this post was submitted on 29 Jan 2025
153 points (92.7% liked)
Technology
61227 readers
4341 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Sure it made the training process faster, but this still takes a fraction of the energy to generate a single output compared to other LLMs like ChatGPT or Llama. Plus it's open source. You can't discredit a technological advancement for building upon previous advancement, especially when doing so with transparency.
As I said, the architectural changes are quite cool
As far as I've understood it mostly comes down to splitting it up into multiple expert systems, so you don't need to activate the complete system with every request
But I've only scratched the surface...
Also, open source... The weights are made publicly available.
None of the training data or systems
Edit: regarding "open source":
Also Meta's Llama is on huggingface, just like deepseek. I still wouldn't talk about transparency here