this post was submitted on 28 Jan 2025
873 points (94.4% liked)
memes
11285 readers
3610 users here now
Community rules
1. Be civil
No trolling, bigotry or other insulting / annoying behaviour
2. No politics
This is non-politics community. For political memes please go to !politicalmemes@lemmy.world
3. No recent reposts
Check for reposts when posting a meme, you can only repost after 1 month
4. No bots
No bots without the express approval of the mods or the admins
5. No Spam/Ads
No advertisements or spam. This is an instance rule and the only way to live.
A collection of some classic Lemmy memes for your enjoyment
Sister communities
- !tenforward@lemmy.world : Star Trek memes, chat and shitposts
- !lemmyshitpost@lemmy.world : Lemmy Shitposts, anything and everything goes.
- !linuxmemes@lemmy.world : Linux themed memes
- !comicstrips@lemmy.world : for those who love comic stories.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The LLM is a machine that when simplified down takes two inputs. A data set, and weight variables. These two inputs are not the focus of the software, as long as the structure is valid, the machine will give an output. The input is not the machine, and the machines source code is open source. The machine IS what is revolutionary about this LLM. Its not being praised because its weights are fine tuned, it didn’t sink Nvidia’s stock price by 700 billion because it has extra special training data. Its special because of its optimizations, and its novel method of using two halves to bounce ideas back and forth and to value its answers. Its the methodology of its function. And that is given to you open to see its source code
I don't know what, if any, CS background you have, but that is way off. The training dataset is used to generate the weights, or the trained model. In the context of building a trained LLM model, the input is the dataset and the output is the trained model, or weights.
It's more appropriate to call deepseek "open-weight" rather than open-source.