this post was submitted on 31 Jan 2025
232 points (94.3% liked)

Open Source

32337 readers
810 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago
MODERATORS
 

Article: https://proton.me/blog/deepseek

Calls it "Deepsneak", failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.

I can't speak for Proton, but the last couple weeks are showing some very clear biases coming out.

you are viewing a single comment's thread
view the rest of the comments
[–] Dyf_Tfh@lemmy.sdf.org 5 points 15 hours ago* (last edited 15 hours ago) (1 children)

Those are not deepseek R1. They are unrelated models like llama3 from Meta or Qwen from Alibaba "distilled" by deepseek.

This is a common method to smarten a smaller model from a larger one.

Ollama should have never labelled them deepseek:8B/32B. Way too many people misunderstood that.

[–] yogthos@lemmy.ml 1 points 13 hours ago (1 children)

I'm running deepseek-r1:14b-qwen-distill-fp16 locally and it produces really good results I find. Like yeah it's a reduced version of the online one, but it's still far better than anything else I've tried running locally.

[–] morrowind@lemmy.ml 1 points 5 hours ago (1 children)

Have you compared it with the regular qwen? It was sissy very good

[–] yogthos@lemmy.ml 0 points 1 hour ago

The main difference is speed and memory usage. Qwen is a full-sized, high-parameter model while qwen-distill is a smaller model created using knowledge distillation to mimic qwen's outputs. If you have the resources to run qwen fast then I'd just go with that.