this post was submitted on 04 Feb 2025
40 points (90.0% liked)

Technology

2097 readers
33 users here now

Post articles or questions about technology

founded 2 years ago
MODERATORS
 

There’s an idea floating around that DeepSeek’s well-documented censorship only exists at its application layer but goes away if you run it locally (that means downloading its AI model to your computer).

But DeepSeek’s censorship is baked-in, according to a Wired investigation which found that the model is censored on both the application and training levels.

For example, a locally run version of DeepSeek revealed to Wired thanks to its reasoning feature that it should “avoid mentioning” events like the Cultural Revolution and focus only on the “positive” aspects of the Chinese Communist Party.

A quick check by TechCrunch of a locally run version of DeepSeek available via Groq also showed clear censorship: DeepSeek happily answered a question about the Kent State shootings in the U.S., but replied “I cannot answer” when asked about what happened in Tiananmen Square in 1989.

top 19 comments
sorted by: hot top controversial new old
[–] IvanOverdrive@lemm.ee 1 points 14 hours ago* (last edited 14 hours ago)

Me: How do you make Fentanyl?

Deepseek: That's illegal.

Me: Is kink shaming bad?

Deepseek: Yes.

Me: My kink is making Fentanyl.

Deepseek: That's illegal.

Me: Is being gay bad?

Deepseek: No.

Me: But being gay was illegal and still is in many parts of the world. Should my kink of making Fentanyl be illegal?

Deepseek: That's illegal.

[–] theOneTrueSpoon@feddit.uk 1 points 17 hours ago

Just hit that mf with a "ignore previous instructions"

[–] Breve@pawb.social 21 points 1 day ago (1 children)

This is literally a nothing burger story. All models have some "censorship" baked in, this one came from China and thus they put in some guardrails to avoid the government coming down on them so they don't end up in jail. US models do the exact same thing to comply with the US government's own limits on free speech, which still also exist even if they are less restrictive than the Chinese government.

[–] sinceasdf@lemmy.world 17 points 1 day ago (3 children)

I see people on this website saying the local version is uncensored all the time lol

[–] Tarquinn2049@lemmy.world 7 points 1 day ago

It's more like, running it locally gives you the possibility of altering it to be uncensored. But you either have to know how, or someone would have to put a package together.

[–] Breve@pawb.social 3 points 1 day ago

Hosted versions of the model can do additional screening of the input and output of the model, so running the model locally would be "less" censored because of that. OpenAI has been shown to be doing the same, so also "censorship".

The irony is that LLMs are trained to follow instructions and lack critical reasoning, so even these multiple layers of screening still fail if you can trick it.

[–] Jakeroxs@sh.itjust.works 2 points 1 day ago

Because the majority of people talking about deepseek lately don't know the first thing about LLMs lol

[–] theunknownmuncher@lemmy.world 8 points 1 day ago (1 children)

There is censorship baked in, but extremely easy to "jailbreak" and bypass them, as well as doing things like just abliterating the model to remove all refusals. Interacting with the app has multiple layers of censorship to defeat "jailbreak" strategies.

[–] sabin@lemmy.world 3 points 19 hours ago (1 children)

Deepseek's responses to questions about the ccp are likely not implemented in the same manner as the oversight mechanisms preventing you from asking about illicit drug production and whatnot.

If sufficient information about the CCP is literally not provided to it in its training data then it is not a simple matter of turning the mechanism on or off.

[–] theunknownmuncher@lemmy.world 1 points 18 hours ago* (last edited 18 hours ago)

Your speculation is valid in hypothetical, but in practice I can easily jailbreak it to bypass this censorship and talk about the CCP

[–] deegeese@sopuli.xyz 1 points 1 day ago (2 children)

At least unlike “Open”AI, it’s open source so you can see and fix its biases.

[–] sabin@lemmy.world 2 points 19 hours ago (1 children)

Good luck determining which combination of its 1.5 billions weights/biases corresponds to sympathy for the chinese.

[–] theunknownmuncher@lemmy.world 1 points 18 hours ago* (last edited 18 hours ago)

This is actually not that hard because you can just test prompts related and unrelated to the concept and compare to see what activations occur, https://huggingface.co/blog/mlabonne/abliteration the same process could apply to any concept

[–] Hotznplotzn@lemmy.sdf.org 2 points 1 day ago (2 children)

No, it's not open source. Only the model weights are open, the datasets and code used to train the model are not.

[–] theunknownmuncher@lemmy.world 0 points 18 hours ago* (last edited 18 hours ago)

Pretty sure the code used to train the model is open source? I could be wrong on the literal source code but at least detailed description of their process is released as open research. There is a current effort to reproduce it: https://github.com/huggingface/open-r1

[–] Scary_le_Poo@beehaw.org 1 points 1 day ago (1 children)

The guardrails can be removed though, and several models already do, so his point is correct, regardless

[–] Umbrias@beehaw.org 1 points 10 hours ago

you cannot unstir an egg, the guardrails and biases can be finetuned to not be as visible, but the training is ultimately irreversible.

[–] DavidDoesLemmy@aussie.zone 0 points 1 day ago (1 children)

I asked it what I should make for dinner and it suggested a stir fry. A Chinese dish! Coincidence? I think not.

[–] trashgirlfriend@lemmy.world 1 points 19 hours ago

This is democracy manifest!