this post was submitted on 21 Oct 2024
872 points (98.8% liked)
Technology
59116 readers
3911 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
They would have to hire a shitload of people to police it all along with the rest of the questionable shit on there, like jailbait or whatever other shit they turned a blind eye to until it showed up on the news
Not saying it's right but from a business standpoint it makes sense
Don't they flag stuff automatically?
Not sure what they're using on the backend, but open source LLMs that take image inputs are good now. Like, they can read garbled text from a meme and interpret it with context, easily. And this is apparently a field thats been refined over years due to the legal need for CSAM detection anyway.
They do, but they'd still need someone to go through the flagging and check. Reddit gets away with it as it is like Facebook groups do, by offloading the moderation to users, with the admins only being roped in for ostensibly big things like ban evasion/site wide bans, or lately, if the moderators don't toe the company line exactly.
I doubt that they would use an LLM for that. That's very expensive and slow, especially for the volume of images that they would need to process. Existing CSAM detectors aren't as expensive, and are faster. They basically compute a hash for the image, and compare it to known hashes for CSAM.
Small LLMs are quite fast these days, even the multimodal ones. Same with small models explicitly used to filter diffusion output.
A shitload of people, like as many as 10!