this post was submitted on 13 Jan 2024
56 points (87.8% liked)

Fediverse

28387 readers
865 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS
 

Though Lemmy and Mastodon are public sites, and their structures are open-source I guess? (I'm not a programmer/coder), can they really dodge the ability of AI s to collect/track any data everytime they search everywhere on Internet?

you are viewing a single comment's thread
view the rest of the comments
[–] skeletorfw@lemmy.world 2 points 10 months ago

Yeah as an ecologist that same thing made me giggle. I suppose why not the lesser-spotted 🍆warbler :P

In terms of exposing it only to bots, that is a frustration, unless you make it seamless then it does become kinda trivial to mitigate. Otherwise the approach I'd take to mitigate it is to adapt a lemmy client that already does the filtering or reverse-engineer the deciding element of the app. Similarly if you use garbage then you need it to look enough like normal words for it to be hard to classify as AI generated.

The funny thing is that LLMs are not actually much good at telling whether something is ai generated, you need to train another model to do that, but to train that ai you need good sources of non-corrupt data. Also the whole point of generative AI language models is that they are actively trying to pass that test by design so it becomes an arms race that they can never really win!

Man, what a shitshow generative ai is