this post was submitted on 17 Sep 2023

78 points (88.2% liked)

Technology

59446 readers

3750 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

Terraforming Mars Publisher Calls AI "Too Powerful" Not to Use (gizmodo.com)

submitted 1 year ago by FlyingSquid@lemmy.world to c/technology@lemmy.world

30 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] circuitfarmer@lemmy.sdf.org 1 points 1 year ago* (last edited 1 year ago) (1 children)

I work in generative AI, specifically curated training sets.

The issue is training on "licensed materials". If that happened with all AI, no one would have a problem. But its disingenuous to suggest that's how most AI is currently being trained. A lot of materials have been scraped off of the web, especially for image generation, meaning some portion of the training data was used without the author's consent or, often, even their knowledge. It's important to note that scraping training data in this way usually breaks a TOS.

The amount of people I've seen supporting AI usage in this context is staggering, with one commenter even telling me it was about the "greed" of the artists, whose work may be in a training set without consent, wanting royalties for slightly changing a parameter with their art (that is, of course, a strawman fallacy).

To me, the only issue here is handling the ethics of what goes into training data and what doesn't. Authors should have the choice of their materials not being used. Adobe understood this, which is why Firefly being trained on explicitly licensed materials makes it a different beast, to which you allude.

But it's clear a lot of people don't understand why using data without consent is a bad thing in this context, and for that reason, some other people will choose not to support companies using it until the issue is resolved. It seems quite reasonable to me.

[–] FaceDeer@kbin.social 2 points 1 year ago (1 children)

The issue is training on “licensed materials”.

People usually say that's the issue, until you show them that it's possible to generate images and whatnot from models trained on "fully licensed" data. Then they come up with some other reason why evil AI is awful and evil. I've been involved in these debates for a long time now and those goalposts have well-worn tracks from how frequently they shift that way.

But it’s clear a lot of people don’t understand why using data without consent is a bad thing in this context

No, they don't agree that using data without consent is a bad thing. Saying "they don't understand" it is begging the question, in the literal sense. You're saying that people who disagree about that are simply being ignorant of some underlying "truth."

[–] circuitfarmer@lemmy.sdf.org 1 points 1 year ago (1 children)

No, they don't agree that using data without consent is a bad thing.

If this developer doesn't mind taking data without consent, I hope they don't have an issue with people pirating their game. That's a slippery slope if I ever saw one.

[–] FaceDeer@kbin.social 0 points 1 year ago* (last edited 1 year ago) (1 children)

"Slippery slope" is also a fallacy. Training an AI and copying a game are two different things and it's entirely reasonable to hold the position that one is ok and the other is not.

[–] circuitfarmer@lemmy.sdf.org 0 points 1 year ago (1 children)

You're missing the point. Both are using data (work of the dev on a game, work of an artist on art) without consent.

[–] FaceDeer@kbin.social 1 points 1 year ago (1 children)

I'm not missing the point. Just because they're both "using data without consent" doesn't mean they're the same thing. Playing baseball and smashing someone's car both involve swinging a bat but that's where the similarity ends.

There are many ways that you can "use data without consent" that are perfectly legal.

[–] circuitfarmer@lemmy.sdf.org 0 points 1 year ago (1 children)

Legal does not necessarily equate to ethical. And the law will eventually change (I think) to mitigate some of these shortcomings that AI training has highlighted.

[–] FaceDeer@kbin.social 2 points 1 year ago (1 children)

Legal does not necessarily equate to ethical.

Of course not. But "ethical" is a matter of subjective debate. You say X is unethical, I say X is ethical, and ultimately there's no way to tell who's "right."

Law's different, the whole point of it is to have a system that sorts these things out.

And the law will eventually change (I think) to mitigate some of these shortcomings that AI training has highlighted.

So it's not currently illegal to train AIs like this? That's been my point this whole time. It's a different thing from the things that are currently illegal (such as "theft").

[–] circuitfarmer@lemmy.sdf.org 0 points 1 year ago* (last edited 1 year ago) (1 children)

Currently legal, but unethical. I never claimed it was illegal. (I did mention that scraping usually breaks a TOS, but that's definitely a legal grey area and moot if its publicly accessible data)

[–] FaceDeer@kbin.social 2 points 1 year ago (1 children)

Unethical according to your personal opinion. My opinion on the ethics of the matter differ, and that's just as valid as yours. You don't get to declare "that's unethical" and then expect everyone to just fall in line with your belief. Way back at the root of this you said:

But it’s clear a lot of people don’t understand why using data without consent is a bad thing in this context,

Which, as I argued back then, suggests that you think that the notion that "using data without consent" is a bad thing that people who disagree with you just don't understand. No, they understand perfectly well. They just disagree with you.

[–] circuitfarmer@lemmy.sdf.org 1 points 1 year ago (1 children)

Can you explicate why you believe it is ethical to use data without consent of the data creator?

[–] FaceDeer@kbin.social 2 points 1 year ago (1 children)

Because it's no different from what people have been doing since time immemorial - learning concepts and styles from things that they can see in public. To place restrictions on this is going to require a whole new category of intellectual property and it leads in very dubious directions.

"Intellectual property" is inherently a restriction of peoples' rights, and you need to have a very good reason to apply any such restriction that balances those restrictions with public benefits that derive from it. Copyright, for example, promotes the progress of science and the useful arts by making it "safe" to publish stuff rather than keeping it squirrelled away. Trademarks benefit people by making the providence of goods clear. Patents ensure that inventions aren't lost.

Rights are not restricted by default, they are unrestricted by default. When something new comes along it's up to the people who want to restrict it to make their case. The default state of the world should be freedom, not prohibition and control.

Trying to restrict the right to learn is an extremely dark place to be going. I strongly oppose that.

[–] circuitfarmer@lemmy.sdf.org 2 points 1 year ago (1 children)

Thanks for your explanation.

[–] FaceDeer@kbin.social 1 points 1 year ago

No problem. People often assume the worst about their opponents in debates (I succumb to that too, even though I try to avoid it), thank you for asking for an explanation of my position.