this post was submitted on 21 Oct 2023

99 points (100.0% liked)

Technology

37719 readers

400 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

Los@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

Music publishers sue Amazon-backed AI company over song lyrics (www.theguardian.com)

submitted 1 year ago by throws_lemy@lemmy.nz to c/technology@beehaw.org

31 comments fedilink hide all child comments

all 35 comments

sorted by: hot top controversial new old

[–] Peanutbjelly@sopuli.xyz 37 points 1 year ago (1 children)

Music publishers sue happy in the face of any new technological development? You don't say.

If an intern gives you some song lyrics on demand, do they sue the parents?

Do we develop all future A.I. Technology only when it can completely eschew copyrighted material from their comprehension?

"I am sorry, I'm not allowed to refer to the brand name you are brandishing. Please buy our brand allowance package #35 for any action or communication regarding this brand content. "

I dream of a future when we think of the benefit of humanity over the maintenance of our owners' authoritarian control.

[–] MoogleMaestro@kbin.social 38 points 1 year ago (4 children)

If an intern gives you some song lyrics on demand, do they sue the parents?

Uh--- what? That analogy makes no sense. AI is trained off actual lyrics, which is why companies who create these models are at risk (they don't own the data they're feeding into the model.)

Also your comment is completely mixing Trademark and Copyright examples. It has nothing to do with brand names and everything to do with intellectual property.

[–] lol3droflxp@kbin.social 14 points 1 year ago (2 children)

In reality people learn how to write lyrics because they listen to songs. Nobody writes a song without listening to thousands of them and many human written songs are really similar to each other. Otherwise the music industry wouldn’t be littered with lawsuits. I don’t really see the difference.

[–] fuzzywolf23@beehaw.org 6 points 1 year ago (1 children)

It may surprise you to learn that 'people' and 'not people's are treated differently under the law

[–] lol3droflxp@kbin.social 5 points 1 year ago* (last edited 1 year ago) (1 children)

Where did I ever say that a stupid AI should get any rights to its own product?

[–] fuzzywolf23@beehaw.org 3 points 1 year ago (1 children)

I don’t really see the difference.

[–] lol3droflxp@kbin.social 3 points 1 year ago

That’s not what I meant by that. People should have the rights to the products they produce using the tools at their disposal.

[–] admiralteal@kbin.social 6 points 1 year ago* (last edited 1 year ago) (2 children)

Chatbots don't have physical bodies that require food and shelter. So even if you could prove their creativity was identical to real human creativity and not a crude imitation more akin to assembling random collages, they still don't deserve the same protections as real artists with physical bodies that need food and shelter.

Which isn't even approaching the obvious retort that their creativity is a crude imitation of real creativity.

Copyright doesn't exist because there's some important moral value to the useful arts. It exists to keep food in bellies.

You're bending over backwards to protect bots as deserving identical rights to humans. For what purpose should they have those rights? The only benefit to treating the bots this way is to ensure the rich tech oligarchs that already have undue power and influence in our society get even richer and get even more influence.

[–] p03locke@lemmy.dbzer0.com 10 points 1 year ago* (last edited 1 year ago) (1 children)

It exists to keep food in bellies.

No, it exists to maintain profits of large corporations. Copyright, patents, and intellectual rights were created under the false pretense that it "protects the little person", but these are lies told by the rich and powerful to keep themselves rich and powerful. Time and time again, we have seen how broken the patent system is, how it is impossible to not step on musical copyright, how Disney has extended copyrights to forever, and how the megacorporations have way more money than everybody else to defend those copyrights and patents. These people are not your friend, and their legal protections are not for you.

[–] lol3droflxp@kbin.social 4 points 1 year ago (1 children)

The LLMs don’t deserve or have any rights. They’re a tool that people can use. Just like reference material, spellcheckers, asset libraries or whatever else creatives use. As long as they don’t actually violate copyright in the classical sense of just copy pasting stuff the product people generate using them is probably as (un)original as a lot of art out there. And collages can be transformative enough to qualify for copyright.

[–] admiralteal@kbin.social 4 points 1 year ago (1 children)

As long as they don’t actually violate copyright in the classical sense of just copy pasting stuff...

As far as we know, that is exactly how they work. They are very, very complex systems for copying and pasting stuff.

And collages can be transformative enough to qualify for copyright

Sure, if they were made with human creativity they deserve the protections meant to keep creative humans alive. But who cares? They are not humans and thus do not get those protections.

[–] lol3droflxp@kbin.social 5 points 1 year ago (1 children)

They are physically unable to just copy paste stuff. The models are tiny compared to the training data, they don’t store it.

[–] admiralteal@kbin.social 1 points 1 year ago* (last edited 1 year ago) (1 children)

That claim doesn't prove your premise. I get that it feels clever, but it isn't.

Just because they're very good at reproducing information from highly pared down and compressed forms does not mean they are not reproducing information. If that were true, you wouldn't be able to enforce copyright on a jpeg photo of a painting.

[–] lol3droflxp@kbin.social 2 points 1 year ago (1 children)

If it was a compression algorithm then it would be insanely efficient and that’d be the big thing about it. The simple fact is that they aren’t able to reproduce their exact training data so no, they aren’t storing it in a highly compressed form.

[–] admiralteal@kbin.social 1 points 1 year ago* (last edited 1 year ago) (1 children)

I think there's a lot of Dunning–Kruger here.

The simple fact is that they aren’t able to reproduce their exact training data so no, they aren’t storing it in a highly compressed form.

See: jpeg analogy. You've described here lossy compression not something that is categorically different than compression. Perhaps the AI models are VERY lossy. But that doesn't mean it is original or creative.

But the reality is, we largely do not know how these chatbots work. They are black boxes even to the researchers themselves. That's just how neural networks are. But the thing I know is they are not themselves creative. All they can do is follow weights to reproduce the things human classifiers evaluated as subjectively "good" over the things they subjectively evaluated as "bad". All the creativity happened in the training process -- the inputs and the testing. All of the apparent creativity outputted is a product of the humans involved in training and testing the model, not the model itself. The actual creative force is somewhere far away.

[–] lol3droflxp@kbin.social 2 points 1 year ago (1 children)

I see a lot of Dunning Kruger here as well. The fact is that you can generate novel images/texts/whatever with these tools. They may mostly suck but they’re still novel so they can be copyrighted by whoever used these tools to create them.

[–] admiralteal@kbin.social 1 points 1 year ago* (last edited 1 year ago) (1 children)

Even if I grant your premise that their produce is novel -- I don't, that is fundamentally not how they work -- the copyright would be held by the bot in that case, not the person who used it.

No more than a person who commissions a painting has copyright for the work. That's not how creativity, LLMs, nor copyright law works.

[–] lol3droflxp@kbin.social 2 points 1 year ago (1 children)

The LLM is a tool. It’s like granting copyright to a paintbrush.

[–] admiralteal@kbin.social 1 points 1 year ago* (last edited 1 year ago) (1 children)

Exactly. Which is how we know that calling what it does inherently creative/novel is absurd and must be wrong. Glad you came around.

[–] lol3droflxp@kbin.social 2 points 1 year ago

Kind of a big jump

[–] p03locke@lemmy.dbzer0.com 11 points 1 year ago (2 children)

AI is trained off actual lyrics, which is why companies who create these models are at risk (they don’t own the data they’re feeding into the model.)

Nobody is "at risk" of anything here. You don't have to own data to use data, just like you're not liable for the content of an Internet page because it was downloaded to your browser's cache.

Everybody who agrees with these lawsuits have a severe misunderstanding of how LLMs and other AI models work. They are large matrices of weights and numbers, not copies of the data they consume. The entire Stable Diffusion model is a 4GB file, trained from billions of images. It's impossible to "copy" petabytes of images and somehow end up with a few gigabytes of numbers. The transformation is a lossy process, and its result does not fit the definition of copyright.

[–] fuzzywolf23@beehaw.org 5 points 1 year ago (1 children)

That doesn't make it "not copyright Infringement", that just makes it an efficient compression algorithm. With the right prompt, you can recover copies of the original.

[–] p03locke@lemmy.dbzer0.com 4 points 1 year ago

With the right prompt, you can recover copies of the original.

Clearly somebody who's never used the software.

[–] lol3droflxp@kbin.social 2 points 1 year ago

Finally someone who gets it

[–] Saik0Shinigami@lemmy.saik0.com 10 points 1 year ago (1 children)

Interns are also trained off of actual lyrics... cause radio.

[–] Catoblepas@lemmy.blahaj.zone 5 points 1 year ago

Despite what you would think by listening to some of the crap on the radio, the people writing song lyrics actually can think and have intent to express an idea. “AI” writing is glorified text prediction.

[–] Peanutbjelly@sopuli.xyz 2 points 1 year ago* (last edited 1 year ago)

I conflate these things because they come from the same intentional source. I associate the copywrite chasing lawyers with the brands that own them, it is just a more generalized example.

Also an intern who can give you a songs lyrics are trained on that data. Any effectively advanced future system is largely the same, unless it is just accessing a database or index, like web searching.

Copyright itself is already a terrible mess that largely serves brands who can afford lawyers to harass or contest infringements. Especially apparent after companies like Disney have all but murdered the public domain as a concept. See the mickey mouse protection act, as well as other related legislation.

This snowballs into an economy where the Disney company, and similarly benefited brands can hold on to ancient copyrights, and use their standing value to own and control the development and markets of new intellectual properties.

Now, a neuralnet trained on copywritten material can reference that memory, at least as accurately as an intern pulling from memory, unless they are accessing a database to pull the information. To me, sueing on that bases ultimately follows the logic that would dictate we have copywritten material removed from our own stochastic memory, as we have now ensured high dimensional informational storage is a form of copywrite infringement if anyone instigated the effort to draw on that information.

Ultimately, I believe our current system of copywrite is entirely incompatible with future technologies, and could lead to some scary arguments and actions from the overbearing oligarchy. To argue in favour of these actions is to argue never to let artificial intelligence learn as humans do. Given our need for this technology to survive the near future as a species, or at least minimize the excessive human suffering, I think the ultimate cost of pandering to these companies may be indescribably horrid.

[–] SmoochyPit@beehaw.org 28 points 1 year ago (1 children)

Music publishers have historically been a serious force in copyright protection, so it’ll be interesting to see where this goes. They’d put in far more of a legal battle than artists, writers, coders or other people whose work has been used in training models.

[–] AceFuzzLord@lemm.ee 5 points 1 year ago

Nothing quite like the music publishers having a temper tantrum when an AI is able to give publicly available online lyrics to a song just because they aren't making any form of money from that AI.

I side with Anthropic on this one since I don't view sharing lyrics in any way, shape, or form a breach of copyright so long as it's done in a non-commercial way. And yes I also view Claude just listing them as non-commercial, even if you are on the payed plan.

Fuck the music industry in this case. It's no wonder the bigwigs in the industry probably only land one night stands, gold diggers, and hookers given their greed and one nanometer beaters.

[–] autotldr@lemmings.world 4 points 1 year ago

🤖 I'm a bot that provides automatic summaries for articles:

Click here to see the summary

The lawsuit said Anthropic violates the publishers’ rights through its use of lyrics from at least 500 songs ranging from the Beach Boys’ God Only Knows and the Rolling Stones’ Gimme Shelter to Mark Ronson and Bruno Mars’ Uptown Funk and Beyoncé’s Halo.

The lawsuit accused Anthropic of infringing the publishers’ copyrights by copying their lyrics without permission as part of the “massive amounts of text” that it scrapes from the internet to train Claude to respond to human prompts.

The publishers also say that Claude illegally reproduces the lyrics by request, and in response to “a whole range of prompts that do not seek Publishers’ lyrics”, including “requests to write a song about a certain topic, provide chord progressions for a given musical composition, or write poetry or short fiction in the style of a certain artist or songwriter”.

For example, the lawsuit said that Claude will provide relevant lyrics from Don McLean’s American Pie when asked to write a song about the death of the rock pioneer Buddy Holly.

Many copyright owners including authors and visual artists have sued tech companies such as Meta Platforms and Microsoft-backed OpenAI over the use of their work to train generative-AI systems.

The music publishers’ lawsuit appears to be the first case over AI’s use of song lyrics and the first against Anthropic, which has drawn financial backing from Google, Amazon and $500m from the former cryptocurrency billionaire Sam Bankman-Fried.

Saved 33% of original text.

[–] lemillionsocks@beehaw.org 3 points 1 year ago (1 children)

Hmmm. I dont exactly want to cheer on the music publishing copyright industry, but AI infringing on creative spaces is certainly an issue especially when its trained using someones data without their consent.

Im surprised they arent chomping at the bit to train these AIs so that they can use the end result to give us AI driven artists, but I guess their knee jerk reaction to do a dcma takedown is too strong.