this post was submitted on 04 Jan 2024

76 points (96.3% liked)

Programming

17432 readers

223 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 1 year ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

MaungaHikoi@lemmy.nz

Japan determines copyright doesn't apply to LLM/ML training data (infosec.town)

submitted 10 months ago by ericjmorey@programming.dev to c/programming@programming.dev

22 comments fedilink hide all child comments

cross-posted from: https://programming.dev/post/8121669

Taggart (@mttaggart) writes:

Japan determines copyright doesn't apply to LLM/ML training data.

On a global scale, Japan’s move adds a twist to the regulation debate. Current discussions have focused on a “rogue nation” scenario where a less developed country might disregard a global framework to gain an advantage. But with Japan, we see a different dynamic. The world’s third-largest economy is saying it won’t hinder AI research and development. Plus, it’s prepared to leverage this new technology to compete directly with the West.

I am going to live in the sea.

www.biia.com/japan-goes-all-in-copyright-doesnt-apply-to-ai-training/

top 22 comments

sorted by: hot top controversial new old

[–] Jean_Lurk_Picard@lemmy.world 26 points 10 months ago (3 children)

Does this mean I can continue to peer to peer share 4k movies for 'AI purposes'

[–] ericjmorey@programming.dev 9 points 10 months ago

Let's find out

[–] Rosco@sh.itjust.works 3 points 10 months ago

Not a crime if you don't get caught!

[–] infinitepcg@lemmy.world 0 points 10 months ago

[–] charonn0@startrek.website 21 points 10 months ago (3 children)

I tend to support this idea. If inputting copyrighted materials isn't infringement then neither should taking the output be.

[–] ericjmorey@programming.dev 19 points 10 months ago (1 children)

[–] Omega_Haxors@lemmy.ml 3 points 10 months ago

Sickos: Yes ha ha ha, yes!!

[–] pkill@programming.dev 11 points 10 months ago (1 children)

laundering copyleft inputs into copyrighted outputs sucks tho. This has been happening before AI, but I think that any form of violating GPL, CC-NC or CC-ND should be punished.

[–] Pipoca@lemmy.world 6 points 10 months ago

In the US, at least, AI works are inherently public domain. Because copyright only applies to works with a human author.

[–] hallettj@beehaw.org 11 points 10 months ago (2 children)

Yeah, that makes a lot of sense. If the thinking is that AI learning from others' works is analogous to humans learning from others' works then the logical conclusion is that AI is an independent creative, non-human entity. And there is precedent that works created by non-humans cannot be copyrighted. (I'm guessing this is what you are thinking, I just wanted to think it out for myself.)

I've been thinking about this issue as two opposing viewpoints:

The logic-in-a-vacuum viewpoint says that AI learning from others' works is analogous to humans learning from others works. If one is not restricted by copyright, neither should the other be.

The pragmatic viewpoint says that AI imperils human creators, and it's beneficial to society to put restrictions on its use.

I think historically that kind of pragmatic viewpoint has been steamrolled by the utility of a new technology. But maybe if AI work is not copyrightable that could help somewhat to mitigate screwing people over.

[–] milo128@lemm.ee 1 points 10 months ago

allow me to introduce you to the pragmatic and idologically consistent viewpoint, that human creativity is unique and not comparable to ai creativity. Humans draw from experiences and memories, and have a unique perspective that is introduced to their art, while ai just churns through thousands of images and tries to replicate them. Dont assume that just because the results are impressive, that ai creativity is analogous to human creativity.

[–] Miaou@jlai.lu 1 points 10 months ago

Just to point out, but AI training is very different from humans learning so drawing parallels between the both does not make much sense

[–] kunaltyagi@programming.dev 2 points 10 months ago (1 children)

Isn't this a year old news?

[–] ericjmorey@programming.dev 2 points 10 months ago

Wouldn't surprise me. I can't read or speak Japanese and don't do much active searching for changes in Japanese law and regulations because I'm pretty insolated from those most of those decisions. So I'm just coming across it now and and have very little context around this. It was the first time I heard about it and it seemed like a big deal. Seems many others als didn't know.

Taking the time to put the source article (in Japanese) through Google translate indicates that the Japanese Government made some decisions in April of 2023 that are still being discussed in concern for the effects it will have on the world.

[–] state_electrician@discuss.tchncs.de 1 points 10 months ago (2 children)

I am so torn on this. On the one hand, I think training these huge models is very similar to human artists consuming things and then consciously or unconsciously using it for their own work. The source is usually no longer distinguishable. So it should be allowed to train them on anything a human could consume.

On the other hand, large AI models are mostly under the control by huge asshole corporations and I absolutely hate seeing them benefit for free from the rest of us. It'd be nice if regulation like in Japan applies only for freely available models.

[–] milo128@lemm.ee 1 points 10 months ago (1 children)

Dont fool yourself with this logic. i thought this too until I realized that humans all have a unique perspective, and an entire life of experiences to draw from. Humans dont just look through tens of thousands of images, internalize them, and spit out something similar to them like AI does. it is just not comparable the way it might seem at first glance.

[–] state_electrician@discuss.tchncs.de 2 points 10 months ago (1 children)

I think that we like to think that, but actually we're not as unique or complex as we'd like to be. We're now at a point where AI already surpasses most humans in knowledge and even creativity. And the models will evolve much faster than we do.

[–] milo128@lemm.ee 1 points 10 months ago* (last edited 10 months ago) (1 children)

creativity isnt about looking at a million existing art pieces and continuing the pattern. sure, humans can do something similar to that that, but thats all that AI can do. AI can immitate a human artist but it cannot have real creativity because it lacks memories, experiences, views, perspectives, that a human would use when creating something new. I'm not just saying this because I want it to be true, i'm saying it because it is true. Awe at the exploding capabilities of AI may be clouding your judgement.

[–] state_electrician@discuss.tchncs.de 1 points 10 months ago

I think this, for outsiders, sudden leap in AI capabilities will also over time cause us to re-evaluate how we ourselves operate. What does it mean to be creative? Humans are mostly incapable of saying why they did something, because so many things factor into it. Down to how we felt in the very moment we did it. But the same is true for current AI models. Nobody can say why they did what they did. And they will only become more complex in the future. Sure, you can come up with definitions of creativity that exclude AI on purpose. But I think we just overestimate and glorify human creativity, probably out of arrogance and fear of becoming obsolete. Almost everything humans create is derivative to some degree. I'd argue that extremely few artists in the entirety of human history have created something truly new. Everything is built on and influenced by previous art and the artist's experience. Sure, AI is not quite as complex as a good human artist. But give it some time. Like I said, AI evolves a lot faster than we do.

[–] atheken@programming.dev 1 points 10 months ago (1 children)

I think the “learning” process could be similar, but the issue is the scale.

No human artist could integrate the amount of material at the speed that these systems can. The systems are also by definition nothing but derivative. I think the process is similar, but there is important nuance that supports a different conclusion.

[–] state_electrician@discuss.tchncs.de 2 points 10 months ago (1 children)

I don't think the scale matters. Do we treat human artists that only read 10 books or watched 10 movies before creating something different than the ones that consumed 1000 or 10000? For me the issue of who controls it is much more important. If something like ChatGPT were truly open-source and people could use it any way they want, I would have zero issues with the models being trained on everything that is available. We desperately need less copyright instead of more. Right now I think going after the big AI models with copyright is a double-edged sword. It's good to bring them down, but not at the cost of strengthening copyright.

[–] atheken@programming.dev 1 points 10 months ago

If these systems could only reorganize and regurgitate 1000 creative works, we would not be having this conversation. It’s literally because of the scale that this is even relevant. The scope of consumption by these systems, and the relative ease of accessibility to these systems is what makes the infringement/ownership question relevant.

We literally went through this exercise with fair use as it pertains to CD/DVD piracy in the 90’s, and Napster in the early 2000’s. Individuals making copies was still robbing creative artists of royalties before those technologies existed, but the scale, ubiquity, and fidelity of those systems enabled large-scale infringement in a way that individuals copying/reproducing them previously could not.

I’m not saying these are identical examples, but the magnitude is a massive factor in why this issue needs to be regulated/litigated.