Technology

59446 readers

3633 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

429

OpenAI introduces Sora, its text-to-video AI model (www.theverge.com)

submitted 9 months ago* (last edited 9 months ago) by catculation@lemmy.zip to c/technology@lemmy.world

144 comments fedilink hide all child comments

https://openai.com/sora

Archive https://archive.is/V8Fv3

(page 2) 50 comments

sorted by: hot top controversial new old

[–] jeena@jemmy.jeena.net 6 points 9 months ago (1 children)

The cat video is funny, the cat has 5 legs :D

[–] troybot@midwest.social 2 points 9 months ago

Seeing the 5 legged cat was the moment I started to believe this stuff really was AI generated.

[–] 1984@lemmy.today 5 points 9 months ago

I'm really impressed by the demo, but yes, let's see how well it works when it's made public.

People who don't think AI will take a lot of jobs may have to rethink...

[–] autotldr@lemmings.world 4 points 9 months ago

This is the best summary I could come up with:

Sora is capable of creating “complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” according to OpenAI’s introductory blog post.

The company also notes that the model can understand how objects “exist in the physical world,” as well as “accurately interpret props and generate compelling characters that express vibrant emotions.”

Many have some telltale signs of AI — like a suspiciously moving floor in a video of a museum — and OpenAI says the model “may struggle with accurately simulating the physics of a complex scene,” but the results are overall pretty impressive.

A couple of years ago, it was text-to-image generators like Midjourney that were at the forefront of models’ ability to turn words into images.

But recently, video has begun to improve at a remarkable pace: companies like Runway and Pika have shown impressive text-to-video models of their own, and Google’s Lumiere figures to be one of OpenAI’s primary competitors in this space, too.

It notes that the existing model might not accurately simulate the physics of a complex scene and may not properly interpret certain instances of cause and effect.

The original article contains 395 words, the summary contains 190 words. Saved 52%. I'm a bot and I'm open source!

[–] MonkderZweite@feddit.ch 2 points 9 months ago

I'm pretty sure that's a model tho.

[–] ton618@lemm.ee 2 points 9 months ago

The demo looks pretty good, yes - but I won't believe it 'till I try it!

load more comments