this post was submitted on 07 Feb 2025

350 points (98.1% liked)

Programmer Humor

33242 readers

1036 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

Posts must be relevant to programming, programmers, or computer science.
No NSFW content.
Jokes must be in good taste. No hate speech, bigotry, etc.

founded 5 years ago

MODERATORS

AgreeableLandscape@lemmy.ml

cat_programmer@lemmy.ml

350

DOGE employee (lemmy.world)

submitted 11 hours ago by SwordInStone@lemmy.world to c/programmerhumor@lemmy.ml

56 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] tetris11@lemmy.ml 10 points 4 hours ago* (last edited 4 hours ago) (2 children)

I have to admit, PDF parsing being such a hot and profitable topic in computer science was really something I never saw coming.

PDFs? The things you can select text from? And when not, there's decent OCR? And when not, you just ask the person to send you an email or a word doc?

It sounds like LLMs are looking for a new unpolluted source of historical data that they can learn from, and this source exists in the form of old scanned-in paper documents. That's the only reason I can fathom as to why this is such a big thing now.

[–] chicken@lemmy.dbzer0.com 3 points 2 hours ago

Every time I try to convert a PDF to epub or something, or OCR one that doesn't actually have selectable text, it turns out shit. I assume the real reason people would want to get LLMs involved is that there is actually a lot of ambiguity in what a correct conversion would be, and there are a lot of PDFs out there.

[–] sudo@programming.dev 3 points 3 hours ago

Training the most insane AI model on classified federal documents.

[–] sudo@programming.dev 5 points 3 hours ago

Like opening source code in Word.

[–] rustydrd@sh.itjust.works 13 points 5 hours ago (1 children)

Ah yes, the famed document <-> JSON converter.

[–] xthexder@l.sw0.com 8 points 4 hours ago

It's so easy! Watch:

{"contents": "<garbled .docx contents goes here>"}

[–] NicolaHaskell@lemmy.ml 32 points 7 hours ago

This is that special blend of Tablet Kid "I don't need to know things I can google them" and Rich Kid "I don't need to do things I can crowdsource them" that makes for that Distinctively VP "I don't know what I'm doing and nobody can tell 👈😎👉"

[–] Opisek@lemmy.world 43 points 8 hours ago (2 children)

Technically OCR is an application of machine learning.

Not an LLM, though.

[–] SwordInStone@lemmy.world 11 points 7 hours ago

A world of difference

[–] jsomae@lemmy.ml 2 points 7 hours ago

regularizing the OCR'd form into a json/html file might be a good application of an LLM though. Perhaps this is what they were asking about.

[–] HStone32@lemmy.world 72 points 10 hours ago (1 children)

The secret to success in software engineering:

Lie and say that there is
Write or use a conversion algorithm
Boss won't know the difference
Collect bonus at performance evaluation
Put "AI engineer" on resume

[–] pastel_de_airfryer@lemmy.eco.br 20 points 7 hours ago (1 children)

Boss thinks AI can code at senior developer level and fires you and the entire team

[–] HStone32@lemmy.world 10 points 6 hours ago

Never plan on staying at a SE job for longer than a few years. Not in a market that volitile.

[–] LGTM@discuss.tchncs.de 100 points 11 hours ago (1 children)

No need, there's an unmaintained javascript library for that (written by a 12-yr old)

[–] NocturnalMorning@lemmy.world 31 points 10 hours ago* (last edited 10 hours ago) (1 children)

Omg, sign me up! I'm gonna put that script in production for a server used by millions of customers around the world!

[–] ogeist@lemmy.world 14 points 8 hours ago (1 children)

Oh no, now there is a security audit and the pdf generated is insecure, the unpaid developer that has not logged in since 2015 has to fix this ASAP

[–] umbrella@lemmy.ml 10 points 8 hours ago (1 children)

what? invest money to pay for open source software? are you nuts?

[–] GenosseFlosse@feddit.org 2 points 4 hours ago* (last edited 4 hours ago)

"Just replace it with AI!!!"

[–] SnotFlickerman@lemmy.blahaj.zone 49 points 11 hours ago (1 children)

Initially, I didn't think these kids were fall guys.

Now I think they're fall guys.

[–] lka1988@sh.itjust.works 48 points 10 hours ago (1 children)

That was my thought. Young kids fresh out of school are really easy to manipulate into delusions of grandeur, especially when said delusions are offered by the richest person in the world. He's gonna leave them out for the wolves.

[–] SnotFlickerman@lemmy.blahaj.zone 39 points 10 hours ago (2 children)

Either that or Musk himself is truly so incompetent he thinks these kids are true geniuses. Honestly, with how things are going, that's a fiddy-fiddy chance, because Musk is somehow almost as unbelievably stupid as Trump.

[–] BartyDeCanter@lemmy.sdf.org 16 points 10 hours ago

Why not both?

[–] lka1988@sh.itjust.works 8 points 9 hours ago

You're probably not wrong, what with him awkwardly hopping around onstage at multiple trump rallies.

[–] traches@sh.itjust.works 52 points 11 hours ago (2 children)

Yes it’s an LLM called pandoc, you can run it locally

[–] brbposting@sh.itjust.works 7 points 7 hours ago

Learn more: openai.com/pandoc

[–] CaptDust@sh.itjust.works 34 points 11 hours ago (1 children)

You don't need a private nuclear plant to run it? Wow very efficient.

[–] avidamoeba@lemmy.ca 5 points 9 hours ago* (last edited 9 hours ago)

Black magic software.

[–] mesamunefire@lemmy.world 41 points 11 hours ago (3 children)

Imagine getting a job like this and now half the nation knows your name...thats terrifying. being an intern may mean you have no idea of the true scope of what they are asking you to do.

[–] Pieisawesome@lemmy.dbzer0.com 15 points 8 hours ago

They are public employees who are changing things at the core of our government. Why wouldn’t we know their names?

Government employees names aren’t secret (asides from a few exceptions) nor is their pay

[–] crusa187@lemmy.ml 11 points 7 hours ago (1 children)

Yeah, seems that’s the point. Old enough to competently perform what they’re told, but too young to realize the gravity of the situation and how wrong it is to partake in it.

[–] eldavi@lemmy.ml 12 points 7 hours ago (1 children)

that's why we have 18 year soldiers ....

[–] ChapulinColorado@lemmy.world 6 points 7 hours ago

It’s ok, with the experienced gained from being forced to grow up, some will come home and use their savings to buy a dodge ram on a 7 year loan at 18% apr.

[–] GrumpyDuckling@sh.itjust.works 15 points 9 hours ago

We know that his dad is an engineering professor at university of Nebraska too. Really calls into question his credentials. I checked the other day and they had already removed his contact info from their website.

[–] tsiad_mordecai_miktros@lemmy.world 45 points 11 hours ago (1 children)

yes me send me what you want me to parse and i will get back to you in 3-4 business days

[–] eldavi@lemmy.ml 18 points 11 hours ago (1 children)

be sure to include the metadata too. lol

[–] OwlPaste@lemmy.world 14 points 11 hours ago (5 children)

And processing fee!

load more comments (5 replies)

[–] CapriciousDay@lemmy.ml 24 points 11 hours ago

Oh yeah the hosted DeepSeek has that

[–] GissaMittJobb@lemmy.ml 18 points 10 hours ago (6 children)

Is this fake?

For context, this is the guy who figured out how to see what's written on some ancient Greek Scrolls without destroying them. It seems slightly far-fetched that he wouldn't know better.

load more comments (6 replies)

[–] cmgvd3lw@discuss.tchncs.de 15 points 11 hours ago (2 children)

Actually this is what they do.

load more comments (2 replies)

[–] ininewcrow@lemmy.ca 11 points 10 hours ago

Just use Deepseek for US government data .... what could go wrong?

load more comments