this post was submitted on 18 Dec 2024
222 points (98.3% liked)

Programmer Humor

19821 readers
2 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] andrew_bidlaw@sh.itjust.works 12 points 1 week ago (1 children)

As it learns from our data, no wonder it fucks up at regexps. They are the arcane knowledge not accessible to us mere mortals, nor to LLMs.

[–] ryathal@sh.itjust.works 9 points 1 week ago (1 children)

If you know even a little about how an LLM works it's obvious why regex is basically impossible for it. I suspect perl has similar problems, but no one is capable of actually validating that.

[–] ignotum@lemmy.world 3 points 1 week ago (1 children)

What do you mean it's impossible for it? I know how LLMs work but I don't know if any such limitations

Write me a regex that matches a letter repeated four times, followed by a 3 or 4 digit number

Here’s your regex: ([a-zA-Z])\1{3}\d{3,4}

[–] ryathal@sh.itjust.works 4 points 1 week ago

They aren't context aware, it's using statistical probability. It can replicate things it's seen a lot of like a tutorial regex. It can't apply that to make a more complicated one. Regex in the wild isn't really standard at all, because it's rarely used to solve common problems. It has a bunch of random regexs from code it analyzed and will spit something out that looks similar.