self

joined 2 years ago
MODERATOR OF
[–] self@awful.systems 14 points 8 months ago (2 children)

our disappointing cyberpunk future where everything looks like Hollywood hacking because you’re just typing prompts to generate stupid exploit scripts at an LLM, but they all work because the people writing the software being exploited also don’t know what they’re doing

[–] self@awful.systems 17 points 8 months ago (6 children)

you can’t just hit me with fucking comedy gold with no warning like that (archive link cause losing this would be a tragedy)

So my natural thought process was, “If I’m using AI to write my anti-malware script, then why not the malware itself?”

Then as I started building my test VM, I realized I would need help with a general, not necessarily security-focused, script to help set up my testing environment. Why not have AI make me a third?

[…]

cpauto.py — the general IT automation script

First, I created a single junk file to actually encrypt. I originally made 10 files that I was manually copy pasting, and in the middle of that, I got the idea to start automating this.

this one just copies a file to another file, with an increasing numerical suffix on the filename. that’s an easily-googled oneliner in bash, but it took the article author multiple tries to fail to get Copilot to do it (they had to modify the best result it gave to make it work)

rudi_ransom.py (rudimentary ransomware)

I won’t lie. This was scary. I made this while I was making lunch.

this is just a script that iterates over all the files it can access, saves a version encrypted against a random (non-persisted, they couldn’t figure out how to save it) key with a .locked suffix, deletes the original, changes their screen locker message to a “ransom” notice, and presumably locks their screen. that’s 5 whole lines of bash! they won’t stop talking about how they made this incredibly terrifying thing during lunch, because humblebragging about stupid shit and AI fans go hand in hand.

rrw.py (rudimentary ransomware wrecker) This was honestly the hardest script to get working adequately, which compounds upon the scariness of this entire exercise. Again, while I opted for a behavior-based detection anti-ransomware script, I didn’t want it to be too granular so it could only detect the rudi_ransom.py script, but anything that exhibits similar behavior.

this is where it gets fucking hilarious. they use computer security buzzwords to describe such approaches as:

  • trying and failing to kill all python3 processes (so much for a general approach)
  • killing the process if its name contains the string “ransom”
  • using inotify to watch the specific directory containing his test files for changes, and killing any process that modifies those files
  • killing any process that opens more than 20 files (hahaha good fucking luck)
  • killing any process that uses more than 5% CPU that’s running from their test directory

at one point they describe an error caused by the LLM making shit up as progress. after that, the LLM outputs a script that starts killing random system processes.

so, after 42 tries, did they get something that worked?

I was giving friends and colleagues play-by-plays as I was testing various iterations of the scripts while writing this blog, and the consensus opinion was that what I was able to accomplish with a whim was terrifying.

I’m not going to lie, I tend to agree. It’s scary that was I was able create the ransomware/data wiper script so quickly, but it took many hours, several days, 42 different versions, and even more minor edits to fail to stop said ransomware script from executing or kill it after it did. I’m glad the static analysis part worked, but that has a high probability of causing accidental deletions from false positives.

I just want to reiterate that I had my AI app generate my ransomware script while I was making lunch

of course they fucking didn’t

[–] self@awful.systems 25 points 8 months ago (2 children)

finally, the guy who willingly named himself BasedBeffJezos and posts the stupidest fucking things I’ve ever seen has pivoted his startup to the grift that you can replicate faster and more reliably on a Commodore 64

[–] self@awful.systems 11 points 8 months ago (2 children)

we have some running commentary on FreeAssembly, a smaller community on the same instance as TechTakes for open source collaboration. I don’t mind having posts about the open letter here too though; what happens to Nix will have an impact on our instance, because it’s a NixOS deployment and because I (the infrastructure admin) have previously been fairly active in the Nix community.

for the record (and to summarize the other thread), there’s not a lot of gray to my stance: anduril, eelco, and friends can fuck off, and I regret letting folks convince me that the systemic problems I’d heard about from marginalized folks in the Nix community were minor or solved. we unfortunately know what the Nix project will look like if it continues down this path: it’ll become something a lot like Urbit, which is a grim fucking fate for something I think is still technologically worthwhile.

[–] self@awful.systems 5 points 8 months ago (3 children)

red tidday up, white tidday down, last done squat gonna tear the ground

[–] self@awful.systems 10 points 8 months ago

damn he better get some question marks on those allegations before dang sees

[–] self@awful.systems 18 points 8 months ago

I’ve suggested the orange site changes the title to “In response to Google?” to keep up with latest practices around there.

fuck yes it begins

[–] self@awful.systems 6 points 8 months ago
[–] self@awful.systems 14 points 8 months ago (4 children)

all of the developers I know at AI-related startups identify as researchers, regardless of their actual role

the underlying hardware, OS, VMs etc.

no, let’s not blame unaffiliated systems engineers for this dumb shit, thanks

[–] self@awful.systems 27 points 8 months ago (11 children)

the inputs required to cause this are so basic, I really want to dig in and find out if this is a stupid attempt to make the LLM better at evaluating code (by doing a lazy match on the input for “evaluate” and using the LLM to guess the language) or intern-level bad code in the frameworks that integrate the LLM with the hosting websites. both paths are pretty fucking embarrassing mistakes for supposedly world-class researchers to make, though the first option points to a pretty hilarious amount of cheating going on when LLMs are supposedly evaluating and analyzing code in-model.

[–] self@awful.systems 6 points 8 months ago

found the original post! https://mastodon.social/@kennwhite/112290497758846218 the prompt to make them execute code is incredibly basic. no idea right now if the exploit is in the chatbot framework or the model itself though

view more: ‹ prev next ›