LLMs continue to be so good and wagmi that they've progressed to the serving ads part of the extractivist SaaS lifecycle
TechTakes
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
this isn’t surprising at all, but some of the details are interesting: Server found in apartment funded by Russian government used AI to interfere with 2024 US elections
LLMs really are designed for this kind of thing, aren’t they?
ellison wants to compete with thiel for title of chief boot-wielder https://archive.is/cOnPx
I can't help but feel like for Ellison in particular, he must have given himself no choice but to believe this stuff is more capable than it is. He's 80 years old now, and if building towards honest-to-god "real AI" wasn't what his whole career was about, then what was the point? The twilight of the older generations of tech executives is going to be its own special kind of pathology.
and if building towards honest-to-god “real AI” wasn’t what his whole career was about, then what was the point?
For Larry? Building a corporation that will last a thousand years fueled by greed and contempt to developer and consumer alike, and which would make nazis blush for its industrial disregrad for ethics in pursuit of profit. He did build a legacy for himself. I'll go to my grave cursing his name and he'll hear it from the depths of hell and smile.
Nobody outside the company has been able to confirm whether the impressive benchmark performance of OpenAI's o3 model represents a significant leap in actual utility or just a significant gap in the value of those benchmarks. However, they have released information showing that the most ostensibly-powerful model costs orders of magnitude more. The lede is in that first graph, which shows that for whatever performance gain o3 costs over ~$10 per request with the headline-grabbing version costing ~$1500 per request.
I hope they've been able to identify a market willing to pay out the ass for performance that, even if it somehow isn't over hyped, is roughly equivalent to an average college graduate.
if all of that $1500 cost is electricity, and at arbitrarily chosen but probably high electricity price of $0.2/kWh, that's 7.5MWh per request. could be easily twice that. this is approx how much electricity four 4-person households consume in a year in poland. or about half of american one. six tons of TNT equivalent, or almost 2/3 ton of oil equivalent if you prefer
Actually wait I'm pretty sure it's even worse because I'm terrible at reading logarithmic scales. It's roughly halfway between $1,000 and $10,000 on their log scale, which if I do the math while actually awake works out closer to $3,000.
I'm wondering about the benchmark too. It's way above my level to figure out how it can be gamed. But, buried in the article:
Moreover, ARC-AGI-1 is now saturating – besides o3's new score, the fact is that a large ensemble of low-compute Kaggle solutions can now score 81% on the private eval.
The most expensive o3 version achieved 87.5%
as an amuse bouche for the horrors that will follow this year, please enjoy this lobste.rs reaching the melting down end stage after going full Karen at someone who agrees with a submitted post saying LLMs are a dead end when it comes to AI.
https://lobste.rs/s/lgqwje/does_current_ai_represent_dead_end#c_tefto4
Thankfully, accusing someone of being a crapto promoter is seen as an attack that is beyond the pale.
Highlights from the rest of the thread include bemoaning the lack of a downvote button for registering disapproval:
https://lobste.rs/s/lgqwje/does_current_ai_represent_dead_end#c_ft9mpj
unilaterally deciding to reply multiple times to one comment, neccesitating them to add a meta comment with hyperlinks
https://lobste.rs/s/lgqwje/does_current_ai_represent_dead_end#c_jjk5ei
And of course is a MoreWronger (moroner?)
Lol of course they think they are civil and other people as pushing nasty rethoric. Quite the sealion feeling.
Wonder if they even notice how much communication weirdness they themself used. With the emphasis of emotional laden language. (They didnt use bold so i cant call it crank capitalization, but more crank cursive. A big deal for me! ;) )
Anyway the questioning of "how do you know this is why there is no downvoting" shows the type of person they are. (And is quite the Rationalist annoying behavior, suddenly they demand excessive sourcing for small remarks of people they disagree with).
I have landed on a "you can get fucked if you make this annoying for me, I don't need your product anyway" response to everything. The silver lining is that I will be dealing with way more bullshit while being just as angry all the time at everything.
noodling on a blog post - does anyone with more experience of LW/EA than me know if "AI safety" people are referencing the invention of nuclear weapons as a template for regulating/forbidding "AGI"?
choose your silicon valley thinkboi
edit: goddammit istewart got in first because we both saw this on the zitron discord
via this I just learned that google's about[0] to open the taps on fingerprinting allowance for advertisers
that'll go well.
I realize that a lot of people in the rtb space already spend an utterly obscene amount of effort and resources to try do this shit in the first place, but jesus, this isn't even pretending. guess their projections for ad revenue must be looking real scary!
edit [0] - "about", as in next month. and they announced it last month.
The Google post appears to be Updating our platform policies to reflect innovations in the ads ecosystem.
I have no idea what the heck those words mean (it appears to be some bizarro form of English), so I diffed the policy itself. Here are the parts I found notable.
This will be removed:
You must not use device fingerprints or locally shared objects (e.g., Flash cookies, Browser Helper Objects, HTML5 local storage) other than HTTP cookies, or user-resettable mobile device identifiers designed for use in advertising, in connection with Google's platform products. This does not limit the use of IP address for the detection of fraud.
This will be removed:
You must not pass any information to Google [...] that permanently identifies a particular device (such as a mobile phone's unique device identifier if such an identifier cannot be reset).
This will be added:
You must disclose clearly any data collection, sharing and usage that takes place in connection with your use of Google products, including information about the technologies used, such as your use of cookies, web beacons, IP addresses, or other identifiers. This applies for data collection, sharing and usage on any platform, surface or property (e.g., web, app, Connected TV, gaming console or email publication).
you just gotta love how vacuously pointless the wording is
You must disclose
google-rfc "must": "we want something we can bend you over a barrel with if you're caught out by one, but that's all we'll bother committing because otherwise it eats into our lovely extortion profits"
Also I'm having a fun time imagining an accurate device fingerprinting disclosure from someone who was really really thorough.
Not-A-Cookie-I-Swear Technologies LTD may collect the following information:
Don't worry none of it is a cookie :D
- Your User-Agent
- Your browsers language / locale
- The state of the service-worker associated with Not-A-Cookie-I-Swear Technologies LTD's website
- Whether your "mouse" movements look more like a mouse, trackpoint, gamepad, joystick or touchscreen according to our heuristics
- The current JavaScript time
- Whether your browser prefers dark mode or not
- Whether your browser reports itself as screen or print media
- The device size, device pixel ratio, frame size, and frame position reported by your browser
- Your browser's HTTP request headers
- The success or failure of fetching a URL included in the Easylist ad-block list
- Whether or not an element associated with the Easylist element hiding list was hidden or not
- Your IP address
- The result of tracerouting your IP address from one of our servers
- Browser Local and/or Session Storage
- The state of the WebSQL and/or IndexedDB database for our website
- The state of the OPFS filesystem store associated with our website
- Whether or not there was an HTTP cache hit for our website
- Whether or not there was a DNS entry cached for our website
- A hash of the pixels in a WebGL and/or WebGPU scene
- The browser's default styling
- The browser's minimum font size
- The browser's default font family
- The font file chosen for a variety of character (or ligature) and font-family combinations
- A hash of the pixels of a canvas with a variety of font families and shapes written into it
- A report on the presence or absence of various browser CVEs in your browser
- Information about any other open tabs that happen to include technologies from Not-A-Cookie-I-Swear Technologies LTD
- What video, audio, and/or image codecs are supported by your browser
- Whether or not your browser enables video auto play (and whether or not it's muted by default)
- Whether your browser supports MathGL or not
- Whether your browser recognizes any origin trials that Not-A-Cookie-I-Swear Technologies LTD happens to have opted into at any given time
- The behavior of your browser against various web standards edge cases or the presence or absense of features in draft web standards (e.g. Web Platform Tests or Can-I-Use tests)
- Whether or not your browser supports Widevine video DRM
- Various browser performance characteristics
- All key press events
- Various form auto-fill data (if triggered)
- Any mouse down, mouse move, or mouse up events
- A rough geolocation calculated by examining the relative latency of fetches to a number of geographically distributed web servers
- The presence or absence of various browser plugins developed by, purchased by, or affilated with Not-A-Cookie-I-Swear Technlogies LTD (and any data therein as agreed to by the extension permissions dialog -- up to and including microphone, webcam, or full page DOM)
Some stuff in this list is me being silly, but overall it shows that the talk about "privacy-enhancing technologies" is premature on the web platform. The web has been trying to have better privacy defaults over time; but there's a long legacy of features from before this was considered as much, as well as Google tossing around their weight in the web standards and browser space.
now i wonder how much of that is blocked by firefox enhanced tracking protection. not all, of course, and it's probably much more than needed for unique identifier. there's mozilla security blog post on this topic says that some anti-fingerprinting measures were built in all the way back in 2020 (firefox 72)
Above I listed a bunch of things which would help narrow down browser version, but that's hopeless anyway -- an adversary will probably be able to figure out your rough browser version even if you fake the UA string, and that you're running in anti-fingerprinting mode.
So assuming that's out of scope I think these are probably the big categories:
- Normalize any system information presented to webpage (e.g. remove minor version from UA header, remove OS from UA header, etc)
- Canvas, WebGL, and WebGPU need to be implemented in software in a deterministic way. Similarly any compositing (including stuff like font shaping, SVG rendering, page layout) must be done in software (prevent GPU fingerprinting)
- A fixed font set must be used rather than using the system font set (prevent fingerprinting font enthusiasts)
- The device size / frame size (and position) must be lied about (e.g. rounded to a common resolution or a multiple of 100px), and layout adjusted appropriately (Mozilla calls this "Letterboxing") (prevent fingerprinting psychos who don't run their browser in fullscreen mode).
- Page storage should be disabled or cleared (local / session storage, cookies, service workers, indexeddb, etc) (A cookie by any other name would taste as sweet)
- Caching is a big problem, probably have to disable it entirely (including HTTP caching, HTTP caching at the ISP level*, DNS lookups, favicons, JavaScript compilation cache) (Pesky pesky global state).
- Performance metrics are another big problem. Disabling JavaScript would go a long way here but you probably can't prevent them entirely unless you're prepared to go to unhealthy extremes** (this is like the past 10 years of cutting edge security research so we're doomed)
- Disable any plugins or other customizations which may provide a fingerprint accessible to the webpage (oops it turned out the FBI caught me because I configured my browser to inject pictures of cute bunnies into every webpage).
- And of course IP address, which you presumably want to do something about (proxy?)
That said while I've worked with browsers, I'm not in the biz of fingerprinting or anti-fingerprinting, so there's surely stuff I haven't thought of.
* Actually we should probably just disable non-HTTPS entirely...
** Running under a VM is probably the minimum required to mitigate the chances of cutting-edge side-channel timing attacks from James Bond level adversaries, but at that point maybe you just want a dedicated browsing computer heh. I did chuckle at the idea of someone trying to apply cryptographic constant-time algorithm techniques to writing a browser though.