Scoots hot new AGI goss just dropped, Trump loses 3rd election to Grok in stunning upset

BigMuffin69@awful.systems · 11 个月前

Ackshually, my metric gives 0 measure to ASI minds and 1 measure to meat sac minds, therefore mu({bio bois}) >> mu({ASI})

BigMuffin69@awful.systems · 1 年前

Kind of knew that after Claude plays pokemon went semi viral, it was going to immediately get goodhart’d. i also saw the usual doomers be like BY END OF YEAR AGENTS WILL BEAT POKEMON, which I thought was a severe underestimate at the time- they were undoubtably basing their projection based off the Anthropic people who posted a little chart showing how far each version of Claude made it, waiting for pokemon playing skill to emerge from larger and larger models, instead of thinking, hmm they are iteratively refining the customized tools as it gets stuck. Then after Gemini ‘beat’ the game I read a disappointed response from an RL guy that said after trying to replicate the results, they concluded Googe’s set up was basically 90% harness for the model, 10% model despite the Google team basically implying it was raw pixels-to-action.

BigMuffin69@awful.systems · 1 年前

Grinding in Oblivion you say?

BigMuffin69@awful.systems · 1 年前

BigMuffin69@awful.systems · 1 年前

np, im just screaming into the void on this beautiful Monday morning

BigMuffin69@awful.systems · 1 年前

I couldn’t find further holes in it

Here’s a couple:

iirc it claims we’ll have reliable “agents” in mid 2025. Fellas it’s almost June in the year of the “agents” and frankly I don’t see shit. We are not starting strong here.
they predict a 10k person anti-AI protest in DC. For context, the recent “Hands Off” protest in DC saw 100k person turnout. Israel / Palestine protest saw 300K in DC in 2023. A ten-thousand-person protest isn’t really anything out of the ordinary? It’s almost like the authors have never been to a protest, don’t understand collective action because they live in a bubble or something? But they assure us, this document is thoroughly researched maybe their point was self-deprecating, “woe is us, only 10K people show up :(”
When they get into their super agi fanfic, they describe Agent-n as “never stops training” continuously learning from the environment. Like the only way I read this is that somehow, we discover paradigm shifting algorithmic discoveries by coincidence in the next couple years that make DL obsolete so we can abandon train-inference approaches and instead have this embodied entity that is constantly taking feedback from the environment to “train” but the system itself is still described under the massive data center heavy DL framework. It’s like they know that bio intelligence has this continuous feedback mechanism, so obviously ai researchers will just patch that in, how hard can it be?
Ong, i swear they just put in there at some point “hallucinations are solved” the thing they have been claiming will be solved in the next month since 2023.

BigMuffin69@awful.systems · 1 年前

Daniel Kokotlajo, the actual ex-OpenAI researcher

Unclear to me what Daniel actually did as a ‘researcher’ besides draw a curve going up on a chalkboard (true story, the one interaction I had with LeCun was showing him Daniel’s LW acct that is just singularity posting and Yann thought it was big funny). I admit, I am guilty of engineer gatekeeping posting here, but I always read Danny boy as a guy they hired to give lip service to the whole “we are taking safety very seriously, so we hired LW philosophers” and then after Sam did the uno reverse coup, he dropped all pretense of giving a shit/ funding their fan fac circles.

Ex-OAI “governance” researcher just means they couldn’t forecast that they were the marks all along. This is my belief, unless he reveals that he superforecasted altman would coup and sideline him in 1998. Someone please correct me if I’m wrong, and they have evidence that Daniel actually understands how computers work.

BigMuffin69@awful.systems · 1 年前

I think Demis Hassabis (chemistry for alpha fold) has said the chance of AI killing all of humanity is somewhere between 0 and 100%.

BigMuffin69@awful.systems · 1 年前

Scoots hot new AGI goss just dropped, Trump loses 3rd election to Grok in stunning upset

BigMuffin69@awful.systems · edit-2 1 年前

Yud be like: "kek you absolute rubes. ofc I simply meant AI would be like a super accountant. I didn’t literally mean it would be able to analyze gov’t waste from studying the flow of matter at the molecular level… heh, I was just kidding… unless 🥺 ? "

BigMuffin69@awful.systems · edit-2 1 年前

Deep thinker asks why?

Thus spoketh the Yud: “The weird part is that DOGE is happening 0.5-2 years before the point where you actually could get an AGI cluster to go in and judge every molecule of government. Out of all the American generations, why is this happening now, that bare bit too early?”

Yud, you sweet naive smol uwu baby~~esian~~ boi, how gullible do you have to be to believe that a) tminus 6 months to AGI kek (do people track these dog shit predictions?) b) the purpose of DOGE is just accountability and definitely not the weaponized manifestation of techno oligarchy ripping apart our society for the copper wiring in the walls?

BigMuffin69@awful.systems · 1 年前

spat out my fucking drink on this one

BigMuffin69@awful.systems · 1 年前

fuck man, this was bad enough that people outside the sneerverse were talking about this around me irl

BigMuffin69@awful.systems · 1 年前

Trump promised me he’d get the price of them down- I’m sure we can start a gofundme to replace the gay people’s eggs

BigMuffin69@awful.systems · 1 年前

Reposting this for the new week thread since it truly is a record of how untrustworthy sammy and co are. Remember how OAI claimed that O3 had displayed superhuman levels on the mega hard Frontier Math exam written by Fields Medalist? Funny/totally not fishy story haha. Turns out OAI had exclusive access to that test for months and funded its creation and refused to let the creators of test publicly acknowledge this until after OAI did their big stupid magic trick.

From Subbarao Kambhampati via linkedIn:

"𝐎𝐧 𝐭𝐡𝐞 𝐬𝐞𝐞𝐝𝐲 𝐨𝐩𝐭𝐢𝐜𝐬 𝐨𝐟 “𝑩𝒖𝒊𝒍𝒅𝒊𝒏𝒈 𝒂𝒏 𝑨𝑮𝑰 𝑴𝒐𝒂𝒕 𝒃𝒚 𝑪𝒐𝒓𝒓𝒂𝒍𝒍𝒊𝒏𝒈 𝑩𝒆𝒏𝒄𝒉𝒎𝒂𝒓𝒌 𝑪𝒓𝒆𝒂𝒕𝒐𝒓𝒔” hashtag#SundayHarangue. One of the big reasons for the increased volume of “𝐀𝐆𝐈 𝐓𝐨𝐦𝐨𝐫𝐫𝐨𝐰” hype has been o3’s performance on the “frontier math” benchmark–something that other models basically had no handle on.

We are now being told (https://lnkd.in/gUaGKuAE) that this benchmark data may have been exclusively available (https://lnkd.in/g5E3tcse) to OpenAI since before o1–and that the benchmark creators were not allowed to disclose this *until after o3 *.

That o3 does well on frontier math held-out set is impressive, no doubt, but the mental picture of “𝒐1/𝒐3 𝒘𝒆𝒓𝒆 𝒋𝒖𝒔𝒕 𝒃𝒆𝒊𝒏𝒈 𝒕𝒓𝒂𝒊𝒏𝒆𝒅 𝒐𝒏 𝒔𝒊𝒎𝒑𝒍𝒆 𝒎𝒂𝒕𝒉, 𝒂𝒏𝒅 𝒕𝒉𝒆𝒚 𝒃𝒐𝒐𝒕𝒔𝒕𝒓𝒂𝒑𝒑𝒆𝒅 𝒕𝒉𝒆𝒎𝒔𝒆𝒍𝒗𝒆𝒔 𝒕𝒐 𝒇𝒓𝒐𝒏𝒕𝒊𝒆𝒓 𝒎𝒂𝒕𝒉”–that the AGI tomorrow crowd seem to have–that 𝘖𝘱𝘦𝘯𝘈𝘐 𝘸𝘩𝘪𝘭𝘦 𝘯𝘰𝘵 𝘦𝘹𝘱𝘭𝘪𝘤𝘪𝘵𝘭𝘺 𝘤𝘭𝘢𝘪𝘮𝘪𝘯𝘨, 𝘤𝘦𝘳𝘵𝘢𝘪𝘯𝘭𝘺 𝘥𝘪𝘥𝘯’𝘵 𝘥𝘪𝘳𝘦𝘤𝘵𝘭𝘺 𝘤𝘰𝘯𝘵𝘳𝘢𝘥𝘪𝘤𝘵–is shattered by this. (I have, in fact, been grumbling to my students since o3 announcement that I don’t completely believe that OpenAI didn’t have access to the Olympiad/Frontier Math data before hand… )

I do think o1/o3 are impressive technical achievements (see https://lnkd.in/gvVqmTG9 )

𝑫𝒐𝒊𝒏𝒈 𝒘𝒆𝒍𝒍 𝒐𝒏 𝒉𝒂𝒓𝒅 𝒃𝒆𝒏𝒄𝒉𝒎𝒂𝒓𝒌𝒔 𝒕𝒉𝒂𝒕 𝒚𝒐𝒖 𝒉𝒂𝒅 𝒑𝒓𝒊𝒐𝒓 𝒂𝒄𝒄𝒆𝒔𝒔 𝒕𝒐 𝒊𝒔 𝒔𝒕𝒊𝒍𝒍 𝒊𝒎𝒑𝒓𝒆𝒔𝒔𝒊𝒗𝒆–𝒃𝒖𝒕 𝒅𝒐𝒆𝒔𝒏’𝒕 𝒒𝒖𝒊𝒕𝒆 𝒔𝒄𝒓𝒆𝒂𝒎 “𝑨𝑮𝑰 𝑻𝒐𝒎𝒐𝒓𝒓𝒐𝒘.”

We all know that data contamination is an issue with LLMs and LRMs. We also know that reasoning claims need more careful vetting than “𝘸𝘦 𝘥𝘪𝘥𝘯’𝘵 𝘴𝘦𝘦 𝘵𝘩𝘢𝘵 𝘴𝘱𝘦𝘤𝘪𝘧𝘪𝘤 𝘱𝘳𝘰𝘣𝘭𝘦𝘮 𝘪𝘯𝘴𝘵𝘢𝘯𝘤𝘦 𝘥𝘶𝘳𝘪𝘯𝘨 𝘵𝘳𝘢𝘪𝘯𝘪𝘯𝘨” (see “In vs. Out of Distribution analyses are not that useful for understanding LLM reasoning capabilities” https://lnkd.in/gZ2wBM_F ).

At the very least, this episode further argues for increased vigilance/skepticism on the part of AI research community in how they parse the benchmark claims put out commercial entities."

Big stupid snake oil strikes again.