
LessWrong (30+ Karma) “Why AI Evaluation Regimes are bad” by PranavG, Gabriel Alfour
How the flagship project of the AI Safety Community ended up helping AI Corporations.
I care about preventing extinction risks from superintelligence. This de facto makes me part of the “AI Safety” community, a social cluster of people who care about these risks.
In the community, a few organisations are working on “Evaluations” (which I will shorten to Evals). The most notable examples are Apollo Research, METR, and the UK AISI.
Evals make for an influential cluster of safety work, wherein auditors outside of the AI Corporations racing for ASI evaluate the new AI systems before they are deployed and publish their findings.
Evals have become a go-to project for people who want to prevent extinction risks. I would say they are the primary project for those who want to work at the interface of technical work and policy.
Incidentally, Evals Orgs consistently avoid mentioning extinction risks. This makes them an ideal place for employees and funders who care about extinction risks but do not want to be public about them. (I have written about this dynamic in my article about The Spectre.)
Sadly, despite having taken so much prominence in the “AI Safety” community, I believe that the [...]
---
Outline:
(00:13) How the flagship project of the AI Safety Community ended up helping AI Corporations.
(02:46) 1) The Theory of Change behind Evals is broken
(06:10) 2) Evals move the burden of proof away from AI Corporations
(09:38) 3) Evals Organisations are not independent of the AI Corporations
(15:55) Conclusion
---
First published:
March 12th, 2026
Source:
https://www.lesswrong.com/posts/Xxp6Tm8BKTkcb2m5M/why-ai-evaluation-regimes-are-bad
---
Narrated by TYPE III AUDIO.
---
Images from the article:


Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
