LessWrong (30+ Karma)

LessWrong
undefined
Apr 7, 2026 • 8min

“We’re actually running out of benchmarks to upper bound AI capabilities” by LawrenceC

Written quickly as part of the Inkhaven Residency. Opinions are my own and do not represent METR's official opinion. In early 2025, the situation for upper-bounding[1] model capabilities using fixed benchmarks was already somewhat challenging. As part of the trend where benchmarks were being saturated at an ever increasing rate, benchmarks that were incredibly challenging for AI in early 2024 such as GPQA were being saturated scarcely a year later. An oft-cited screenshot from Our World In Data (including in our time horizon blog post!), showing the ever increasing pace of saturation for AI benchmarks. Thankfully, we saw a wave of alternative approaches to measure AI agent capabilities: for example, at METR, we released both the Time Horizon methodology as well as a preliminary uplift study that found no significant productivity uplift from AI. As part of their frontier AI safety policies, AI developers such as Anthropic and OpenAI built newer, more extensive evaluations to demonstrate that their AIs did not reach dangerous capability thresholds, such as BrowseComp and GDPval. Many research teams, both in academia and in industry, stepped up and created newer, ever more challenging agentic benchmarks, including τ2 -Bench, MCP-Atlas, terminal-bench [...] The original text contained 1 footnote which was omitted from this narration. --- First published: April 6th, 2026 Source: https://www.lesswrong.com/posts/gfkJp8Mr9sBm83Rcz/we-re-actually-running-out-of-benchmarks-to-upper-bound-ai --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Apr 7, 2026 • 10min

“Don’t write for LLMs, just record everything” by RobertM

They debate whether public writing buys immortality or better future LLMs. They question if pretraining actually makes models personally useful. They argue for giving models reusable artifacts instead of just prose. They propose logging conversations, keystrokes, and screens to create richer personal data for models. They discuss privacy, feasibility, and existing tooling.
undefined
Apr 7, 2026 • 6min

“Contra Nina Panickssery on advice for children” by Sean Herrington

A critique of advice for very smart children that warns against adopting an adversarial stance toward others. Short takes on when to follow the crowd and when to question it. Discussion of realistic limits, tailoring common advice to ability, and avoiding predictable mistakes. Emphasis on respectful debate, protecting your future self, and staying humble about being right sometimes and wrong other times.
undefined
Apr 7, 2026 • 6min

“By Strong Default, ASI Will End Liberal Democracy” by MichaelDickens

Michael Dickens, writer on AI safety and political fallout, argues that artificial superintelligence will break the balance of power that sustains liberal democracy. He explores how ASI control can defeat military or legislative checks. He runs through scenarios: a DARPA-built ASI seizing power, slow take-off and distributed control as escape routes, and whether an aligned ASI could actually defend civil liberties.
undefined
Apr 6, 2026 • 30min

“AIs can now often do massive easy-to-verify SWE tasks and I’ve updated towards shorter timelines” by ryan_greenblatt

Ryan Greenblatt, AI researcher who updates timelines for advanced systems. He explains why short, easy-to-verify software tasks are now much easier for AIs and how that pushes his timeline estimates much earlier. He walks through what makes those tasks verifiable, limits like taste and scaffolding, hands-on attempts to automate safety research, and implications for large-scale SWE, R&D, and cyber work.
undefined
Apr 6, 2026 • 18min

“Paper close reading: “Why Language Models Hallucinate”” by LawrenceC

A step-by-step close reading of a paper that frames hallucinations as plausible guessing under uncertainty. Short checks compare model outputs against examples and benchmarks. The talk examines a reduction of generation errors to binary classification and debates whether learning theory supports that view. It also explores incentive changes to benchmarks as a possible mitigation.
undefined
Apr 5, 2026 • 10min

“Ten different ways of thinking about Gradual Disempowerment” by David Scott Krueger (formerly: capybaralet)

A clear-eyed tour of how automation, corporate incentives, and state power can slowly strip away human decision-making. Topics include capitalist and evolutionary forces that favor AI, metrics and standardization concentrating control, competitive pressure to outsource to untrusted but capable systems, and worries about skill erosion and broader systemic failures.
undefined
Apr 5, 2026 • 6min

“11 pieces of advice for children” by Nina Panickssery

Nina Panickssery, author of the piece "11 pieces of advice for children," is a writer who centers practical, contrarian guidance for young people. She urges resisting conformity and thinking freely. She stresses honest appraisal of abilities, prioritizing where you care, and learning from others to avoid common pitfalls. The narration is a concise, provocative manifesto for independent thinking and self-respect.
undefined
Apr 5, 2026 • 8min

“Steering Might Stop Working Soon” by J Bostock

J Bostock, author of the LessWrong post on steering LLMs, is a researcher analyzing activation steering and eval-awareness. He warns single-vector steering may soon fail. He compares steering models to steering humans and explains why larger systems resist simple fixes. He surveys experiments showing steering harms and outlines more robust alternatives to consider.
undefined
Apr 5, 2026 • 5min

“Am I the baddie?” by Ustice

A software engineer recounts a frantic sprint using advanced AI models to blaze through dozens of tickets. They explain hacking worktrees and repo tricks to run many threads in parallel. They describe automating planning and deployments with agentic workflows and building a dashboard to spin up agents. The story ends with uneasy moral questions about automation and job displacement.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app