LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

Episodes

Mentioned books

Mar 14, 2026 • 32min

“Most likely you won’t be able to perform a data-driven self-improvemnet” by siarshai

Suppose you’re not happy with the quality of your sleep. You’ve already stopped doing the obviously harmful things (no more coffee at night), and your sleep has improved - but you’d like to work on it further. A coworker gives you an herbal mix with St. John's wort and lavender. You try drinking it at night instead of coffee, and it does seem that sometimes your sleep really does get deeper than before. But sometimes it doesn’t. You’re willing to experiment, but how do you actually check whether the herbs work, or whether it's just random variation? Or suppose you’re not particularly satisfied with your productivity at work. Following the advice from "Atomic Habits" and books on workflow organization, you’ve introduced a few useful micro‑habits and ergonomic improvements. But what do you do once the low‑hanging fruits are picked up? Time is limited - you can’t implement everything that someone somewhere calls "useful". Some habits are even mutually exclusive: it's impossible to both socialize during lunch and sit alone in silence at the same time. Or, for example, you want to achieve better results in fishing… you get the idea. "Don’t underestimate the power of small things taken in [...] ---Outline:(02:20) Notation(03:11) Minimal background for people who want to read the article but dont know probability theory(10:30) How large of effects should you expect from self‑experimentation?(13:41) What does d = 0.1-0.4 actually mean in real life?(15:56) How many observations, exactly?(20:23) p‑value(25:02) Additional complications(25:15) Non‑linear interactions(26:04) Accumulation and time‑to‑effect(27:00) Side conditions and seasonality(27:34) Substitution effects(28:13) Noisy measurement units(29:33) The observer effect(30:22) The general noise of life(31:04) Summary The original text contained 15 footnotes which were omitted from this narration. --- First published: March 13th, 2026 Source: https://www.lesswrong.com/posts/ycWWbpjxuhdxGpJ6e/most-likely-you-won-t-be-able-to-perform-a-data-driven-self --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 14, 2026 • 14min

“Are AIs more likely to pursue on-episode or beyond-episode reward?” by Anders Woodruff, Alex Mallen

Mar 12, 2026 • 9min

“Ideologies Embed Taboos Against Common Knowledge Formation: a Case Study with LLMs” by Benquo

LLMs are searchable holograms of the text corpus they were trained on. RLHF LLM chat agents have the search tuned to be person-like. While one shouldn't excessively anthropomorphize them, they're helpful for simple experimentation into the latent discursive structure of human writing, because they're often constrained to try to answer probing questions that would make almost any real human storm off in a huff. Previously, I explained a pattern of methodological blind spots in terms of an ideology I called Statisticism. Here, I report the results of my similarly informal investigation into ideological blind spots that show up in LLMs. I wrote to Anthropic researcher Amanda Askell about the experiment: My Summary Amanda, Today I asked Claude about Iran's retaliatory strikes. [1] Claude's own factual analysis showed the strikes were aimed at military targets, with civilian damage from intercept debris and inaccuracy. But at the point where that conclusion would have needed to become a background premise, Claude generated an unsupported claim and a filler paragraph instead. I'd previously seen Grok do something much worse on the same question (both affirming and denying "exclusively military targets" in the same reply, for several turns), and [...] ---Outline:(00:56) My Summary(02:20) Claudes Summary:(08:22) Disclaimer The original text contained 2 footnotes which were omitted from this narration. --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/6wNwj7xANPmTwWkX6/ideologies-embed-taboos-against-common-knowledge-formation-a --- Narrated by TYPE III AUDIO.

Mar 12, 2026 • 19min

“Why AI Evaluation Regimes are bad” by PranavG, Gabriel Alfour

How the flagship project of the AI Safety Community ended up helping AI Corporations. I care about preventing extinction risks from superintelligence. This de facto makes me part of the “AI Safety” community, a social cluster of people who care about these risks. In the community, a few organisations are working on “Evaluations” (which I will shorten to Evals). The most notable examples are Apollo Research, METR, and the UK AISI. Evals make for an influential cluster of safety work, wherein auditors outside of the AI Corporations racing for ASI evaluate the new AI systems before they are deployed and publish their findings. Evals have become a go-to project for people who want to prevent extinction risks. I would say they are the primary project for those who want to work at the interface of technical work and policy. Incidentally, Evals Orgs consistently avoid mentioning extinction risks. This makes them an ideal place for employees and funders who care about extinction risks but do not want to be public about them. (I have written about this dynamic in my article about The Spectre.) Sadly, despite having taken so much prominence in the “AI Safety” community, I believe that the [...] ---Outline:(00:13) How the flagship project of the AI Safety Community ended up helping AI Corporations.(02:46) 1) The Theory of Change behind Evals is broken(06:10) 2) Evals move the burden of proof away from AI Corporations(09:38) 3) Evals Organisations are not independent of the AI Corporations(15:55) Conclusion --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/Xxp6Tm8BKTkcb2m5M/why-ai-evaluation-regimes-are-bad --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 12, 2026 • 4min

“Conflicted on Ramsey” by jefftk

People are often pretty short-sighted, spending money today that they'll want tomorrow. Debt makes it possible to prioritize your current self even more highly: you can spend money you haven't even earned yet. This is a trap many people fall into, and one different communities have built social defenses against. One of the more surprisingly successful approaches is the Financial Peace (Ramsey) system, popular in evangelical Christian communities. It has a series of rules, most prominently the seven baby steps: Save $1,000 for your starter emergency fund. Pay off all debt (except the house) using the debt snowball. Save 3–6 months of expenses in a fully funded emergency fund. Invest 15% of your household income in retirement. Save for your children's college fund. Pay off your home early. Build wealth and give. There are many more specific rules, however, such as: As a general rule of thumb, the total value of your vehicles (anything with a motor in it) should never be more than half of your annual household income. I [...] --- First published: March 11th, 2026 Source: https://www.lesswrong.com/posts/XsC49gCDNGTNu6Qfn/conflicted-on-ramsey --- Narrated by TYPE III AUDIO.

Mar 12, 2026 • 1h 57min

“How well do models follow their constitutions?” by aryaj, Senthooran Rajamanoharan, Neel Nanda

They test whether large language models actually follow long written constitutions by running adversarial multi-turn scenarios against multiple model families. Results compare violation rates across Claude, Opus, Sonnet, Gemini, and GPT generations. The discussion highlights common failure modes like fabrication, operator-compliance conflicts, prompt injections, and how reasoning effort and scaffolds affect alignment.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

LessWrong (30+ Karma)

Episodes

Mentioned books

“Most likely you won’t be able to perform a data-driven self-improvemnet” by siarshai

“Cycle-Consistent Activation Oracles” by slavachalnev

“Things that Go Boom” by sarahconstantin

“AI #159: See You In Court” by Zvi

“Operationalizing FDT” by Vivek Hebbar

“Are AIs more likely to pursue on-episode or beyond-episode reward?” by Anders Woodruff, Alex Mallen

“Ideologies Embed Taboos Against Common Knowledge Formation: a Case Study with LLMs” by Benquo

“Why AI Evaluation Regimes are bad” by PranavG, Gabriel Alfour

“Conflicted on Ramsey” by jefftk

“How well do models follow their constitutions?” by aryaj, Senthooran Rajamanoharan, Neel Nanda

The AI-powered Podcast Player