

LessWrong (30+ Karma)
LessWrong
Audio narrations of LessWrong posts.
Episodes
Mentioned books

Mar 20, 2026 • 17min
“No, we haven’t uploaded a fly yet” by Ariel Zeleznikow-Johnston
In the last two weeks, social media was set abuzz by claims that scientists had succeeded in uploading a fruit fly. It started with a video released by the startup Eon Systems, a company that wants to create “Brain emulation so humans can flourish in a world with superintelligence.” On the left of the video, a virtual fly walks around in a sandpit looking for pieces of banana to eat, occasionally pausing to groom itself along the way. On the right is a dancing constellation of dots resembling the fruit fly brain, set above the caption ‘simultaneous brain emulation’. At first glance, this appears astounding - a digitally recreated animal living its life inside a computer. And indeed, this impression was seemingly confirmed when, a couple of days after the video's initial release on X by cofounder Alex Wissner-Gross, Eon's CEO Michael Andregg explicitly posted “We’ve uploaded a fruit fly”. Yet “extraordinary claims require extraordinary evidence, not just cool visuals”, as one neuroscientist put it in response to Andregg's post. If Eon had indeed succeeded in uploading a fly - a goal previously thought to be likely decades away according to much of the fly neuroscience community - they’d [...] ---Outline:(03:43) A brief history of fruit fly connectomics(06:10) So, what did Eon actually do?(10:39) What would actually count as uploading a fly?(14:08) Still loading ---
First published:
March 19th, 2026
Source:
https://www.lesswrong.com/posts/ybwcxBRrsKavJB9Wz/no-we-haven-t-uploaded-a-fly-yet
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 19, 2026 • 2min
″“The AI Doc” is coming out March 26” by Rob Bensinger, Beckeck
On Thursday, March 26th, a major new AI documentary is coming out: The AI Doc: Or How I Became an Apocaloptimist. Tickets are on sale now. The movie is excellent, and MIRI staff I've spoken with generally believe it belongs in the same tier as If Anyone Builds It, Everyone Dies as an extremely valuable way to alert policymakers and the general public about AI risk, especially if it smashes the box office. When IABIED was coming out, the community did an incredible job of helping the book succeed; without all of your help, we might never have gotten on the New York Times bestseller list. MIRI staff think that the community could potentially play a similarly big role in helping The AI Doc succeed, and thereby help these ideas go mainstream. (Note: Two MIRI staff were interviewed for the film, but we weren’t involved in its production. We just like it.) The most valuable thing most people can do is maximize opening-weekend success. Buy tickets to see the movie now; poke friends and family members to do the same. This will cause more theaters to pick up the movie, ensure it stays in theaters for longer, and broadly [...] ---
First published:
March 19th, 2026
Source:
https://www.lesswrong.com/posts/w9BCbshKra7FKHTzi/the-ai-doc-is-coming-out-march-26
---
Narrated by TYPE III AUDIO.

Mar 19, 2026 • 17min
“Null Results From An Orexin-A RCT” by niplav, harsimony, nomagicpill
Over the last few months we[1] have been doing a sleep experiment inspired by our suspicion that orexin is an exciting target for sleep need reduction. We mildly deprived ourselves of sleep (5-5.5 hours, relative to 7-7.5 hours normally) and took either a placebo or orexin intranasally. We tracked our sleep the night before and after taking a dose in the morning and completed various tests of mental acuity during the day. The results from our initial experiment are exclusively null results that don’t cross standard thresholds for statistical significance. Not that this was particularly surprising, we expected a ~60% chance of this happening. We’re considering next steps, and need your feedback! For now, there are a few things to cover in the results. Trial Design We performed a self-blinded randomized controlled trial with blocking, each participant took either the placebo (2.5 mL of sterile water) or the orexin (100 μg of orexin-A dissolved in 2.5 mL of sterile water). Here's the procedure, repeated for every block: Prepare two nasal atomizers, one with saline solution and one with orexin+saline solution Night to the first day: Sleep 5-5.5 hours. First day: At a consistent time of day [...] ---Outline:(01:01) Trial Design(02:57) The Results(05:44) The Next Trial(06:39) Appendix A: Details about the Data Analysis(08:40) Frequentist Analysis and Additional Results(09:33) Bayesian Analysis and Additional Results(11:06) Learning Effects on Mental Acuity Tests(11:27) Appendix B: Threats to Validity(12:50) Appendix C: Personal Experiences ---
First published:
March 19th, 2026
Source:
https://www.lesswrong.com/posts/sW5PtDTKtmGNKcvQk/null-results-from-an-orexin-a-rct
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 19, 2026 • 30min
“Broad Timelines” by Toby_Ord
No-one knows when AI will begin having transformative impacts upon the world. People aren’t sure and shouldn’t be sure: there just isn’t enough evidence to pin it down. But we don’t need to wait for certainty. I want to explore what happens if we take our uncertainty seriously — if we act with epistemic humility. What does wise planning look like in a world of deeply uncertain AI timelines? I’ll conclude that taking the uncertainty seriously has real implications for how one can contribute to making this AI transition go well. And it has even more implications for how we act together — for our portfolio of work aimed towards this end. AI Timelines By AI timelines, I refer to how long it will be before AI has truly transformative effects on the world. People often think about this using terms such as artificial general intelligence (AGI), human level AI, transformative AI, or superintelligence. Each term is used differently by different people, making it challenging to compare their stated timelines. Indeed even an individual's own definition of their favoured term will be somewhat vague, such that even after their threshold has been crossed, they might have [...] ---Outline:(00:58) AI Timelines(04:38) Short vs Long Timelines(07:05) Broad Timelines(17:57) Implications(19:48) Hedging(21:00) A Different World(24:02) Longterm Actions(28:36) Conclusions ---
First published:
March 19th, 2026
Source:
https://www.lesswrong.com/posts/6pDMLYr7my2QMTz3s/broad-timelines
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 19, 2026 • 9min
“Protecting humanity and Claude from rationalization and unaligned AI” by Kaj_Sotala
My first academic piece on risks from AI was a talk that I gave at the 2009 European Conference on Philosophy and Computing. Titled “three factors misleading estimates of the safety of artificial general intelligence”, one of the three factors was what I called anthropomorphic trust: Trust in humans is at least partially mediated by oxytocin - higher levels of oxytocin lead to more trusting behavior [9]. Trusting somebody and then not being betrayed by the trustee increases oxytocin levels [10], and the hormone has been linked to pair bonding. Testing an AGI for reliability and then having one's trust repaid seems likely to trigger the same mechanism. Thus people may believe that an AGI that has cooperated with them for a long time has ”earned their trust”, and feel protective whenever the AGI's friendliness is questioned. In simpler words, if someone has repeatedly been nice and trustworthy toward you, then you are more likely to trust them and want to take their side. This operates on an emotional level that bypasses intellectual analysis. And if someone suggests that your friend might be a bad actor in some way, you’re more likely to want to take your friend's side [...] The original text contained 1 footnote which was omitted from this narration. ---
First published:
March 19th, 2026
Source:
https://www.lesswrong.com/posts/2p6dD35h38X5fw85G/protecting-humanity-and-claude-from-rationalization-and
---
Narrated by TYPE III AUDIO.

Mar 19, 2026 • 5min
“An interactive version of the extropians mailing list” by beyarkay
Claude & I vibecoded an interface for the extropians mailing list. It's live! Have a look here: https://extropians.boydkane.com/. From Wikipedia, discussing the extropians mailing list: In the 90s, the Extropy Institute launched an email mailing listserv through which members could receive updates from the institute and have conversations about extropianism with other members. Notable members include: Julian Assange Nick Bostrom Wei Dai Eric Drexler Hal Finney Robin Hanson Todd Huffman Marvin Minsky Ray Kurzweil Nick Szabo Eliezer Yudkowsky I got curious about this mailing forum. There's an index hosted by one of the original participants Wei Dai here, but it's not easy to navigate, and there's a dump of all the data here but that's just the raw data. There's 130k messages spread out over ~8 years from 2k unique authors, discussing topics including mars, cryonics, nanotech, morality, AI, politics, etc. I also took the time to embed all 130k messages and then project them via UMAP: If you hover over a message, it’ll highlight the other messages in that thread: This is just running on a smol server, so there's no semantic search, but I did pre-cluster the embeddings and label them, so you can search for [...] ---
First published:
March 18th, 2026
Source:
https://www.lesswrong.com/posts/hTnhwX9kr6GHgFrp4/an-interactive-version-of-the-extropians-mailing-list
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 19, 2026 • 2min
[Linkpost] “OpenAI: How we monitor internal coding agents for misalignment” by Marcus Williams
This is a link post. Sharing some of the monitoring work I've been doing at OpenAI: How we monitor internal coding agents for misalignment. OpenAI now monitors 99.9% of internal coding traffic for signs of misalignment using our most powerful models. Today, that monitor is GPT-5.4 Thinking. It gets access to the full conversation context, that is everything the agent saw, and everything the agent did, including tool calls and CoT. Higher severity cases are sent for human review within 30 minutes. Some examples of misaligned behavior we’ve found: - Encoding commands in base64 to circumvent monitors - Calling other versions of themselves to bypass restrictions - Attempting to upload files to the public internet - Trying to prompt inject users. We haven’t yet detected any real world examples of sabotage, scheming or sandbagging. Our goal is to establish robust monitoring practices internally, strengthen them through iteration, and ultimately help make similar safeguards standard across the industry. ---
First published:
March 19th, 2026
Source:
https://www.lesswrong.com/posts/syB2r2X4E7nmw58co/openai-how-we-monitor-internal-coding-agents-for
Linkpost URL:https://openai.com/index/how-we-monitor-internal-coding-agents-misalignment/
---
Narrated by TYPE III AUDIO.

Mar 19, 2026 • 30min
“Training on Documents About Monitoring Leads To CoT Obfuscation” by Reilly Haskins, bilalchughtai, Josh Engels
Authors: Reilly Haskins*, Bilal Chughtai**, Joshua Engels** * primary contributor ** advice and mentorship Summary [Note: This is a research update sharing preliminary results as part of ongoing work] Will future models obfuscate their CoT when they learn during pretraining that their CoT is being monitored? We investigate this question on today's models by using synthetic document finetuning (SDF) on documents stating that the model will indeed have its CoT monitored. We find that when trained on these documents, gpt-oss-120b and Kimi-K2 obfuscate their CoTs, successfully decreasing monitor detection rates by 31-47% while still maintaining strong performance on a deceptive task. We also see weaker evidence of obfuscation ability on a second task (sandbagging). Additionally, we find that: Obfuscated CoTs still encode the model's intention to be deceptive, but they are reframed to avoid detection. When we inject the start of the obfuscating deceptive model's (undetected) CoT into the base model, the base model continues the CoT and then does the deceptive target behavior. We observe different obfuscation strategies depending on how much information the model is given about the monitor. With detailed monitor information, the model adopts specific evasion strategies in its reasoning. When only given the [...] ---Outline:(00:27) Summary(02:14) Introduction(03:12) Experimental setup(03:16) Model and training(05:09) Deception task(06:30) Sandbagging task(07:34) Monitors(07:38) Deception monitor(08:46) Sandbagging monitor(09:09) More important details(09:34) Models can learn to obfuscate their CoT to evade monitors(13:11) Higher dependence on the CoT does not necessarily reduce obfuscation(16:11) Does the obfuscated CoT still functionally encode the behavior?(18:15) How does the model obfuscate when told minimal details about the monitor?(18:54) Defect concealment (deception) task(20:00) Sandbagging task(20:58) Additional task: Sandbagging(22:11) Additional model reproduction: Kimi K2(23:04) Obfuscation abilities(24:18) Harder task(24:42) Effect of hints(25:09) Discussion(26:41) Limitations(27:41) Future work(29:00) Acknowledgements(29:20) Please do not train on this document! ---
First published:
March 18th, 2026
Source:
https://www.lesswrong.com/posts/PhTJC5kDcGr6CZA87/training-on-documents-about-monitoring-leads-to-cot-6
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 19, 2026 • 7min
“Two Skillsets You Need to Launch an Impactful AI Safety Project” by Luc Brinkman, plex
Your project might be failing without you even knowing it. It's hard to save the world. If you’re launching a new AI Safety project, this sequence helps you avoid common pitfalls. Your most likely failure modes along the way: You never get started. Entrepreneurship is uncomfortable, and AI Safety is complex. There are many failure modes. It's hard to figure out how to do something useful, so many people never try. Simultaneously, many things in AI Safety that obviously should be done are not done yet. You might mistakenly assume they’re already being done and therefore not try them. You move slowly, fail to gain traction, and get stuck. You launch a tool – six months later, you have 12 users. You keep adding features, hoping something will change. Nothing does. You satisfice. You figure that what you’re doing must be high impact, just because it's part of AI Safety. In reality, you could be having 10x more impact than you are. You fail without realizing it. You have users, citations, participants. But you're not actually reducing x-risk, and you don't realize it because you're not tracking your impact or not tracking the right things. To set yourself up [...] ---Outline:(01:59) Impact = Adoption × Effectiveness(02:32) The Two Impact Multipliers(03:10) Most Projects Wont Matter. Yours Could.(03:49) Would you help kill every single human on earth?(04:29) When really not wanting to do something bad makes you more likely to do that bad thing(05:45) You can do this(06:27) Next Up: Post 2 - Entrepreneurial Skills The original text contained 7 footnotes which were omitted from this narration. ---
First published:
March 18th, 2026
Source:
https://www.lesswrong.com/posts/jDDZsoGRK8HNZEoTb/two-skillsets-you-need-to-launch-an-impactful-ai-safety
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 19, 2026 • 1min
[Linkpost] “Metagaming matters for training, evaluation, and oversight” by jenny, Bronson Schoen
This is a link post. Following up on our previous work on verbalized eval awareness: we are sharing a post investigating the emergence of metagaming reasoning in a frontier training run. Metagaming is a more general, and in our experience a more useful concept, than evaluation awareness. It arises in frontier training runs and does not require training on honeypot environments. Verbalization of metagaming can go down over the course of training. We also share some quantitative analyses, qualitative examples, and upcoming work. ---
First published:
March 18th, 2026
Source:
https://www.lesswrong.com/posts/4hXWSw8tzoK9PM7v6/metagaming-matters-for-training-evaluation-and-oversight
Linkpost URL:https://alignment.openai.com/metagaming
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.


