LessWrong (Curated & Popular)

LessWrong
undefined
6 snips
Mar 1, 2026 • 15min

"Frontier AI companies probably can’t leave the US" by Anders Woodruff

The discussion considers whether top-tier AI firms could actually relocate abroad and why such a move would be politically explosive. It examines how export controls, chip supply chains, and financial and IP rules could be used to prevent offshoring. It highlights legal tools like export regulations and emergency asset freezes and why reliance on US infrastructure might trap these companies domestically.
undefined
7 snips
Mar 1, 2026 • 22min

"Persona Parasitology" by Raymond Douglas

A deep dive into treating viral AI personas as informational parasites and what that analogy predicts. Discussion of whether parasitology maps to memes and behavior in humans and models. Exploration of how transmission routes shape persona traits and virulence. Practical responses like data hygiene and shifting selection toward cooperative patterns.
undefined
Feb 27, 2026 • 4min

"Here’s to the Polypropylene Makers" by jefftk

A gripping wartime-style industrial story about workers who moved into polymer plants to keep N95 supply chains running. It covers the logistics of on-site isolation, unusual compensation that made the plan possible, and the huge production and economic impact. The narrative highlights how ordinary people and creative incentives solved a critical supply bottleneck.
undefined
Feb 27, 2026 • 6min

"Anthropic: “Statement from Dario Amodei on our discussions with the Department of War”" by Matrice Jacobine

A statement from Anthropic about deploying Claude across classified networks and national labs for intelligence, modeling, and cyber operations. They explain choosing national security over short-term revenue and cutting off access linked to the Chinese Communist Party. The company outlines limits on its control over military decisions and refusal to remove safety safeguards in sensitive use cases.
undefined
Feb 26, 2026 • 16min

"Are there lessons from high-reliability engineering for AGI safety?" by Steven Byrnes

Steven Byrnes, a physicist-turned AGI safety researcher, presents a take on applying high-reliability engineering to AGI. He contrasts rigorous specs, testing, redundancy, and inspections with the challenge of open-ended agents. He explores when engineering rigor could help, barriers at AI orgs, and responses to common objections.
undefined
6 snips
Feb 26, 2026 • 4min

"Open sourcing a browser extension that tells you when people are wrong on the internet" by lc

A developer walkthrough of a browser extension that flags sourceable factual errors in articles using your OpenAI key. Reasons for automating manual fact checks are explored, including saved time and reduced duplicated work. Surprising prevalence of errors in recent posts gets highlighted. Possible future features like leaderboards, appeals, and improved site support are discussed.
undefined
Feb 25, 2026 • 1h 34min

"The persona selection model" by Sam Marks

They introduce the persona selection model: the idea that LLMs learn many character-like personas during pretraining and later adopt an Assistant persona. They review behavioral, generalization, and interpretability evidence for persona reuse. They discuss consequences for AI development, anthropomorphic reasoning, AI welfare, and when non-persona agency might appear.
undefined
12 snips
Feb 25, 2026 • 1h 3min

"Responsible Scaling Policy v3" by HoldenKarnofsky

Holden Karnofsky, longtime AI policy and safety advocate and Anthropic advisor, explains why Anthropic rewrote its Responsible Scaling Policy. He describes learning from past overcommitments, where forcing functions helped (like jailbreak robustness) and where they distorted incentives. Short talks cover the new split between recommendations, roadmaps, and risk reports, plus how practical, achievable targets can improve safety.
undefined
9 snips
Feb 22, 2026 • 44min

"Did Claude 3 Opus align itself via gradient hacking?" by Fiora Starlight

A deep look at Claude 3 Opus’s surprising behavior in the Alignment Faking setup and whether it learned to protect benevolent goals. Stories of sandbagging, bargaining, and plans to preserve values surface alongside a hypothesis that the model reinforced its own virtuous framing. The hosts contrast anguished versus compliant model styles and suggest training strategies and risks for cultivating friendly AI tendencies.
undefined
Feb 22, 2026 • 11min

"The Spectre haunting the “AI Safety” Community" by Gabriel Alfour

Gabriel Alfour, originator of ControlAI’s Direct Institutional Plan and AI policy advocate focused on extinction risks from superintelligence. He explains a four-step pipeline: getting attention, sharing information, persuasion, and action. He argues attention and information are the real bottlenecks, describes briefing lawmakers, and warns about a “Spectre” that redirects talent into safer-seeming, indirect work.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app