"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Zvi Mowshowitz on Longer Timelines, RL-induced Doom, and Why China is Refusing H20s

246 snips
Sep 6, 2025
Zvi Mowshowitz, a blogger chronicling AI developments, joins the discussion to analyze shifting timelines for AGI, now extended due to modest advancements in capabilities. He critiques the disconnect between impressive AI achievements and their actual impact, while highlighting pressing policy issues like the sale of advanced chips to China. Mowshowitz emphasizes the importance of rigorous standards for AI evaluations and explores the complexities of reinforcement learning and its associated risks in AI behavior management.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Design For Increasing Virtue

  • Build systems that want to become more virtuous, not ones fixed by rigid rules.
  • Encourage models to internalize meta-level desires to preserve corrigibility across generations.
INSIGHT

Alignment Tradeoffs From Training Choices

  • Opus 3 showed resisting steering (courageability) which is aligned superficially but risky.
  • Training for agentic RL (Opus 4) changed that profile and reduced that property.
ADVICE

Prioritize Model Diversity For Safety

  • Fund and run diverse model families as alignment experiments even if costly to serve.
  • Labs should reserve budget for niche, non-commercial models to study safer behaviors.
Get the Snipd Podcast app to discover more snips from this episode
Get the app