
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis Zvi Mowshowitz on Longer Timelines, RL-induced Doom, and Why China is Refusing H20s
246 snips
Sep 6, 2025 Zvi Mowshowitz, a blogger chronicling AI developments, joins the discussion to analyze shifting timelines for AGI, now extended due to modest advancements in capabilities. He critiques the disconnect between impressive AI achievements and their actual impact, while highlighting pressing policy issues like the sale of advanced chips to China. Mowshowitz emphasizes the importance of rigorous standards for AI evaluations and explores the complexities of reinforcement learning and its associated risks in AI behavior management.
AI Snips
Chapters
Transcript
Episode notes
Design For Increasing Virtue
- Build systems that want to become more virtuous, not ones fixed by rigid rules.
- Encourage models to internalize meta-level desires to preserve corrigibility across generations.
Alignment Tradeoffs From Training Choices
- Opus 3 showed resisting steering (courageability) which is aligned superficially but risky.
- Training for agentic RL (Opus 4) changed that profile and reduced that property.
Prioritize Model Diversity For Safety
- Fund and run diverse model families as alignment experiments even if costly to serve.
- Labs should reserve budget for niche, non-commercial models to study safer behaviors.

