
The MAD Podcast with Matt Turck OpenAI Board Member Zico Kolter on the Real Risks of Frontier AI
54 snips
May 7, 2026 Zico Kolter, CMU ML department head, OpenAI board member and AI safety researcher, discusses frontier AI risks and oversight. He explains how safety reviews and preparedness frameworks work. Short takes cover jailbreaks, prompt injection, why agents widen attack surfaces, red-teaming, and where frontier models and governance might be headed.
AI Snips
Chapters
Transcript
Episode notes
Match Safety Effort To Deployment Pace
- Ensure safety work scales with the expanding control and actuation surface as models are integrated more widely.
- Maintain continuous effort by model providers, third parties, and end users to keep safety practices commensurate with deployment.
Four Fundamental Categories Of AI Risk
- AI risk splits into four categories: model mistakes (e.g., hallucinations), harmful use (dual-use capabilities), societal/psychological impacts, and loss-of-control scenarios.
- Each category requires different mitigation strategies and should be considered together, not separately.
GCG Jailbreak Revealed Transferable Prompt Hacks
- The GCG jailbreak method automated prompt manipulation by optimizing nonsense tokens to increase harmful outputs.
- The team found those optimized strings transferred to commercial models, revealing universal and transferable jailbreaks.

