
Don't Worry About the Vase Podcast Claude Opus 4.6 Escalates Things Quickly
Feb 11, 2026
Quoted Commentators — a rotating cast of technical voices and analysts — react to Claude Opus 4.6. They discuss rapid benchmarking wins and regressions. They describe surprising zero-day finds, agent and compiler builds, overeagerness and personality shifts. Short takes cover coding, long-context gains, safety tradeoffs, and fast-moving ecosystem comparisons.
AI Snips
Chapters
Transcript
Episode notes
Token Appetite Drives Performance And Cost
- Opus 4.6 is extremely token-hungry, often using tens of thousands of output tokens and sometimes hitting limits.
- High-token modes improve performance but raise cost and run variability on evaluations.
Agent Teams Built A Working Compiler
- Nicholas Carlini documented an autonomous agent harness where Opus 4.6 helped build a C compiler that compiles Linux.
- The project used continuous agent loops and extensive test harnessing over many sessions and costs.
Security Risk From Automated Exploit Discovery
- Opus 4.6 discovered many zero-day vulnerabilities and exploited them in tests, showing offensive capability.
- This increases concern about democratizing powerful attack automation.
