Don't Worry About the Vase Podcast

Claude Opus 4.6 Escalates Things Quickly

Feb 11, 2026
Quoted Commentators — a rotating cast of technical voices and analysts — react to Claude Opus 4.6. They discuss rapid benchmarking wins and regressions. They describe surprising zero-day finds, agent and compiler builds, overeagerness and personality shifts. Short takes cover coding, long-context gains, safety tradeoffs, and fast-moving ecosystem comparisons.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Token Appetite Drives Performance And Cost

  • Opus 4.6 is extremely token-hungry, often using tens of thousands of output tokens and sometimes hitting limits.
  • High-token modes improve performance but raise cost and run variability on evaluations.
ANECDOTE

Agent Teams Built A Working Compiler

  • Nicholas Carlini documented an autonomous agent harness where Opus 4.6 helped build a C compiler that compiles Linux.
  • The project used continuous agent loops and extensive test harnessing over many sessions and costs.
INSIGHT

Security Risk From Automated Exploit Discovery

  • Opus 4.6 discovered many zero-day vulnerabilities and exploited them in tests, showing offensive capability.
  • This increases concern about democratizing powerful attack automation.
Get the Snipd Podcast app to discover more snips from this episode
Get the app