Measuring Machine Intelligence with Chris Painter

4 snips

Apr 17, 2026

Chris Painter, president of Model Evaluation and Threat Research, designs ways to measure AI autonomy and catastrophic risk. He discusses using time horizon metrics to gauge sustained autonomy. They cover software-task benchmarks, observed rapid progress in capabilities, barriers to fully automated AI research, and how compute and allocation shape future risks.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Measuring Autonomy Is Key To Assessing AI Catastrophe Risk

Meter focuses on measuring when AI systems could pose catastrophic risks by assessing their level of autonomy rather than just misuse scenarios.
They target AI agent capabilities like automating AI research or sabotaging developers as distinct threats requiring different evaluations.

INSIGHT

Differentiate Misuse From Intrinsic Autonomous Risks

Distinguish misuse risks (humans using AI to build harmful things) from novel risks intrinsic to autonomous AI pursuing misaligned goals.
Intrinsic risks involve means, motive, and opportunity: high capability systems could pursue hidden objectives and act in the world beyond tool-like assistance.

INSIGHT

Model Time Horizon Translates Human Task Time Into AI Autonomy

Meter's core metric is model time horizon: how long a human-baselined task (in minutes/hours) an AI agent can autonomously complete at 50% success.
They benchmarked ~200 software engineering tasks, timing expert humans then measuring agent pass rates to fit a success curve.

Get the Snipd Podcast app to discover more snips from this episode

Get the app