AI incidents, audits, and the limits of benchmarks

103 snips

Feb 13, 2026

Sean McGregor, co-founder of the AI Verification & Evaluation Research Institute and creator of the AI Incident Database, is an AI safety and incident-collection specialist. He discusses why incident reporting matters and how databases track harms. He contrasts benchmarks with real-world audits, recounts red-team findings at DEF CON, and highlights common failure modes and the need for scalable verification.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Use A Broad Incident Definition

'Incident' purposely remains broad to cover harms with and without intent.
A flexible definition helps catalog exposure, accidents, controversies, and harm events in one system.

INSIGHT

Journalism Now, Mandatory Reporting Later

Public journalism currently supplies most incident reports because reporters validate core facts.
Mandatory reporting will be needed later to measure true incident rates and not just newsworthy cases.

ADVICE

Audit Models, Then Pilot In Your Context

Do third-party audits for general-purpose models before trusting them in your context.
Run pilots because vendor evaluations rarely cover your exact deployment distribution.

Get the Snipd Podcast app to discover more snips from this episode

Get the app