
Practical AI AI incidents, audits, and the limits of benchmarks
102 snips
Feb 13, 2026 Sean McGregor, co-founder of the AI Verification & Evaluation Research Institute and creator of the AI Incident Database, is an AI safety and incident-collection specialist. He discusses why incident reporting matters and how databases track harms. He contrasts benchmarks with real-world audits, recounts red-team findings at DEF CON, and highlights common failure modes and the need for scalable verification.
AI Snips
Chapters
Transcript
Episode notes
Use A Broad Incident Definition
- 'Incident' purposely remains broad to cover harms with and without intent.
- A flexible definition helps catalog exposure, accidents, controversies, and harm events in one system.
Journalism Now, Mandatory Reporting Later
- Public journalism currently supplies most incident reports because reporters validate core facts.
- Mandatory reporting will be needed later to measure true incident rates and not just newsworthy cases.
Audit Models, Then Pilot In Your Context
- Do third-party audits for general-purpose models before trusting them in your context.
- Run pilots because vendor evaluations rarely cover your exact deployment distribution.

