
The Lawfare Podcast Lawfare Archive: Elliot Jones on the Importance and Current Limitations of AI Testing
Mar 15, 2026
Elliot Jones, Senior Researcher at the Ada Lovelace Institute who studies AI evaluation and governance. He discusses why AI evaluations, audits, and benchmarks matter now. He explains the technical and governance hurdles in testing foundation models. He covers who should run assessments, regulatory approaches in the EU/UK/US, and risks like audit-washing and limited test coverage.
AI Snips
Chapters
Transcript
Episode notes
Use Audits For Standardized Governance Checks
- Treat audits as structured, standardized assessments that can include governance and organizational practices beyond model behavior.
- Jones recommends audits define endpoints and processes similar to financial audits, not ad hoc testing.
Regulatory Approaches Are Diverging Globally
- EU, UK, and US approaches differ: EU's AI Act moves toward mandatory assessments and third-party checks, while UK/US rely more on voluntary safety institutes.
- Jones notes the EU may require third-party assessors or centralized office evaluations.
Independent Institutes Reduce Gaming Of Tests
- Government-run AI Safety Institutes reduce gaming risk because companies don't know exact tests or answers in advance.
- Jones highlights UK institute developing its own evaluations and publishing results to avoid company-selected benchmarks.

