Latent Space: The AI Engineer Podcast cover image

⚡️The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals & Human Data

Latent Space: The AI Engineer Podcast

00:00

Why SWE‑Bench Verified Saturated

Olivia explains SWE‑Bench Verified's saturation and contamination, arguing it no longer measures coding progress.

Play episode from 00:34
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app