Equity cover image

The PhD students who became the judges of the AI industry

Equity

00:00

Measuring real-world intelligence versus static tests

Anastasios contrasts static benchmarks with Arena's live user-driven evaluations that resist overfitting and reflect diverse tasks.

Play episode from 04:28
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app