
What happens when AI runs a store
ACCESS
00:00
Designing effective, out-of-distribution evaluations
Lukas explains their evaluation approach: put models into unfamiliar domains to reveal real-world limits.
Play episode from 56:00
Transcript

Lukas explains their evaluation approach: put models into unfamiliar domains to reveal real-world limits.