Eye On A.I.

#298 Ryan Kolln: How Appen Trains the World's Most Powerful AI Models

18 snips
Nov 6, 2025
Ryan Kolln, CEO of Appen, discusses the critical role of human evaluation in training AI models. He explains why traditional benchmarks fall short, emphasizing the need for user-centered measures. Kolln highlights how curated human evaluators provide richer insights than random feedback, ensuring AI's cultural relevance through localized data. He also covers the evolution from supervised learning to large language model evaluations, and the synergy between AI evaluators and human annotators in enhancing quality control and model performance.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Scale Requires Cultural Matching

  • Appen maintains millions of contributors worldwide and focuses on finding right-demographic matches.
  • The hard part is translating one task into many languages and cultures while keeping consistent quality.
ADVICE

Verify And Monitor Evaluators Continuously

  • Verify contributor identity, location, and expertise with automated checks before using their labels.
  • Run real-time quality controls to prevent low-effort or fraudulent work.
INSIGHT

Evals Commonly Use Thousands Of People

  • Model evaluations often require thousands of ongoing contributors and can spike to tens of thousands.
  • Large-scale human evals are common and necessary for robust measurement.
Get the Snipd Podcast app to discover more snips from this episode
Get the app