#298 Ryan Kolln: How Appen Trains the World's Most Powerful AI Models

18 snips

Nov 6, 2025

Ryan Kolln, CEO of Appen, discusses the critical role of human evaluation in training AI models. He explains why traditional benchmarks fall short, emphasizing the need for user-centered measures. Kolln highlights how curated human evaluators provide richer insights than random feedback, ensuring AI's cultural relevance through localized data. He also covers the evolution from supervised learning to large language model evaluations, and the synergy between AI evaluators and human annotators in enhancing quality control and model performance.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Scale Requires Cultural Matching

Appen maintains millions of contributors worldwide and focuses on finding right-demographic matches.
The hard part is translating one task into many languages and cultures while keeping consistent quality.

ADVICE

Verify And Monitor Evaluators Continuously

Verify contributor identity, location, and expertise with automated checks before using their labels.
Run real-time quality controls to prevent low-effort or fraudulent work.

INSIGHT

Evals Commonly Use Thousands Of People

Model evaluations often require thousands of ongoing contributors and can spike to tens of thousands.
Large-scale human evals are common and necessary for robust measurement.

Get the Snipd Podcast app to discover more snips from this episode

Get the app