Ep 80: CEO of Surge AI Edwin Chen on Why Frontier Labs Are Diverging, RL Environments & Developing Model Taste

174 snips

Dec 15, 2025

Edwin Chen, Founder and CEO of Surge AI, shares insights from his data infrastructure company supporting major AI labs like OpenAI and Meta. He discusses the pitfalls of optimizing for clickbait benchmarks, revealing how these practices mislead model quality. Chen emphasizes the importance of rigorous human evaluations over gaming benchmarks, and he critiques Silicon Valley's pivot culture. The conversation delves into the diversity of AI training approaches, advocating for multiple opinionated models tailored to specific needs in future AI development.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

RL Environments Are The Next Data Frontier

RL environments represent the next step beyond SFT and RLHF, requiring realistic simulated worlds and extensive tooling.
Building them demands populated worlds, executable tools, diverse prompts, and deep measurement infrastructure.

ADVICE

Track Trajectories, Not Only Final Rewards

Track model trajectories and dig into failures to avoid short-term reward hacks masking deeper weaknesses.
Analyze why models fail, not just whether they get rewarded, to shore up underlying capabilities.

INSIGHT

Human Data, Not Just Staffing, Powers Environments

Creating RL environments is primarily a human-data challenge that requires technology to scale and verify quality.
Pure staffing approaches miss the tooling and quality signals needed for rich, creative worlds.

Get the Snipd Podcast app to discover more snips from this episode

Get the app