#289 Eiso Kant: How Reinforcement Learning and Coding Could Unlock Human-Level AI

28 snips

Sep 24, 2025

Eiso Kant, Co-founder of Poolside, is an entrepreneur dedicated to developing AI systems that leverage reinforcement learning for software development. He reveals why coding serves as a perfect training ground for achieving human-level intelligence. Eiso dives into how Poolside uses reinforcement learning from code execution to improve AI capabilities and shares insights on building foundation models that focus on reasoning. He discusses the exciting evolution of agentic AI and its role in transforming enterprise software development.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

RL From Code Execution At Scale

RLCF trains models by sampling many rollouts and rewarding successful executions or failures.
Poolside scaled to ~1M containerized repos and millions of revisions to diversify training tasks.

INSIGHT

From Edits To Agentic, Multi-Step Learning

Agents extend RLCF beyond single edits into multi-step interactions with tools and systems.
Agentic training lets models run commands, open files, install deps, and aim for longer-range rewards.

ADVICE

Ship The Trained Agent As Product

Ship the same agent used in RL training as the product so learning continues in production.
Provide clarification tools and test-time compute to let agents ask humans and iterate on failures.

Get the Snipd Podcast app to discover more snips from this episode

Get the app