
Eye On A.I. #289 Eiso Kant: How Reinforcement Learning and Coding Could Unlock Human-Level AI
28 snips
Sep 24, 2025 Eiso Kant, Co-founder of Poolside, is an entrepreneur dedicated to developing AI systems that leverage reinforcement learning for software development. He reveals why coding serves as a perfect training ground for achieving human-level intelligence. Eiso dives into how Poolside uses reinforcement learning from code execution to improve AI capabilities and shares insights on building foundation models that focus on reasoning. He discusses the exciting evolution of agentic AI and its role in transforming enterprise software development.
AI Snips
Chapters
Books
Transcript
Episode notes
RL From Code Execution At Scale
- RLCF trains models by sampling many rollouts and rewarding successful executions or failures.
- Poolside scaled to ~1M containerized repos and millions of revisions to diversify training tasks.
From Edits To Agentic, Multi-Step Learning
- Agents extend RLCF beyond single edits into multi-step interactions with tools and systems.
- Agentic training lets models run commands, open files, install deps, and aim for longer-range rewards.
Ship The Trained Agent As Product
- Ship the same agent used in RL training as the product so learning continues in production.
- Provide clarification tools and test-time compute to let agents ask humans and iterate on failures.



