

Latent Space: The AI Engineer Podcast
Latent.Space
The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.
We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.
Full show notes always on https://latent.space www.latent.space
We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.
Full show notes always on https://latent.space www.latent.space
Episodes
Mentioned books

548 snips
Jan 6, 2026 • 24min
[State of Evals] LMArena's $1.7B Vision — Anastasios Angelopoulos, LMArena
Anastasios Angelopoulos, founder of LMArena, shares his journey from a Berkeley basement to a $100M valuation. He discusses why they chose to spin out as a company to scale their mission. The conversation dives into Arena's innovative approach to benchmarking AI models, the transparency of their public leaderboard, and their responses to critiques. Anastasios also reveals plans for expanding into new verticals like medicine and legal, the significance of community engagement, and the exciting shift to multimodal arenas.

568 snips
Jan 2, 2026 • 28min
[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton
Kevin Wang, an undergraduate researcher at Princeton, and Ishaan Javali, his co-author, discuss their groundbreaking work on scaling reinforcement learning networks to 1,000 layers deep, a feat previously deemed impossible. They dive into the shift from traditional reward maximization to self-supervised learning methods, highlighting architectural breakthroughs like residual connections. The duo also explores efficiency trade-offs, data collection techniques using JAX, and the implications for robotics, positioning their approach as a radical shift in reinforcement learning objectives.

407 snips
Dec 31, 2025 • 18min
[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang
Join John Yang, a Stanford PhD student and the mind behind SWE-bench and CodeClash, as he shares insights from the cutting-edge world of AI coding benchmarks. Discover how SWE-bench went from zero to industry standard in mere months, the limitations of traditional unit tests, and the innovative long-horizon tournaments of CodeClash. Yang dives into the debate around Tau-bench's 'impossible tasks' and explores the balance between autonomous agents and interactive workflows. Get ready for a glimpse into the future of human-AI collaboration!

448 snips
Dec 31, 2025 • 28min
[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI
In this engaging discussion, Josh McGrath, a post-training researcher at OpenAI, dives into the evolution of AI models from GPT-4.1 to GPT-5.1. He highlights the importance of data quality over optimization methods and explains why RLHF and RLVR are simply variations of policy gradients. Josh also shares insights on how the shopping model enhances user experience with personality toggles and the complexities involved in scaling reinforcement learning. His call for more engineers proficient in both distributed systems and ML further emphasizes the need for interdisciplinary expertise in advancing AI.

601 snips
Dec 30, 2025 • 45min
[State of RL/Reasoning] IMO/IOI Gold, OpenAI o3/GPT-5, and Cursor Composer — Ashvin Nair, Cursor
In this engaging discussion, Ashvin Nair, a researcher with a rich background in robotics and AI, shares his journey from OpenAI to Cursor. He highlights the transition from robotic challenges to the quicker impact of language models. Ashvin delves into the economic dynamics of LLMs, the importance of co-designing models and products, and the complexities of continual learning. He also explores the limitations of scaling and the need for specialized models, offering insights into the future of coding automation and the evolving landscape of AI.

430 snips
Dec 30, 2025 • 29min
[State of AI Startups] Memory/Learning, RL Envs & DBT-Fivetran — Sarah Catanzaro, Amplify
Join Sarah Catanzaro, a general partner at Amplify Partners with a focus on data and AI infrastructure, as she discusses the evolving landscape of AI startups. She shares insights on the impact of the DBT-Fivetran merger and how data tools are vital for frontier labs. Sarah critiques the trend of massive seed funding without clear roadmaps while highlighting when such raises are warranted. Delve into exciting topics like memory management, personalization challenges in AI products, and the true essence of real-world training environments.

914 snips
Dec 27, 2025 • 1h 39min
One Year of MCP — with David Soria Parra and AAIF leads from OpenAI, Goose, Linux Foundation
David Soria Parra, the lead core maintainer of the Model Context Protocol (MCP) at Anthropic, shares insights from MCP’s rapid ascent in the AI world. Joined by Nick Cooper from OpenAI and Jim Zemlin, CEO of the Linux Foundation, they discuss the journey from Thanksgiving hackathons to widespread enterprise adoption. The trio explores the design challenges of ensuring interoperability between agents, the decision to join the AAIF for neutral governance, and how MCP enhances agent capabilities while maintaining flexibility and security.

1,650 snips
Dec 26, 2025 • 37min
Steve Yegge's Vibe Coding Manifesto: Why Claude Code Isn't It & What Comes After the IDE
Steve Yegge, a veteran software engineer known for his roles at Google and Amazon, dives deep into the future of coding. He argues that IDEs will soon be obsolete, pushing developers to orchestrate AI agents like NASCAR pit crews instead of writing traditional code. Steve warns against anthropomorphizing these agents, noting the risks they pose. He also discusses the growing challenge of merging in highly productive teams and predicts a world where multi-agent systems revolutionize code creation, likening it to a 'factory farming' approach.

683 snips
Dec 26, 2025 • 28min
⚡️GPT5-Codex-Max: Training Agents with Personality, Tools & Trust — Brian Fioca + Bill Chen, OpenAI
Explore the revolutionary Codex Max, designed to work tirelessly for over 24 hours and manage code like a pro. Brian and Bill dive into the importance of training AI with personality and trust, making coding agents not only efficient but relatable. Discover how these tools develop preferences, like loving 'rg' over 'grep,' and learn about innovative evaluations that go beyond academics. They envision a future where coding agents enhance personal workflows and standardize high-level coding tasks across industries.

531 snips
Dec 18, 2025 • 1h 15min
SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)
Nikhila Ravi leads the Segment Anything project at Meta, with Pengchuan Zhang contributing as a researcher specializing in vision models. They discuss the groundbreaking SAM 3, which enables concept segmentation using natural language prompts. The conversation dives into the impressive real-time performance, the massive SACO benchmark of over 200k concepts, and how SAM 3 revolutionizes data annotation—reducing time from two minutes to just 25 seconds. Joseph Nelson from Roboflow shares insights on real-world applications in fields like cancer research and the automation of complex visual reasoning.


