Latent Space: The AI Engineer Podcast

Latent.Space

The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.

We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.

Full show notes always on https://latent.space www.latent.space

Episodes

Mentioned books

Feb 24, 2026 • 2h 4min

Claude Code for Finance + The Global Memory Shortage: Doug O'Laughlin, SemiAnalysis

Doug O'Laughlin, founder of SemiAnalysis and semiconductor analyst known for deep memory, GPU, and supply-chain research. He talks about Claude Code powering analyst workflows and agent swarms for automation. He explores the memory supply crunch, HBM vs DRAM pressures, and how AI tooling reshapes information work and institutional memory.

Feb 23, 2026 • 26min

⚡️The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals & Human Data

Mia Glaese, VP of Research at OpenAI who oversees Codex and alignment work, and Olivia Watkins, a Frontier Evals evaluator focused on contamination and evaluation design, discuss why SWE‑Bench Verified became saturated and contaminated. They walk through its human curation, show examples of contamination and narrow tests, and explain the move toward tougher, more diverse benchmarks that measure longer‑horizon coding tasks and real‑world product skills.

Feb 19, 2026 • 55min

Bitter Lessons in Venture vs Growth: Anthropic vs OpenAI, Noam Shazeer, World Labs, Thinking Machines, Cursor, ASIC Economics — Martin Casado & Sarah Wang of a16z

Sarah Wang, a16z partner investing in AI infrastructure and growth, and Martin Casado, a16z GP and former networking founder, unpack the new AI capital dynamics. They explore compute-first funding, the raise→train→ship flywheel, blurred lines between infrastructure and apps, talent and compensation spirals, and two possible market futures: extreme fragmentation or a few dominant supermodels.

Feb 12, 2026 • 1h 24min

Owning the AI Pareto Frontier — Jeff Dean

Jeff Dean, Google’s Chief AI Scientist and TPU co-designer, reflects on decades of scaling ML and hardware co-design. He talks about owning the Pareto frontier with both Pro and low-latency Flash models. Topics include distillation as the engine for deployable models, energy- and latency-first design, sparse trillion-parameter networks, long-context ambitions, and hardware-software co-design for future AI systems.

Feb 12, 2026 • 1h 21min

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

Jeremy Wohlwend, co-founder of Boltz and ML researcher in generative structural biology, and Gabriele Corso, co-founder and structural biology researcher from MIT, discuss open-source Boltz models like Boltz-1 and Boltz-2. They cover the shift from AlphaFold to generative multi-chain modeling. Topics include co-evolution signals, sampling vs regression, affinity prediction, large-scale wet-lab validation, and building an open research platform.

Feb 6, 2026 • 1h 8min

The First Mechanistic Interpretability Frontier Lab — Myra Deng & Mark Bissell of Goodfire AI

Myra Deng, Head of Product at Goodfire AI who turns interpretability research into production, and Mark Bissell, mechanistic interpretability engineer with Palantir roots, discuss making model internals actionable. They cover lightweight probes, token-level safety filters, real-time steering of huge models, post-training surgical edits, and applying these tools across language, vision, and genomics.

Jan 28, 2026 • 1h 14min

🔬 Automating Science: World Models, Scientific Taste, Agent Loops — Andrew White

Andrew White, former professor turned AI-for-science entrepreneur who co-founded Future House and Edison Scientific. He recounts building ChemCrow and Cosmos, red-teaming GPT-4 for chemistry, and automating hypothesis-to-experiment loops. Topics include scientific taste and why RLHF failed, world models as distilled scientific memory, lab-in-the-loop bottlenecks, and safety/dual-use tradeoffs.

Jan 23, 2026 • 1h 32min

Captaining IMO Gold, Deep Think, On-Policy RL, Feeling the AGI in Singapore — Yi Tay

Yi Tay, a DeepMind researcher who co-led the IMO Gold project and built the Reasoning & AGI team in Singapore. He recounts training Gemini Deep Think, the live IMO Gold push, the shift from symbolic systems to end-to-end RL, debates on on-policy versus off-policy learning, and the role of self-consistency and data efficiency in unlocking reasoning.

Jan 17, 2026 • 1h 13min

Brex’s AI Hail Mary — With CTO James Reggio

James Reggio, CTO of Brex and leader of their AI transformation, shares his journey from mobile engineer to fintech innovator. He discusses Brex's unique three-pillar AI strategy aimed at enhancing corporate workflows, operational compliance, and customer-oriented product features. Reggio reveals how SOP-driven agents outperform traditional reinforcement learning in automating processes like KYC and underwriting. He emphasizes empowering employees to create their own AI tools and the advantages of a multi-agent network architecture in financial operations.

Jan 8, 2026 • 1h 18min

Artificial Analysis: Independent LLM Evals as a Service — with George Cameron and Micah-Hill Smith

Join George Cameron, co-founder of Artificial Analysis and benchmarking guru, along with Micah Hill-Smith, who crafted the evaluation methodology and unique benchmarks. They share their journey from a basement project to a vital tool for AI model assessment. Discover why independent evaluations matter, how their 'mystery shopper' strategy keeps benchmarks honest, and the innovative Omniscience index that prioritizes accurate responses. Learn about the evolving AI landscape and their predictions for future developments in benchmarking and transparency.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner