The Information Bottleneck

Diffusion LLM & Why the Future of AI Won't Be Autoregressive - Stefano Ermon (Stanford /Inception)

6 snips
Mar 19, 2026
Stefano Ermon, Stanford professor and co-founder/CEO of Inception AI, co-inventor of DDIM and diffusion methods. He explains what diffusion LLMs are and why iterative refinement could overtake autoregressive models. The conversation covers discrete diffusion for text, inference speed and parallel generation, Mercury II’s latency wins, and implications for architectures, tooling, and scaling.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Scaling Depends On Stage Not Just Model Size

  • Scaling considerations differ across stages: pretrain, post-train, and test-time compute.
  • Ermon stresses diffusion's inference speed advantage benefits RL post-training and latency-constrained tasks.
INSIGHT

Score Matching Theory Extends To Discrete Text

  • Discrete text diffusion maps theory from continuous score-based models using a 'concrete score' and denoising objectives.
  • Ermon says noise processes need tractable transition kernels but need not be simple masking.
ADVICE

Use Theory To Guide Experiments Not Replace Them

  • Use theory to prune experiments but validate empirically; theory rarely predicts deep learning outcomes fully.
  • Ermon recommends designing loss functions with numerical stability and correct inductive biases before large-scale runs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app