AI + a16z

What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado

10 snips
Mar 17, 2026
Vishal Misra, Columbia professor and vice dean specializing in how LLMs function, explains experiments that show transformers update token predictions in mathematically predictable ways. He compares LLMs to human learning, highlights the plasticity gap, and discusses why moving from pattern matching to causal, continual learning is vital for true AGI.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Building A DSL Front End For StatsGuru

  • Vishal Misra built a DSL and used GPT-3 in 2020 to translate natural language cricket queries into database queries for ESPN.
  • He few-shot primed GPT-3 with ~1,500 examples and deployed a production frontend by October 2020 that completed unseen DSL tokens.
INSIGHT

LLMs As Sparse Prompt-to-Token Matrices

  • Misra models an LLM as a giant sparse matrix mapping every possible prompt (row) to a next-token distribution (columns).
  • Real models compress this matrix because the combinatorial prompt space is astronomically large but mostly gibberish, yielding a sparse structure.
INSIGHT

In-Context Learning Is Bayesian Updating

  • In-context learning is the model updating its next-token posterior as it sees examples, resembling Bayesian updating.
  • Misra observed the DSL token probabilities rising with each example until near-100% for new queries.
Get the Snipd Podcast app to discover more snips from this episode
Get the app