
AI + a16z What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado
10 snips
Mar 17, 2026 Vishal Misra, Columbia professor and vice dean specializing in how LLMs function, explains experiments that show transformers update token predictions in mathematically predictable ways. He compares LLMs to human learning, highlights the plasticity gap, and discusses why moving from pattern matching to causal, continual learning is vital for true AGI.
AI Snips
Chapters
Transcript
Episode notes
Building A DSL Front End For StatsGuru
- Vishal Misra built a DSL and used GPT-3 in 2020 to translate natural language cricket queries into database queries for ESPN.
- He few-shot primed GPT-3 with ~1,500 examples and deployed a production frontend by October 2020 that completed unseen DSL tokens.
LLMs As Sparse Prompt-to-Token Matrices
- Misra models an LLM as a giant sparse matrix mapping every possible prompt (row) to a next-token distribution (columns).
- Real models compress this matrix because the combinatorial prompt space is astronomically large but mostly gibberish, yielding a sparse structure.
In-Context Learning Is Bayesian Updating
- In-context learning is the model updating its next-token posterior as it sees examples, resembling Bayesian updating.
- Misra observed the DSL token probabilities rising with each example until near-100% for new queries.

