What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado

438 snips

Mar 17, 2026

Vishal Misra, Columbia professor and AI researcher, digs into how transformers may update predictions like Bayesian math. He explores why that still falls short of consciousness. The conversation turns to what AGI would really need: continual learning, causal reasoning, new abstractions, and why scaling alone won’t get us there.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

The Cricket DSL That Pushed GPT-3 Into Production

Vishal Misra got GPT-3 to translate English cricket-stat questions into a DSL it had never seen and shipped it at ESPN.
He built semantic retrieval over 1,500 examples, effectively an early RAG system, then set out to understand why it worked.

LLMs Work Like A Sparse Prompt To Token Matrix

Misra models an LLM as a gigantic sparse matrix from prompt rows to next-token probability distributions.
The prompt protein shifts probability toward synthesis or shake, and that single token radically changes the entire continuation distribution.

In Context Learning Behaves Like Bayesian Updating

In-context learning looks like Bayesian updating because each example changes the model's next-token beliefs in real time.
In Misra's cricket DSL tests, English-token probabilities fell while DSL-token probabilities rose with each demonstration until the correct output became nearly certain.

Get the Snipd Podcast app to discover more snips from this episode

Get the app