The Data Exchange with Ben Lorica

World Models Are Here—But It’s Still the GPT-2 Phase

Mar 19, 2026
Jeff Hawke, CTO of Odyssey, builds general-purpose world models that generate interactive visual simulations from images or text. He explains how continuous video-like models are trained, early use cases like games and robotics, compute and latency challenges, stability limits on long runs, and the path toward scalable, real-time and on-device deployments.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

World Models Produce Interactive Streams Not Clips

  • World models produce continuous interactive simulations that predict potential futures as a stream of pixels rather than fixed video clips.
  • Odyssey 2 Pro lets developers seed a world with an image or text, stream back evolving video, and interact with or manipulate the scene in real time.
INSIGHT

Video Volume Drives World Model Training

  • Training data for general world models is dominated by large-scale public video because video volume far exceeds other modalities.
  • Jeff Hawke notes trillions of visual observations from internet video offer a different constraint set than language and make data selection the core challenge.
INSIGHT

World Models Are In The GPT-2 Phase

  • World models are early and exploratory—Jeff Hawke compares the current period to the GPT-2 era of LLMs.
  • He expects many experimental use cases (games, retail, live events, robotics) before mass commercialization.
Get the Snipd Podcast app to discover more snips from this episode
Get the app