
Inference by Turing Post Transformers Are Not the End Game | World Models, Physical AI, and AI’s Next Frontier
Apr 7, 2026
Sanja Fidler, VP of AI Research at NVIDIA and Spatial Intelligence Lab lead, studies world models, 3D spatial intelligence, and physical AI. She discusses how transformers and world models complement each other. She highlights why 3D and multimodal sensing matter for robotics and self-driving. She explores learned simulators like AlpaDreams and the hard gaps left in physical interaction and real-time simulation.
AI Snips
Chapters
Transcript
Episode notes
Transformers Complement World Models
- Transformers are a general architecture that can power tasks like language, video, or 3D world models rather than being mutually exclusive with world models.
- Sanja Fidler explains world models simulate virtual worlds (e.g., generating camera views) and can be implemented using transformer-based or other architectures.
Transformers Are Not The End Game
- Transformers are unlikely to be the final architecture; the field is already exploring alternatives like state space models and mixtures of experts.
- Fidler stresses architectures will evolve to reduce compute and data needs so smaller teams can experiment.
AlphaDreams Demo Shows Real Time Interactive Worlds
- NVIDIA announced AlphaDreams following Cosmos, moving from slow generative video chunks to interactive, real-time simulation.
- Fidler describes a demo with a steering wheel where the model runs like a game engine and users can drive in the loop.
