
RoboPapers Ep#42: General Intuition
4 snips
Nov 13, 2025 Discover how AI can learn from video games to create predictive world models. The team shares insights on using diffusion models for better visual detail in training agents. They explore the challenges of multi-player dynamics and the importance of high-quality action labels. The discussion includes innovations for stability and speed in model training, as well as the advantages of transferring knowledge across different games. Learn about their mission to develop general agents for complex reasoning in three-dimensional spaces.
AI Snips
Chapters
Transcript
Episode notes
Real-Time CSGO World Model Demo
- Adam trained a diffusion world model on 95 hours of CSGO gameplay and demonstrated it can run interactively in real time.
- The model ran at about 10 Hz on a single 3090 and produced action-steerable game frames.
Engineering Tricks To Run Diffusion Fast
- Speed was achieved by reducing denoising steps (down to three) and using a two-stage downsample-upsample pipeline for high-resolution frames.
- These engineering choices let diffusion models run fast enough for RL and interactive control.
Fix Drift By Collecting More Data
- Prioritize scaling data coverage to improve model stability in less-visited regions of an environment.
- Pim emphasizes more data reduces out-of-distribution drift and stabilizes long rollouts.
