EP28: How to Control a Stochastic Agent with Stefano Soatto (VP AWS/ Pro. UCLA)

24 snips

Mar 6, 2026

Stefano Soatto, VP for AI at AWS and UCLA professor leading work on agentic AI, discusses treating LLMs as stochastic dynamical systems that require control. He explains strands coding: skeletons with verifiable pre/post-conditions to constrain AI functions. Conversation covers vibe vs spec coding limits, why algorithmic information matters, and how world models emerge from rich multimodal reasoning engines.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

LLMs Are Stochastic Control Systems

Stefano Soatto reframes modern LLMs as stochastic dynamical systems that must be treated as control problems, not just static predictors.
He highlights in-context and in-storage memory plus stochastic generation as core differences from classical induction.

ANECDOTE

Early Memory Of Code Writing To Disk

Stefano recalls his first school program that wrote to disk as an early thrill of triggering actions from instructions.
He contrasts that with modern agents acting on APIs and sensors, amplifying the power of language-like instructions.

ADVICE

Verify Intent Before Generating Code

Use a two-level control strategy: high-level open-loop planning plus local closed-loop feedback to manage model stochasticity.
Stefano introduces strands coding: skeleton code with AI functions constrained by pre/postconditions to verify intent before generation.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Stefano Soatto, VP for AI at AWS and Professor at UCLA, the person responsible for agentic AI at AWS, joins us to explain why building reliable AI agents is fundamentally a control theory problem.

Stefano sees LLMs as stochastic dynamical systems that need to be controlled, not just prompted. He introduces "strands coding," a new framework AWS is building that sits between vibe coding and spec coding, you write a skeleton with AI functions constrained by pre- and post-conditions, verifying intent before a single line of code is generated. The surprising part: even as AI coding adoption goes up, developer trust in the output is going down.

We go deep into the philosophy of models and the world. Stefano argues that the dichotomy between "language models" and "world models" doesn't really exist, where a reasoning engine trained on rich enough data is a world model. He walks us through why naive realism is indefensible, how reverse diffusion was originally intended to show that models can't be identical to reality, and why that matters now.

We also discuss three types of information, Shannon, algorithmic, and conceptual, and why algorithmic information is the one that actually matters to agents. Synthetic data doesn't add Shannon information, but it adds algorithmic information, which is why it works. Intelligence isn't about scaling to Solomonov's universal induction; it's about learning to solve new problems fast.

Takeaways:

Vibe coding is local feedback control with high cognitive load; spec coding is open-loop global control with silent failures, neither scales well alone.
Trust in AI-generated code is declining even as adoption rises.
The distinction between next-token prediction and world model is mostly nomenclature - reasoning engines operating on multimodal data are world models.
Algorithmic information, not Shannon information, is what matters in the agentic setting.
Intelligence isn't minimizing inference uncertainty - it's minimizing time to solve unforeseen tasks.
The intent gap between user and model cannot be fully automated or delegated.

Timeline

(00:13) Introduction and guest welcome

(01:12) How the agentic era changed machine learning

(06:11) Vibe coding one year later

(07:23) Vibe vs. spec vs. strands coding

(14:30) Why English is not a programming language

(16:36) Constrained generation and agent choreography

(20:44) Diffusion models vs. autoregressive models (25:59) The platonic representation hypothesis and naive realism

(31:14) Synthetic data and the information bottleneck

(36:22) Three types of information: Shannon, algorithmic, conceptual

(38:47) Scaling laws and Solomonov induction

(42:14) World models and the Goethian vs. Marrian approach

(49:00) Encoding vs. generation and JEPA-style training

(55:50) Are language models already world models?

(59:13) Closing thoughts on trust, education, and responsibility.

Music:

"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
"Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0. Changes: trimmed

About

The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.