
The Reasoning Show AI SRE for Complex Systems
39 snips
Apr 5, 2026 Anish Agarwal, CEO of Traversal and Columbia professor with MIT PhD roots in causal ML and RL, discusses AI-native approaches to observability and SRE. He covers why traditional observability breaks, how AI-generated code explodes telemetry, reframing observability as an AI problem, building a production world model, and the vision of agentic search and self-driving production stacks.
AI Snips
Chapters
Transcript
Episode notes
Observability Shows Symptoms Not Causes
- Observability gives you eyes into a system but not the reasoning to find root causes.
- Anish Agarwal says metrics, logs, and traces produce correlations; humans still must connect causality across thousands of signals during incidents.
AI Code Is Exploding Telemetry Faster Than Teams Can Learn
- AI-generated code is rapidly increasing telemetry volume while team understanding and SRE headcount remain flat.
- Anish highlights an expanding gap: more data, less human understanding, and static SRE capacity driving the failure mode.
Build A Production World Model To Reason At Scale
- Traversal builds a production world model plus a causal search engine to map causal relationships across observability data.
- They re-index existing logs, metrics, and traces into a representation readable by agentic systems to find root causes at scale.

