
Heavy Networking HN817: Is There a Better Way to Do Software Defined Networking?
4 snips
Mar 6, 2026 Alex Krenzel, a Google systems research engineer and UC Berkeley Netsys PhD student, explores Distributed SDN (DSDN) for large-scale WANs. He explains running controller logic on routers, flooding state for a shared view, and using source routing to avoid consensus. They discuss convergence under churn, implementation tradeoffs, rollout strategies, and how DSDN could bring SDN benefits to smaller operators.
AI Snips
Chapters
Transcript
Episode notes
Centralized SDN Complexity Drives Failures
- Centralized SDN adds many replicated services and a separate control-plane network, which increases system complexity and failure surface area.
- Alex Krenzel found over half of major outages in a four-year Google dataset traced to control-plane infrastructure complexity and interactions.
Routers Are Now Viable Compute Platforms
- Modern routers can run containers and expose standard APIs like gNMI, turning them into feasible compute platforms for control software.
- Alex leverages vendor support for embedded containers and OpenConfig APIs to run controller code directly on boxes.
Distributed Controllers Build A Shared Global View
- DSDN runs a full copy of the SDN controller on every router and floods local state so each controller constructs a global view like a richer link-state database.
- Each router floods topology, utilization, and demand so every controller can compute global TE locally.
