Durable computing: What is it and why now?

34 snips

Mar 5, 2026

John Coleman, lead consultant with expertise in distributed systems and state recovery. Brandon Cook, principal engineer focused on operationalizing resiliency in event-driven platforms. They define durable computing and state recovery. They compare durable platforms to orchestration patterns, weigh tradeoffs and lock-in, discuss testing and versioning, and explore durable agents for AI and practical pitfalls.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Durable Computing Guarantees Workflow Continuity

Durable computing lets a program recover its state and continue so long-running workflows can complete despite crashes.
John Coleman explained platforms persist state or effects and either replay or resume execution to guarantee completion.

INSIGHT

Platforms Abstract Known Failure Patterns

Durable computing platforms package known distributed-systems failure patterns so teams don't reinvent retries and recovery.
John Coleman said the novelty is platform power, not concepts like assured delivery or exactly-once semantics.

ANECDOTE

Assessment Led To A Central Durable Platform Team

After an assessment, Brandon helped create a platform team to centralize durable capabilities instead of each team reimplementing them.
He evaluated many platforms to democratize resiliency rather than forcing every team to build it.

Get the Snipd Podcast app to discover more snips from this episode

Get the app