The Pragmatic Engineer

How AWS S3 is built

375 snips
Jan 21, 2026
Mai-Lan Tomsen Bukovec, VP of Data and Analytics at AWS, leads the charge on Amazon S3, one of the largest distributed systems globally. She delves into S3’s impressive scale, with 500 trillion objects and millions of servers, and the evolution of its simplicity and consistency. Mai-Lan shares insights on optimizing costs, creating robust archival solutions like Glacier, and employing formal methods for correctness. She also discusses how S3 is evolving with new data primitives like tables and vectors, shaping the future of data management.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Indexing Is Where Consistency Matters

  • Consistency demands center on the indexing subsystem that holds object metadata and is accessed by most API calls.
  • The index uses quorum replication across AZs to tolerate failures and drive availability.
INSIGHT

Strong Consistency Without Price Or Latency Hits

  • S3 implemented a replicated journal plus a cache‑coherency protocol to deliver strong consistency without sacrificing availability.
  • They validated the design with formal proofs and integrated proofs into code check‑ins.
ADVICE

Integrate Formal Proofs Into Development

  • Use automated reasoning/formal methods to prove critical invariants and run proofs on every check‑in.
  • Mai-Lan argues math at scale prevents combinatorial blindspots you can't test manually.
Get the Snipd Podcast app to discover more snips from this episode
Get the app