

The BugBash Podcast
Antithesis
The BugBash podcast is a lively look at all aspects of software reliability, by enthusiasts, for everyone.
Each episode brings leading engineers and researchers together for deep dives on everything from formal methods to testing to observability to human factors. There’s concrete advice on best practices, and nuanced discussion of how these strategies combine to deliver software that works.
And if you’re enjoying these conversations, check out the talks from BugBash 2025 on YouTube, and join us at BugBash 2026 on April 23-24, 2026, in Washington DC!
Each episode brings leading engineers and researchers together for deep dives on everything from formal methods to testing to observability to human factors. There’s concrete advice on best practices, and nuanced discussion of how these strategies combine to deliver software that works.
And if you’re enjoying these conversations, check out the talks from BugBash 2025 on YouTube, and join us at BugBash 2026 on April 23-24, 2026, in Washington DC!
Episodes
Mentioned books

Mar 25, 2026 • 51min
The Dollar Bet that Fuzzed Figma: Exploding Laptops and UI Reliability with Jonathan Chan
Jonathan Chan, a former Figma engineer and creator of FuzzMap, built a coverage-guided fuzzer for React UIs. He tells the lunch-bet origin of FuzzMap and why Figma needed faster, reproducible UI testing. He explains the gnarly React instrumentation hacks, state deduplication and visualization, and ideas for extending fuzzing to networks and full-stack scenarios.

Mar 18, 2026 • 52min
Semmathesy and the Agentic Era: Learning Systems in 2026
Jessica Kerr, software developer and systems-thinker known for symmathesy, explains learning systems made of learning parts. They explore treating software as a teammate and how observability becomes the system's language. Conversation covers AI agents joining teams, hallucination risks, shaping agent behavior with context, and redefining legacy code as what agents cannot understand.

Mar 11, 2026 • 1h 2min
From Scale to Rigor: An Engineering Journey at Meta and Oxide
A journey from massive, data-driven engineering to shipping untouchable, air-gapped hardware. How technical writing became a force multiplier for complex designs. The surprising power of property-based testing and Oracles to catch bugs early. Practical use of LLMs for prototyping and doc review with strong human oversight. Real-world challenges of testing and operating a cloud-in-a-box without live patches.

Mar 4, 2026 • 58min
Escaping the Spaghetti: How to Test Untestable Codebases
Lewis Campbell, consultant at OutData who specializes in testing and reliability for legacy SaaS, explains practical ways to tame messy code. He covers adding static typing and schemas, extracting deterministic islands from tangled apps, moving business logic out of React, and using property-based and simulation-style randomized tests to reveal bugs. He also talks about political tactics for surgical refactors and team buy-in.

Feb 25, 2026 • 52min
How rr Became a Protected Species: A Story of Necessary Hacks
Rob O'Callaghan, creator of the rr record-and-replay debugger and co-founder of Pernosco, is a systems engineer who made time-travel debugging practical. He recounts building rr with pragmatic hacks, patching syscalls and courting the kernel to keep rr safe. Talks include why browsers needed record-and-replay, Pernosco's data-flow visualizations, and the tradeoff of practical hacking over academic purity.

29 snips
Feb 18, 2026 • 1h 20min
Re-Designing Data-Intensive Applications: The Shift to Cloud-Native Storage
Chris Riccomini, an engineer who builds distributed systems and startups, and Martin Kleppmann, a researcher and author on distributed systems, discuss rebuilding databases on cloud-native object storage. They explore how S3-like stores change system assumptions. They revisit CAP and propose offline availability. They debate using formal methods, model checking, and LLMs as test oracles for migrations.

Dec 10, 2025 • 1h 19min
Hypothesis vs. Hallucinations: Property Testing AI-Generated Code
David R. MacIver, creator of Hypothesis and PBT researcher, discusses making AI-generated code trustworthy. He explains why standard benchmarks mislead, how property-based testing uncovers hidden failures, and the challenges of getting developers to think in invariants. Conversations cover LLMs writing tests, Hypothesis’ UX for onboarding, shrinking failing cases, and workflows for continuous fuzzing and regression protection.

Nov 26, 2025 • 40min
From the Lab to Production: Making Cutting-Edge Testing Practical
Software testing research is exploding, but in practice, most companies' testing approaches seem stuck in the past. Where does that gap come from?It often boils down to the distance between academic promises and the practical needs of developers who need usable tools and fast results.In this episode, David talks with Rohan Padhye, head of the PASTA research group at Carnegie Mellon University, who has lived on both sides of that divide. They explore how fuzz testing crossed that chasm—from industry curiosity to academic focus and back again—and what it will take for other techniques to do the same.Rohan shares insights on designing testable software, building a robust testing culture, and what truly makes a "good" property for finding bugs.

Nov 12, 2025 • 40min
Ergonomics, reliability, durability
Integrating non-deterministic, non-durable elements like AI agents into our workflows tends to lead to a lot of do-overs. But restarting AI processes can be costly, burning through tokens and losing valuable progress. Wouldn’t it be easier if there was always a clear checkpoint to restart a task from? Today I talk with Qian Li, co-founder of the DBOS durable execution engine, about reliability, ergonomics, and actually understanding your software. We discuss the long history of checkpointing, mental models, and how using durable execution allows systems to resume right where they left off after a crash. It makes your software resilient by default.Learn how this architectural pattern can impact an AI-assisted or any complex system that could use a little improvement in how developers work with it.

Oct 30, 2025 • 55min
No actually, you can property test your UI
How do you test for bugs that only appear when a user clicks frantically, or when asynchronous data loads in an unexpected order? Standard UI tests often miss the subtle stuff that happens all the time in the stateful, dynamic applications.In this episode, Paul Ryan and I sit down with Oskar Wickström, creator of the QuickStrom framework, among other things, to explore how to apply generative testing to the complex world of user interfaces. Oskar argues that you don't need to be a formal methods genius to get real value out of the approach. Even simple properties can uncover deep bugs, like ensuring a loading spinner eventually disappears or that the screen never goes blank. If you've been intrigued by property-based testing but intimidated by the thought of writing complex formal models for UIs, stick around.


