The Ethics of Data-Intensive Applications - Designing Data-Intensive Applications by Martin Kleppman

7 snips

Feb 9, 2026

A lively wrap-up of stream processing, logs, Kafka-style replayability, and change data capture tools. They cover event sourcing, idempotence, and delivery guarantees for resilient pipelines. The conversation tackles privacy risks, bias in data-driven systems, and ethical trade-offs for engineers. Practical architecture tips and career-facing advice round out the tech-focused discussion.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Streams Are The Next Step After Batch

Stream processing is the natural next step after batch for continuous, unbounded data like markets or analytics.
Understanding streams reveals trade-offs around ordering, latency, and state that shape modern data systems.

ANECDOTE

Nightly FTP Feeds Fit Batch Jobs

Nathan describes a fintech batch job that ingested nightly NASDAQ and NYSE FTP payloads and made data available by morning.
Batch processing was simple, reliable, and suited to bounded nightly data delivery in that use case.

ADVICE

Use Append-Only Logs For Replay And Scaling

Use log-based message brokers (like Kafka) to append events and let consumers replay history for experiments and bug fixes.
Partition by a stable key (e.g., user ID) to preserve ordering while scaling processing across workers.

Get the Snipd Podcast app to discover more snips from this episode

Get the app