Book Overflow

The Ethics of Data-Intensive Applications - Designing Data-Intensive Applications by Martin Kleppman

7 snips
Feb 9, 2026
A lively wrap-up of stream processing, logs, Kafka-style replayability, and change data capture tools. They cover event sourcing, idempotence, and delivery guarantees for resilient pipelines. The conversation tackles privacy risks, bias in data-driven systems, and ethical trade-offs for engineers. Practical architecture tips and career-facing advice round out the tech-focused discussion.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Streams Are The Next Step After Batch

  • Stream processing is the natural next step after batch for continuous, unbounded data like markets or analytics.
  • Understanding streams reveals trade-offs around ordering, latency, and state that shape modern data systems.
ANECDOTE

Nightly FTP Feeds Fit Batch Jobs

  • Nathan describes a fintech batch job that ingested nightly NASDAQ and NYSE FTP payloads and made data available by morning.
  • Batch processing was simple, reliable, and suited to bounded nightly data delivery in that use case.
ADVICE

Use Append-Only Logs For Replay And Scaling

  • Use log-based message brokers (like Kafka) to append events and let consumers replay history for experiments and bug fixes.
  • Partition by a stable key (e.g., user ID) to preserve ordering while scaling processing across workers.
Get the Snipd Podcast app to discover more snips from this episode
Get the app