
Book Overflow The Ethics of Data-Intensive Applications - Designing Data-Intensive Applications by Martin Kleppman
7 snips
Feb 9, 2026 A lively wrap-up of stream processing, logs, Kafka-style replayability, and change data capture tools. They cover event sourcing, idempotence, and delivery guarantees for resilient pipelines. The conversation tackles privacy risks, bias in data-driven systems, and ethical trade-offs for engineers. Practical architecture tips and career-facing advice round out the tech-focused discussion.
AI Snips
Chapters
Books
Transcript
Episode notes
Streams Are The Next Step After Batch
- Stream processing is the natural next step after batch for continuous, unbounded data like markets or analytics.
- Understanding streams reveals trade-offs around ordering, latency, and state that shape modern data systems.
Nightly FTP Feeds Fit Batch Jobs
- Nathan describes a fintech batch job that ingested nightly NASDAQ and NYSE FTP payloads and made data available by morning.
- Batch processing was simple, reliable, and suited to bounded nightly data delivery in that use case.
Use Append-Only Logs For Replay And Scaling
- Use log-based message brokers (like Kafka) to append events and let consumers replay history for experiments and bug fixes.
- Partition by a stable key (e.g., user ID) to preserve ordering while scaling processing across workers.



