Engineering Lakehouses with Open Table Formats

Build scalable and efficient lakehouses with Apache Iceberg, Apache Hudi, and Delta Lake
Book • 2025
Engineering Lakehouses with Open Table Formats explains how open table formats (for example Apache Iceberg) decouple storage from compute to enable interoperable, evolvable data tables for analytics and AI. It covers schema evolution, performance techniques (compaction, partitioning), streaming versus batch patterns, catalogs, governance, lineage, and cost-control practices needed to move AI from pilot to production.

The book compares open table formats and integration patterns with various compute engines, offering architecture guidance and real-world case studies to help practitioners choose and operate lakehouse components.

It emphasizes building governance, metadata, and lineage into lakehouse deployments from day one to meet security and compliance needs.

Practical chapters and quizzes reinforce concepts so readers can apply them to business problems and production systems.

Mentioned by

Mentioned in 0 episodes

Mentioned by
undefined
Paul Muller
to introduce the guest as the book's co-author and to frame the episode's topic about open lakehouse architecture.
Open Lakehouse Architecture: How to Scale AI to Production

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app