
Data Engineering Podcast Advanced Lakehouse Management With The LakeKeeper Iceberg REST Catalog
37 snips
Apr 21, 2025 Victor Kessler, co-founder of Vakama and developer of Lakekeeper, dives into the world of advanced lakehouse management with a focus on Apache Iceberg. He discusses the pivotal role of metadata in data actionability and the evolution of data catalogs. Victor highlights innovative features of Lakekeeper, like integration with OpenFGA for access control and its deployment using Rust on Kubernetes. He also addresses the challenges of migrating data catalogs and the importance of community involvement in open-source projects for better data management.
AI Snips
Chapters
Transcript
Episode notes
Catalogs in Multi-Cloud Ecosystems
- Unity Catalog (Databricks) and Polaris (Snowflake) offer managed catalog services best suited inside their ecosystems.
- Enterprises face multi-cloud, hybrid-cloud challenges needing agnostic catalogs focused on metadata, not just compute.
Centralize Authorization Controls
- Centralize authorization management separate from compute and storage to avoid inconsistent data access.
- Use identity providers (IDP) and integrate authorization with catalogs and compute via tools like OpenFGA and Open Policy Agent.
Lakekeeper's Modular Rust Design
- Build Lakekeeper with Rust for performance and stability as a critical infrastructure piece.
- Design it modularly with PostgreSQL storage and Kubernetes support for flexible deployment options.
