Data Engineering Podcast

Advanced Lakehouse Management With The LakeKeeper Iceberg REST Catalog

37 snips
Apr 21, 2025
Victor Kessler, co-founder of Vakama and developer of Lakekeeper, dives into the world of advanced lakehouse management with a focus on Apache Iceberg. He discusses the pivotal role of metadata in data actionability and the evolution of data catalogs. Victor highlights innovative features of Lakekeeper, like integration with OpenFGA for access control and its deployment using Rust on Kubernetes. He also addresses the challenges of migrating data catalogs and the importance of community involvement in open-source projects for better data management.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Catalogs in Multi-Cloud Ecosystems

  • Unity Catalog (Databricks) and Polaris (Snowflake) offer managed catalog services best suited inside their ecosystems.
  • Enterprises face multi-cloud, hybrid-cloud challenges needing agnostic catalogs focused on metadata, not just compute.
ADVICE

Centralize Authorization Controls

  • Centralize authorization management separate from compute and storage to avoid inconsistent data access.
  • Use identity providers (IDP) and integrate authorization with catalogs and compute via tools like OpenFGA and Open Policy Agent.
ADVICE

Lakekeeper's Modular Rust Design

  • Build Lakekeeper with Rust for performance and stability as a critical infrastructure piece.
  • Design it modularly with PostgreSQL storage and Kubernetes support for flexible deployment options.
Get the Snipd Podcast app to discover more snips from this episode
Get the app