GOTO - The Brightest Minds in Tech

Building Planetary-Scale Data Systems with Venice • Felix GV & Olimpiu Pop

Mar 3, 2026
Félix GV, former LinkedIn engineer who built the Venice planetary-scale derived data system, explains how Venice unbundles components like Kafka and RocksDB into independent distributed systems. He covers client caching patterns, rigorous chaos engineering and load tests, trade-offs of asynchronous writes and CAP theorem in multi-region deployments, and experiments integrating DuckDB for analytics.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Unbundled Architecture Makes Each Piece A Distributed System

  • Venice is built as an unbundled distributed database where each component (pub/sub, servers, control plane, clients) is its own distributed system.
  • Servers host RocksDB locally and offer an eager-cache client that embeds RocksDB in-app processes to act as follower replicas for lower latency.
ADVICE

Exercise Multi‑DC Failover With Realistic Peak Load

  • Regularly run aggressive load tests that concentrate traffic into a single data center to validate failover behavior under peak conditions.
  • Venice ran multi-data-center chaos tests several times a week, draining traffic to one DC during weekday morning peaks to expose weak components.
INSIGHT

Derived Data Systems Favor Asynchronous Ingestion

  • Venice is a derived data system where ingestion is asynchronous from pub/sub or batch jobs, optimizing for very high throughput rather than immediate visibility.
  • It supports mixed ingestion (batch + stream) and can orchestrate partial column refresh patterns for different latency needs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app