
Heavybit Podcasts Ep. #10, Data Modeling Matters Most with Toby Mao
Apr 7, 2026
Toby Mao, creator of SQLGlot and co-creator of SQLMesh who led data infrastructure at Netflix and Airbnb, talks data modeling as the toughest challenge in engineering. He recounts building tools to handle multi-engine SQL, using AI to speed refactors, and why architectural intuition and software practices matter. He also explores future UIs, LLM-assisted workflows, and practical advice for engineers.
AI Snips
Chapters
Transcript
Episode notes
SQL Dialects Break Portability
- SQL dialect fragmentation is a practical blocker: Toby built SQLGlot to parse and transpile between Spark, Trino, Druid and others so users can write SQL once and run it on different engines.
- He observed data scientists prefer writing SQL over Python, so a robust SQL parser/transpiler unlocked cross-engine portability at Netflix and Airbnb.
Treat Transformations Like Software
- Apply software engineering practices to data pipelines: deploy, test, check, and treat transformations like code so changes are safer and repeatable.
- Toby built SQLMesh to track state and time for incremental, idempotent transforms because full refreshes don’t scale at Netflix/Airbnb volumes.
Data Modeling Is The Core Hard Problem
- Data modeling is the hardest and most important long-term problem in data engineering and will grow in importance with AI.
- Good models let pipelines evolve with the business and provide the context AI needs to deliver value.
