Data Engineering Podcast

Tobias Macey
undefined
19 snips
Mar 29, 2026 • 50min

Treat Metering Like Finance: Building Data Platforms for Consumption Economics

Himant Goyal, a Senior Product Manager who builds usage-driven data platforms, explains why metering should be treated like a financial system. He covers the architecture needed for reliable consumption tracking. Short takes dive into real-time versus batch tradeoffs, rate-card versioning, handling late or duplicate data, and the cross-functional shift required for finance, product, and engineering to co-own metering.
undefined
9 snips
Mar 22, 2026 • 43min

Beyond the PDF: Rowan Cockett on Reproducible, Composable Science

Rowan Cockett, co-founder and CEO of CurveNote and Continuous Science Foundation, builds tools for reproducible, composable scientific research. He talks about fixing PDF-bound workflows, cloud-optimized formats like Zarr, Jupyter-based interactive articles, graceful degradation of interactives, storage partnerships that avoid hosting huge datasets, and the Open Exchange Architecture push for interoperable scientific components.
undefined
62 snips
Mar 16, 2026 • 1h 2min

Beyond Prompts: Practical Paths to Self‑Improving AI

Raj Shukla, CTO at SymphonyAI and veteran applied AI leader, discusses building production-grade self-improving AI for regulated industries. He covers agentic architectures, feedback loops, and intelligent memory as a practical middle ground. He also talks about sandboxing, policy alignment, subagent code loops, model brittleness, and how owning memory and process graphs creates enterprise differentiation.
undefined
13 snips
Mar 8, 2026 • 1h 5min

Orion at Gravity: Trustworthy AI Analysts for the Enterprise

Drew Gilson, co-founder of Gravity and former Looker/Google product leader focused on agentic analytics. Lucas Thelosen, co-founder of Gravity and former Looker/Google analytics lead building Orion. They discuss building trustworthy AI analysts using semantic layers and context engineering. They cover preserving data investments, bootstrapping semantics from messy systems, memory and lineage, and connecting insights to action.
undefined
37 snips
Mar 2, 2026 • 45min

From Models to Momentum: Uniting Architects and Engineers with ER/Studio

Ryan Hirsch, product marketing lead with a data warehousing background, and Jamie Knowles, product director and enterprise data modeling expert, discuss ER/Studio’s role in creating shared semantic models. They cover translating logical designs to code, preventing semantic drift, integrating governance, collaboration features like Team Server, and new AI-assisted modeling and semantic exports.
undefined
32 snips
Feb 22, 2026 • 58min

From Data Models to Mind Models: Designing AI Memory at Scale

Vasilije Markovich, founder of Cognee and former data engineer turned cognitive-science-informed entrepreneur, builds agentic memory and knowledge-layer systems. He discusses permanent vs session memory, graph+vector architectures, storage and latency trade-offs, metadata and decay strategies, trace-based scoring, multi-tenant isolation, and practical vertical uses like pharma, logistics, and security.
undefined
40 snips
Feb 15, 2026 • 51min

Prompt Management, Tracing, and Evals: The New Table Stakes for GenAI Ops

Aman Agarwal, creator of OpenLit and AI engineering tools builder, talks about making LLM apps reliable and debuggable. He covers opaque model behavior, runaway token costs, and brittle prompt management. He explains OpenTelemetry-native observability, prompt/secret versioning, eval workflows, and integrations that turn black-box model runs into stepwise traces for production readiness.
undefined
47 snips
Feb 8, 2026 • 47min

From Legacy to AI-Ready: How MongoDB AMP Accelerates Modernization

Shilpa Kolhar, SVP of Product and Engineering at MongoDB who built large-scale data and ML infrastructure, explains modernizing legacy relational systems to a document-first, AI-ready platform. She covers AMP, Atlas Vector Search and embeddings, schema validation and versioning patterns, incremental migration units, and balancing LLM automation with human governance.
undefined
35 snips
Feb 1, 2026 • 57min

Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows

Tim Sehn, founder and CEO of DoltHub and creator of Dolt — a version-controlled SQL database — explains why Git-style semantics belong in data systems. He covers row-level branching, merging, and diffs, real production use cases like reproducible ML feature stores and game config, and how branches enable safe agentic writes and PR-style data reviews.
undefined
81 snips
Jan 25, 2026 • 41min

Logical First, Physical Second: A Pragmatic Path to Trusted Data

Jamie Knowles, Product Director for ER/Studio with decades in data modeling and architecture, explains why meaning should drive designs. He talks about building shared semantic models, avoiding schema sprawl, and evolving architecture alongside delivery. He also covers governance, practical modeling techniques, and the double-edged role of generative AI in drafting models without human-approved ontologies.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app