Data Engineering Podcast

From Academia to Industry: Bridging Data Engineering Challenges

45 snips
Aug 26, 2025
In this engaging discussion, Professor Paul Groth from the University of Amsterdam shares his expertise in AI systems and intelligent data engineering. He dives into the evolution of data provenance and lineage, illustrating its significance in today's workflows. Paul also highlights the transformative impact of large language models on knowledge graph construction and data integration. The conversation addresses the synergy between academia and industry, emphasizing human-AI collaboration and the need for tailored data management solutions.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Semantics Cause Access Control Friction

  • Semantic divergence (customer vs person) causes access-control and governance headaches across teams.
  • Future systems must adapt to varied models instead of enforcing a single org-wide schema.
INSIGHT

Graphs Clarify But Don’t Perfect Identity

  • Knowledge graphs make semantics explicit but they do not magically solve identity or disambiguation issues.
  • Even curated graphs like Wikidata contain unresolved identity proliferation.
INSIGHT

LLMs Lower Graph Construction Cost

  • LLMs dramatically simplify information extraction and mapping for knowledge graph construction.
  • You can build partial graphs and link to raw data instead of converting everything into the graph.
Get the Snipd Podcast app to discover more snips from this episode
Get the app