Weaviate Podcast

Semantic Query Engines with Matthew Russo - Weaviate Podcast #131!

25 snips
Nov 18, 2025
Matthew Russo, a Ph.D. student at MIT, dives into the world of semantic query processing engines and their potential to revolutionize database systems. He discusses the emergence of semantic operators like AI_WHERE and their role in transforming how we handle unstructured data. With insights on optimizing query planning and the benefits of filtering order, Matthew also introduces SemBench, a crucial standardized benchmark for evaluating semantic queries. Expect a lively exploration of the future of AI in databases and practical optimization strategies!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Logical Plans Let Optimizers Reorder Work

  • Logical plans separate query intent from physical execution, enabling optimizers to reorder semantic operators.
  • Physical plans assign specific models and implementations to each operator to meet cost/latency/quality goals.
INSIGHT

Joins Are Harder And Costlier In Multimodal Settings

  • Semantic joins are high-cost and require creative implementations (embeddings, batching, hybrid strategies).
  • Some use cases are obvious for text, but multimodal joins (image↔audio) remain harder to find and optimize.
ADVICE

Turn Joins Into Relational Matches When Feasible

  • When possible, map both sides to discrete labels and perform a relational join to reduce expensive pairwise comparisons.
  • Watch out for label mismatches (e.g., 'elephant' vs 'North African elephant') that can break exact joins.
Get the Snipd Podcast app to discover more snips from this episode
Get the app