Weaviate Podcast

Weaviate

Join Connor Shorten as he interviews machine learning experts and explores Weaviate use cases from users and customers.

Episodes

Mentioned books

Mar 23, 2026 • 1h 21min

Multi-Vector Search with Amélie Chatelain and Antoine Chaffin - Weaviate Podcast #134!

Antoine Chaffin, ML researcher at LightOn working on ColBERT and efficient multi-vector search, and Amélie Chatelain, LightOn engineer focused on multi-vector models and PyLate. They dive into late-interaction vs single-vector trade-offs. They cover code-focused ColGrep, reasoning-intensive retrieval, multimodal search, and scaling approaches like PLAID and MuVERA.

Mar 1, 2026 • 53min

AI-Powered Search with Doug Turnbull and Trey Grainger [#133]

Doug Turnbull and Trey Grainger join the Weaviate Podcast to discuss all things AI-Powered Search! The conversation kicks off with designing search experiences, not all search queries are the same! Sometimes the user knows exactly what they want (a product ID, a specific file), other times they're exploring a broad category, and other times they need to compare and contrast options. AI is now making it possible to dynamically construct UIs around search results, moving toward what Trey describes as a "Minority Report"-style future where visualizations adapt on the fly to the query and the data.From there, the discussion dives into query understanding and domain modeling. Doug and Trey break down how LLMs can classify queries against existing taxonomies (like NAICS codes or Google's product taxonomy), while Trey explains a multi-tier RAG approach, using the index itself as grounding for query interpretation before executing the final retrieval. The conversation moves into agentic search, exploring whether iterative LLM-driven search loops reduce the need for ever-better embedding models, or whether simple tools like BM25 and grep are sufficient when paired with strong reasoning.Trey introduces wormhole vectors, a technique for traversing between sparse (lexical) and dense (semantic) vector spaces by treating query results as document sets with shared meaning, enabling exploration across vector spaces rather than treating them as orthogonal. The discussion also covers reflected intelligence, the idea of making search systems self-learning by mining user behavioral signals (clicks, purchases, skipped results) to continuously improve relevance through techniques like signals boosting, collaborative filtering, and learning to rank.The episode wraps with a conversation about how coding agents are changing the way Doug and Trey work, and Trey's philosophy of designing intentional agentic workflows with atomic agents rather than just handing an LLM a bag of tools.AI Powered Search (Discount Code = "weaviate")https://aipoweredsearch.com/live-course?promoCode=weaviate

Dec 8, 2025 • 1h 1min

Pyversity with Thomas van Dongen - Weaviate Podcast #132!

Thomas van Dongen is the head of AI engineering at Springer Nature and the creator of Pyversity! Pyversity is a fast, lightweight open-source Python library for diversifying retrieval results. Retrieval systems often return highly similar items. Pyversity efficiently re-ranks these results to encourage diversity, surfacing items that remain relevant but less redundant. It implements several popular diversification strategies such as MMR, MSD, DPP, and Cover with a clear, unified API.

Nov 18, 2025 • 1h 2min

Semantic Query Engines with Matthew Russo - Weaviate Podcast #131!

Matthew Russo, a Ph.D. student at MIT, dives into the world of semantic query processing engines and their potential to revolutionize database systems. He discusses the emergence of semantic operators like AI_WHERE and their role in transforming how we handle unstructured data. With insights on optimizing query planning and the benefits of filtering order, Matthew also introduces SemBench, a crucial standardized benchmark for evaluating semantic queries. Expect a lively exploration of the future of AI in databases and practical optimization strategies!

Nov 3, 2025 • 60min

REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!

Xiaoqiang Lin, a Ph.D. student at the National University of Singapore and former Meta researcher, dives into the innovative REFRAG method for enhancing retrieval-augmented generation. He explains how REFRAG improves LLM inference speeds, making Time-To-First-Token 31x faster. The discussion also covers multi-granular chunk embeddings, performance trade-offs in compression, and the exciting future of agentic AI. Listeners will learn about the balance between data and architecture for long-context capabilities and the practical compute requirements for training.

Oct 13, 2025 • 44min

Weaviate and SAS with Saurabh Mishra and Bob van Luijt - Weaviate Podcast #129!

In this conversation, Saurabh Mishra, a Senior product/engineering leader at SAS, discusses the exciting partnership between SAS and Weaviate on the SAS Retrieval Agent Manager. He explores how retrieval-augmented generation is transforming enterprise AI, particularly for managing unstructured data. Saurabh highlights real-world use cases, including predictive maintenance in manufacturing, and addresses persistent challenges in data security and AI trustworthiness. He also shares insights on the evolving developer experience and the promising future of SAS RAM.

Sep 22, 2025 • 1h 2min

Weaviate's Query Agent with Charles Pierse - Weaviate Podcast #128!

Charles Pierse, Director of Weaviate Labs, shares insights on the GA release of the Weaviate Query Agent. He discusses the journey from beta to GA, highlighting unexpected lessons and team collaborations. The conversation dives into technical aspects, including response models, citations, and how Search Mode enhances retrieval. Charles explains how the Query Agent integrates with the Cloud Console, making it intuitive for users. He also presents a compelling case study featuring MetaBuddy's innovative use of the agent for nutrition data.

Aug 13, 2025 • 1h 2min

GEPA with Lakshya A. Agrawal - Weaviate Podcast #127!

Lakshya A. Agrawal, a Ph.D. student at U.C. Berkeley, discusses his groundbreaking work on GEPA, an innovative optimizer using Large Language Models (LLMs). He elaborates on three key innovations: Pareto-Optimal Candidate Selection, Reflective Prompt Mutation, and System-Aware Merging. Lakshya explores how these techniques enhance AI efficiency, the importance of incorporating domain knowledge, and the role of benchmarks like LangProBe. He also delves into the future of AI in scientific simulations and the advantages of merging language-based learning with traditional methods.

Jul 9, 2025 • 1h 5min

Agentic Topic Modeling with Maarten Grootendorst - Weaviate Podcast #126!

Maarten Grootendorst, a psychologist turned AI engineer known for creating BERTopic, dives into the exciting world of agentic topic modeling. He discusses how large language models (LLMs) are revolutionizing the way we extract and categorize topics from complex data. The conversation highlights the challenges of traditional vs. LLM-driven methods and the critical role of human feedback. Maarten also emphasizes the importance of modularity in BERTopic, allowing for adaptive and efficient topic exploration tailored to user needs.

Jul 2, 2025 • 51min

Sufficient Context with Hailey Joren - Weaviate Podcast #125!

In this installment, Hailey Joren, a Ph.D. student at UCSD, shares her groundbreaking insights on retrieval augmented generation systems. She sheds light on the crucial difference between relevant search results and 'sufficient context' for accurate answers. With her team's innovative autorater, they tackle the future of AI, addressing how current models struggle with hallucinations. Expect discussions on fine-tuning methodologies, the role of context in AI responses, and the exciting prospects of enhancing model reliability and interpretability.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner