Vespa AI and Surpassing the Limits of Vector Search

51 snips

May 12, 2026

Radu Gheorghe, a software engineer at Vespa who moved from Elasticsearch and Solr consulting into tensor-based retrieval, explains why single-vector similarity is not enough. He talks about chunking and lossy embeddings. He outlines multi-stage retrieval, re-ranking trade-offs, and how tensors with named dimensions enable richer, scalable search for multimodal and real-time systems.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Joining Vespa From A Consulting Background

Radu joined Vespa out of curiosity about non-Lucene internals and different distributed trade-offs.
His consulting background on Elasticsearch and Solr made him appreciate Vespa's generalized, scale-oriented design choices like tensors.

ADVICE

Make First-Stage Ranking Efficient Before ReRanking

Optimize early-stage ranking efficiency so you can afford richer re-rankers later.
Run a strong base relevance on all documents, then apply heavier re-ranking to only the top candidates.

INSIGHT

Tensors Enable Richer, Named Dimension Math

Tensors generalize vectors into named, sparse, and multi-dimensional arrays that support richer math.
Named dimensions let you store maps of vectors (e.g., patches) or user preferences and compute dot-products or MaxSim at scale.

Get the Snipd Podcast app to discover more snips from this episode

Get the app