How AI Is Built

#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering Lessons

14 snips
Oct 31, 2024
Stuart Cam and Russ Cam, seasoned search infrastructure experts from Elastic and Canva, dive into the complexities of modern search systems. They discuss the integration of traditional text search with vector capabilities for better outcomes. The conversation emphasizes the importance of systematic relevancy testing and avoiding local maxima traps, where improving one query can harm others. They also explore the critical balance needed between performance, cost, and indexing strategies, including practical insights into architecting effective search pipelines.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Use A Golden Query Set

  • Create a golden query set covering head, mid, and tail queries with human-judged expected results.
  • Automate offline evaluations (NDCG, reciprocal rank) before any production rollout.
ADVICE

Validate With A/B And Interleaving

  • Validate offline improvements with A/B testing or interleaving to measure real user impact.
  • Use interleaving when tail-query traffic is too low for standard A/B tests.
ADVICE

Test On Production Data Or Mirrors

  • Run evaluations against production data or a real-time prod mirror to avoid misleading results.
  • Offload heavy profiling or explain calls to a mirror to prevent production strain.
Get the Snipd Podcast app to discover more snips from this episode
Get the app