Software Engineering Daily

Production-Grade AI Systems with Fred Roma

23 snips
Jan 27, 2026
Fred Roma, SVP of Product and Engineering at MongoDB, a veteran in cloud and data management. He talks about the complex AI stack: LLMs, embeddings, vector search, caching, and observability. He covers schema evolution in the LLM era, Voyage AI’s multimodal embeddings and rerankers, and how data platforms must adapt for production-grade AI systems.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Fuse Keyword And Semantic Search

  • Fuse keyword search with semantic (vector) search and tune weights to surface precise results.
  • Use rank/score fusion operators to control how exact matches (e.g., brand) are prioritized alongside semantic matches.
ADVICE

Use Aggregation Pipelines For Retrieval Logic

  • Use MongoDB's aggregation pipeline to compose search, vector search, reranking and transformations in one query.
  • Roma notes new operators like score fusion let developers control how results merge without external ETL.
ADVICE

Never Mix Private Data With Model Training

  • Keep private data out of LLM training by retrieving and inserting documents into prompts at runtime.
  • Choose LLM hosting (cloud or on‑prem) based on regulatory and security needs and control prompts sent to providers.
Get the Snipd Podcast app to discover more snips from this episode
Get the app