
Software Engineering Daily Production-Grade AI Systems with Fred Roma
23 snips
Jan 27, 2026 Fred Roma, SVP of Product and Engineering at MongoDB, a veteran in cloud and data management. He talks about the complex AI stack: LLMs, embeddings, vector search, caching, and observability. He covers schema evolution in the LLM era, Voyage AI’s multimodal embeddings and rerankers, and how data platforms must adapt for production-grade AI systems.
AI Snips
Chapters
Transcript
Episode notes
Fuse Keyword And Semantic Search
- Fuse keyword search with semantic (vector) search and tune weights to surface precise results.
- Use rank/score fusion operators to control how exact matches (e.g., brand) are prioritized alongside semantic matches.
Use Aggregation Pipelines For Retrieval Logic
- Use MongoDB's aggregation pipeline to compose search, vector search, reranking and transformations in one query.
- Roma notes new operators like score fusion let developers control how results merge without external ETL.
Never Mix Private Data With Model Training
- Keep private data out of LLM training by retrieving and inserting documents into prompts at runtime.
- Choose LLM hosting (cloud or on‑prem) based on regulatory and security needs and control prompts sent to providers.

