Software Engineering Daily

Production-Grade AI Systems with Fred Roma

23 snips

Jan 27, 2026

Fred Roma, SVP of Product and Engineering at MongoDB, a veteran in cloud and data management. He talks about the complex AI stack: LLMs, embeddings, vector search, caching, and observability. He covers schema evolution in the LLM era, Voyage AI’s multimodal embeddings and rerankers, and how data platforms must adapt for production-grade AI systems.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Fuse Keyword And Semantic Search

Fuse keyword search with semantic (vector) search and tune weights to surface precise results.
Use rank/score fusion operators to control how exact matches (e.g., brand) are prioritized alongside semantic matches.

ADVICE

Use Aggregation Pipelines For Retrieval Logic

Use MongoDB's aggregation pipeline to compose search, vector search, reranking and transformations in one query.
Roma notes new operators like score fusion let developers control how results merge without external ETL.

ADVICE

Never Mix Private Data With Model Training

Keep private data out of LLM training by retrieving and inserting documents into prompts at runtime.
Choose LLM hosting (cloud or on‑prem) based on regulatory and security needs and control prompts sent to providers.

Get the Snipd Podcast app to discover more snips from this episode