How AI Is Built

#045 RAG As Two Things - Prompt Engineering and Search

13 snips
Mar 6, 2025
In this discussion, John Berryman, an expert who transitioned from aerospace engineering to search and machine learning, explores the dual nature of retrieval-augmented generation (RAG). He emphasizes separating search from prompt engineering for optimal performance. Berryman shares insights on effective prompting strategies using familiar structures, testing human evaluations, and managing token limits. He dives into the differences between chat and completion models and highlights practical techniques for tackling AI applications and workflows. It's a deep dive into enhancing interactions with AI!
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

RAG Splits Into Two Distinct Problems

  • RAG is two separate problems: retrieval and prompt engineering.
  • Treat and optimize search and prompting independently to find where failures occur.
ADVICE

Stay On The Model's Familiar Path

  • Mimic formats and structures the model saw in training when you prompt.
  • Use Markdown, docstrings, or domain report formats so the model recognizes the pattern.
ADVICE

Start With Vibe Tests, Then Quantify

  • Start prompt tuning with human 'vibe testing' to detect obvious failures fast.
  • Then build systematic tests and use token probabilities to measure when few-shot examples stop adding value.
Get the Snipd Podcast app to discover more snips from this episode
Get the app