
How AI Is Built #045 RAG As Two Things - Prompt Engineering and Search
13 snips
Mar 6, 2025 In this discussion, John Berryman, an expert who transitioned from aerospace engineering to search and machine learning, explores the dual nature of retrieval-augmented generation (RAG). He emphasizes separating search from prompt engineering for optimal performance. Berryman shares insights on effective prompting strategies using familiar structures, testing human evaluations, and managing token limits. He dives into the differences between chat and completion models and highlights practical techniques for tackling AI applications and workflows. It's a deep dive into enhancing interactions with AI!
AI Snips
Chapters
Books
Transcript
Episode notes
RAG Splits Into Two Distinct Problems
- RAG is two separate problems: retrieval and prompt engineering.
- Treat and optimize search and prompting independently to find where failures occur.
Stay On The Model's Familiar Path
- Mimic formats and structures the model saw in training when you prompt.
- Use Markdown, docstrings, or domain report formats so the model recognizes the pattern.
Start With Vibe Tests, Then Quantify
- Start prompt tuning with human 'vibe testing' to detect obvious failures fast.
- Then build systematic tests and use token probabilities to measure when few-shot examples stop adding value.


