The Deep View: Conversations

#28 - How to solve the ROI problem for AI inference - Rob May

4 snips
Jan 22, 2026
Rob May, founder and CEO of Neurometric AI, shares his expertise in optimizing AI inference costs. He discusses the pressing challenge of making AI affordable enough to yield real ROI for enterprises. Rob explains his innovative approach using 'thinking algorithms' and specialized small models to reduce costs while enhancing accuracy. He also reflects on his journey back to startups, emphasizing the importance of authentic storytelling in gaining media attention and driving growth in the AI sector.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Small Models Cut Inference Costs Deeply

  • Moving from large models to small task-specific models can cut inference costs drastically, often by ~90%.
  • For heavy users, even modest percentage savings on inference spend translate to massive dollar savings.
ADVICE

Benchmark With Real Workloads First

  • Run task-based leaderboards and evaluate models with your real workloads to spot where small models win.
  • Use model-and-probing combinations (thinking algorithms) and hardware context to optimize accuracy, latency, and cost.
ADVICE

Automate Model Management For Users

  • Start with recommendations and dashboards, then automate model selection and routing for customers.
  • Hide most knobs: many companies prefer managed optimization over manual tuning.
Get the Snipd Podcast app to discover more snips from this episode
Get the app