The Engineering Leadership Podcast

Building reliable and proactive agentic systems at scale: how Shopify’s reflexive AI culture was instrumental in their development of Sidekick w/ Andrew McNamara #258

May 12, 2026
Andrew McNamara, Director of Applied Machine Learning at Shopify, leads the team behind Shopify Sidekick and builds production-scale AI assistants. He talks about Shopify’s prototype-first culture and how interns shape AI adoption. The conversation covers Sidekick’s merchant-driven vision, building ground-truth evals, subagent specializations, proactive features like Sidekick Pulse, and reliability and latency strategies.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
ANECDOTE

Merchant Survey Drove Sidekick's AI Co-founder Vision

  • Merchant surveys showed access to a business-savvy friend correlated strongly with store survival, motivating Sidekick as an AI co-founder.
  • Andrew recounts that merchants with mentorship-like access were far more likely to continue and succeed, inspiring Sidekick's vision.
ADVICE

Prefer Simple Orchestration Over Complex Multiagent Systems

  • Favor simple orchestration: use one model to orchestrate tool calls rather than complex multi-agent systems.
  • Andrew recommends iterating with parallel prototypes to empirically find that less-complex design wins in practice.
ADVICE

Make Evals Central With A Growing Ground Truth Set

  • Build evolving ground truth sets (GTX) and an AI judge calibrated to them to evaluate conversational agents at scale.
  • Andrew says product specs become labelled conversation rubrics; the judge must match human experts to enable scalable evals and training.
Get the Snipd Podcast app to discover more snips from this episode
Get the app