The Data Exchange with Ben Lorica

Are Multi-Agent Systems More Complex Than They Need to Be?

33 snips
Apr 2, 2026
Arun Kumar, Associate Professor at UC San Diego and co-founder/CTO of RapidFire AI, researches data systems, ML engineering, and agent engineering. He discusses ensembles vs multi-agent workflows. He explains memory, dynamic topologies, and tool use differences. He covers systematic evaluation, failure taxonomy, AutoML for agents, observability, and scaling experiments for robust LLM-based pipelines.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

MAST Benchmark Reveals Common Agent Failures

  • The MAST benchmark catalogs failure taxonomies for multi-agent traces: system design, task verification, and interagent misalignment.
  • Use its taxonomy post hoc to design workflows robust to common failure modes.
ADVICE

Optimize Agents For Cost And Latency Too

  • Optimize agents for eval metrics, latency, and total cost of ownership rather than raw accuracy alone.
  • Use experimentation frameworks and limit search to what the application's ROI justifies to control cost.
ADVICE

Define Eval Metrics Before Tuning Research Agents

  • For deep research agents, first codify eval metrics and task definitions before optimizing any knobs.
  • If outputs are unstructured, create gold data or programmatic checks so optimization and automation become feasible.
Get the Snipd Podcast app to discover more snips from this episode
Get the app