Are Multi-Agent Systems More Complex Than They Need to Be?

33 snips

Apr 2, 2026

Arun Kumar, Associate Professor at UC San Diego and co-founder/CTO of RapidFire AI, researches data systems, ML engineering, and agent engineering. He discusses ensembles vs multi-agent workflows. He explains memory, dynamic topologies, and tool use differences. He covers systematic evaluation, failure taxonomy, AutoML for agents, observability, and scaling experiments for robust LLM-based pipelines.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

MAST Benchmark Reveals Common Agent Failures

The MAST benchmark catalogs failure taxonomies for multi-agent traces: system design, task verification, and interagent misalignment.
Use its taxonomy post hoc to design workflows robust to common failure modes.

ADVICE

Optimize Agents For Cost And Latency Too

Optimize agents for eval metrics, latency, and total cost of ownership rather than raw accuracy alone.
Use experimentation frameworks and limit search to what the application's ROI justifies to control cost.

ADVICE

Define Eval Metrics Before Tuning Research Agents

For deep research agents, first codify eval metrics and task definitions before optimizing any knobs.
If outputs are unstructured, create gold data or programmatic checks so optimization and automation become feasible.

Get the Snipd Podcast app to discover more snips from this episode

Get the app