
The Data Exchange with Ben Lorica Are Multi-Agent Systems More Complex Than They Need to Be?
33 snips
Apr 2, 2026 Arun Kumar, Associate Professor at UC San Diego and co-founder/CTO of RapidFire AI, researches data systems, ML engineering, and agent engineering. He discusses ensembles vs multi-agent workflows. He explains memory, dynamic topologies, and tool use differences. He covers systematic evaluation, failure taxonomy, AutoML for agents, observability, and scaling experiments for robust LLM-based pipelines.
AI Snips
Chapters
Transcript
Episode notes
MAST Benchmark Reveals Common Agent Failures
- The MAST benchmark catalogs failure taxonomies for multi-agent traces: system design, task verification, and interagent misalignment.
- Use its taxonomy post hoc to design workflows robust to common failure modes.
Optimize Agents For Cost And Latency Too
- Optimize agents for eval metrics, latency, and total cost of ownership rather than raw accuracy alone.
- Use experimentation frameworks and limit search to what the application's ROI justifies to control cost.
Define Eval Metrics Before Tuning Research Agents
- For deep research agents, first codify eval metrics and task definitions before optimizing any knobs.
- If outputs are unstructured, create gold data or programmatic checks so optimization and automation become feasible.

