Operationalizing AI Agents: From Experimentation to Production // Databricks Roundtable

33 snips

Mar 30, 2026

Guest

Samraj Moorjani

Samraj Moorjani, MLflow engineer focused on agent quality and observability. Apurva Misra, AI consultant helping startups scope POCs and automation. Ben Epstein, CTO building LLM-driven internal tools for property teams. They discuss scaling agent reliability, observability and testing strategies. Conversation covers eval-driven development, sandboxing and production-grade monitoring for agent workflows.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Slack Agents Replaced A Team Of Engineers

Ben describes internal Slack agents that replaced many engineer requests by providing data, spreadsheets, and analysis on demand.
He says a team of six now does work that previously needed 40 people because agents give everyone access to company data and context.

ADVICE

Build Narrow Composable Agents First

Start with narrow, composable agents rather than one monolithic assistant to reduce failure modes and speed iteration.
Break problems into state transitions or classification/regression tasks so you can build test sets and validate behavior.

INSIGHT

Eval Driven Development Is TDD For GenAI

Treat eval-driven development like TDD: write unit-like evaluations, integration tests, and production telemetry for GenAI systems.
Use evaluations as verifiable goals so agents can self-check and improve via feedback loops.

Get the Snipd Podcast app to discover more snips from this episode

Get the app