

MLOps.community
Demetrios
Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)
Episodes
Mentioned books

21 snips
Mar 31, 2026 • 59min
Spec Driven Development, Workflows, and the Recent Coding Agent Conference
Jens Bodal, a senior software engineer who builds backend systems and developer tooling, discusses how abstract AI agents shift work from coding to defining intent. He covers agent orchestration, evaluation challenges, security and sandboxing, local self-hosted stacks, spec-driven workflows, and how teams must rethink ownership, reviews, and telemetry in an agent-first world.

33 snips
Mar 30, 2026 • 1h 1min
Operationalizing AI Agents: From Experimentation to Production // Databricks Roundtable
Samraj Moorjani, MLflow engineer focused on agent quality and observability. Apurva Misra, AI consultant helping startups scope POCs and automation. Ben Epstein, CTO building LLM-driven internal tools for property teams. They discuss scaling agent reliability, observability and testing strategies. Conversation covers eval-driven development, sandboxing and production-grade monitoring for agent workflows.

35 snips
Mar 27, 2026 • 56min
arrowspace: Vector Spaces and Graph Wiring
Lorenzo Moriondo, Technical Lead for AI at tuned.org.uk and creator of Arrowspace, builds graph-based tools for LLM systems. He talks about turning embeddings into graphs to reveal structure, topological versus geometric search, and using graph wiring for smarter retrieval, dataset curation, drift detection, and agent memory.

37 snips
Mar 20, 2026 • 51min
Agentic Marketplace
Donné Stevenson, ML engineer building scalable GenAI infrastructure and rollout strategies. Pedro Chaves, data science manager shaping GenAI search and marketplace recommendations. They explore agent-driven home search, enriched listings with commute and safety data, progressive permissioning and undo patterns, agent-to-agent marketplace layers, and agent-powered uploads, negotiation and logistics.

40 snips
Mar 17, 2026 • 1h 1min
Durable Execution and Modern Distributed Systems
Johann Schleier-Smith, Technical Lead for AI at Temporal Technologies, builds reliable infrastructure for long-running production AI workflows. He explains durable execution and why it makes regular Python programs crash-proof and scalable. Topics include deterministic workflows, cross-region resilience, integrating durable state with databases, using durable execution with LLMs and agents, and practical operational patterns.

58 snips
Feb 24, 2026 • 1h 26min
Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs
Chris Fregly, AI performance engineer, founder, and author, walks through software/hardware co-design for PyTorch, CUDA, and NVIDIA GPUs. He talks mechanical sympathy, GPU generations, NVLink and networking, kernel tuning with coding agents, and infrastructure trade-offs for training versus inference. Short, technical, and focused on building scalable, high-performance AI systems.

44 snips
Feb 19, 2026 • 1h 6min
Serving LLMs in Production: Performance, Cost & Scale // CAST AI Roundtable
Igor Šušić, founding ML engineer focused on large-scale inference and performance tuning. Ioana Apetrei, senior product manager building accessible, cost-effective LLM deployment. They debate why deployments fail at scale. They cover model routing and cost vs accuracy. They explain time-sharing GPUs, quantization, prefill vs decode separation, and when self-hosting or managed endpoints make sense.

23 snips
Feb 17, 2026 • 1h 3min
The Future of Information Retrieval: From Dense Vectors to Cognitive Search
Rahul Raja, Staff Software Engineer at LinkedIn who builds large-scale search and retrieval systems, discusses the shift from keyword search to dense, vector-based retrieval. He explores cognitive search, LLM-driven reasoning and personalization, scalability of billions of embeddings, evaluation signals beyond recall, and challenges like embedding drift, access control, and cost-effective infrastructure.

58 snips
Feb 13, 2026 • 26min
Rethinking Notebooks Powered by AI
Vincent Warmerdam, founding engineer at marimo who reinvents Python notebooks for reactive, reproducible, and interactive data work. He discusses marimo’s reactive execution model, Molab hosted GPU notebooks, LLM and agent integrations that inspect and modify notebook state, dynamic UI generation, WASM/pyodide exports, and treating notebooks as shareable, testable Python apps.

70 snips
Feb 10, 2026 • 57min
Software Engineering in the Age of Coding Agents: Testing, Evals, and Shipping Safely at Scale
Ereli Eran, founding engineer at 7AI who builds agentic AI systems for security ops, joins to unpack real-world agent engineering. He covers how agentic systems mix deterministic code with stochastic LLM behavior. They talk testing, evals, safety gates, progressive prompts, model hybrids, observability and audit trails, and strategies for shipping agents reliably at scale.


