MLOps.community

Demetrios
undefined
21 snips
Mar 31, 2026 • 59min

Spec Driven Development, Workflows, and the Recent Coding Agent Conference

Jens Bodal, a senior software engineer who builds backend systems and developer tooling, discusses how abstract AI agents shift work from coding to defining intent. He covers agent orchestration, evaluation challenges, security and sandboxing, local self-hosted stacks, spec-driven workflows, and how teams must rethink ownership, reviews, and telemetry in an agent-first world.
undefined
33 snips
Mar 30, 2026 • 1h 1min

Operationalizing AI Agents: From Experimentation to Production // Databricks Roundtable

Samraj Moorjani, MLflow engineer focused on agent quality and observability. Apurva Misra, AI consultant helping startups scope POCs and automation. Ben Epstein, CTO building LLM-driven internal tools for property teams. They discuss scaling agent reliability, observability and testing strategies. Conversation covers eval-driven development, sandboxing and production-grade monitoring for agent workflows.
undefined
35 snips
Mar 27, 2026 • 56min

arrowspace: Vector Spaces and Graph Wiring

Lorenzo Moriondo, Technical Lead for AI at tuned.org.uk and creator of Arrowspace, builds graph-based tools for LLM systems. He talks about turning embeddings into graphs to reveal structure, topological versus geometric search, and using graph wiring for smarter retrieval, dataset curation, drift detection, and agent memory.
undefined
37 snips
Mar 20, 2026 • 51min

Agentic Marketplace

Donné Stevenson, ML engineer building scalable GenAI infrastructure and rollout strategies. Pedro Chaves, data science manager shaping GenAI search and marketplace recommendations. They explore agent-driven home search, enriched listings with commute and safety data, progressive permissioning and undo patterns, agent-to-agent marketplace layers, and agent-powered uploads, negotiation and logistics.
undefined
40 snips
Mar 17, 2026 • 1h 1min

Durable Execution and Modern Distributed Systems

Johann Schleier-Smith, Technical Lead for AI at Temporal Technologies, builds reliable infrastructure for long-running production AI workflows. He explains durable execution and why it makes regular Python programs crash-proof and scalable. Topics include deterministic workflows, cross-region resilience, integrating durable state with databases, using durable execution with LLMs and agents, and practical operational patterns.
undefined
58 snips
Feb 24, 2026 • 1h 26min

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

Chris Fregly, AI performance engineer, founder, and author, walks through software/hardware co-design for PyTorch, CUDA, and NVIDIA GPUs. He talks mechanical sympathy, GPU generations, NVLink and networking, kernel tuning with coding agents, and infrastructure trade-offs for training versus inference. Short, technical, and focused on building scalable, high-performance AI systems.
undefined
44 snips
Feb 19, 2026 • 1h 6min

Serving LLMs in Production: Performance, Cost & Scale // CAST AI Roundtable

Igor Šušić, founding ML engineer focused on large-scale inference and performance tuning. Ioana Apetrei, senior product manager building accessible, cost-effective LLM deployment. They debate why deployments fail at scale. They cover model routing and cost vs accuracy. They explain time-sharing GPUs, quantization, prefill vs decode separation, and when self-hosting or managed endpoints make sense.
undefined
23 snips
Feb 17, 2026 • 1h 3min

The Future of Information Retrieval: From Dense Vectors to Cognitive Search

Rahul Raja, Staff Software Engineer at LinkedIn who builds large-scale search and retrieval systems, discusses the shift from keyword search to dense, vector-based retrieval. He explores cognitive search, LLM-driven reasoning and personalization, scalability of billions of embeddings, evaluation signals beyond recall, and challenges like embedding drift, access control, and cost-effective infrastructure.
undefined
58 snips
Feb 13, 2026 • 26min

Rethinking Notebooks Powered by AI

Vincent Warmerdam, founding engineer at marimo who reinvents Python notebooks for reactive, reproducible, and interactive data work. He discusses marimo’s reactive execution model, Molab hosted GPU notebooks, LLM and agent integrations that inspect and modify notebook state, dynamic UI generation, WASM/pyodide exports, and treating notebooks as shareable, testable Python apps.
undefined
70 snips
Feb 10, 2026 • 57min

Software Engineering in the Age of Coding Agents: Testing, Evals, and Shipping Safely at Scale

Ereli Eran, founding engineer at 7AI who builds agentic AI systems for security ops, joins to unpack real-world agent engineering. He covers how agentic systems mix deterministic code with stochastic LLM behavior. They talk testing, evals, safety gates, progressive prompts, model hybrids, observability and audit trails, and strategies for shipping agents reliably at scale.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app