

MLOps.community
Demetrios
Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)
Episodes
Mentioned books

16 snips
Mar 27, 2026 • 56min
arrowspace: Vector Spaces and Graph Wiring
Lorenzo Moriondo, Technical Lead for AI at tuned.org.uk and creator of Arrowspace, builds graph-based tools for LLM systems. He talks about turning embeddings into graphs to reveal structure, topological versus geometric search, and using graph wiring for smarter retrieval, dataset curation, drift detection, and agent memory.

37 snips
Mar 20, 2026 • 51min
Agentic Marketplace
Donné Stevenson, ML engineer building scalable GenAI infrastructure and rollout strategies. Pedro Chaves, data science manager shaping GenAI search and marketplace recommendations. They explore agent-driven home search, enriched listings with commute and safety data, progressive permissioning and undo patterns, agent-to-agent marketplace layers, and agent-powered uploads, negotiation and logistics.

27 snips
Mar 17, 2026 • 1h 1min
Durable Execution and Modern Distributed Systems
Johann Schleier-Smith, Technical Lead for AI at Temporal Technologies, builds reliable infrastructure for long-running production AI workflows. He explains durable execution and why it makes regular Python programs crash-proof and scalable. Topics include deterministic workflows, cross-region resilience, integrating durable state with databases, using durable execution with LLMs and agents, and practical operational patterns.

58 snips
Feb 24, 2026 • 1h 26min
Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs
Chris Fregly, AI performance engineer, founder, and author, walks through software/hardware co-design for PyTorch, CUDA, and NVIDIA GPUs. He talks mechanical sympathy, GPU generations, NVLink and networking, kernel tuning with coding agents, and infrastructure trade-offs for training versus inference. Short, technical, and focused on building scalable, high-performance AI systems.

44 snips
Feb 19, 2026 • 1h 6min
Serving LLMs in Production: Performance, Cost & Scale // CAST AI Roundtable
Igor Šušić, founding ML engineer focused on large-scale inference and performance tuning. Ioana Apetrei, senior product manager building accessible, cost-effective LLM deployment. They debate why deployments fail at scale. They cover model routing and cost vs accuracy. They explain time-sharing GPUs, quantization, prefill vs decode separation, and when self-hosting or managed endpoints make sense.

23 snips
Feb 17, 2026 • 1h 3min
The Future of Information Retrieval: From Dense Vectors to Cognitive Search
Rahul Raja, Staff Software Engineer at LinkedIn who builds large-scale search and retrieval systems, discusses the shift from keyword search to dense, vector-based retrieval. He explores cognitive search, LLM-driven reasoning and personalization, scalability of billions of embeddings, evaluation signals beyond recall, and challenges like embedding drift, access control, and cost-effective infrastructure.

58 snips
Feb 13, 2026 • 26min
Rethinking Notebooks Powered by AI
Vincent Warmerdam, founding engineer at marimo who reinvents Python notebooks for reactive, reproducible, and interactive data work. He discusses marimo’s reactive execution model, Molab hosted GPU notebooks, LLM and agent integrations that inspect and modify notebook state, dynamic UI generation, WASM/pyodide exports, and treating notebooks as shareable, testable Python apps.

70 snips
Feb 10, 2026 • 57min
Software Engineering in the Age of Coding Agents: Testing, Evals, and Shipping Safely at Scale
Ereli Eran, founding engineer at 7AI who builds agentic AI systems for security ops, joins to unpack real-world agent engineering. He covers how agentic systems mix deterministic code with stochastic LLM behavior. They talk testing, evals, safety gates, progressive prompts, model hybrids, observability and audit trails, and strategies for shipping agents reliably at scale.

62 snips
Feb 6, 2026 • 52min
Physical AI: Teaching Machines to Understand the Real World
Nick Gillian, Co-Founder and CTO of Archetype AI and builder of Newton, is an expert in real-time sensor understanding. He discusses Physical AI: single foundation models for diverse sensors, fusing non-visual modalities, and the engineering challenges of massive continuous data. He also covers evaluation strategies, product-driven dataset design, and turning research into deployable, safety-minded systems.

46 snips
Feb 3, 2026 • 1h 7min
Speed and Scale: How Today's AI Datacenters Are Operating Through Hypergrowth
Kris Beevers, CEO and co-founder of NetBox Labs and veteran network engineer with a Ph.D., discusses how modern AI datacenters handle hypergrowth. He talks about modeling infrastructure as a single system of record, tackling power and procurement bottlenecks, and using programmatic blueprints and digital twins to speed builds. He highlights rapid iteration, robotics in racking, and the need for vendor data standards for automation.


