KubeFM

KubeFM
undefined
Aug 26, 2025 • 28min

Teaching Kubernetes to Scale with a MacBook Screen Lock, with Brian Donelan

Brian Donelan, VP of Cloud Platform Engineering at JPMorgan Chase, shares his innovative side project that automates Kubernetes workload scaling based on MacBook screen lock status. He connects macOS notifications to CloudWatch, achieving impressive 80% cost savings by scaling resources to zero when idle. The discussion highlights KEDA's unique event-driven scaling capabilities, creative metrics for different industries, and strategies for optimizing cloud resource usage, making workload management more efficient and sustainable.
undefined
Aug 19, 2025 • 41min

Building a Carbon and Price-Aware Kubernetes Scheduler, with Dave Masselink

Data centers consume over 4% of global electricity and this number is projected to triple in the next few years due to AI workloads.Dave Masselink, founder of Compute Gardener, discusses how he built a Kubernetes scheduler that makes scheduling decisions based on real-time carbon intensity data from power grids.You will learn:How carbon-aware scheduling works - Using real-time grid data to shift workloads to periods when electricity generation has lower carbon intensity, without changing energy consumptionTechnical implementation details - Building custom Kubernetes schedulers using the scheduler plugin framework, including pre-filter and filter stages for carbon and time-of-use pricing optimizationEnergy measurement strategies - Approaches for tracking power consumption across CPUs, memory, and GPUsSponsorThis episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/zk2xM1lfWInterested in sponsoring an episode? Learn more.
undefined
Aug 12, 2025 • 33min

How Policies Saved us a Thousand Headaches, with Alessandro Pomponio

Alessandro Pomponio from IBM Research explains how his team transformed their chaotic bare-metal clusters into a well-governed, self-service platform for AI and scientific workloads. He walks through their journey from manual cluster interventions to a fully automated GitOps-first architecture using ArgoCD, Kyverno, and Kueue to handle everything from policy enforcement to GPU scheduling.You will learn:How to implement GitOps workflows that reduce administrative burden while maintaining governance and visibility across multi-tenant research environmentsPractical policy enforcement strategies using Kyverno to prevent GPU monopolization, block interactive pod usage, and automatically inject scheduling constraintsFair resource sharing techniques with Kueue to manage scarce GPU resources across different hardware types while supporting both specific and flexible allocation requestsOrganizational change management approaches for gaining stakeholder buy-in, upskilling admin teams, and communicating policy changes to research usersSponsorThis episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.ioMore infoFind all the links and info for this episode here: https://ku.bz/5sK7BFZ-8Interested in sponsoring an episode? Learn more.
undefined
Jun 24, 2025 • 20min

Dear friend, you have built a Kubernetes, with Mac Chaffee

Mac Chaffee, a platform engineer and security champion, dives into the underestimated complexities of running modern applications. He discusses how overconfidence can lead to costly mistakes, particularly when teams reject proven tools like Kubernetes. Mac highlights the tipping point where DIY solutions become burdensome and stresses the importance of mentorship in preventing poor technical decisions. He advocates for transparency in technology, urging teams to establish effective guardrails rather than hiding complexity.
undefined
Jun 17, 2025 • 23min

Beyond Kubernetes: Serverless Execution Models for Variable Workloads, with Marc Campora

Marc Campora, a systems consultant with experience in high-throughput platforms, shares his analysis of a real customer deployment with 500+ microservices. He breaks down the cost implications, technical constraints, and operational trade-offs between Kubernetes containers and AWS Lambda functions based on actual production data and migration assessments.You will learn:Cost analysis frameworks for comparing Lambda vs Kubernetes across different traffic patterns, including specific examples of 3x savings potential and the 80/20 rule for service utilizationMigration complexity factors when moving existing microservices to Lambda, including cold start issues, runtime model changes, and why it's often a complete rewrite rather than a simple portDecision criteria for choosing between platforms based on traffic consistency, computational requirements, and operational overhead toleranceSponsorThis episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/5gMTkzLhVInterested in sponsoring an episode? Learn more.
undefined
Jun 10, 2025 • 36min

Shared Nothing, Shared Everything: The Truth About Kubernetes Multi-Tenancy, with Molly Sheets

Molly Sheets, Director of Engineering for Kubernetes at Zynga, leads platform engineering behind popular games like Words with Friends. She discusses how her team shifted from a one-cluster-per-team model to a more efficient multi-tenant architecture. Molly highlights the dangers of slowing deployment speeds and shares practical strategies for resource allocation and SLOs. She also delves into the unique challenges of Kubernetes in the gaming sector and candidly addresses the balance between technical roles and her journey as a new parent.
undefined
Jun 3, 2025 • 48min

My pipelines from GitLab Commit to ArgoCD got beaten by FTP, with David Pech

David Pech, a Staff Cloud Ops Engineer at Wrike with all CNCF certifications, shares his insights on cloud-native adoption challenges. He recounts how a sophisticated GitLab CI/CD setup was overtaken by simple FTP due to cultural resistance. David discusses the hidden costs of complex tooling and the importance of team readiness over technical superiority. He offers practical strategies for gradual cloud transitions, emphasizing in-house expertise and management advocacy, while also reflecting on his own journey through cloud technologies and Docker misconceptions.
undefined
May 27, 2025 • 36min

Performance testing Kubernetes workloads, with Stephan Schwarz

Stephan Schwarz, a DevOps engineer at iits-consulting specializing in Kubernetes, shares invaluable insights on performance testing workloads. He discusses defining performance metrics and the methodology of testing individual pods to uncover their limitations. The conversation delves into the impact of shared Kubernetes components on results and the complexities of configuring Horizontal Pod Autoscaling. Stephan also highlights the importance of tools like OpenTelemetry for monitoring performance in production, emphasizing a holistic approach to testing and continuous learning in the DevOps landscape.
undefined
May 20, 2025 • 33min

Managing 100s of Kubernetes Clusters using Cluster API, with Zain Malik

Discover how to manage Kubernetes at scale with declarative infrastructure and automation principles.Zain Malik shares his experience managing multi-tenant Kubernetes clusters with up to 30,000 pods across clusters capped at 950 nodes. He explains how his team transitioned from Terraform to Cluster API for declarative cluster lifecycle management, contributing upstream to improve AKS support while implementing GitOps workflows.You will learn:How to address challenges in large-scale Kubernetes operations, including node pool management inconsistencies and lengthy provisioning timesWhy Cluster API provides a powerful foundation for multi-cloud cluster management, and how to extend it with custom operators for production-specific needsHow implementing GitOps principles eliminates manual intervention in critical operations like cluster upgradesStrategies for handling production incidents and bugs when adopting emerging technologies like Cluster APISponsorThis episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.More infoFind all the links and info for this episode here: https://ku.bz/5PLksqVlkInterested in sponsoring an episode? Learn more.
undefined
May 13, 2025 • 46min

Super-Scaling Open Policy Agent with Batch Queries, with Nicholaos Mouzourakis

Nicholaos Mouzourakis, a Staff Product Security Engineer at Gusto, dives into the intricacies of scaling authorization within Kubernetes using Open Policy Agent (OPA). He explains how traditional approaches fall short in microservices and shares his team's journey optimizing OPA performance through batch queries for impressive efficiency gains. Nicholaos also highlights surprising interactions between Kubernetes CPU limits and Go's performance, alongside deployment strategies that ensure smooth operations in production. His unique transition from the gaming industry enriches his insights.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app