Adventures in DevOps

Will Button, Warren Parad

Join us in listening to the experienced experts discuss cutting edge challenges in the world of DevOps. From applying the mindset at your company, to career growth and leadership challenges within engineering teams, and avoiding the common antipatterns. Every episode you'll meet a new industry veteran guest with their own unique story.

Episodes

Mentioned books

Mar 28, 2026 • 53min

Upskilling your agents

Share Episode In this adventure, we sit down with Dan Wahlin, Principal of DevRel for JavaScript, AI, and Cloud at Microsoft, to explore the complexities of modern infrastructure. We examine how cloud platforms like Azure function as "building blocks". Which of course, can quickly become overwhelming without the right instruction manuals. To bridge this gap, one potential solution we discuss is the emerging reliance on AI "skills"—specialized markdown files. They can give coding agents the exact knowledge needed to deploy poorly documented complex open-source projects to container apps without requiring deep infrastructure expertise. And we are saying the silent part outloud, as we review how handing the keys over to autonomous agents introduces terrifying new attack vectors. It's the security nightmare of prompt injections and the careless execution of unvetted AI skills. Which is a blast from the past, and we reminisce how current downloading of random agent instructions to running untrusted executables from early internet sites. While tools like OpenClaw purport to offer incredible automation, such as allowing agents to scour the internet and execute code without human oversight, it's already led us to disastrous leaks of API keys. We emphasize the critical necessity of validating skills through trusted repositories where even having agents perform security reviews on the code before execution is not enough. Finally, we tackle the philosophical debate around AI productivity and why Dorota's LLMs raise the floor and not the ceiling is so spot on. The standout pick requires mentioning, a fascinating 1983 paper titled "Ironies of Automation" by Lisanne Bainbridge. This paper perfectly predicts our current dilemma: automating systems often leaves the most complex, difficult tasks to human operators, proving that as automation scales, the need for rigorous human monitoring actually increases, destroying the very value that was attempting to be captured by the original innovation. 💡 Notable Links: Agent Skill MarketplaceAI Fatigue is realEpisode: Does Productivity even exist?🎯 Picks: Warren - Paper: Ironies of Automation (& AI)Dan - Tool: SkillShare

Mar 20, 2026 • 52min

There's no way it's DNS...

Share Episode How much do you really know about the protocol that everything is built upon? This week, we go behind the scenes with Simone Carletti, a 13-year industry veteran and CTO at DNSimple, to explore the hidden complexities of DNS. We attempt to uncover why exactly DNS is often the last place developers check during an outage, drawing fascinating parallels between modern web framework abstractions and network-level opaqueness. Simone shares why his team relies on bare-metal machines instead of cloud providers to run their Erlang-based authoritative name servers, highlighting the critical need to control BGP routing. We trade incredible war stories, from Facebook locking themselves out of their own data centers due to a BGP error, to a massive 2014 DDoS attack that left DNSimple unable to access their own log aggregation service. The conversation also tackles the reality of implementing new standards like SVCB and HTTPS records, and why widespread DNSSEC adoption might require an industry-wide mandate. And of course we have the picks, but I'm not spoiling this weeks, just yet... 💡 Notable Links: Episode: IPv6SVCB + HTTPS DNS Resource Records RFC 9460Avian Carrier RFC 1149🎯 Picks: Warren - Book: One Second AfterSimone - Recommended Diving locations in Italy

Mar 15, 2026 • 49min

Getting better at networking

Share Episode We are joined by Daan Boerlage, CTO at Mavexa as we tackle the long-awaited arrival of IPv6 in cloud infrastructure. Here, we highlight how migrating to an IPv6-native setup eliminates public/private subnet complexity and expensive NAT gateways natively. As well as entirely sidestepping the nightmare of IP collisions during VPC peering. Beyond the financial savings of ditching IPv4 charges, we explore the technical superiority of IPv6. Daan breaks down just how mind-bogglingly large the address space is, and focuses on how it solves serverless IP exhaustion while systematically debunking the pervasive myth that NAT is a security feature. We also discuss how IPv6's end-to-end connectivity, paving the way for next-generation protocols like QUIC, HTTP/3, and WebTransport. The episode rounds out with a cathartic venting session about legacy architecture, detailing a grueling nine-year migration away from a central shared database that ironically culminated in a move to Salesforce. Almost by design, Daan recommends his pick, praising its intuitive use of signals and fine-grained reactivity over React. And Warren's pick explores storing data in the internet itself by leveraging the dwell time of ICMP ping packets. 💡 Notable Links: FOSDEM talk on the internet of threadsHilbert Map of IPv6 address space🎯 Picks: Warren - Harder Drive: what we didn't want or needDaan - SolidJS

Mar 6, 2026 • 58min

Varied Designer Does Vibecoding: Why testing always wins

Matt Edmunds, longtime UX director and Principal UX Consultant at Tiny Pixls, talks about vibe coding and how nontraditional builders use LLMs to prototype real tools. He recounts building a virtual audio driver with Claude and compares model behaviors. Discussion covers prompt-first strategies, using tests and schemas as anchors, the slop-app economy, and economic and hardware limits facing AI providers.

Feb 20, 2026 • 32min

DevOps trifecta: documentation, reliability, and feature flags

Share Episode We dive into the shifting landscape of developer relations and the new necessity of optimizing documentation for both humans and LLMs. Melinda Fekete joins from Unleash, and suggests transitioning to platform to help get this right by utilizing LLMs.txt files to cleanly expose content to AI models. The conversation then takes a look at the June GCP outage, which was triggered by a single IAM policy change. This illustrates that even with world-class CI/CD pipelines, deploying code using runtime controls such as feature flags is still risky. Feature flags can't even save GCP and other cloud providers, so what hope do the rest of us have. Finally, we discuss the practical implementation of these systems, advocating for "boring technology" like polling over streaming to ensure reliability, and conducting internal "breakathons" to test features before a full rollout. 💡 Notable Links: Diátaxis - Who is article this for?Fern - Docs PlatformCloudFlare - Feature Flag causes outageAWS - Graceful degredationBuilding for 5 nines reliabilityEpisode: Latency is always more important than freshnessEpisode: DORA 2025 Report🎯 Picks: Warren - Show: Bosch - LA Detective proceduralMelinda - Wavelength - Party Game

Jan 30, 2026 • 51min

The Productivity Delusion: Gizmos, Resentment Metrics, and the Art of Deleting Code

Share Episode Dorota, CEO of Authress, returns to apply the US Supreme Court’s definition of obscenity to a scandalous topic: Engineering Productivity. In a world obsessed with AI-driven efficiency, Dorota and Warren argue that software development productivity has nothing to do with manufacturing "gizmos" and everything to do with feelings. They dismantle the factory-floor mentality that equates typing speed with value, suggesting instead that the most productive work often happens while staring out a train window or disassociating in the shower. The conversation takes a dark turn into the reality of performance reviews. If productivity is subjective, how do you decide who gets promoted? Dorota proposes the "Resentment Metric"—ignoring Jira tickets in favor of figuring out who the team has secret concerns fo. They also roast the "100% utilization" fallacy, noting that a fully utilized highway is just a parking lot, and the same logic applies to engineering teams that don't schedule downtime for actual thinking. Ultimately, they land on a definition of productivity that would make any optimizer proud: deleting things. If the best code is no code, then the most productive engineer is the one removing waste, deleting replicas, and emptying S3 buckets. The episode wraps up with a credit-card-sized transformer (it's a tripod) and a book recommendation on why your international colleagues might be misinterpreting your silence. 💡 Notable Links: DevOps Episode: DORA 2025 ReportResearch: Happy software developers solve problems better🎯 Picks: Warren - Book: The Culture MapDorota - GEOMETRICAL Pocket tripod

Jan 16, 2026 • 51min

Project Yellow Brick Road: Creative, Practical, and Unconventional Engineering

Share Episode ⸺ Episode Sponsor: Rootly AI - https://dev0ps.fyi/rootlyai Paul Conroy, CTO at Square1, joins the show to prove that the best defense against malicious bots isn't always a firewall—sometimes, it’s creative data poisoning. Paul recounts a legendary story from the Irish property market where a well-funded competitor attempted to solve their "chicken and egg" problem by scraping his company's listings. Instead of waiting years for lawyers, Paul’s team fed the scrapers "Project Yellow Brick Road": fake listings that placed the British Prime Minister at 10 Downing Street in Dublin and the White House in County Cork. The result? The competitor’s site went viral for all the wrong reasons, forcing them to burn resources manually filtering junk until they eventually gave up and targeted someone else. We also dive into the high-stakes world of election coverage, where Paul had three weeks to build a "coalition builder" tool for a national election. The solution wasn't a complex microservice architecture, but a humble Google Sheet wrapped in a Cloudflare Worker. Paul explains how they mitigated Google's rate limits and cold start times by putting a heavy cache in front of the sheet, leading to a crucial lesson in pragmatism: data that is "one minute stale" is perfectly acceptable if it saves the engineering team from building a complex invalidation strategy. Practically wins. Finally, the conversation turns to the one thing that causes more sleepless nights than malicious scrapers: caching layers. Paul and the host commiserate over the "turtles all the way down" nature of modern caching, where a single misconfiguration can lead to a news site accidentally attaching a marathon runner’s photo to a crime story. They wrap up with picks, including a history of cryptography that features the Pope breaking Spanish codes and a defense of North Face hiking boots that might just be "glamping" gear in disguise. 🎯 Picks: Warren - The North Face Hedgehog Gore-tex Hiking ShoesPaul - The Code Book

Jan 2, 2026 • 59min

Special: The DORA 2025 Critical Review

Dorota Parade, tech entrepreneur and CEO of Authress with engineering leadership chops, skewers the 2025 DORA Report. She argues it trades hard data for AI-flavored narrative. They debate LLMs that make engineers feel faster but do not shorten feedback loops, the limits of agentic tools, feature-flag pitfalls, and why 30% acceptance rates get celebrated.

Dec 15, 2025 • 50min

Browser Native Auth and FedCM is finally here!

Share Episode ⸺ Episode Sponsor: Incident.io - https://dev0ps.fyi/incidentio "My biggest legacy at Google is the amount of systems I broke." — Sam Goto joins the show with a name that strikes fear into engineering systems everywhere. As a Senior Staff Engineer on the Chrome team, Sam shares the hilarious reality of having the last name "Goto," which once took down Google's internal URL shortener for four hours simply because he plugged in a new computer. Sam gets us up to speed with Federated Credentials Management (FedCM), as we dive deep into why authentication has been built despite the browser rather than with it, and why it’s time to move identity from "user-land" to "kernel-land". This shift allows for critical UX improvements for logging in all users irrespective of what login providers you use, finally addressing the "NASCAR flag" problem of infinite login lists. Most importantly, he shares why you don't need to change your technology stack to get all the benefits of FedCM. Finally, Sam details the "self-sustaining flame" strategy (as opposed to an ecosystem "flamethrower"), revealing how they utilized JavaScript SDKs to migrate massive platforms like Shopify and 50% of the web's login traffic without requiring application developers to rewrite their code. 💡 Notable Links: HSMs + TPM in production environmentsGet involved: FedCM W3C WGThe FedCM spec GitHub repoTPAC Browser Conference🎯 Picks: Warren - Book: The Platform RevolutionSam - The 7 Laws of Identity and Short Story: The Egg By Andy Weir

Dec 4, 2025 • 36min

Are we building the right thing?

Elise Stanley Breval, VP and Head of UX at Unleash, brings 30 years of experience in user-centered design to the conversation. She challenges the misconception that UX is merely about aesthetics, discussing the critical friction between engineering and customer needs. Elise shares a memorable story of overcoming misguided branding decisions with practical user research. They also debate the role of engineers in user interactions and highlight the importance of fostering a culture that values feedback and collaboration in product development.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner