Slight Reliability

Stephen Townshend
undefined
Apr 25, 2023 • 9min

Slight Reliability Episode 52 - Double, Double, Toil and Trouble!

Send us Fan MailIn this episode Stephen explores the SRE concept of "toil". What is it? How can we measure it? How do we reduce it?Also in this episode: Can we make non-technology systems observable? (like we do technology ones), and the ineffectiveness of change advisory boards (CAB). Also,  Stephen's upcoming attendance at SREcon, AWS Summit, and SLOconf.Shout outs to Steve McGhee, Dom Finn, and Shea Stewart.You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Apr 18, 2023 • 30min

Slight Reliability Episode 51 - The reliability.org Community with Anurag Gupta

Send us Fan MailIn this episode Stephen Townshend and Anurag Gupta discuss the new reliability.org community for SREs or reliability engineers to share experiences, ask questions, and find community. They discuss the value of community and sharing your thoughts, collaboration between organisations, vicious versus virtuous cycles for reliability, and much more.You can join us in the community by visiting https://www.reliability.org/You can find Anurag:On LinkedIn: https://www.linkedin.com/in/awgupta/You can find out more about Shoreline by visiting https://www.shoreline.io/You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Apr 11, 2023 • 39min

Slight Reliability Episode 50 - The 50th Episode Special with Bruce Cullen

Send us Fan MailIn this episode Bruce Cullen interviews Stephen Townshend about the past, present, and future of the Slight Reliability podcast. They discuss their shared backgrounds in software testing, the different career paths that testing has opened up, and much more!Bruce is the Director of Engineering at SquaredUp. You can find him on LinkedIn: https://www.linkedin.com/in/bruce-cullen/You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Apr 4, 2023 • 39min

Slight Reliability Episode 49 - Implementing Observability in the Real World with Ivan Merrill

Send us Fan MailIn this episode Ivan Merrill from Fiberplane shares his experiences implementing observability within some of the large complex organisations he's worked for in the past.You can find Ivan on LinkedIn: https://www.linkedin.com/in/ivan-merrill-1a05223/You can find out more about Fiberplane here: https://fiberplane.com/You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Mar 21, 2023 • 8min

Slight Reliability Episode 48 - Blind Insight

Send us Fan MailIn this episode I discuss the word "insight" within the context of observability. Is insight something tools can provide? Is it something you can reproduce? You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Mar 14, 2023 • 33min

Slight Reliability Episode 47 - Cloud Dependency Reliability with Jeff Martens and Ryan Duffield

Send us Fan MailIn this episode Stephen Townshend discusses our increased dependency on third party cloud services and what this means for reliability with Jeff Martens and Ryan Duffield from https://metrist.io/.You can find Jeff... On LinkedIn: https://www.linkedin.com/in/jmartens/On Twitter: https://twitter.com/JmartensYou can find Ryan...On StackOverflow: https://stackoverflow.com/users/2696/ryan-duffieldOn GitHub: https://github.com/rduffieldYou can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Mar 7, 2023 • 10min

Slight Reliability Episode 46 - Raw Telemetry

Send us Fan MailIn this episode I propose the use of scatterplots of raw data to better understand how our systems are behaviour and what our customers are experiencing. The ideas from this episode come from my time as a performance engineer and working with legends in that space Richard Leeke (https://www.linkedin.com/in/richard-leeke-450448/) and Neil Davies (https://www.linkedin.com/in/neildaviesnz/).For some basic examples of scatterplots and what they show you versus line charts check out an article I wrote back in 2017 called "Let's Talk About Averages": https://www.linkedin.com/pulse/lets-talk-averages-stephen-townshend/Another proponent of scatterplots is Stijn Schepers (https://www.linkedin.com/in/stijnschepers/). Here's an article he wrote about it in 2019: https://www.linkedin.com/pulse/performance-testing-act-like-detective-use-raw-data-stijn-schepers/ Neil Davies' article on tornado scatters "Chasing Tornadoes" can be found here: http://www.performance-workshop.org/wp/wp-content/uploads/2013/12/Chasing_Tornadoes_Davies.pdfYou can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Feb 28, 2023 • 49min

Slight Reliability Episode 45 - Telemetry Fluency with Paige Cruz

Send us Fan MailIn this episode we discuss uplifting telemetry knowledge within engineering teams to enrich their work (and their lives) with Paige Cruz from Chronosphere. We cover why not to take a chainsaw to your observability in order to cut costs, the dark side of auto-instrumentation, story telling with live data, and much more.The book that Paige recommends at the end is "Effecting Monitoring and Alerting for Web Operations": https://www.oreilly.com/library/view/effective-monitoring-and/9781449333515/You can check out Chronosphere here: https://chronosphere.io/You can find Paige on LinkedIn: https://www.linkedin.com/in/paigerduty/You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Feb 21, 2023 • 39min

Slight Reliability Episode 44 - Cognitive Overload with Paige Cruz

Send us Fan MailIn this episode we discuss cognitive overload in SRE with Paige Cruz from Chronosphere. We cover both what cognitive load is, what causes it, as well as some potential antidotes and preventative measures.You can check out Chronosphere here: https://chronosphere.io/You can find Paige on LinkedIn: https://www.linkedin.com/in/paigerduty/You can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre
undefined
Feb 14, 2023 • 10min

Slight Reliability Episode 43 - Beyond Observability

Send us Fan MailIn this episode I discuss my "bigger picture" perspective of what observability needs to be, and why it's important we include business and customer into what we monitor in the Digital Era.The books I highlight in this episode are...Observability Engineering https://www.oreilly.com/library/view/observability-engineering/9781492076438/Sooner, Safer, Happier: https://soonersaferhappier.com/book/The Phoenix Project https://www.oreilly.com/library/view/the-phoenix-project/9781457191350/The Unicorn Project https://www.oreilly.com/library/view/the-unicorn-project/9781098124175/Accelerate: https://www.oreilly.com/library/view/accelerate/9781457191435/You can grab a copy of the 2022 State of DevOps report at: https://cloud.google.com/devops/state-of-devopsThe blog I mentioned was The Insight Industrial Complex: https://benn.substack.com/p/insight-industrial-complexYou can find the official Slight Reliability podcast website at: https://slightreliability.com/You can find Stephen at:LinkedIn: https://www.linkedin.com/in/stephentownshend/Twitter: https://twitter.com/the_kiwi_sre

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app