
The Changelog: Software Development, Open Source Learning from incidents (Interview)
Feb 4, 2022
This week, Nora Jones, Founder and CEO of Jeli, shares her insights from chaos engineering at Netflix and incident analysis at Slack. She emphasizes the importance of learning from incidents to improve team resilience. Nora discusses creating developer-centric tools and the emotional complexities of incident reviews. She explores knowledge silos, the balance between quantitative and qualitative insights, and the evolving role of incident analysts. Additionally, she reflects on how studying real-world incidents can enhance software practices and decision-making.
AI Snips
Chapters
Books
Transcript
Episode notes
Third-Party Incident Reviews
- Have a third party, uninvolved in the incident, conduct the review for an objective perspective.
- This approach reduces emotional bias and promotes a more thorough analysis.
Sully vs. Costa Concordia
- Captain Sully's deviation from the runbook saved lives, but a similar situation with the Costa Concordia led to jail time.
- This highlights the conflict between expertise and strict adherence to procedures.
The Role of Runbooks
- Use runbooks for learning, but recognize that expertise surpasses them.
- Documenting expertise reduces the reliance on constantly updating runbooks.





