Incident Post-Mortem Best Practices
One of the most important tenets of DevOps is the concept of continuous learning: always looking for ways to improve your processes, tools and overall operational effectiveness. This becomes even more important after you’ve had a major incident / outage. Your team can use the incident as a learning opportunity by conducting an effective post-mortem.
This session describes how to create an effective learning feedback loop as part of your incident management process, by conducting a blameless post-mortem. We use a real-life incident post-mortem from PagerDuty’s history as an example. We’ll cover what goes into a post-mortem investigation, what goes into the actual post-mortem report, and touch on the importance of blamelessness as part of your PM process.