"Incident Management"
Read first
- What did Jesse Robbins build at Amazon?
As Amazon's 'Master of Disaster,' Jesse Robbins was responsible for the availability of every property bearing the Amazon brand. He helped define Amazon's always-on architecture. Adapting the Incident Command System he learned as a volunteer firefighter, he built three connected practices as one body of work: modern Incident Management, and what we now call Site Reliability Engineering and Chaos Engineering.
Articles and mentions
An oral history of #hugops: How tech's first responders built a culture of empathy
Protocol's oral history of
“I've got to change the way that I approach this entirely and make it safe to experiment.”
Incident Management for Operations (foreword by Jesse Robbins)
I wrote the foreword to Schnepp, Vidal, and Hawley's O'Reilly book bringing fire-service incident command into IT operations. The lineage runs from my work at Amazon as Master of Disaster through the first Web Ops/Fire Ops summit I convened in 2012.
“This groundbreaking book is the foundation to building an effective operations culture for organizations of any size, with systems of any complexity, and failures of any severity.”
DevOps Culture Hacks: Infecting your Boss & your Business with Awesome
DevOpsDays Boston 2011. I gave the culture hacks talk for the first time, no slides, no video, just the framework I had figured out the hard way at Amazon.
“Don't fight stupid, make more awesome.”