"Site Reliability Engineering"

Heavybit

Incident Response and DevOps in the Age of Generative AI

Jesse Robbins convenes a panel of incident management veterans to examine where generative AI genuinely helps in SRE and DevOps — and where humans must stay in the loop.

“GenAI is good at confidently delivering text that is pleasant to read, but not always complete, or correct.”

The Confident Commit

DevOps is dead? Nope, it is maturing ft. Jesse Robbins

· Podcast · 37:57

Jesse Robbins joins CircleCI CTO Rob Zuber on The Confident Commit to argue that DevOps is not dead but growing up, and that the hard problems now are organizational, not technical.

♪ Apple Podcasts
Heavybit

What to Know About the Modern Incident Response Lifecycle

Heavybit's guide to modern incident management quotes Jesse Robbins on why teams only master incident response when they embrace the whole process — detection, response, and learning.

“Teams only get good at this when they embrace the whole process and each of its steps.”

— Jesse Robbins
Protocol

An oral history of #hugops: How tech's first responders built a culture of empathy

Protocol's oral history of

“I've got to change the way that I approach this entirely and make it safe to experiment.”

— Jesse Robbins
O'Reilly Radar

Tim O'Reilly on Why We Started the Velocity Conference

Tim O'Reilly's retrospective on the origins of the Velocity Conference explains why the event was launched and how web operations emerged as a strategic discipline, with Jesse Robbins as co-founder and conference chair.

ACM Queue

Resilience Engineering: Learning to Embrace Failure

Jesse Robbins (Amazon), Kripa Krishnan (Google), and John Allspaw (Etsy) discuss how they built organizations that deliberately trigger failure to get stronger: powering off data centers, running 96-hour disaster simulations, and transforming blame cultures into learning cultures.

“You can't choose whether or not you're going to have failures — they are going to happen no matter what — but you can choose in many cases when you're going to learn the lessons.”

— Jesse Robbins
USENIX

GameDay: Creating Resiliency Through Destruction

· Talk · 52:50

In this USENIX LISA'11 talk, Jesse Robbins explains GameDay: deliberately injecting failures into production systems to build organizational resilience before real outages happen.

“You don't choose the moment, the moment chooses you. You only choose how prepared you are when it does.”

— Jesse Robbins
▶ YouTube
DevOpsDays

DevOps Culture Hacks: Infecting your Boss & your Business with Awesome

· Talk

The original DevOps culture hacks talk at DevOpsDays Boston 2011. Jesse Robbins shares the formula for changing engineering culture from the inside, drawn from his years as Amazon's Master of Disaster.

“Don't fight stupid, make more awesome.”

— Jesse Robbins