---
title: Meet 2011 TR35 Winner Jesse Robbins
description: MIT Technology Review interviewed me as a 2011 TR35 honoree, recognizing the work on web operations, infrastructure automation, and reliability at Opscode.
doc_version: "1.0"
last_updated: 2011-12-02
slug: meet-2011-tr35-winner-jesse-robbins-mit-tr
outlet: MIT Technology Review
author: MIT Technology Review
date: 2011-12-02
url: https://www.youtube.com/watch?v=s55G8eDHGgY
type: Video
excerpt: MIT Technology Review interviewed me as a 2011 TR35 honoree, recognizing the work on web operations, infrastructure automation, and reliability at Opscode.
tags:
  - Awards
  - DevOps
  - Chef
  - Cloud Infrastructure
duration: 04:16
thumbnail: https://i.ytimg.com/vi/s55G8eDHGgY/hqdefault.jpg
---

<!-- source -->

## From the MIT Technology Review TR35 honoree page

**Category:** Internet & web
**Year Honored:** 2011
**Organization:** Opscode
**Region:** Global
**Focus:** Fault-tolerant online infrastructure

### Biography

Jesse Robbins applied for two jobs in 2001: a Seattle bus driver position and a backup systems engineer role at Amazon.com. Amazon's offer came first, beginning a decade of work on how web companies operate complex server and software networks at scale.

Drawing from his background as a volunteer firefighter, Robbins brought crisis management principles to infrastructure design. He recognized that massive global operations inevitably experience failures and built systems to withstand them safely. Rather than preventing failures, he made Amazon resilient to them through architectural fault tolerance and live operational drills that tested teams by temporarily taking entire data centers offline, without affecting customer experience.

After leaving Amazon in 2006, Robbins shared his methodologies through blogging. In 2007, he cofounded Velocity, now an annual conference where major competitors openly discuss infrastructure management.

Robbins cofounded Opscode in 2008. The company's flagship product, Chef, is an open-source framework for cloud-based infrastructure automation. One notable application involved scientists using Chef to deploy a 10,000-processor supercomputing cluster in 45 minutes on Amazon's cloud, completing complex protein-binding research in eight hours, then shutting down operations, all at a fraction of traditional supercomputing costs.

## Also Mentioned

- Jesse Robbins (person)
- [MIT Technology Review](https://www.technologyreview.com) (company)

## Transcript

Hi, I'm Jesse Robbins. I'm co-founder of Opscode, the leader in cloud infrastructure automation, and part of a large community of people that build and operate the websites that we all depend on every day. Which may seem like very strange work for a firefighter, but I promise you it isn't.

In 2001 I joined Amazon, really as a day job while I was making a transition to the fire service. I ended up with the title of Master of Disaster, which sort of foreshadows what I'm going to talk about.

We found at Amazon, and most websites at any significant scale find, that as complexity and operating size increase, so do the number of failures and outages. And this is on the test: failure happens. So in the past decade, the conventional wisdom was: spend a whole lot more money to make the systems more reliable. We found that instead, focusing on resiliency of both technology and operational culture is the way that you build successful websites.

And I realized as we were discovering this that my experiences in the fire department directly correlated to building sites at scale. So with some support of Amazon executives, I began turning Amazon into something of a fire department. I began training software developers using fire-department-style incident management techniques and doing things like fire drills — a program we called GameDay, which gave people an opportunity to learn how to deal with failures at scale, learn from those in stressful situations, and eventually work up to full-scale exercises where we were able to turn off data centers with no notice to developers and no impact to customers.

As a result of this operational culture — many people know that Amazon's success is a direct result of its operating culture — it was a privilege to be able to contribute something unique from my background to that. These evolved into safety standards and building codes, much like the things that protect you in this room right now: technical controls that allow people to be safe even though the environment is significant and complex.

I left in 2006, and I realized that the skills and operating capabilities that we had at Amazon were really necessary for any organization that depended on the web. There was no community, no culture, no way to share this. And so I, and a group of other crazy people, founded a conference called the Velocity Web Performance and Operations Conference, which is now in its fifth year and teaches about 2,000 people every year how to operate and succeed.

One of the things that we learned from that conference was that there was both a cultural component and a tool component that was required to succeed. All the big companies had built their own tools over the years, but they had held them as very tightly guarded secret sauce, which left every other startup in a really bad position — poorly experienced and not equipped to be able to operate at the scale that they wanted to.

And so we founded Opscode. Opscode does cloud infrastructure automation. We provide a tool called Chef, which is an open-source framework for systems integration. It's like a little sysadmin robot. It uses recipes and cookbooks, which are very easy to share, which allows people that are new to operating at scale to stand on the shoulders of other giants and then contribute back themselves. Chef is one of the most successful open-source projects in infrastructure history — over 450 contributors, and it is used by over 6,000 organizations, including many very relevant to other speakers today.

We're hiring, and if you know people that care about infrastructure — send 'em our way. I'm Jesse Robbins, thank you very much.

## Sitemap

See [sitemap.md](https://jesserobbins.com/sitemap.md) for the full list of pages on this site.