Meet 2011 TR35 Winner Jesse Robbins

MIT Technology Review

MIT Technology Review by MIT Technology Review · · Video · 04:16

MIT Technology Review's TR35 honoree page for Jesse Robbins, 33 — citation 'Fault-tolerant online infrastructure' under the TR35 Innovators Under 35 badge, with John Keatley's portrait of Jesse in firefighter turnout gear.
The original 2011 MIT Technology Review TR35 honoree page. Portrait by John Keatley. (MIT Technology Review)

MIT Technology Review interviewed me as a 2011 TR35 honoree, recognizing the work on web operations, infrastructure automation, and reliability at Opscode.

I was truly honored to be recognized by MIT and Technology Review.

MIT Technology Review’s 2011 TR35 recognition highlighted Jesse Robbins as an innovator under 35 for his work on DevOps culture and infrastructure automation at Opscode, marking the shift from operations as manual toil to infrastructure as code.


Jesse Robbins, 2011 TR35 Winner

Category: Internet & web Year Honored: 2011 Organization: Opscode Region: Global Focus: Fault-tolerant online infrastructure

Biography

Jesse Robbins initially pursued two career paths in 2001: a Seattle bus driver position and a backup systems engineer role at Amazon.com. Amazon’s offer came first, launching a decade-long career that revolutionized how web companies handle complex server and software networks.

Drawing from his background as a volunteer firefighter, Robbins brought crisis management principles to infrastructure design. He recognized that massive global operations inevitably experience failures and built systems to withstand them safely. Rather than preventing failures, he made Amazon resilient to them through architectural fault tolerance and live operational drills that tested teams by temporarily taking entire data centers offline, without affecting customer experience.

After departing Amazon in 2006, Robbins shared his methodologies through blogging. In 2007, he established Velocity, now an annual conference where major competitors openly discuss infrastructure management solutions.

Robbins cofounded Opscode in 2008. The company’s flagship product, Chef, is an open-source programming language automating cloud-based infrastructure management. One notable application involved scientists using Chef to deploy a 10,000-processor supercomputing cluster in 45 minutes on Amazon’s cloud, completing complex protein-binding research in eight hours, then shutting down operations, all at a fraction of traditional supercomputing costs.

Full Transcript (AI-generated)
Hi, I'm Jesse Robbins. I'm co-founder of Opscode, the leader in cloud infrastructure automation, and part of a large community of people that build and operate the websites that we all depend on every day. Which may seem like very strange work for a firefighter, but I promise you it isn't. In 2001 I joined Amazon, really as a day job while I was making a transition to the fire service. I ended up with the title of Master of Disaster, which sort of foreshadows what I'm going to talk about. We found at Amazon, and most websites at any significant scale find, that as complexity and operating size increase, so do the number of failures and outages. And this is on the test: failure happens. So in the past decade, the conventional wisdom was: spend a whole lot more money to make the systems more reliable. We found that instead, focusing on resiliency of both technology and operational culture is the way that you build successful websites. And I realized as we were discovering this that my experiences in the fire department directly correlated to building sites at scale. So with some support of Amazon executives, I began turning Amazon into something of a fire department. I began training software developers using fire-department-style incident management techniques and doing things like fire drills — a program we called GameDay, which gave people an opportunity to learn how to deal with failures at scale, learn from those in stressful situations, and eventually work up to full-scale exercises where we were able to turn off data centers with no notice to developers and no impact to customers. As a result of this operational culture — many people know that Amazon's success is a direct result of its operating culture — it was a privilege to be able to contribute something unique from my background to that. These evolved into safety standards and building codes, much like the things that protect you in this room right now: technical controls that allow people to be safe even though the environment is significant and complex. I left in 2006, and I realized that the skills and operating capabilities that we had at Amazon were really necessary for any organization that depended on the web. There was no community, no culture, no way to share this. And so I, and a group of other crazy people, founded a conference called the Velocity Web Performance and Operations Conference, which is now in its fifth year and teaches about 2,000 people every year how to operate and succeed. One of the things that we learned from that conference was that there was both a cultural component and a tool component that was required to succeed. All the big companies had built their own tools over the years, but they had held them as very tightly guarded secret sauce, which left every other startup in a really bad position — poorly experienced and not equipped to be able to operate at the scale that they wanted to. And so we founded Opscode. Opscode does cloud infrastructure automation. We provide a tool called Chef, which is an open-source framework for systems integration. It's like a little sysadmin robot. It uses recipes and cookbooks, which are very easy to share, which allows people that are new to operating at scale to stand on the shoulders of other giants and then contribute back themselves. Chef is one of the most successful open-source projects in infrastructure history — over 450 contributors, and it is used by over 6,000 organizations, including many very relevant to other speakers today. We're hiring, and if you know people that care about infrastructure — send 'em our way. I'm Jesse Robbins, thank you very much.

Topics