Release Management Best Practices at Amazon [feedly]

Wow! 

"Key to Amazon's success at scale is its ability to roll out new software across its whole platform. The company can update 10,000 servers at a time and roll back changes with a single system call. This continuous deployment environment allows Amazon to push out new code once every 11 seconds, on average."
 
 
Shared via feedly // published on Puppet Labs // visit site
Release Management Best Practices at Amazon

When you think about scale, you probably think about Amazon. And Werner Vogels, Amazon's CTO, has plenty to say about practicing release management at scale.

In a review of Vogels' speech at Amazon's developer conference, Charles Babcock of Information Week highlights the influential CTO's view that enterprise IT development has to be recrafted in favor of systems that are controllable, resilient, adaptive and data driven.

Vogels thinks of controllable applications as "decomposed into small, loosely coupled, stateless building blocks." These run in an environment where the systems architect no longer thinks about physical servers, and so is no longer bound by resource constraints. This same concept, of course, is what underlies Amazon's own Elastic Compute Cloud, which provides resource scaling on demand.

Key to Amazon's success at scale is its ability to roll out new software across its whole platform. The company can update 10,000 servers at a time and roll back changes with a single system call. This continuous deployment environment allows Amazon to push out new code once every 11 seconds, on average.

Effective release management requires constant monitoring of new releases, and Amazon has its entire infrastructure instrumented to log performance and failures. This gives the company early warnings of system slowdown or failure. In such a large environment, failure is to be expected, so it's critical that an event such as a disk drive failure on the S3 storage system is well tolerated and provided for with resilient design.

Amazon is a challenging IT operations environment that provides services to both internal and external customers. That the company is able to do this at enormous scale is a testament to its success at understanding release management across a very large environment.

Learn More: