Healthy Software Development

High-quality DevOps practices produce a seamless flow of continuous development, deployment and maintenance of large-scale web applications. Today’s business environment demands accelerated releases of functionality from inception to production code. The notion of allowing developers to push code directly into production sets off alarms for most traditional IT leaders. Many experienced development managers recall stories of mistakes made in production environments, some resulting in significant business disruption. However, leading large-scale consumer sites such as Google, Facebook, Netflix and Amazon have adopted these practices and push small code releases every day at an enormous rate. Agile DevOps has its roots in Lean and Kanban. In lean manufacturing, cycle time is defined as the time from when a work product is started to when the finished work product is delivered. For software, this corresponds to the time between when a user story is created to when the story is real code in production. For many high volume sites, the preferred batch size for a release is a single user story. Each story is put into production as soon as it’s complete. Due to massively dense server farms, there is no way to do this manually and automation tools must be used.

We see engineering style alignment with these tools such as Puppet or Chef. If you’re coming from the dev side of DevOps, then the procedural nature of Chef using Ruby is natural. Puppet appeals to Ops pros since it is more mature, data driven and geared toward sysadmins. DevOps seeks to bring these styles together. The freedom of allowing developers to push code into production also comes with the responsibility to ensure stability of their code after deployment. A cultural shift from Project-based IT to Product-based IT is necessary to make DevOps successful. If not, then you have speedy agile development sprints constrained within quarterly or monthly waterfall release cycles. Bottlenecks upstream and downstream outside of normal scope are addressed more readily in a product-based construct. The attitude changes from “it’s not my job” to “it’s my workflow” – and a strong sense of ownership accompanies this new attitude. With increased velocity & cycle time, we can use metrics of mean time between failure and mean time to restore service rather than the number of lines of code produced or number of defects resolved.