From the Blogosphere
Have You Started Your Infrastructure as Code (IaC) Journey? | @CloudExpo #IaaS #Cloud #Analytics
What is Infrastructure as Code? How do you determine if your IT infrastructure is truly automated?
Nov. 13, 2017 02:13 PM
A recent survey done across top Fortune 500 companies shows almost 70% of the CIOs have either heard about IaC from their infrastructure head or they are on their way to implementing IaC. Yet if you look under the hood, while some level of automation has been done, most of the infrastructure is still managed in a traditional / legacy way.
What is Infrastructure as Code? How do you determine if your IT infrastructure is truly automated? "IaC is an approach and mind-set to automate infrastructure provisioning and management based on best practices from the world of software development. It underscores consistent, repeatable procedures for provisioning and changing systems configuration". The two key words here are ‘approach' and ‘mind-set'. While ‘approach' signifies that the deploying and managing infrastructure should not be any different from the deploying and managing of a software application, the ‘mind-set' signifies a radical change on how a traditional operations engineer thinks of operating and managing systems (compute, storage, network). Treating infrastructure as a piece of software requires a major shift from the traditional way of provisioning and managing infrastructure. It requires change in both technology and underlying processes. It warrants the need to think like a software engineer rather than think like a systems administrator. ‘Kief Morris', Author of Infrastructure as code book has described following basic principles of Infrastructure as Code
- Systems are easily reproduced
- Systems are disposable
- Systems are consistent
- Processes are reproducible
- Design is always changing
Principle #1. Systems are easily reproduced.
In the cloud based world we live today, this is the fundamental building block. Systems needs to be provisioned automatically with a press of button. While most of the companies have figured this out and have some form of automation to provision compute infrastructure, it's still far from where business can see value. Value gets generated not when one element of infrastructure gets provisioned programmatically, but when all components of infrastructure can be provisioned programmatically. From compute to storage & network everything should be automated. And why stop at the IaaS layer? Automation should go all the way to configure and install required software which can then be immediately used by business. This will result in achieving maximum business value. Automating end to end not only results in reduction of provisioning time but also reduces risk of human error and configuration mismatches which often occurs when building manually. It also makes life easy for infrastructure engineers to manage the systems over the period of systems life-cycle.
Principle #2. Systems are disposable.
A common analogy given in the industry is of cattle and pets. Pets are treated special and needs to be given special care. Cattle is in plenty, they serve the same purpose and most important they are easily replaceable. In traditional infrastructure world, all the systems are treated like pets. Each system needs to be given special care and most important each system differ from one another. If the system goes down for any unplanned reason, the only option is to scramble the operators to bring it up. This method comes at a huge cost of business downtime and high maintenance cost. With the advancement of cloud native applications which are mostly resilient at application layer, this problem can be avoided. By creating systems which are clones of each other (containers) and that can be easily created, destroyed, replaced, resized and moved we no longer need to treat each system like pets. If one system (VM or container) goes down, instead of spending time to troubleshoot and bring it, we simply build another one from predefined image and replace it with the one which went down. While this sounds simple, complexity of implementation lies at the application tier. If the applications are traditional monolithic application which depends on infrastructure to provide resiliency, implementing this model will be tough. However, if applications are cloud native, stateless micro services, it's easy to treat systems as disposable.
Principle #3: Systems are consistent.
How many times you have come across an incident only to realize the root cause is due to missing file or outdated driver/firmware? Chances are most of the time outages happen because the systems are not consistent. Each system looks different from one another even though they serve the same application running on them. Commonly known as snow-flake computing in the industry, this creates a huge problem for any infrastructure head. Inconsistent systems not only result in unplanned outages, but can also lead to performance issues and other problems which impacts business. Peeling the onion further will lead to the underlying root cause as human failure. Humans are prone at making mistakes, we forget and even the best system engineer can cause an outage if he or she is manually making system changes. By automating configuration and system management, one can avoid snow flake computing. Manual system changes should only be allowed in rare circumstances like critical incidents only if there are no other alternatives. A defect ticket should be created post incident to ensure all the systems gets updated with the system changes done during incident. Systems should be built from a common operating system image and any application related customization should be pushed via run-books. This will ensure at any given time all the systems are consistent to each other and will guarantee higher business uptime.
Principle #4. Processes are reproducible.
One cannot deny the fact that any enterprise today is loaded with processes. The more processes humans are expected to follow, higher the chances of human error. While processes are integral part of operations, it's important that they are consistent and reproducible so everyone can follow the same exact steps to do a particular task. Left to humans, each human will execute the same task following a slight different process leading to issues. We have probably encountered issues when one of key member leaves the team and other team members cannot figure out how to carry on the same task. Solution lies in ensuring each task is scripted and stored in a common software repository. Every system operator is expected to execute the predefined script rather than executing manual steps which differs from the scripted tasks. Any change in process or task should result in a new version of the script being uploaded in the repository with clear documentation of the changes done.
Principle #5. Design is always changing.
In today age where business model is changing at much faster pace than ever before, its expected that the underlying infrastructure serving the business application can adapt to the change at the same pace. Yet, most of the times in traditional enterprise IT, change in application architecture almost results in building new underlying infrastructure which is very costly. Systems are built for specific application architecture and any change at application layer becomes complex and expensive task at infrastructure layer to implement. In the cloud based world, infrastructure should be built in such a way that's easy to scale to meet business demand. Systems should be designed to accept change as a norm rather than as exception. After all the only thing that is constant is change!
Infrastructure as Code sounds complex to implement, however it's a journey and every journey starts with a single step. It is no longer an option, but a must for any IT to move towards automating infrastructure to meet the business demand. What are you waiting for?