Disaster Recovery Plans Be Prepared
Learning from recent events
Dec. 26, 2005 10:15 AM
It would seem only logical that after 9/11, one of the most horrific days in American history, corporations large and small would be ready for unforeseen catastrophic events. However, by one recent estimate, less than 38% have put a complete disaster recovery plan in place - the policies, processes, procedures, and architecture to deal with unforeseen events. In the wake of Hurricanes Katrina and Rita, IT managers are again forced to reassess how well prepared they and their organizations are to manage through and recover from natural or man-made disasters.
Understanding the strategic goals and requirements for surviving a catastrophic event is one thing, but actually having a set of guidelines in place for handling the tactical issues involved is quite another. Ultimately, the goal is to recover and restart business operations quickly and efficiently. But successfully arriving at that goal involves doing a hundred little things before, during, and after the event. This can present a challenge for IT organizations just trying to keep up with day-to-day operations, but when one thinks of what's at stake, it's imperative. Advance planning, preparation, awareness, and testing are the keys to the success or failure of a disaster recovery plan.
Planning for the Worst
CIOs and IT managers should ask themselves, "How do I deal with 21st century threats?" Imagine what you would do if you were in faced with a disaster. Would you be in good shape, or would you find yourself scurrying to implement a recovery plan? If the latter, now ask yourself: How long can my company's IT operations afford to be offline?
Preparing data and IT systems for potential disaster requires a combination of well-planned procedures and thoughtful policies. All companies should ask themselves this "What if..." question at every stage of their operation: "What if the power was down for over 10 days? Do I have a plan to deal with it?"
The problem today is that most software development organizations view disaster recovery at best as an after thought, although prevention is the key. The inability to resume everyday operations quickly and protect resources can be detrimental to a business and its community, and the companies most prepared for the unexpected are the ones that will reduce the risk of operational downtime, protect their valuable intellectual property and get back up on their feet quickly.
Approach Planning Incrementally
Organizations have to determine what it will take to protect them during a disaster and how long they can manage before full restoration is required. By asking the "What if..." question, organizations can be better prepared for an emergency by defining an initial plan and then refining it with more and more detail. It's best to create such a plan incrementally, starting out by asking the simple questions first, and then moving to more complex queries as appropriate for the business. Keep in mind that this project is an on-going one. As your business changes and new initiatives are defined, your plan has to change too. So reviewing your disaster recovery plan regularly should be a key part of the project.
When looking at disaster recovery from an application lifecycle management perspective, questions such as these should be addressed:
Disaster Recovery Planning Checklist
- From a Requirements Stage: What are the business, system, and data requirements needed in the event of a disaster?
a. Are these items isolated so they can be moved in an emergency?
- From a Design Stage: Is there adequate design for failure, is the architecture defined to enable recovery?
a. Where are the hardware, software, and other assets located and where will they be if something happens? What are the steps to restarting or continuing these assets?
- From a Development Stage: So the systems that are being built include concepts of failover, redundancy, and co-location?
a. Do the architects and developers understand the importance of disaster recovery as it relates to the systems being created, and has management participated in these requirements?
- From a Testing Stage: Could we implement a copy of our environment and test to ensure that it is stable and live in the event of an emergency?
a. Where are the assets going to be located in an emergency?
- From a Production Stage: Could we move all operations to another location seamlessly and immediately if we had to and is the location and infrastructure needed to do this available and ready?
a. What are the steps needed to move the company assets?
- From Configuration Management Stage: Is there adequate backup and redundancy built-into our large software investment?
a. Do you have a recent copy of your entire business off-site? And is it stored in a safe place?
Besides asking these questions, there are other measures IT executives can take to protect their people, information, infrastructure, and assets. The following are some steps companies can start to do today to ensure that their systems have the best chance of survival:
- Create a disaster recovery plan. Once you have a plan in place, communicate it, rehearse it and keep it updated.
- Review software and hardware contracts to ensure that proper licensing and contingencies are in place to help in the case of an emergency.
- Enhance the company's software development methodology to include the guidelines needed for disaster recovery in every phase of the development process.
- Test the plan to see if it will work. Contemplate the worst possible case scenarios. If your company already has a disaster plan in place, great. You're in far better shape than most. Once you've made a plan, test it. As we all know, plans look great on paper, but when it comes time to execute things don't always go as you expect.
- Backup. Many of the plans for data recovery are way too limited. Here are a few questions companies can ask themselves to ensure their backup plans are extensive enough:
I. Do I have a current backup? Is there data missing from real-time to what was backed up, and does that matter?
II. Do I have an alternate location from which to conduct business? If you have a good backup, and the data is current, is there a place to run the software or actually failover to?
III. If the power outage is widespread and all businesses are down locally, will the company supplying the recovery location be overwhelmed? If so, what are our alternate plans?
IV. Does the disaster recovery location have the staff to actually run the business? If not, what is the plan to get competent people there?
Many of us today are thinking about our own disaster preparedness. What lessons can be learned from recent events? What can be done differently next time? Is it possible to be prepared for every contingency? Coming up with better plans for the future - and executing on them - may take some time. However, adapting existing technology to new and unforeseen circumstances is something that we can start doing today. The unpredictable nature of events that can cause IT disruption continues to be a threat and one that businesses can't afford to dismiss.