Disaster Recovery is a process for securing business continuity in the event of a disaster. The severity of the disaster may differ, as well as how to resolve it. However, the goal for a Disaster Recovery, or the Disaster Recovery Plan should always be to get to the state of business as usual as soon as possible.
The most important thing about Disaster Recovery is preparation and planning. This is because when a disaster actually happens, you will not be able to change any prerequisites but are forced to work with what you already have. If your plan was broken or didn't actually address all the critical aspects of your IT, your Disaster Recovery-environment will end up broken as well. A misconfigured backup job can always be reconfigured during normal operations but when the environment is already down; that ship has sailed.
To make sure we cover those needs we will split this section into a couple of subsections.
The first question we need to ask ourselves when creating a Disaster Recovery Plan, or DRP, is what the needs of the business actually is. In most cases, an organization has some sort of critical infrastructure or systems that need to be available at all time to be able to perform the core business. This might be an Enterprise Resource Management-system (ERP), a document collaboration or versioning system, production system etc.
The size of the organization as well as the grade of information technology adaptation all need to be taken into consideration when creating a plan for Disaster Recovery.
Defining these business needs can't be the sole work of the IT Department or an Operations Engineer. It needs attention from all levels of the organizational hierarchy. One method for identifying the needs might be interviews with colleagues that work in the core business.
Questions that need to be answered by every role in the core process(es):
- In your every day work, what activities do you need to be able to perform to fulfill the goals of your role?
- How do you do these?
Using the answers to the questions, we then need to identify each system or infrastructural component that is needed to realize that goal. The identified objects gives us a list of what we actually need to secure in the case of a real disaster.
This can be done in many ways, for example:
- Backup/Restore
- Data Mirroring
- Off-Site Mirrored environment
- Failover