Part 1 It’s clear why testing is important, but how to actually test your disaster…
While disaster recovery plan testing is notorious for interrupting your workflow, this doesn’t need to be the reality. In this two part blog post (see Part 1 here), we explain how you can test your disaster recovery plan. See why Baseline’s testing technique keeps business flow continuous and secure.
Where to Begin
See Part 1 for the detailed first steps in disaster recovery plan (DRP) testing. In a nutshell, companies must:
- Define the scope of their plan
- Determine changes that have taken place since the original plan’s development
- Ensure that these changes are taken into account (e.g. added servers, new customer info)
- Address issues like application codependency, partial server loss, order of post-disaster restoration, etc.
Companies often mistakenly think that disaster recovery plans are timeless, but the reality is that every change in your company will affect your plan as well. Make sure that your plan is current, and that members from each department agree on the prioritization of applications and servers as they are backed up and restored.
How Our Technique Differs
Many disaster recovery (DR) providers suggest something called the “The Full Interruption Test.” This entails simulating a disaster that you are likely to have, like a weather-related outage, hardware failure, or accidental file deletion. If the simulation goes wrong, you can have all of the same consequences as if the disaster scenario had actually happened. It seems bizarre and almost unreal, but it’s true – companies really subject themselves to this invented disaster, essentially doubling the stress on their systems.
Unlike companies that recommend this type of testing, we test in a way that keeps your data secure at all times, and does not interrupt your business operations or workflow. Instead of simulating a disaster on your systems, we conduct the entire testing process in our custom sandbox environment. This custom sandbox is essentially a virtual copy of your production that our engineers create and host at our secure hot site.
When the test is done, the sandbox is wiped clean; anything that went wrong in the sandbox will never affect or influence your real IT environment. As a result, our tests do not interrupt workflow, require odd hours to be run, or put any data at risk. We also document all of the information that we acquire during the testing process, like how much time it takes to restore your system or how quickly the prioritized applications are back online. This documentation fulfills auditor requirements and compliance, so you get an extra bonus out of testing your systems without any added inconvenience.
How we do test:
- In a sandbox environment
- Fully managed testing process
- Establish customized RTOs and RPOs
- Ensure disaster recovery plan scalability
- Validate changes in your IT environment
- Fulfill auditor requirements and compliance
- Testing of IBM i, Power, AIX, Windows and Unisys
How we don’t test:
- “Full Interruption Test”
- Incur unnecessary costs
- Put data or systems at risk
- Interrupt business operations
- Inconvenience company employees
- Require strange hours to run the test
Think of it this way: every gap found in the testing process is another gap filled when a real disaster occurs. Finding faults during a test is actually a good thing; this means that we can fix these problems before a real disaster tests your systems.
Faults found during the testing process fall into two main categories: human-related, and technical. The former usually includes new or modified systems that slipped through the cracks since your initial disaster recovery plan was established. Technical challenges can range from having servers in poor condition that negatively impact backups, to having data restoration difficulties. All of these problems can be addressed during the testing process.
There’s no way of knowing whether your disaster recovery plan meets your business requirements without a real disaster, or a test. Remember: time spent testing is time saved from downtime. Rather than risking everything during a real disaster, testing is the way to make sure your data can weather any storm. That’s why we treat testing as an integrated part of our data recovery solutions.
For more information on disaster recovery plan testing while maintaining a continuous workflow, contact us today at 317-707-3941.