What If? Disaster Recovery For Software Systems
December 2008By Deb Canning
With contractors depending more than ever on the reliability of their IT systems, disaster recovery of data has become a critical component of any prudent business plan. The recent spate of hurricanes, floods, and utility blackouts have made it imperative that systems be protected and a plan developed for quick and effective recovery in the event of a natural disaster, human sabotage, or IT failure. However, while most contracting organizations have business continuity insurance, not all have reliable disaster recovery plans. Define Business Critical
Like any other commercial enterprise, a contractor relies on its ability to keep its commitments to customers and employees. Today’s contracting companies rely on digital information for every business process, from job costing to invoicing, accounts receivable, and payroll.
The process of developing an effective disaster recovery plan begins with a definition of “business-critical” IT applications as part of a company’s overall business continuity planning. There is no single definition; it depends on the nature and size of the contractor and its business environment. For example, what would happen if a natural disaster, human sabotage, or an IT system failure struck the company? Would it be prepared to keep its business running with confidence, pay its employees and subcontractors on schedule, and invoice its customers in a timely manner? How much downtime can the company afford: an hour, a day, or a week? And how much data can it afford to lose?
As one might expect, the more business-critical components there are to the IT system, and the greater the associated level of disaster protection that is required, the higher the cost. Thus, the results of this cost/benefit analysis will drive the disaster recovery solution.
Scheduling Data Backups
Contractors must be prepared for both planned and unplanned outages. Planned outages include system upgrades, daily and weekly backups, application maintenance, and maintenance upgrades. Unplanned outages include simple human errors (“Oops, I hit the delete key!”), power outages, network failures, natural disasters, and human tampering.
First, the company needs to quantify the amount of downtime and data loss that are acceptable in the event of an outage. Most companies quantify the amount in terms of the amount of resources (such as staff, dollars, and time) that would be required to recreate the lost data.
For example, if a contractor has staff inputting data all day, every day, and loses its server—and hasn’t backed up the system since last Friday—can it afford to pay staff overtime to re-key the data for the last 5 days while keeping up with current data entry? Meanwhile, can it also afford to operate the business without the missing data? In particular, can the company meet its payroll date?
Every contractor must schedule backups to meet its individual level of risk. Most contractors schedule a nightly backup of data with changed objects and a weekly full backup.
Off-Site Tape Storage
Backing up data to tapes is only the first step. If these tapes are stored on-site, there is a risk of data loss in the event of a theft or physical damage to the building as a result of a fire or natural disaster. Therefore, even smaller contractors send their data tapes to an off-site data archiving and storage company once a month. However, depending on the size of the company, its geographic location, season of the year, and/or how critical the data are, the company might want to consider a more frequent schedule.
Data archiving and storage companies typically return tapes to be rewritten after the appropriate period of time. Typically, weekly tapes are returned after 2 or 3 months, while monthly tapes should be stored for a longer period of time so that they are available, for example, in the event of a contract dispute.
Data Replication Solutions
Backups and off-site tape storage are the fundamentals. The IT system can also be set up to replicate data so that it is easily recoverable in the event of a planned or unplanned outage. There are several ways in which data can be replicated, including the following:
- Storage Area Network (SAN). Basically a remote storage device, a SAN attaches to a server through the network, so that it appears as if it is locally attached to the server. The SAN is part of the company network, and because it is not part of the local server hardware, it would not be affected in the event of a hardware failure. The SAN can be located off-site for an added measure of security. This solution avoids the expense and problems associated with tapes, notably the need to swap tapes as they become full, defective tapes or tape drives, and the need for off-site storage. Using a SAN also reduces recovery time.
- Clustered server. A clustered server arrangement is the ultimate high-availability data solution for a contractor that requires no downtime. In this arrangement, the primary server “synchs” or transfers data in real time to a secondary server, which could be located off-site. If primary server A goes down, then server B will take over as the primary server; when server A comes back online, server B will transfer back the missing data and revert to secondary status.
- Remote online backup. These services are typically inexpensive, and some are easy to use and install. Some services will even encrypt the data when transferring it over the Internet. However, not all services will provide enough support when configuring the product, and they may have confusing interfaces. If using this option, try to find a service that will encrypt data during and after the transfer, but also check that they will put the data on several servers. Some services will also ensure that the contractor is the only one that holds the decryption key. But beware that if the key is lost, the data will also be as good as lost.
- Outsourcing to an enterprise systems specialist. A contractor can feel confident in outsourcing data storage and recovery to an enterprise systems specialist, that is, the provider of its financial and project management systems. Such a provider has the expertise, capabilities, and resources to ensure the fastest, most reliable data storage and recovery through a comprehensive solution comprising tape storage, a SAN, and/or a clustered server arrangement at its own data center.
About The Author:
Deb Canning is technical services manager for Computer Guidance Corporation, a leading developer of financial and project management software solutions for the commercial construction industry out of Phoenix, Arizona. For more information, please visit www.computerguidance.com.
Download a pdf version of this article. | Adobe Reader is required.



