January 6, 2009
 

Disaster recovery solutions:

Issue

What will it cost the company if the server is down for an extended period?
What will happen to your business if the server gets stolen/damaged or has a major hardware failure?

Solution

1. Must have Raid on server and centralise all data.
2. Daily tape backup and keep offsite.
3. Prepare a backup server.
4. Create images of all your servers and keep them up to date.
6. Secondary backups onto NAS or other disk storage
7. A spare backup server offsite.
8. Service agreements on your server hardware.
9. Routine maintenance on your network equipment
10. Type a disaster recovery plan.


Benefits

- Data security
- Less down-time
- Insurance on major disasters
- Peace of mind.



Disaster Recovery Planning

With the level of dependence many organisations place on their information systems, it is necessary to consider the business consequences of a potential system failure. - For example what is the financial cost to your organisation for a single day of lost productivity due to an unexpected system failure?

What would happen if……
• A new ‘Virus’ or ‘Worm’ infects your system and corrupts the system rendering it unusable??
• A power surge damages critical server components??
• A theft occurs and your server is stolen?? (It’s happened before!)

System failure commonly occurs due to one or more of the following reasons;
- user error, - inadequate maintenance, - hardware failure, - software conflict, - or as the result of a deliberate attack.

Unfortunately there is no guaranteed method to protect against all the above issues. Therefore the best course of action is to be aware of the potential risks, take precautions wherever possible and be prepared in the event of a failure.

All-Tasks employ a two-part strategy to contend with system failures. (Hardware or software.) Firstly we offer comprehensive preventative maintenance to actively manage the network infrastructure. Secondly we work with our clients to implement procedures and put in place tools that allow us to rapidly recover in the event of a systems failure.

Should a ‘worst case scenario’ occur, the procedures and associated tools will already be in place to get the system back up and running in predefined time period. Common expectations are for a 4-hour or an 8-hour recovery window.

Without forward planning it could take 15 – 30 hours to rebuild a downed server..

Disaster Recovery Implementaion

The single most effective tool in ‘disaster recovery’ is having an image of the working system volume of your server. (This image will include your specific program and user configurations.) Should a critical failure occur, rather than re-build the server (installation) from scratch we can restore a known working image.

Q. I have a tape backup, isn’t that doing the same thing?

A. Unfortunately not. Tape backup is an archival media primarily used for data storage. You cannot effectively backup a working system volume (programs) with a tape backup, it’s designed to store data.
A tape backup device can potentially take hours to backup a system and by the time the backup is complete many things may have changed. Secondly you cannot use a tape backup to restore data until the system is back up and running. If the system is corrupted, you will need to rebuild it first.

There are many methods of implementing system imaging technology. The simplest version is an off-line image. Once the server is working in a satisfactory state, we take the server offline. While the server is shutdown, we take a complete copy of the working system and store it on another hard drive. Once the image is complete the backup drive is removed from the working system. The whole procedure should take under and hour and will enable us to have 15+ hours work in the event of a system corruption. The cost to the client is minimal, the hardware is one additional hard drive.


The downside of off-line imaging is that it is a manual process and the image needs to be regularly updated manually, which involves taking the server off-line. Ideally the image should be updated every 3 to 6 months, or anytime significant changes are made to the configuration. (Especially when rolling out updated applications.)

As a further benefit, the drive also becomes a hot-spare. Should one of the live drives fail, we have a spare on hand to replace it with.

Taking it further

There are better methods of server imaging available. The ultimate solution is to use an advanced software tool to automatically create a daily live backup on another server. The image can then be archived to tape for storage.



 

 

© 2005 All-Tasks Computer Services.