CIW Course Revision Site


Introduction to Fault Tolerance

In a Nutshell - CIW Course Section 3, Part B3, Chapter 8

RAID

Redundant Array of Independent Disks (RAID) is also referred to as Redundant Array of Inexpensive Disks. The two terms seem to be used interchangeably.

Originally, RAID was developed simply to increase disk space as, at the time, disks of over 1GB capacity were prohibitively expensive.

RAID can be implemented as a hardware or as a software solution. Hardware is the preferred route as it offers greater flexibility and better performance, but it is the more expensive option.

There are five main RAID levels in common use:

RAID 0 is good from a performance point of view, but offers no fault tolerance.

RAID 1 Read speed is as good as a single disk, but write speeds may be slower depending on it being a hardware solution or a software solution. A disk failure does not interrupt service.

RAID 3 is not a very common implementation. It places all parity information on a single disk and data is striped over the remaining disks in the set.

RAID 4 is similar to RAID 3 but individual files are always written to a single disk.

RAID 5 requires a minimum number of three disks and is, by far, the most common server solution. Like RAID 3 except that the parity information is spread across all disks in the set.

Data Protection Options

Protecting data is important to all computer users. Lost data can mean many hours of work may be lost irretrievably. To a company, this loss of data could represent huge monetary losses as well. RAID disks are perhaps the best way to protect data as, hopefully, any system failure will not result in, even temporary, data loss. Swapping system components in the event of failure usually requires the computer to be switched off. Improved technology, however, allows for the hot-swapping or warm-swapping of devices without powering-off the system A warm-swap is the term used when software configuration is necessary before a swap can occur. This configuration may be as simple as disabling a service that is using the device.

An Uninterruptible Power Supply (UPS) can help protect data. Sudden loss of power can result in data stored on the disks becoming corrupted. A UPS provides continuous power for a limited time in the event of a mains failure. This battery-backed power is, generally, used to shut systems down in an orderly fashion, to prevent data loss.

Another means of protecting data is to back it up to an alternative medium. This media may be floppy disk, not one I would recommend, Zip disk, CD, DVD or tape. The latter is still the most common for large server backups. Another possibility is off-site storage which addresses another scenario - physical damage to the workplace. Any form of backup should ideally be stored in a fire-proof safe or kept off-site so that fire, flood or other catastrophe does not lead to total data loss. Yes, you may lose the computer systems, you may even lose the building, but for a business the most most critical element is the data - this can't be insured!

Backup Strategy Planning

The course identifies five main considerations when planning your backup strategy, all of which I agree with:

I prefer to backup to tape, which I have been doing on a weekly basis. I have chosen to back up only my data files as a full system backup occupies a lot of space and, in my experience, seldom restores correctly. Most backup software will verify files as they are written unless you deselect this option. I also use a combination of local and network backup. Having a tape drive for each system would be a tad expensive.

There are four methods of backing up data: Full, Differential, Incremental and Copy. Differential and Incremental only backup files that have changed since the last backup. They use a different method for determining which files to backup.

If you do choose the full system backup then practicing the restoration is, in my opinion, paramount. If you have the facilities, restore to a spare machine and see if the system can be fully restored. I think you will be surprised, and not pleasantly.

TAR Command

Tar is a Unix command to create and extract archive files. It can archive entire directory structures and is, allegedly, an intuitive command. In my experience, and we're going back many years, it was not too intuitive trying to extract only part of a directory structure from an archive. In fact, myself, the customer's IT manager and a DEC engineer failed to accomplish this.

Tar options or switches:

An example might be: tar -cf archive.tar cpi where cpi is the directory to be archived.

Compressing and Decompressing Files

Compress and gzip are two Unix compression tools. Compress is a proprietary product, while gzip is open-source and is distributed with Linux. They each use an algorithm to reduce the storage space required by files. The file name extension ".Z" indicates the use of the compress utility while a file name extension of ".gz" indicates gzip.

Uncompress and gunzip are the associated decompression utilities.

WinZip is the most common compression utility for the Windows platform and is, as the name implies, a Windows-based GUI utility.

Dump and Restore

These are Unix backup commands. The basic syntax for the dump command is: "dump epoch number options" where the epoch number indicates the type of back: Full, incremental or differential. An example dump command might be: "dump -0uf /dev/ht0 /mydirectory". The "u" option updates the /etc/dumpdates file which records the last time the dump command was executed. The "f" option specifies the device to be used.

Restore is the correct Unix tool to extract data previously backed up using the dump command. The syntax is: "restore if device". The "i" option specifies interactive mode which allows commands to be issued during the restore process.

Design by Stephen

Certified Internet Webmaster

Page last Edited: 20 Nov 2011