This section covers the basic backup tools that come with most all Linux distributions, explains how they work, and discusses the pros and cons of each.
Tar is short for Tape ARchive and it's been around a very long time. The GNU version of tar that comes on all Linux systems is a feature-laden miracle worker that most praise for its convenience and rich feature set, and a few curse for its non-POSIX compatibility. Star (pronounced "ess-tar") is pretty much like tar, but much faster for doing local streaming to tape as well as more POSIX-compliant. It also handles file attributes like creation time, access time, and modification time, which makes it a bit more powerful than plain old tar.
Tar and related backup utilities hit the system through the filesystem, just as a user would access files. As a result, it can be a bit slower than some other form of raw device backup such as dump, cat, or dd (depending on the usage).
Tar functions very well as a general-purpose file system/directory archiving tool, as well as doing network-based backups in conjunction with things like ssh. However, in and of itself it does not support "level backups" and so must be used creatively with other tools to create a full TOH or level 0 to 9 backup script.
Tar is also nice in that it comes bundled with gzip compression compatibility built in; however, this needs to be used with caution on live filesystems. If a file system changes underneath the tar/gzip session, this can break the compression stream and break the tar session all together. We'll examine ways to get around this later.
Dump and restore are actually two different programs that are included in the dump package. Dump is one of the more popular OS-included, production level, filesystem-based backup utility packages out there. Dump, used in conjunction with other backup programs or by itself, is used across most all UNIX systems to do full system backups with level 0 to 9/differential/ incremental support.
Dump talks to the raw block devices that underlie your filesystems, and as such is never supposed to be used on a live, mounted r/w, nonstatic, filesystem. That being said, many people choose to do this anyway. Red Hat and even Linus Torvolds warns against using it like this (see warnings about using clump at http://dump.sourceforge.net/isdumpdeprecated.html, along with info on how to use it safely). However, even with such authoritative foreboding, you can still safely use it and get great results if you do so carefully, and prepare the system with a little admin sleight of hand.
Besides having a default "batch mode" for scripting, both dump and restore's other features include an interactive command line interface. Restore's interface allows one to browse back into the backup dump file-cd'ing through directories, adding files to your restore list, and then restoring them with the interactive extract command. Dump/restore also handles multivolume (tape) support, post-/pretape scripting (for changers), file/device/stdout output, and much more.
Major issues to watch out for with dump are (as previously mentioned) copying live and changing filesystems (active logs, e-mail, databases, and so on), as well as dump and restore binary compatibility issues. Dump and restore are tied at the hip. Never try to do a dump with one version and a restore with another. This will often fail. Be sure you know what versions of dump you have on all of your systems and make sure that you have the corresponding version of restore. In fact, I recommend getting a copy of your dump/restore binaries off your backups system for safekeeping, so that if the restore you're using doesn't seem to be reading the data off the tape, you'll have another copy to fall back on. Red Hat Fedora is currently coming with dump/restore version 0.4b34. You can see this with the following commands:
# which dump restore /sbin/dump /sbin/restore # rpm -q dump dump-0.4b34-1 # dump 2>&1 | grep ^dump dump 0.4b34 (using libext2fs 1.34 of 25-Jul-2003) # restore 2>&1 | grep ^restore restore 0.4b34 (using libext2fs 1.34 of 25-Jul-2003)
Be sure that you have both binaries on your system, that they're the same version on all of your servers that you're doing dumps and restores on, and that there are no other dump/ restore binaries anywhere in your system path, like this:
# locate bin/dump | grep dump$ /sbin/dump # locate bin/restore | grep restore$ /sbin/restore
Note |
Most backup-related packages, including dump, expect to use a standard device file or symlink to be set up to point to the tape drive. The link name expected is /dev/tape. If you do not have it, then you will have to manually point the system to the tape device /dev/st0 (for example). To fix this, create a symlink with the ln --s command like this: |
# ln -s /dev/st0 /dev/tape # ls -la /dev/tape lrwxrwxrwx lroot root 8 Mar 14 15:17 /dev/tape->/dev/st0
Short for Advanced Maryland Automatic Network Disk Archiver, Amanda is a pretty slick and polished automated, production client/server backup package that's free and comes installed on many Linux systems out of the box, including Red Hat/Fedora systems.
Amanda is a suite of command-line tools that brings together the best backup commodity tools such as dump/restore, tar, ssh, and production quality backup concepts such as parallel client dumps, a caching "holding disk" for caching client dumps to disk while tape busy, and many other advanced features.
Note |
Amanda is a very advanced backup suite, and replaces much of the manual level-x and TOH work shown in this chapter, but it also demands a more complex setup configuration. See Table 5-4 at the end of this chapter for more information on setting up the client/server configuration for Amanda.
|
The server config files are located in /etc/amanda/, where each backup set is defined in its own subdirectory.
The daemon is configured on the centralized backup server, and the service turned on like this:
# chkconfig --list | grep amanda amandaidx: off amanda: on # chkconfig amanda on # chkconfig --list amanda amanda on
Note |
As you can see from the grep, it's on, but there are no runlevels visible. This means that it is actually a subservice (on Fedora anyway) of the xinetd service. |
One the best walkthroughs in setting up and using this comprehensive client/server backup package is located at http://backupcentral.com/amanda.html.
One of the most useful tools you'll find out there for implementing great backup feats of wonder is the find command. It epitomizes the UNIX philosophy of many small tools that do one thing well. For example, it can be used to find all files on your file server that have not been accessed in over a year, and move them off the system and back them up to tape:
# find /mnt/fileserver/ -atime +365 -exec mv-and-backup.sh {} \ ;
Or, how about finding, backing up, and deleting all files on a hard drive that are owned by an ex-employee?
# find / -user bob -exec backup-n-rm.sh {} \ ;
Note |
These examples are using an imaginary backup script. In this example, you could substitute the script for the command "tar -czvf/dev/tape {}; rm -rf {}" or the like. |
If you want to master UNIX or Linux, find is one of those commands that can greatly assist you, while making you appear all-powerful to those who are less inclined to use a non-GUI interface.
The command mt is another tool you'll need to understand to do backups well. It stands for magnetic tape, and it allows you to send commands to and talk to the tape drive. The thing to remember is that it expects to talk to the tape device /dev/tape. If that device is not in place, then it will not work without pointing it elsewhere. Be sure you have that set up. It should look something like this:
# ls -la /dev/tape lrwxrwxrwx lroot root 8 Mar 14 15:17 /dev/tape->/dev/st0
If you do not have this symlink set up, please refer back to the Dump/Restore section for details on how to fix this. After this is set up, you can issue simple tape drive control commands such as rewind, erase, and status:
# mt status SCSI 2 tape drive: File number=0, block number=0, partition=0. Tape block size 512 bytes. Density code 0x13 (DDS (61000 bpi)). Soft error count since last status=0 General status bits on (41010000): BOT ONLINE IM_REP_EN
If you're working with tape changer libraries (with robot arm changers and the like), you'll need to familiarize yourself with the mtx package. This is the software that allows you to talk directly to the tape changer and tape drive using standardized commands.