Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 2Course Objectives • Use three Unix/Linux backup commands: tar , dump , and dd or • Operate the tape device via the mt command •
Trang 1Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 1
Unix and Linux Backups for System Administrators
By Robert Blader
Hello, my name is Robert Blader I’m here to present a tutorial on how to make use of the backup utilities that
UNIX provides and apply them to the development of a backup plan For the past 10 years, I have worked as
a system administrator at the Naval Surface Warfare Center in Dahlgren, Virginia The mission of the site I
managed was to develop fire-control software for deployment on board submarines As such, data
availability, security, and configuration management were of paramount importance
Before I start, I’d like to tell a story Perhaps some of you can identify with it You’re tasked with managing
a system If it’s new, start with hardware - connect cables, attach peripherals, etc Next, you install and
configure the operating system, the latest security patches, and security software (Tripwire, TCP Wrappers,
COPS, etc) Next, you create user accounts, groups, and directories Finally, you add your applications,
compilers, tools, etc You’re running along fine for six months until one morning users notice they cannot
access files
The day in question, there is some deadline that must be met and that data is essential You confirm what you
are being told - you cannot access directories that should be there, and attempts to mount the filesytem are
futile Your choices are (A) panic; (B) panic while trying to locate a backup that you fear is old and was not
done with a timely recovery in mind; or (C) break out your contingency plan that has your backup/recovery
plan documented step by step If this is your first crisis, then you probably will handle it using some
combination of A and B Hopefully, after going through this tutorial, choice C will be a viable option
Trang 2Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 2
Course Objectives
• Use three Unix/Linux backup
commands: tar , dump , and dd (or
• Operate the tape device via the mt
command
• Develop a backup strategy that meets
your needs as well as your users’
At the completion of this tutorial, the student will know how to (1) use tar, dump, andddto
archive data; (2) know how to use themtcommand to control the tape media and the tape device;
and (3) know how to apply the UNIX archiving tool set to formulate a backup plan (Editor’s note:
information on the UNIX command cpio is also included as an appendix to this course – JEK)
No one can argue against the value of a backup in a time of crisis Whether the crisis is the result of
a hardware failure such as a disk crash, a security breach, or a user accidentally deleting files, the
ability to recover from the event in a timely manner is what will separate an excellent system
administrator from a mediocre one Obtaining funding – and the respect and confidence of users – is
a lot easier when you can provide them with restored data rather than with excuses However,
devising a backup scheme that achieves this in a UNIX environment may seem a daunting task
However, it does not need to be This tutorial will explain the concepts you need to be able to meet
this challenge and succeed
A list of the requirements that a backup plan should meet will be discussed A little bit of time spent
creating a backup plan now will make dealing with lost data much less stressful later
Trang 3We will start by presenting the three backup utilities that UNIX provides us.
They are tar, dump, and dd Each command will be presented with usage, examples, and a
description of the situation that each is best suited for We will also touch on some personal "war
stories" and useful examples This way, we will see how the utilities come together to form a
comprehensive backup scheme
Since magnetic tape is by and far the most common media, we will show how the mtcommand
comes into play to manage the tape device and manipulate the tape Next, we will present some
considerations to take into account when creating a backup plan, and wrap up with some closing
notes
Trang 4Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 4
Unix/Linux Backup Commands
• tar
• dump
• dd
• cpio (in Appendix)
The archival commands we will discuss here are tar, dump, and dd
As we will see, each is suited for different types of backups Combined, they form a versatile toolkit
for performing backups
Some information on syntax - the dash proceeding option flags for tarand dumpare optional
Dashes however, are not used with dd
Trang 5Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 5
tar Usage
• Create tar file
tar cvf <archive> <file>
• Extract tar file
tar xvf <archive> <file>
• List contents of tar archive
tar tvf <archive> <file>
• Copy current directory to another
tar cpf - | ( cd newdir; tar xvpf - )
– Where
• <Archive> is a file or tape device
• <File> is the file or directory to archive
The three primary functions of tarare (1) to create an archive; (2) to extract files from the archive;
and (3) to generate a table of contents for a tar file
It is simple to use, ideal for backing up only a particular directory tree or a list of files
Note how in the fourth bullet, we use a dash instead of specifying an “archive” A dash can be used
in lieu of a device or file name to a indicate that the data will either be read from standard input or
written to standard output depending on which side of the pipe it is used
Trang 6Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 6
tar - the <File> Parameter
• Warning: -p to get all ACL and
permission information
• Absolute vs Relative path
– Affects whether files will be placed in current working directory or in
absolute path when restoring
• If restoring file from tar created using absolute pathname, could wind up overwriting a file if one exists by that name
tar, when used with the -pflag, will preserve access information If you administer a heterogeneous
environment, it may be important to try to extract your tar files on the same platform as they were created
on This is because some operating systems (such as Solaris) support Access Control Lists; others (such as
Linux) do not If maintaining ACL controls is important for you at your site, note that the information will
be lost
Another thing to keep in mind when creating a tar archive is the use of absolute vs relative path names Tar
files are restored to locations based on how they were put on the tape If they were created using absolute
path names, they will be restored to the same location Otherwise they are restored relative to the current
working directory To illustrate the significance, here is a true story:
At the site I used to work at, we routinely got deliveries of software from our contractors Unfortunately,
one company was lax in their documentation, especially when it came to installation notes The normal
course of action with a new delivery was to unload it to a “test” area, where the code would be tested prior to
being put into production The current version remains in use until the code is tested One day, I was given
an update to install I extracted the tar file that was delivered Since it was backed up using absolute path
names, the current version wound up being overwritten I had to restore the original version, move it to a
temporary location, extract the new files, move them to a test directory, and move the old version back to
where it belonged Moral of the story: know what you are extracting, make sure you know where the files
are going, and know if the files already exist on disk Otherwise, a 15 minute task could take you all
afternoon
Trang 7Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 7
Absolute vs Relative Examples
Absolute path: would overwrite /etc when
extracted)
tar -cvf etc_archive.tar /etc
Relative path: use “.” to indicate current directory
cd /etc tar cvf /etc_archive.tar relative path
Here are examples of how an archive is created with tar using both absolute and relative path names
In the absolute path example, the contents of /etcwould be overwritten when restored
Use of the “.” indicates that the archive uses relative path names Restoring files created in this
manner will place them in the current directory Typically, you would want to first create an empty
directory from which to stage the tar extraction
By the way, Linux (Red Hat) tar, by default, strips any leading slashes However, this can be
overridden with the -Pflag However, this does not apply to all vendors’ implementations of tar
Trang 8Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 8
Use Caution When Extracting
Tar Files
• If backed up with absolute path:
– Take care that files by that name
don’t already exist
• If backed up with relative path
– Will restore to current directory Be
certain you cd to the directory you want the files to reside in
Whether using relative or absolute pathnames, caution should be used If absolute pathnames are
used, make sure you do not accidentally overwrite files on disk The next slide shows a snippet of
code that can be used as a shell script to check that the files that are on a tape will not overwrite any
files without you knowing it
Alternatively, if relative paths are used and the files go to the directory you are in, you need to make
sure that is where you want them to wind up A common mistake is to untar the file while still sitting
in a directory full of files like /usr for example, and then having to “relocate” the files that do not
belong there
Trang 9Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 9
Ensure Don’t Overwrite Files
With tar
• The following code could help find
files that could get overwritten:
tar -tvf /dev/nrst0 > tar_listing.outfor FILE in `cat tar_listing.out|cut -f6-d” “`
do
if [ -f $FILE ]; thenecho “$FILE exists
mv $FILE $FILE.origfi
done
Here is one way to ensure that you don’t overwrite files First, we use tar with the -toption to
extract a file listing and save it off to a temporary file called tar_listing.out
Then, we read the contents of the tar listing, extract the filename with the cutcommand, and test to
see if a file by that name exists If so, print a warning and save it off with a origextension This
way we can be proactive when we restore files and not just cross our fingers and hope for the best
As a rule of thumb, it is recommended that you use relative path names, extract to a temporary
directory, and then copy files to where you want them to permanently reside This way, you avoid
overwriting a file by accident
Trang 10Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 10
Other Tar Options
• Tar a list of files with -I (include)
– Want all *.C files from the /development
directory tree or file system:
find /development -name “*.C” > filelist.Outtar -I filelist.Out -cvf c_files_archive.Tar
• Likewise, exclude files with -X
This example shows how you can use the findcommand in conjunction with tar(with the -I
flag) to create an include list Here we are archiving C source files
The findcommand says “search the /development directory tree for files matching the pattern *.c
Save the results to a file called filelist.out”
The tarcommand says “archive all the files in filelist.out and call the archive c_files_archive.tar”
Trang 11Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 11
tar Summary
• Use relative paths when possible
• Can be used for directory trees or selected
files if listed with an include or exclude file
• Use -p to retain security attributes (e.g
ACLs)
• Can archive on-line, to tape, or use on either
side of a pipe
To summarize, taris the the simplest and perhaps the most versatile of backup commands
available
To be safe, use relative pathnames and use -tto double check the contents of a tape prior to
extracting it
You can tar an entire directory tree or just selected files
Use the –pflag to retain ownership, group and access mode, and ACLs on platforms that support
them
You can use tarto archive on-line, to tape device, or use it on either side of a pipe
tardoes not support device files Also, on some versions of UNIX, multiple volumes may not be
supported However, Redhat Linux supports multiple volumes with the -Moption Check to see if
your version of tarsupports the use of multiple volumes/tapes
Trang 12Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 12
– Good time for housekeeping.
dumpgives you the option to archive either an entire file system, or only the files that have been
changed since a previous dump tardoes not look at whether or not a file was previously dumped
(or tarred)
Since dump is file system based, there are some things to keep in mind
First, a full dump should be run after an upgrade or reinstallation of the operating system This is
because dates on files are when the files were “mastered”, not actually copied to your system
Therefore, their creation dates relative to your dumps will be out of synch In other words, the files
you install will be NEW, yet could have older time stamps than the files they are replacing The
/etc/dumpdatesfile will not be accurate, and incremental dumps will not pick them up as being
changed files
Second, taronly requires that you be able to read the file in order to archive it dumpaccesses the
raw device (which typically is readable only by root), so non-privileged users cannot run it (without
use of sudo, or a setUID script)
And third, dumponly supports local UNIX file systems It cannot dump NFS-mounted partitions If
you need to dump a remote partition, run dump on the system serving it and use
hostname:deviceto specify a remote tape device
Trang 13Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 13
dump
• Full dump or level “0”
– dump captures entire file systems - can only be used to
dump an entire local file system
• Incremental dump or level 1-9
– Captures all files modified since previous dump of a
lower level
– Uses /etc/dumpdates to store date, dumplevel and
filesystem name
• Set block size, tape length, density, etc
– Defaults are often okay Some large drives require
specific values which can be obtained from the vendor
Aside from full vs incremental, you really have no control over which files get dumped
A level 0 dump captures an entire file system Incremental dumps (levels 1-9) record files modified
since a dump of lower level
dumpuses the /etc/dumpdatesfile to record what level dump was done on which file system
and when
dumpalso keeps track of the amount of media used When dumping small partitions to tape, you
can usually rely on defaults; but if dumping large partitions (several GB’s) to large capacity media,
you may need to specify tape length, density or both Tape drive vendors can usually assist with
these parameters
Trang 14Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 14
• Full dump of /usr:
dump 0uf /dev/nrst0 /usr
• Incremental (level 2) dump of /usr to
a 20 GB Travan tape drive
dump 2usdf 740 106400 /dev/nrst0 /usr
Some dumpexamples:
The simplest form of the dumpcommand is:
dump dump_level u(update dumpdates file) f(device name) and the file_systemto dump
The last parameter may be specified as mount point (like /usr) or a disk device name
(/dev/hd0a) If you need to specify other parameters, they must be in the order of the flags used;
example two uses size and density, so that must be the order of the parameters
First, a full dump of the /usr file system
Example two dumps /usr to a 20 GB tape drive
Trang 15Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 15
• u updates the dumpdates file
– Otherwise dump is full (regardless of
dump level)
tape prior to writing
– If you write multiple dump files to tape
and do NOT use a non-rewinding device, you will overwrite archives
We’ve already discussed the use of the dumpdates file If you do not use the -uflag to indicate that
the dumpdates file should be updated, the effect on your incremental dumps is that they will be run
as though you specified full dumps This is because dump has no way to know when the file system
was dumped
Be careful to use a non-rewinding device! Dumps rewind to beginning of tape if not directed
otherwise This will result in all but the last dump file being overwritten
Trang 16Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 16
restore
• Used for reading dump-formatted
archives to restore or create “table of
contents”
• Interactive and non-interactive mode
– Interactive mode permits “browsing
of dump file”
restorereads back archives created by dump It can be used to generate a listing or extract files
Note it cannot read tar files, nor can tarread files created by dump
restorehas both an interactive and a non-interactive mode The interactive mode permits you to
“browse” through a dump file so that you may search for what you want to retrieve
Trang 17Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 17
Restore Examples
• Interactive restore of /etc/hosts:
cd /tmp; restore ivf /dev/nrst0
restore> cd etc restore> add hosts
make node /etc
restore> extract
Specify next volume #: 1
Extract file /etc/hostsAdd links
Set directory mode, owner, and times
Set owner/mode for '.'? [yn] y restore> quit
• Add builds list of files to search for
• When extracted, full path is recreated relative to
current directory and restored file is
/tmp/etc/hosts
Here is an example of an interactive restore
We first cdto /tmpand use /tmpas a staging directory
Next, we invoke restore with the i(interactive) flag
Note addonly builds the search list Issue an addcommand for each file you want to extract
However, it does not actually retrieve files The extractcommand tells dumpto start searching
the tape
The absolute path to each file restored is recreated In this case, we are retrieving /etc/hostsin
the /tmpdirectory Therefore, restorewill put the restored file in /tmp/etc/hosts
Trang 18Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 18
Restore Examples
• Noninteractive mode
-– Restore full filesystem, /big_project
– Start with most recent level 0 dump tape
cd /big_project restore rf /dev/nrst0
– Repeat for each incremental dump taken
between the level 0 and present,
restore rf /dev/nrst0
In non-interactive mode, be certain that you have set your current working directory to be the
directory that you want data restored to prior to running the restore -rcommand
Position the tape to the desired dump file on the tape There will be more on this when we discuss
tape operations
cdto the directory where you want the data restored
Issue restorewith the rand fflags to extract the entire dump
Repeat the procedure for each incremental dump taken between the level 0 and the present
Trang 19Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 19
Using dd
• Provides image of disk - as near to a
bit-by-bit copy as you can get with Linux tools
– In event of compromise, could include
“deleted” files
• Ability to copy tape to tape
• Ability to read from non-UNIX platform or
UNIX systems with different byte order
(Sun/SGI)
ddis a utility that reads input files block by block
If you specify a disk device, you can capture file system metadata – blocks of “data” marked deleted
that could be useful for evidence gathering following a break in This data would be missed if using
taror dump, which rely on the UNIX file system
The input file can be a tape or disk device name, enabling you to make tape to tape copies without
having to unpack the archive
Additionally, you can do conversions on the blocks of data permitting you to swap byte pairs
-enabling you to go between SGI and other UNIX variants
Other conversions include changing upper to lowercase data, ASCII to EBCIDC, and others Refer
to the manpages for a complete list
Trang 20Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 20
• Image copy of a file system
Here are three sample uses of dd:
Firs,t we copy a disk partition to a tape
In the second example, we are copying from one tape drive to another
Example three shows a more complex use of dd Here we are using it to help transfer a tar file from
an SGI to a Linux system Since these two platforms have a different byte order, a conversion needs
to take place The byte-order conversion is made to an archive residing on a tape and piped to a tar
command
This is probably not something you need to do often but is shown to illustrate how powerful ddcan
be
Trang 21Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 21
• Files AND filesystem metadata are
saved for forensics study
• Files that might have been maliciously
deleted (i.e., log files) might be able to
be restored
• Usage:
dd if=input_device of=output_device
If your system should ever be infected by a virus, Trojan horse, etc., first perform a backup of the
filesystem using dd This will preserve filesystem information, along with “deleted” disk blocks
which forensics experts may be able to recover
Ideally, you will have a ready spare to rebuild onto from your backups and can set the compromised
disk aside for forensic study
A binary copy, in the hands of a computer forensics expert, might also provide insight to how the
virus operates
Trang 22Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 22
Command Summary
• tar
– Archive selected files or directories
– Copy contents of one directory to another
• dump/restore
– Can choose between full and incremental
– Can only dump entire file system, not a
single file or directory
• dd
– Binary backup
– Use to modify format of a dump file
– Copy archives between tapes
This slide compares and contrasts the backup commands tar, dump, and dd
taris best for backing up a single directory or selected files You can also use it to copy the
contents of one directory to another, with the exception of /dev
dumpand its counterpart restoreare best suited for creating backups of entire partitions Usually
run from cronas root (since it needs to be able to read the raw device), it is typically used for
nightly backups
ddis used to create a binary (byte for byte) copy of a device, convert between block formats, and
copy archive files, without extracting or restoring the archives
Neither one is better or worse, they are just suited for different things All three combined make for
an effective tool set that can support your backup plan
Trang 24Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 24
• However, disks and CDROM are other
technologies that might be considered
Tapes, by and far, are the backup media of choice They offer high capacity, take little space, are
available in various sizes and formats, and are cost efficient However, they are not the only option
available
Disks are getting cheaper and bigger Experience has shown me that about 90% of user-requested
backups are for files that were modified within a week’s time In an environment where backups are
requested often, and restore time must be kept low, keeping archives on line for a brief period of
time may be worth considering
Trang 25Unix and Linux Backups – SANS GIAC LevelOne© 2000, 2001 25
mt Usage
mt -f /dev/nrst0 command
command can be any of the following:
• status = status of device
– (Tape must be loaded to get information on
device type, on-line, etc)
• rew = rewind
• offl = rewinds tape, ejects it
• fsf n = fast forward over n archive files
• bsf n = rewind over n archive files
• eom = skip to end of recorded media
Tape devices are managed using the mtcommand
This slide shows some of the mtoptions for manipulating tape As mentioned earlier, mtcan
provide tape drive status, rewind, eject a tape and take the drive offline, skip over archive files,
backup over archive files, and jump to the end of the tape in preparation for appending a new
archive