When reading files on a multiple-file tape, you must use the nonrewinding tape device with tar and the mt command to position the tape to the appropriate file.. For example, to back up t
Trang 1You should also back up your kernel sources (if you have upgraded or built your own kernel);
these are found in /usr/src/linux
During your Linux adventures it's a good idea to keep notes on what features of the system you've made changes to so that you can make intelligent choices when taking backups If you're truly paranoid, go ahead and back up the whole system; that can't hurt, but the cost of backup media might
Of course, you should also back up the home directories for each user on the system; these are
generally found in /home If you have your system configured to receive electronic mail (see
Section 16.2 in Chapter 16), you might want to back up the incoming mail files for each user Many people tend to keep old and "important" electronic mail in their incoming mail spool, and it's not difficult to accidentally corrupt one of these files through a mailer error or other
mistake These files are usually found in /var/spool/mail Of course, this applies only if you
are using the local mail system, not to people who access mail directly via POP3 or IMAP
8.1.1.1 Backing up to tape
Assuming you know what files or directories to back up, you're ready to roll You can use the
tar command directly, as we saw in Section 7.1.2 in Chapter 7, to make a backup For
example, the command:
tar cvf /dev/qft0 /usr/src /etc /home
archives all the files from /usr/src, /etc, and /home to /dev/qft0 /dev/qft0 is the first
"floppy-tape" device — that is, a tape drive that hangs off of the floppy controller Many popular tape drives for the PC use this interface If you have a SCSI tape drive, the device names are
/dev/st0, /dev/st1, and so on, based on the drive number Those tape drives with another type
of interface have their own device names; you can determine these by looking at the documentation for the device driver in the kernel
You can then read the archive back from the tape using a command, such as:
tar xvf /dev/qft0
This is exactly as if you were dealing with a tar file on disk, as seen in Section 7.1 in Chapter 7
When you use the tape drive, the tape is seen as a stream that may be read from or written to
in one direction only Once tar is done, the tape device will be closed, and the tape will
rewind You don't create a filesystem on a tape, nor do you mount it or attempt to access the data on it as files You simply treat the tape device itself as a single "file" from which to create or extract archives
Be sure your tapes are formatted before you use them This ensures that the beginning-of-tape marker and bad-blocks information has been written to the tape For formatting QIC-80 tapes
(those used with floppy-tape drivers), you can use a tool called ftformat that is either already
included with your distribution or can be downloaded from
ftp://sunsite.unc.edu/pub/Linux/kernel/tapes as part of the ftape package
Trang 2Creating one tar file per tape might be wasteful if the archive requires but a fraction of the capacity of the tape In order to place more than one file on a tape, you must first prevent the tape from rewinding after each use, and you must have a way to position the tape to the next
"file marker," for both tar file creation and extraction
The way to do this is to use the nonrewinding tape devices, which are named /dev/nqft0, /dev/nqft1, and so on for floppy-tape drivers, and /dev/nst0, /dev/nst1, and so on for SCSI
tapes When this device is used for reading or writing, the tape will not be rewound when the
device is closed (that is, once tar has completed) You can then use tar again to add another
archive to the tape The two tar files on the tape won't have anything to do with each other Of course, if you later overwrite the first tar file, you may overwrite the second file or leave an undesirable gap between the first and second files (which may be interpreted as garbage) In general, don't attempt to replace just one file on a tape that has multiple files on it
Using the nonrewinding tape device, you can add as many files to the tape as space permits
In order to rewind the tape after use, use the mt command mt is a general-purpose command
that performs a number of functions with the tape drive
For example, the command:
retensions the tape by winding it to the end and then rewinding it
When reading files on a multiple-file tape, you must use the nonrewinding tape device with
tar and the mt command to position the tape to the appropriate file
For example, to skip to the next file on the tape, use the command:
mt /dev/nqft0 fsf 1
This skips over one file on the tape Similarly, to skip over two files, use:
mt /dev/nqft0 fsf 2
Be sure to use the appropriate nonrewinding tape device with mt Note that this command
does not move to "file number two" on the tape; it skips over the next two files based on the
current tape position Just use mt to rewind the tape if you're not sure where the tape is currently positioned You can also skip back; see the mt(1) manual page for a complete list of
options
You need to use mt every time you read a multifile tape Using tar twice in succession to read two archive files usually won't work; this is because tar doesn't recognize the file marker placed on the tape between files Once the first tar finishes, the tape is positioned at the
Trang 3beginning of the file marker Using tar immediately will give you an error message because tar will attempt to read the file marker After reading one file from a tape, just use:
mt device fsf 1
to move to the next file
8.1.1.2 Backing up to floppy
Just as we saw in the last section, the command:
tar cvf /dev/fd0 /usr/src /etc /home
makes a backup of /usr/src, /etc, and /home to /dev/fd0, the first floppy device You can then
read the backup using a command, such as:
tar xvf /dev/fd0
Because floppies have a rather limited storage capacity, GNU tar allows you to create a
"multivolume" archive (This feature applies to tapes as well, but it is far more useful in the
case of floppies.) With this feature, tar prompts you to insert a new volume after reading or
writing each floppy To use this feature, simply provide the M option to tar, as in:
tar cvMf /dev/fd0 /usr/src /etc /home
Be sure to label your floppies well, and don't get them out of order when attempting to restore the archive
One caveat of this feature is that it doesn't support the automatic compression provided by the
z and I options However, there are various reasons why you may not want to compress your backups created with tar, as discussed later At any rate, you can create your own multivolume backups using tar and gzip in conjunction with a program that reads and writes
data to a sequence of floppies (or tapes), prompting for each in succession One such program
is backflops, available on several Linux distributions and on the FTP archive sites A
do-it-yourself way to accomplish the same thing is to write the backup archive to a disk file and use
dd or a similar command to write the archive as individual chunks to each floppy If you're
brave enough to try this, you can figure it out for yourself
8.1.1.3 To compress, or not to compress?
There are good arguments both for and against compression of tar archives when making backups The overall problem is that neither tar nor the compression tools gzip and bzip2 are
particularly fault-tolerant, no matter how convenient they are Although compression using
gzip or bzip2 can greatly reduce the amount of backup media required to store an archive, compressing entire tar files as they are written to floppy or tape makes the backup prone to
complete loss if one block of the archive is corrupted, say, through a media error (not
uncommon in the case of floppies and tapes) Most compression algorithms, gzip and bzip2
included, depend on the coherency of data across many bytes in order to achieve compression
If any data within a compressed archive is corrupt, gunzip may not be able to uncompress the file from that point on, making it completely unreadable to tar
Trang 4This is much worse than if the tar file were uncompressed on the tape Although tar doesn't
provide much protection against data corruption within an archive, if there is minimal corruption within a tar file, you can usually recover most of the archived files with little trouble, or at least those files up until the corruption occurs Although far from perfect, it's better than losing your entire backup
A better solution is to use an archiving tool other than tar to make backups Several options are available cpio is an archiving utility that packs files together, similar in fashion to tar However, because of the simpler storage method used by cpio, it recovers cleanly from data corruption in an archive (It still doesn't handle errors well on gzipped files.)
The best solution may be to use a tool such as afio afio supports multivolume backups and is similar in some respects to cpio However, afio includes compression and is more reliable
because each individual file is compressed This means that if data on an archive is corrupted, the damage can be isolated to individual files, instead of to the entire backup
These tools should be available with your Linux distribution, as well as from all the based Linux archives A number of other backup utilities, with varying degrees of popularity and usability, have been developed or ported for Linux If you're serious about backups, you should look into them.1 Among those programs are the freely available taper, tob and Amanda, as well as commercial programs like ARKEIA (free for use with up to two computers), BRU, and Arcserve Lots of free backup tools can also be found at
Internet-http://velocom.linux.tucows.com/system/backup.html
8.1.2 Incremental Backups
Incremental backups, as described earlier in this chapter, are a good way to keep your system backups up-to-date For example, you can take nightly backups of only those files that changed in the last 24 hours, weekly backups of all files that changed in the last week, and monthly backups of the entire system
You can create incremental backups using the tools mentioned previously: tar, gzip, cpio, and
so on The first step in creating an incremental backup is to produce a list of files that changed
since a certain amount of time ago You can do this easily with the find command.2 If you use
a special backup program, you will most likely not have to do this, but set some option somewhere that you want to do an incremental backup
For example, to produce a list of all files that were modified in the last 24 hours, we can use the command:
find / -mtime -1 \! -type d -print > /tmp/filelist.daily
The first argument to find is the directory to start from — here, /, the root directory The -mtime -1 option tells find to locate all files that changed in the last 24 hours
Trang 5The \! -type d is complicated (and optional), but it cuts some unnecessary stuff from your output It tells find to exclude directories from the resulting file list The ! is a negation
operator (meaning here, "exclude files of type d"), but put a backslash in front of it because otherwise the shell interprets it as a special character
The -print causes all filenames matching the search to be printed to standard output We
redirect standard output to a file for later use Likewise, to locate all files that changed in the last week, use:
find / -mtime -7 -print > /tmp/filelist.weekly
Note that if you use find in this way, it traverses all mounted filesystems If you have a ROM mounted, for example, find attempts to locate all files on the CD-ROM as well (which you probably do not wish to backup) The -prune option can be used to exclude certain directories from the walk that find performs across the system; or, you can use find multiple
CD-times with a first argument other than / See the manual page for find(1) for details
Now you have produced a list of files to back up Previously, when using tar, we have
specified the files to archive on the command line However, this list of files may be too long for a single command line (which is usually limited to around 2048 characters), and the list itself is contained within a file
You can use the -T option with tar to specify a file containing a list of files for tar to back up
In order to use this option, you have to use an alternate syntax to tar in which all options are specified explicitly with dashes For example, to back up the files listed in /tmp/filelist.daily to the device /dev/qft0, use the command:
tar -cv -T /tmp/filelist.daily -f /dev/qft0
You can now write a short shell script that automatically produces the list of files and backs
them up using tar You can use cron to execute the script nightly at a certain time; all you
have to do is make sure there's a tape in the drive You can write similar scripts for your
weekly and monthly backups cron is covered in the next section
8.2 Scheduling Jobs Using cron
The original purpose of the computer was to automate routine tasks If you must back up your disk at 1:00 A.M every day, why should you have to enter the commands manually each time
— particularly if it means getting out of bed? You should be able to tell the computer to do it
and then forget about it On Unix systems, cron exists to perform this automating function Briefly, you use cron by running the crontab command and entering lines in a special format recognized by cron Each line specifies a command to run and when to run it
Behind your back, crontab saves your commands in a file bearing your username in the /var/spool/cron/crontabs directory (For instance, the crontab file for user mdw would be
called /var/spool/cron/crontabs/mdw.) A daemon called crond reads this file regularly and executes the commands at the proper times One of the rc files on your system starts up crond when the system boots There actually is no command named cron, only the crontab utility and the crond daemon
Trang 6On some systems, use of cron is limited to the root user In any case, let's look at a useful command you might want to run as root and show how you'd specify it as a crontab entry Suppose that every day you'd like to clean old files out of the /tmp directory, which is
supposed to serve as temporary storage for files created by lots of utilities
Notice that cron never writes anything to the console All output and error messages are sent
as an email message to the user who owns the corresponding crontab You can override this
setting by specifying MAILTO=address in the crontab file before the jobs themselves Most systems remove the contents of /tmp when the system reboots, but if you keep it up for a long time, you may find it useful to use cron to check for old files (say, files that haven't been
accessed in the past three days) The command you want to enter is:
ls -l filename
But how do you know which filename to specify? You have to place the command inside a
find command, which lists all files beneath a directory and performs the operation you specify
on each one
We've already seen the find command in Section 8.1.2 Here, we'll specify /tmp as the directory to search, and use the -atime option to find files whose last access time is more than three days in the past The -exec option means "execute the following command on every file
we find":
find /tmp \! -type d -atime +3 -exec ls -l {} \;
The command we are asking find to execute is ls -l, which simply shows details about the files (Many people use a similar crontab entry to remove files, but this is hard to do without
leaving a security hole.) The funny string {} is just a way of saying "Do it to each file you find, according to the previous selection material." The string \; tells find that the -exec
option is finished
Now we have a command that looks for old files on /tmp We still have to say how often it runs The format used by crontab consists of six fields:
minute hour day month dayofweek command
Fill the fields as follows:
1 Minute (specify from 0 to 59)
2 Hour (specify from 0 to 23)
3 Day of the month (specify from 1 to 31)
4 Month (specify from 1 to 12, or a name such as jan, feb, and so on)
5 Day of the week (specify from 0 to 6 where 0 is Sunday, or a name such as mon, tue, and so on)
6 Command (can be multiple words)
Figure 8-1 shows a cron entry with all the fields filled in The command is a shell script, run with the Bourne shell sh But the entry is not too realistic: the script runs only when all the
conditions in the first five fields are true That is, it has to run on a Sunday that falls on the
Trang 715th day of either January or July — not a common occurrence! So this is not a particularly useful example
Figure 8-1 Sample cron entry
If you want a command to run every day at 1:00 A.M., specify the minute as 0 and the hour as
1 The other three fields should be asterisks, which mean "every day and month at the given
time." The complete line in crontab is:
0 1 * * * find /tmp -atime 3 -exec ls -l {} \;
Because you can do a lot of fancy things with the time fields, let's play with this command a bit more Suppose you want to run the command just on the first day of each month You would keep the first two fields, but add a 1 in the third field:
0 1 1 * * find /tmp -atime 3 -exec ls -l {} \;
To do it once a week on Monday, restore the third field to an asterisk but specify either 1 or mon as the fifth field:
0 1 * * mon find /tmp -atime 3 -exec ls -l {} \;
To get even more sophisticated, there are ways to specify multiple times in each field Here, a comma means "run on the 1st and 15th day" of each month:
0 1 1,15 * * find /tmp -atime 3 -exec ls -l {} \;
while a hyphen means "run every day from the 1st through the 15th, inclusive":
0 1 1-15 * * find /tmp -atime 3 -exec ls -l {} \;
and a slash followed by a 5 means "run every fifth day" which comes out to the 1st, 6th, 11th, and so on:
0 1 */5 * * find /tmp -atime 3 -exec ls -l {} \;
Now we're ready to actually put the entry in our crontab file Become root (because this is the kind of thing root should do) and enter the crontab command with the -e option for
"edit":
rutabaga# crontab -e
Trang 8By default, this command starts a vi edit session If you'd like to use Emacs instead, you can specify this before you start crontab For a Bourne-compliant shell, enter the command:
rutabaga# export VISUAL=emacs
For the C shell:
rutabaga# setenv VISUAL emacs
The environment variable EDITOR also works in place of VISUAL for some versions of
crontab Enter a line or two beginning with hash marks (#) to serve as comments explaining what you're doing, then put in your crontab entry:
# List files on /tmp that are 3 or more days old Runs at 1:00 AM
# each morning
0 1 * * * find /tmp -atime 3 -exec ls -l {} \;
When you exit vi, the commands are saved Look at your crontab entry by entering:
rutabaga# crontab -l
We have not yet talked about a critical aspect of our crontab entry: where does the output go?
By default, cron saves up the standard output and standard error and sends them to the user as
a mail message In this example, the mail goes to root, but that should automatically be directed to you as the system administrator Make sure the following line appears in
/usr/lib/aliases (/etc/aliases on SuSE, Debian, and RedHat):
root: your-account-name
In a moment, we'll show what to do if you want output saved in a file instead of being mailed
to you
Here's another example of a common type of command used in crontab files It performs a
tape backup of a directory We assume that someone has put a tape in the drive before the
command runs First, an mt command makes sure the tape in the /dev/qft0 device is rewound
to the beginning Then a tar command transfers all the files from the directory /src to the tape
A semicolon is used to separate the commands; that is standard shell syntax:
# back up the /src directory once every two months
0 2 1 */2 * mt -f /dev/qft0 rewind; tar cf /dev/qft0 /src
The first two fields ensure that the command runs at 2:00 A.M., and the third field specifies the first day of the month The fourth field specifies every two months We could achieve the same effect, in a possibly more readable manner, by entering:
0 2 1 jan,mar,may,jul,sep,nov * mt -f /dev/qft0 rewind; \
tar cf /dev/qft0 /src
The aforementioned section Section 8.1 explains how to perform backups on a regular basis
The following example uses mailq every two days to test whether any mail is stuck in the mail
queue, and sends the mail administrator the results by mail If mail is stuck in the mail queue,
Trang 9the report includes details about addressing and delivery problems, but otherwise the message
is empty:
0 6 */2 * * mailq -v | \
mail -s "Tested Mail Queue for Stuck Email" postmaster
Probably you don't want to receive a mail message every day when everything is going normally In the examples we've used so far, the commands do not produce any output unless they encounter errors But you may want to get into the habit of redirecting the standard
output to /dev/null, or sending it to a log file like this (note the use of two > signs so that we
don't wipe out previous output):
0 1 * * * find /tmp -atime 3 -exec ls -l {} \; >> /home/mdw/log
In this entry, we redirect the standard output, but allow the standard error to be sent as a mail message This can be a nice feature because we'll get a mail message if anything goes wrong
If you want to make sure you don't receive mail under any circumstances, redirect both the standard output and the standard error to a file:
0 1 * * * find /tmp -atime 3 -exec ls -l {} \; >> /home/mdw/log 2>&1
When you save output in a log file, you get the problem of a file that grows continuously You
may want another cron entry that runs once a week or so, just to remove the file
Only Bourne shell commands can be used in crontab entries That means you can't use any of the convenient extensions recognized by bash and other modern shells, such as aliases or the
use of ~ to mean "my home directory." You can use $HOME, however; cron recognizes the
$USER, $HOME, and $SHELL environment variables Each command runs with your home directory as its current directory
Some people like to specify absolute pathnames for commands, like /usr/bin/find and /bin/rm,
in crontab entries This ensures that the right command is always found, instead of relying on
the path being set correctly
If a command gets too long and complicated to put on a single line, write a shell script and
invoke it from cron Make sure the script is executable (use chmod +x) or execute it by using
a shell, such as:
0 1 * * * sh runcron
As a system administrator, you often have to create crontab files for dummy users, such as
news or UUCP Running all utilities as root would be overkill and possibly dangerous, so these special users exist instead
The choice of a user also affects file ownership: a crontab file for news should run files owned by news, and so on In general, make sure utilities are owned by the user in whose
name you create the crontab file
As root, you can edit other users' crontab files by using the -u option For example:
tigger # crontab -u news -e
Trang 10This is useful because you can't log in as user news, but you still might want to edit this user's
crontab entry
8.3 Managing System Logs
The syslogd utility logs various kinds of system activity, such as debugging output from sendmail and warnings printed by the kernel syslogd runs as a daemon and is usually started
in one of the rc files at boot time
The file /etc/syslog.conf is used to control where syslogd records information Such a file
might look like the following (even though they tend to be much more complicated on most systems):
facility.level [; facility.level ]
where facility is the system application or facility generating the message, and level is the severity of the message
For example, facility can be mail (for the mail daemon), kern (for the kernel), user(for user programs), or auth (for authentication programs such as login or su) An asterisk in
this field specifies all facilities
level can be (in increasing severity): debug, info, notice, warning, err, crit, alert, or emerg
In the previous /etc/syslog.conf, we see that all messages of severity info and notice are
logged to /var/log/messages, all debug messages from the mail daemon are logged to
/var/log/maillog, and all warn messages are logged to /var/log/syslog Also, any emergwarnings from the kernel are sent to the console (which is the current virtual console, or an
xterm started with the -C option)
The messages logged by syslogd usually include the date, an indication of what process or
facility delivered the message, and the message itself — all on one line For example, a kernel
error message indicating a problem with data on an ext2fs filesystem might appear in the log
files, as in:
Dec 1 21:03:35 loomer kernel: EXT2-fs error (device 3/2):
ext2_check_blocks_bit map: Wrong free blocks count in super block,
stored = 27202, counted = 27853
Similarly, if an su to the root account succeeds, you might see a log message, such as:
Dec 11 15:31:51 loomer su: mdw on /dev/ttyp3
Trang 11Log files can be important in tracking down system problems If a log file grows too large,
you can empty it using cat /dev/null > logfile This clears out the file, but leaves it there for
the logging system to write to
Your system probably comes equipped with a running syslogd and an /etc/syslog.conf that
does the right thing However, it's important to know where your log files are and what programs they represent If you need to log many messages (say, debugging messages from
the kernel, which can be very verbose) you can edit syslog.conf and tell syslogd to reread its
configuration file with the command:
kill -HUP `cat /var/run/syslog.pid`
Note the use of backquotes to obtain the process ID of syslogd, contained in /var/run/syslog.pid
Other system logs might be available as well These include:
/var/log/wtmp
This file contains binary data indicating the login times and duration for each user on
the system; it is used by the last command to generate a listing of user logins The output of last might look like:
mdw tty3 Sun Dec 11 15:25 still logged in
mdw tty3 Sun Dec 11 15:24 - 15:25 (00:00)
mdw tty1 Sun Dec 11 11:46 still logged in
reboot ~ Sun Dec 11 06:46
A record is also logged in /var/log/wtmp when the system is rebooted
/var/run/utmp
This is another binary file that contains information on users currently logged into the
system Commands such as who, w, and finger use this file to produce information on who is logged in For example, the w command might print:
3:58pm up 4:12, 5 users, load average: 0.01, 0.02, 0.00
User tty login@ idle JCPU PCPU what
mdw ttyp3 11:46am 14 -
mdw ttyp2 11:46am 1 w
mdw ttyp4 11:46am kermit
mdw ttyp0 11:46am 14 bash
We see the login times for each user (in this case, one user logged in many times), as
well as the command currently being used The w(1) manual page describes all the
fields displayed
/var/log/lastlog
This file is similar to wtmp but is used by different programs (such as finger to
determine when a user was last logged in)
Trang 12Note that the format of the wtmp and utmp files differs from system to system Some
programs may be compiled to expect one format and others another format For this reason, commands that use the files may produce confusing or inaccurate information — especially if the files become corrupted by a program that writes information to them in the wrong format Log files can get quite large, and if you do not have the necessary hard-disk space, you have
to do something about your partitions being filled too fast Of course, you can delete the log files from time to time, but you may not want to do this, because the log files also contain information that can be valuable in crisis situations
One option is to copy the log files from time to time to another file and compress this file The log file itself starts at 0 again Here is a short shell script that does this for the log file
First, we move the log file to a different name and then truncate the original file to 0 bytes by
copying to it from /dev/null We do this so that further logging can be done without problems
while the next steps are done Then, we compute a date string for the current date that is used
as a suffix for the filename, rename the backup file, and finally compress it with gzip
You might want to run this small script from cron, but as it is presented here, it should not be
run more than once a day — otherwise the compressed backup copy will be overwritten because the filename reflects the date but not the time of day (of course, you could change the date format string to include the time) If you want to run this script more often, you must use additional numbers to distinguish between the various copies
You could make many more improvements here For example, you might want to check the size of the log file first and copy and compress it only if this size exceeds a certain limit Even though this is already an improvement, your partition containing the log files will eventually get filled You can solve this problem by keeping around only a certain number of compressed log files (say, 10) When you have created as many log files as you want to have, you delete the oldest, and overwrite it with the next one to be copied This principle is also
called log rotation Some distributions have scripts like savelog or logrotate that can do this
automatically
To finish this discussion, it should be noted that most recent distributions, such as SuSE,
Debian, and Red Hat, already have built-in cron scripts that manage your log files and are
much more sophisticated than the small one presented here
8.4 Managing Print Services
Linux has a fairly complicated printing system, compared to the printing services most PCs use It allows many users to print documents at the same time, and each user can send documents from one or more applications without waiting for the previous document to finish
Trang 13printing The printing system processes the files to be printed correctly on different kinds of printers connected to the computer in different ways If you print on a network, files can be created on one host and printed out on a printer controlled by another host
Before we go into the inner workings of the Linux printing system, we would like to point you to www.linuxprinting.org, a very comprehensive site with information about printing on Linux If you have problems or questions concerning printing that this chapter cannot answer, this site should be your next stop
The whole process happens without much fuss, when you press the Print button in an
application or issue a command, such as lpr, to print a document That document does not go
directly to the printer, though, because it might already be busy Instead, the document is stored in a temporary file in a directory called the printer spool directory As the word "spool" suggests, the documents get taken out of the directory one by one as the printer becomes free Each printer has its own spool directory
When Linux starts, it sets up a printer daemon (an independently running process) called lpd
This process waits around, checking each spool directory for files that should be printed
When the process finds a file, it makes a copy of itself The new lpd takes control of the print
spool where the file was placed and queues it for printing It won't send the next file to that
printer until the last file has finished printing The master lpd starts an lpd for each spooling directory on the system when a file is sent to it, so there may be as many lpd daemons running
as the number of active spooling directories, plus the master lpd Each subordinate lpd stays
around until its spool directory is empty
Your Linux installation process associates the printer port on your system to a device named
in the /dev directory You must then link that device name to the convenient printer names you use in your commands; that's the role of the printer capability file called /etc/printcap Another key task in printer management is to make sure you have filters in place for lpd to
use when formatting documents for printing These filters are also specified in the
/etc/printcap file, and we'll talk a lot about them in this section
There are several printer-support packages for Linux Most distributions use the BSD-derived
package that contains the lpd printer daemon These packages include a set of utilities and
manpage documents to support traditional Unix-style printing on Linux The BSD printing system doesn't have as many administrative tools or user controls as, for example, the System
V Unix printer-management system (which uses the lpsched or lprng daemon), but each user
controls the files that she sends to the printer This section describes installation and configuration of the BSD printer-support package (The various printing utilities are described
in Section 9.6 in Chapter 9.)
There is a new system called Common Unix Printer System (CUPS) that is bound to take over the Linux (if not Unix) printing world At this point, very few distributions come with CUPS preinstalled — the BSD printing system is still ubiquitous — which is why we concentrate on the older system here We'll look at CUPS in brief later in this chapter, though
Some Linux distributions provide a printer-management tool that simplifies printer installation and management through a GUI These tools are documented by the vendor that supplies them They manage printing by controlling the same tools and files we are about to
Trang 14describe, but with less fine control They can save you a lot of trouble getting started, but they don't always get things right If you want to correct an installation set up through these tools
or want to improve on their performance, you still should work through the procedures in this section
8.4.1 Checking Printer Hardware
Before you set up printer services, be sure the printing devices are online If you also use another operating system, such as Microsoft Windows, you can exercise the hardware to ensure that it is connected properly and working before loading Linux Successfully printing a document from another operating system immediately eliminates one major source of woe and head scratching Similarly, if you are going to use printer services on a network, your system should be on the network and all protocols functioning before proceeding
A word about the so-called GDI printers (or Windows printers) is in order here GDI printers are really brain-damaged printers in the true sense of the meaning: their "brain," the internal processing unit that builds up a page from the data sent to it, has been removed; this task is performed in the printer driver on the computer itself The printer itself only consists of the actual printing hardware and a very small amount of software that controls the hardware Of course, drivers for these printers are typically available only for Microsoft Windows systems (where the graphics subsystem is called GDI, which is where the name comes from), so there
is hardly any hope of getting such a printer to work with Linux
Install printer services as the root user, with superuser privileges The superuser is the only
user besides the lpd print daemon able to write directly to a printer by directing output to the
corresponding output device Other users cannot send output directly to the printer and must instead use the printer utilities to handle printing tasks
Before you get started, you can abuse your root privileges to verify that your system's assigned device files actually have a valid link to the physical device Just send a brief ASCII test file directly to the printer by redirection For instance, if you have a printer on your first
parallel port, its device name is probably either /dev/lp0 or /dev/lp1, depending on your
installation The following command outputs some text suited for testing a printer setup, which you can redirect to your printer (If you have an early PostScript printer, you may need instead to send it a small PostScript test file to prevent it from getting confused Newer PostScript printers can often perform this conversion themselves.)
lptest > /dev/lp1
The lptest utility (which may not be available on all distributions) is designed to conveniently
exercise an ASCII printer or terminal to make sure it is working correctly It sends a prepend file composed of the 96 ASCII characters in a sequence that creates a "ripple" or "barber-
pole" output effect The default output of lptest on Linux is 16,000 characters arrayed in character lines, long enough to require more than one page to print If you run lptest with no
79-arguments, it prints the standard output to your screen, and you can see what should be sent to
the printer The lptest command allows you to trim the width of the output column and to
limit the number of output lines For example, to display an output 35 characters wide, limited
to six lines, you would enter:
lptest 35 6
Trang 15The output should look much like this:
Of course, you can also use the cat command to direct a file to the printer To send a
PostScript test file to a PostScript printer, for example, type:
cat testfile.ps > /dev/lp1
If you have a serial printer, try directing output to the serial port to which it is connected For the first serial port (COM1 in MS-DOS) try something like:
lptest > /dev/ttys0
or:
lptest > /dev/ttyS0
Make sure you send to the correct serial port; don't try to output the file to a serial mouse, for
example If your serial printer is on, say, the second serial port, it is addressed as /dev/ttyS1 or /dev/ttys1
If you have a page printer that buffers a partial page, after it stops printing you may need to take the printer offline and press the Form Feed button to get it to print the last partial page Don't forget to put the printer back online afterward (A permanent solution is to get Linux to
send the formfeed character to the printer, either by forcing it through the /etc/printcap entry
for the printer or by having the printer filter-append it to the end of the file We'll discuss these options later.)
If your little test resulted in "laddered" text (text that looks something like the following example) and then continued off the page, the printer did not insert a carriage return at the end
Trang 16manual (Be careful about changing your printer characteristics if you use multiple operating systems.)
Laddering won't be an issue if you have a printer using a page-description language, such as PostScript (the universally used page-layout language from Adobe), and you always filter plain text into that output form before printing Filtering is described later in this chapter
8.4.2 Gathering Resources
OK, you have your printer hardware set up and connected You should collect a hardcopy of your resource documents (at least the manpages for the print utilities and files described here, and the Printing HOWTO file) Also, it is useful to have the technical specifications for your printer Often these are no longer provided when you buy your printer, but you can usually download the information you need from an FTP site or a web site operated by the printer manufacturer While you are retrieving such information, look around and see if there is documentation (such as a description of printer error messages and their meanings) that can help you troubleshoot and manage your printer Most printer manufacturers also offer a technical manual for the printer that you can buy This may or may not be the same volume as the service manual
For example, on the Hewlett-Packard web site, http://www.hp.com/cposupport/software.html, you can retrieve printer technical data sheets; product specs; port-configuration information; PostScript, PCL, and HP-GL files for testing your printers (and filters); descriptions of printer control sequences you can pass to the printer to control its behavior; and documents telling
how you can integrate lpd-based printing services with HP's JetAdmin package (and thereby
with Netware-networked printers as well)
Now, before starting, take a deep breath; be patient with yourself Printer services configuration is a skill that takes time to develop If you have sufficiently standard equipment and successfully use one of the new-fangled printer management utilities to quickly and efficiently install and configure printer services, celebrate! Then note that you can probably fine-tune the installation for better performance by applying the procedures we describe next, and perhaps by using filters and utilities written specifically to support all the features of your printer model If you decide to revise a successful printer installation, make sure you take notes on the changes you make so that you can find your way back if your changes don't do what you expected
8.4.3 Choosing Printer Software
In order to print from Linux, you need to install the BSD print system (or an alternative system) This provides basic tools, but it does not support modern printers Indeed, it was designed to support line printers (hence the name "line printing daemon") and devices common to the computer rooms of the 1960s and 1970s In order to support modern printers, powerful supplemental packages provide the features most users think of as essential (The ftp://ftp.ibiblio.org FTP site and its mirrors archive the packages we mention here.)
In this section, we discuss some important packages to support modern print services We
assume your system will have at least the groff formatter, the Ghostscript page-formatting package, and the GNU Enscript filter packages, which are described in Chapter 9 Most Linux
distributions already include these as well as other formatting and printing utilities If yours
Trang 17does not, you can retrieve them from the usual Linux FTP sites or take them from the ROM of another distribution
CD-It matters where you get formatting and filtering packages If you receive Ghostscript from a European distribution, for example, it probably defaults to an A4 paper format rather than the 8.5x11-inch paper format kept in U.S binary archives In either case, you can easily override
the default through an lpr option passed to the filter Alternatively, you can build the tools
from source
The trend in printer technology is away from character-oriented output and toward adoption
of a page-description language (PDL) that provides sophisticated graphics and font control
By far the most popular of the PDLs is PostScript, which has been widely adopted in the Unix and Internet communities A major reason for its acceptance is Ghostscript, a PostScript implementation copyrighted by Aladdin Enterprises A version is also distributed under the GNU Public License through the Free Software Foundation, along with a large font library that can be used with either version and with other PostScript interpreters Ghostscript is indispensable if you do any kind of printing besides character-based output, and it is easily extensible
Ghostscript implements almost all the instructions of the PostScript language and supports viewer utilities, such as Ghostview, that allow PostScript documents to be displayed in an X window Similarly, excellent filters are readily available that convert PostScript output into other printing languages, such as Hewlett-Packard's PCL, and into forms printable as raster output on inkjet, dot matrix, and laser printers The Ghostscript package supports Adobe Type
1 and 3 PostScript fonts and provides a number of utilities for graphics format conversion and filtering It can even generate PDF files — i.e., files that conform to the Adobe Portable Document Format specification
Ghostscript may be insufficient to use by itself, however, because it doesn't provide printer control to switch between PostScript and text modes Although Ghostscript does provide a
filter that provides this capability (and more), the nenscript filter meets the tests of simplicity,
flexibility, and reliability for most systems, so we document it here
A typical Linux formatting and printing system might primarily use groff to format
documents, creating PostScript output that is then processed by Ghostscript for printing and display
8.4.4 Checking Print Utilities
You probably also want to install the TEX formatting package Even if you do not install the full TEX distribution, you should at least install the xdvi utility, in order to view TEX output and (processed) Texinfo files in an X window (unless you have installed the KDE Desktop
Environment, which contains a more user-friendly replacement called kdvi) Other filters can process device independent (DVI) output into forms such as PostScript (dvips) or PCL (dvilj)
if you have an aversion to the Ghostscript package or need to use native printer fonts for efficient data transfer and rapid printing
The Lout package is also worthy of consideration as an efficient and compact package to format text documents for PostScript output It supports Level 2 PostScript and the Adobe Structuring Conventions, takes comparatively little memory, and comes with good enough
Trang 18documentation to learn quickly Lout doesn't create an intermediate output form; it goes directly from markup input to PostScript output
To support graphics work and X Window System utilities, you probably want to install other tools, some of which probably come with your distribution A collection of current versions of the most popular print support packages for Linux can be found at the ftp://ftp.ibiblio.org
Linux archive, in /pub/Linux/system/printing The netpbm and pbmplus packages support a
large variety of graphics file format conversions (Such formats have to be converted to PostScript before you try to print them.) The Ghostview package provides display tools to view PostScript files in an X Window System environment, and also provides PostScript and PDF support for other packages, such as your web browser
The ImageMagick package, described in Chapter 9, deserves special mention It lets you display a large number of graphics formats in an X window and convert many file formats to other file formats (It uses Ghostview and Ghostscript when it needs to display a PostScript image.) Most of the graphics files you can print you can also display using ImageMagick
A "magic" filter package may also save you much grief in configuring and supporting different document output formats We will touch on the APSfilter magic filter package, but you may prefer the Magic-Filter package instead Both are available at the ftp://ftp.ibiblio.org FTP archive For more on magic filters, see Section 8.4.9 later in this chapter
If you want to support fax devices, you can use the tiffg3 utility with Ghostscript to output
Group III fax format files To control a Class 1 or Class 2 fax modem on your Linux host, you can use the efax package, which is provided in many distributions, or you can install and configure the more capable, but more complex, FlexFax or HylaFax packages
There are additional tools to support double-sided printing on laser printers, and packages that convert PostScript to less common printer-control languages to support Canon and IBM Proprinter devices, for example There is a package to support printing in Chinese on laser printers and bitmap devices Most of these packages don't directly affect management of print services, so we don't describe them in detail here, but this is a good time to install them if you wish to use them
For the benefit of your users, make sure that all the manual pages for the packages you install
are prepared properly when you complete your installations Then run /sbin/mkwhatis (/usr/bin/mandb on Debian) to build the manual page index file that facilitates locating
information online Some packages, such as Ghostscript, also provide additional documention that you can print or make available on the system for reference (Linux distributions tend to omit these documents, but you can FTP them from the sites where the software packages are developed and maintained The GNU archives of the Free Software Foundation, for example, are accessed by anonymous FTP at ftp://GNU.ai.mit.edu.)
Trang 198.4.5 Setting Up the Printcap File
The essence of printer configuration is creating correct entries in the printer capabilities file,
/etc/printcap A simple printcap entry for an HP LaserJet 4MP laser printer attached to
the first (bidirectional) parallel port on an ISA bus PC might look something like this:3
The /etc/printcap file should accommodate every printer and printer port or address — serial,
parallel, SCSI, USB, or networked — your system will use Make sure it reflects any change
in hardware And as always, be aware that some hardware changes should be performed only when power to your system is shut off
8.4.5.1 Printcap file format rules
The printcap file format rules, briefly, are:
• A comment line begins with a pound sign (#)
• Each "line" of the printcap file defines a printer A line that ends with a backslash character (\) is continued on the next line Be very careful that no space or tab character follows the backslash character on the line Traditionally, a continuation line
is indented for readability Multiple printer definitions can use the same actual printer, applying the same or different filters
• Fields in the line are separated by colon characters (:); fields can be empty However,
a printcap line entry cannot be an empty field
• Traditionally, the first field in the entry has no preceding colon
• The first field of the entry line contains names for the printer, each separated by a vertical bar character (|) In the earlier example entry, this portion is the name field:
ljet|lp|ps|PostScript|600dpi 20MB memory|local|LPT1
Printer naming is discussed in detail in the next section You should create a
subdirectory of /var/spool/lpd with the same name as the first printer ID listed in each
printcap entry However, the actual print spool that is used is assigned by the sdvariable for the printcap entry; if the sd variable doesn't point to the actual print spool directory, any file sent to that printer definition will vanish
• There must be at least a default printer entry in printcap If a printer is named lp, that printer is used as the default system printer Do not confuse the lp default printer name with the lp local printer variable, which is described next We recommend you use lp
3 In this chapter, we use ljet4 in several examples Be aware that the HP LaserJet 4 model is available in several versions Some LaserJet 4 models are PCL5 printers only, and others use PostScript Unless you are aware that different types exist, you can find it very frustrating trying to debug a printer filter that is expecting, for example, PostScript, when Ghostscript is passing it PCL5 input
Trang 20as an alias (one of the names after the | characters) rather than the primary printer name (the first in the list), so you can switch the default printer without difficulty
• Each local printer must have an lp variable set In the previous example, the variable was set by this segment of the printcap entry:
ljet|lp|ps|PostScript|600dpi 20MB memory|local|LPT1:
Documents can be output to any printer named in a nameline in /etc/printcap
You might name your printers after the model (HP, Epson), or the type of printer (PS, PCL),
or its specific modes The DeskJet 540, for example, is a printer that should have two definitions in the printcap file, one for black print and another for color The filters you use to support it are likely to be those for the DeskJet 500 or 550C For simple administration, you can assign printer names that are the names of a filter or filter parameter used for a specific device Thus, if you have one LaserJet 4 and will use the ljet4 filter only for it, ljet4 is one logical name for the printer Similarly, a dot-matrix printer might be named 72dpi when accessed via its low-resolution printer definition line, and have the name 144dpi when accessed in a higher resolution
If you use a printer administration utility that comes with your Linux distribution, you may have to follow certain arbitrary rules in preparing your printcap entries in order to get the tools working For example, if you use Red Hat's printer manager utility provided on the administrator's desktop, you may need to make sure that hp is the first name of the first active printer entry in the printcap file This means that when you need to switch default printers, you need to move the new default printer to the top entry of the list and then remove the hpname from the old default printer and prepare it as the first name of the new default printer In order to prevent confusion in use of the spool queues, you should just leave the
/var/spool/lpd/lp directory set up and create a new directory with the actual name of the spool
directory that corresponds to the name by which you actually will address the printer Thus, if you want to send your files to print on a printer named moa, you will need to create a
directory named /var/spool/lpd/moa, with appropriate permissions, and specify that directory
as the printer spool for that printer Setting up printer directories is described in the next section
8.4.5.3 The rest of the printcap variables
The printcap file provides a number of variables that you can define Most variables are provided to specify page parameters, files and directories, filters, communications channel settings, and remote access control Anytime you prepare a printcap file on a new system,
Trang 21read the printcap manual page to make sure you use the correct variable names Variables set
in a printcap entry that are not recognized are passed through to the filter for processing The printcap variables described here are listed in roughly their order of importance Some variables are boolean, and are considered set if they are present Others are set with the assignment operator (=) or numeric value operator (#) and a value; the variable precedes the operator and the string or number follows the operator Examples of the variables described in
the following list are included in the contrived sample /etc/printcap file that follows The printcap manual page has a more complete listing of variables recognized by lpd
sd
Specifies the spool directory used by the printer Spool directories should all be in the
same directory tree (on a fast hard disk), which is usually /var/spool Spool files are
defined even for remote printers Each spool file should have one of the names assigned to the printer it serves
lp
Assigns a local printer device, typically connected through a parallel port, serial port,
or SCSI interface The lp variable must be assigned to a device file in the /dev
directory, which may be a link to a physical device The lp variable must be assigned
if there is a local printer This variable should not be assigned if the rp variable is assigned (that is, the print spool manager is on another host).4 If lp assigns a serial device, the baud rate must be specified with the br variable
lf
Specifies the log file for storing error messages All printers should have this variable set and normally use the same error log file Error entries include the name of the printer and can reveal problems with the user's printer environment, the host configuration, the communications channel that is used, and sometimes the printer hardware itself
rw
This variable should be specified if the printer is able to send data back to the host through the specified device file The rw variable tells lpd that the device should be
opened for both reading and writing This can be useful for serial or SCSI PostScript
printers, for example, because they may return fairly useful error messages to lpd,
which stores them in the error log
4 A special case arises where the printer to be addressed is a true networked printer (that is, it has its own IP address) In that instance, the lp variable assigns the name of a dummy file that is used for setting a temporary lock on the file when the networked printer is in use The documentation for the networked printer should describe the procedure for setting up and managing print services to access it
Trang 22Specifies the maximum size of a print job in the spool A value of zero sets no limit (the default, mx#0), and any other value sets the maximum file size in blocks Most of the time you don't want to set a limit, but you could, for example, set a value slightly smaller than the expected minimum space available on a disk
if
Specifies an input filter to use If you do not specify an input (if) or output (of) filter,
the system uses the default /usr/sbin/lpf filter For some MS-DOS-style character
printers, this is sufficient Other useful filters are provided in the formatting utilities, and there are some flexible "magic filter" packages that will determine (usually correctly) the filtering to apply from the content of the data file passed to it See Section 8.4.7 that follows
of
Specifies an output filter to use When you assign the of variable and don't assign the
if variable, the system uses the filter once when the device is opened All queued jobs
are then sent until the queue is exhausted (and lpd removes the lock file from the spool
directory) This is not normally useful, but it could serve such purposes as sending faxes to a fax modem for dialed connection over a telephone line
When you assign both the if and of variables, the if-specified filter normally processes the file, but the of-specified filter prints a banner page before the input filter
is applied Using both input and output filters effectively on the same print queue is notoriously difficult
br
Specifies the data-transfer rate (baud rate) for a serial port You must supply this value
if the printer is accessed via serial port A pound sign precedes a numeric value that expresses the data-transfer rate in bits per second (not truly the baud rate, which is
an effective rate of flow as opposed to the maximum rate of flow) The specified rate should not exceed any hardware limits For example, if your serial port is capable of
a 57.6 Kbps rate and the printer can process 28.8 Kbps, the assigned rate should not exceed that lower limit (perhaps br#19200) Supported bps values are the usual multiples for serial communications: 300, 600, 1200, 2400, 4800, 9600, and so on
A number of additional data conditioning values may be set if you do assign a brvalue, but most of them aren't useful for typical Linux installations The default behavior is probably acceptable for your printing purposes, but if you intend to print via serial port, study the br, fc, fs, xc, and xs variables in the printcap manual page
pl
Specifies the page length as the number of lines using the default font characters for character devices (and printers that can use a character mode) An example is pl#66for an 11-inch page at six lines per inch This value allows space for cropping and accommodates the limits of some other devices, such as inkjet printers that cannot
Trang 23print to the bottom of the sheet or the edge of the paper Normally used in conjunction with the pw variable
Trang 24Specifies the maximum number of copies you can print Values are the same as for the
mx variable; usually you want to allow unlimited copies (mc#0), which is the default
sc
Suppresses multiple copies (equivalent to mc#1)
Example 8-1 contains a sample printcap file that shows off many of the variables discussed in
the previous list It contains an entry for a remote printer; printing to the printer named hp
(also the default, as it is the first entry) sends the documents to the host spigot.berk.ora.com, where they are printed on the printer queue lp
Example 8-1 Sample /etc/printcap file
# Fare well, sweet prints
hp|bat|west|spigot|berkeley|TI MicroLaser Turbo:\
:mx#0:rp=lp:\
:lp=:sd=/var/spool/lpd:rm=spigot.berk.ora.com:\
:lf=/var/log/lpd-errs:
# To the print room
kiwi|810|rint|Big Apple|Apple 810 via EtherTalk:\
:lp=/var/spool/lpd/kiwi:sh:\
:sd=/var/spool/lpd/kiwi:pl#72:pw#85:mx#0:\
:lf=/var/log/lpd-errs:if=/usr/local/cap/kiwi:
# big bird agapornis via shielded serial access
samoa|S|PostScript|secure|QMS 1725 by serial adapter:\
:lp=dev/tty01:br#38400:rw:xc#0:xs#0400040:sh:\
:sd=/var/spool/lpd/samoa:pl#72:pw#85:mx#0:mc#0:\
:lf=/var/log/lpd-errs:if=/usr/local/cap/samoa:
# agapornis via printer room subnet (standard access)
moa|ps|QMS 1725 via Ethernet:\