The potential problem with removing a large file to clean up disk space is that if a user or an administrator removes the file, the inode may still be kept open by a process that is writ
Trang 1■ ■ ■
C H A P T E R 4 0
Removing Large Files and
Log Rolling
This chapter gives a few tips relating to moving or removing files that are consuming
large amounts of disk space In some cases you may have a file system that’s filling up
because a process adds copious amounts of entries to a log file When you run into a large
log file that is found to be the primary cause of a full file system, your first inclination may
be to remove the file to reclaim space However, this may not work as you might intend
Before I go into why, let me first talk a little about files and their structure The file name
is the visible representation of an inode that allows you to access the data in the file The
inode contains all the important information about a file, such as its ownership,
permis-sions, modification times, and other items of interest An inode also holds information
about the location of the data on the hard disk A file name is a user-friendly way of
access-ing an inode, which is represented by a number You can determine the inode number of
any file by running the command ls -i filename
Here is an example of a full directory:
$ ls -i
474322 37l5152.pdf 214111 rsync.tar.gz
474633 770tref.pdf 215939 yum-2.0.5-1.fd.fr.noarch.rpm
215944 openbox-3.1-1.i386.rpm
Note that the inode number precedes each of the five files listed here As just
men-tioned, part of the data contained in an inode is a pointer to the data on the physical disk
If the file has been opened by a process for writing, the process writes to this location on
the hard disk
The potential problem with removing a large file to clean up disk space is that if a user
or an administrator removes the file, the inode may still be kept open by a process that is
writing the data, and the disk space will not be returned to the system The operating
sys-tem won’t realize it should release the disk space for reuse until the process closes the file
At that time the disk space will definitely be reclaimed
One way of finding out if a process is keeping a file open is to use the fuser command,
which will display a list of process IDs that are accessing a given file You also can use
the lsof command to find this information lsof is designed to list open files and the
Trang 2262 C H A P T E R 4 0 ■ R E M O V I N G L A R G E F I L E S A N D L O G R O L L I N G
processes that are accessing them Once you know the processes that are using the file, you can stop them before deleting the file However, you may not be able (or want) to do this The other problem with needing to stop a process that is holding a file open is that doing so may be against existing site policies The process or application that is holding the file open might be production-critical and impact business needs if it is halted Another way to clean up the space used by a file is to zero out the file by redirecting /dev/null into the file This trims the file down to zero bytes while leaving the file itself in place The file remains open and accessible to any process that might be using it How-ever, the operating system will release the disk space in a timely fashion Keep in mind that some processes may keep a file open for writing for a very long time
Here is a sample of a directory listing, including an offending log file that is consuming large amounts of disk space We have a choice of several possible commands that will zero out the file
$ ls -l
total 113172
-rw-rw-r 1 rbpeters rbpeters 1057862 Jun 7 18:21 37l5152.log
-rw-rw-r 1 rbpeters rbpeters 449184 Jun 7 18:21 770tref.log
-rw-rw-r 1 rbpeters rbpeters 1104249096 Jun 7 21:22 really_big.log
The following command is one, cp /dev/null really_big.log is another, and echo > really_big.log is yet another All of these commands overwrite a large file with nothing
$ >really_big.log
This is the resulting directory listing after the file has been zeroed out:
$ ls -l
total 1484
-rw-rw-r 1 rbpeters rbpeters 1057862 Jun 7 18:21 37l5152.log
-rw-rw-r 1 rbpeters rbpeters 449184 Jun 7 18:21 770tref.log
-rw-rw-r 1 rbpeters rbpeters 0 Jun 7 21:25 really_big.log
This procedure can be used to rotate log files For example, you may have an applica-tion for which you would like to keep one week’s worth of log informaapplica-tion You would then rotate it to an older version to keep up to one month After a month, you would delete the file One solution includes the logrotate command, which is part of multiple Linux distributions
The following is a simple script that copies the existing log file to a backup with a ver-sion number and then zeroes out the file During the time the script is being run, any existing processes that are writing to the file can still access it
The code first defines the file in question and then checks whether it exists If it does exist, execution continues
Trang 3C H A P T E R 4 0 ■ R E M O V I N G L A R G E F I L E S A N D L O G R O L L I N G 263
#!/bin/sh
LOGFILE=./my.log
if [ ! -f $LOGFILE ]
then
echo "Nothing to do exiting."
else
You’ve already seen the use of the redirect (>) to zero a file The main addition to this
idea here is the cp -p command
cp -p $LOGFILE ${LOGFILE}.0
> $LOGFILE
fi
This command preserves the original file’s ownership and time stamp for the new
copy Thus the new copy maintains the attributes of the original file and it appears as if
the file had been moved This is also a useful technique when modifying scripts,
configu-ration files, or any other file for which preserving the original file’s attributes would be
valuable
Trang 4■ ■ ■
C H A P T E R 4 1
Core Finder
In a production environment, it is a good idea to know whether your applications are
dumping core often It is also considered good housekeeping to know about your core
files so your hard disk won’t fill up with unnecessary files The small script in this chapter
tracks down and cleans up core files The script was intended to be run as an hourly cron
job, although you could change the schedule to fit your needs The job also has its priority
lowered using the nice command so that it won’t interfere with the performance of
regular processes on the machine The notifications I’ve received from this script have
characterized chronic issues with applications on more than one occasion Without the
script, I would have never seen the patterns
This script steps through each of the locally mounted file systems and finds all core
files It determines the applications that created the core files and moves the core files to
a central location for later examination The script also logs its actions and cleans up old
saved files
First we have to set up some straightforward variables
#!/bin/sh
HOWOLD=30
FSLIST="/ /boot"
UNAME=`uname -n`
DATE=`date +%m%d%I%M`
DATADIR="/usr/local/data/cores"
LOGFILE="$DATADIR/cor_report"
The HOWOLD variable is used as the maximum age, in days, for saved core files Any files
older than this will be removed
The FSLIST variable contains the list of file systems that will be checked for core files
The list will vary from system to system You could set FSLIST dynamically by using the
df command to determine the locally mounted file systems The command might look
something like this: FSLIST=`df -l | grep '^/dev' | awk '{print $6}'`, which gathers
the lines starting with /dev and then prints the field containing the file-system name
The other four variables contain the name of the system that the script is running
on (UNAME), the current date (DATE), the directory to which the core files will be saved
(DATADIR), and the name of the log file (LOGFILE)
Trang 5266 C H A P T E R 4 1 ■ C O R E F I N D E R
Now we determine whether the data directory exists This is the directory where the core files will be saved If it doesn’t exist, you have to create it The -p option to mkdir adds any omitted parent directories to the path of the directory being created
if [ ! -d $DATADIR ]
then
mkdir -p $DATADIR
fi
Then you need to find all previously saved core files that don’t need to be kept around anymore, and remove them
find $DATADIR -name \*\.core\.\* -mtime +$HOWOLD -exec rm {} \;
The following is the script’s main loop It finds all the core files created since the last time the script was run
find $FSLIST -mount -type f -name core -print | while read file
do
if [ -s $file ]
then
coretype=`file $file | cut -d\' -f2`
mv $file $DATADIR/$UNAME.core.$coretype.$DATE
echo "$coretype $UNAME $DATADIR/$UNAME.core.$coretype.$DATE" | \
tee -a $LOGFILE | mailx -s "core_finder: $coretype core file \
found on $UNAME" sysadmins
else
rm $file
fi
done
If a core file’s size is zero bytes, we remove it If it is larger than zero bytes, we use the file command to determine the application that created the core file Then the core file
is saved in the archive directory and its name is changed to include the type of core and the date when the file was found Then we add an entry to the log file noting the action that was taken, and send an e-mail notice to the administrators