Tools and Scripts



As with any job, tools are required to simplify tasks, increase efficiency, and many times just make things possible. The job of system administration is almost cursed with too many tools. It seems like every month there is yet another scripting language to do everything you used to do with that old language. The ability to compound tools into something more then the sum of its parts is extremely easy in the computer field. After many years, most system administrators have their own toolbox of favorite tools that they have honed over time to work just the way they need.

This part of the course will be very UNIX specific. It isn't really possible to teach tools and scripts without actual examples. However, many programs listed in this section are available on many different platforms and operating systems. Most of those that are not, have equivalents. I'm afraid it is left up to the reader to discover these if they wish.


Utilities

Most tools start with simple unix utilities such as du, telnet and ps. These are used all the time by sysadmins to check the status of running machines, see what users are doing, and test that certain services are working.

ps and top

Since just about the only way to do anything on a UNIX machine is with a process (except for the kernel itself) tools like ps and top are used constantly. If a machine is running slowly, a good bet is that there are too many processes or one process is taking up too much CPU time.

example output from top...


 11:31pm  up  5:44,  5 users,  load average: 0.06, 0.07, 0.04
48 processes: 46 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  5.1% user,  1.5% system,  0.0% nice, 93.2% idle
Mem:  192936K av, 189784K used,   3152K free,  49044K shrd,   3928K buff
Swap: 216868K av,   6628K used, 210240K free                159316K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
 1128 kscott    18   0  8164 8164  8048 S       0  4.3  4.2   3:43 mpg123
  181 kscott     1   0  3772 3544  1404 S       0  0.5  1.8   0:29 emacs
 1438 root       3   0  1132 1132   940 R       0  0.5  0.5   0:00 top

example output from ps...


USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
kscott     116  0.0  0.5  1788 1084 tty1     S    17:47   0:00 -bash
root       117  0.0  0.2  1048  436 tty2     S    17:47   0:00 /sbin/agetty 38400 tty2 linux

find

find is a wonderful program with some of the worst syntax ever found in a UNIX utility. However, once you get a handle on it, you can write very powerful one-line scripts to do such things as recursively remove all files that have not been accessed in the past 10 days, and print out the names as you do it.

find /tmp -type f -xdev -atime +10 -exec rm {} \; -a -print

Something like this could be put in a cron job to clean up disk space.

strace

strace shows system calls and signals issued by processes and spawned child processes. It's most often used to see what files a process is accessing. It can be used to trace running processes or to trace processes from beginning to end and this example does here.

strace -f -o /tmp/man.out ls

This is just a sample from the output...

1015  open(".", O_RDONLY|O_NONBLOCK|0x10000) = 4
1015  close(4)                          = 0
1015  getdents(4, /* 58 entries */, 3933) = 1156
1015  write(1, "dir\t\t    nsmail\n", 16) = 16
1015  close(1)                          = 0

I used strace to diagnose a really slow running netscape. It turned out that it was running a real-time java applet and was constantly checking an NFS mounted cache directory for its files. The solution was to turn off file caching.

Services

syslog

syslog is a system logger. Combined with a syslog.conf file, one can define what machines and what files certain logs get sent to. These logs can be useful in debugging machines, tracking malicious users, or just watching for anything out of the ordinary.

Here are some examples of common syslog messages...


Jan 31 09:22:35 jupiter kernel: nfs: server saturn not responding, timed out
Jan 31 09:23:09 mail sendmail[17310]: JAA17309: forward /u/kscott/.forward.mail+: World writable directory
Jan 31 00:12:58 mars sshd2[6450]: connection from "192.168.11.1"

email

email may be the most commonly used tool in system administration. It's used to send warnings to people when they are running out of disk space, keep notes of things that need to be fixed, inform users of impending downtime, and just about anything else you can think of. Many automated scripts that run in the night send out reports via email.

And of course, once something is so widely used as email, there needs to be another tool to filter out the useful from the useless. procmail is just this tool. This is an entry from my procmailrc that deletes any mail messages from root with the subject cron in it.


# Use this when we are going to be down for a while
:0
* ^From: root@mailhost.nmt.edu
* ^Subject: cron:*
/dev/null

cron

cron is the clock daemon in UNIX, and is used to run programs at specific times. cron can be configured to run anywhere from once a minute, to a certain day of the week, to once a year. Many of the above tools can be combined into a monitoring utility spawned via cron. This example looks for files larger then 10 MB every night at 2 minutes to midnight and mails the names of them to root.

58 23 * * * find /home -type f -xdev -size +10000k -print | mail -s "large files" root

Scripts

Scripts provide a way of utilizing many tools into one comprehensive tool that can be run easily or, even better, run automaticly when needed.

Shell Scripts

Shell scripts are probably the most commonly used language by sysadmins. A shell script is simply a list of instructions that one could just as easily type on the command line, except they are executed in sequential order from a file. By putting them in a file, the list of commands can be treated like a program.

A common place to find shell scripts is in the rc startup files. Since most UNIXes use startup scripts differently, you can usually find them by reading the man page on init; the first process started after bootup.

A simple example of a shell script is one that rotates syslog files. If left alone, the files that syslog logs to will continue to grow until the disk runs out of space, or hit a file size limit in the kernel (which is 2gigs for many versions of UNIX).


#!/bin/sh

# Usage: rotate  

PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/bin ; export PATH

logfile=$1			# name of the logfile without any extensions
to_num=$2			# extension that the rotated file will have

while [ $to_num -gt 1 ] ; do
    from_num=`expr $to_num - 1` # extension of the file to be rotated
    if [ -f $logfile.$from_num ] ; then	# if the logfile already exists
        mv $logfile.$from_num $logfile.$to_num
    elif [ -f $logfile.$from_num.gz ] ; then  # in case they are compressed
        mv $logfile.$from_num.gz $logfile.$to_num.gz
    fi
    to_num=$from_num
done

if [ -f $logfile ] ; then	# sanity check
    cp $logfile $logfile.1
    cp /dev/null $logfile	# truncates the file
    gzip --quiet $logfile.1	# compress it to save space
fi

Perl Scripts

perl is a programming language that has found a home with system administrators. It has much of the power of C combined with the scripting and shell interface of sh.

Here is the same logfile rotate script from above, only this time written in perl.


#!/usr/loca/bin/perl

# Usage: rotate  

$logfile = $ARGV[0] ;		# name of the logfile without any extensions
$to_num = $ARGV[1] ;		# extension that the rotated file will have

while($to_num > 1)
{
    $from_num = $to_num - 1 ;	# extension of the file to be rotated
    if(-f "$logfile.$from_num")	# if the logfile already exists
    {
	system("/bin/mv $logfile.$from_num $logfile.$to_num") ;
    }
    elsif(-f "$logfile.$from_num.gz") # in case they are compressed
    {
	system("/bin/mv $logfile.$from_num.gz $logfile.$to_num.gz") ;
    }
    $to_num-- ;
}

if(-f $logfile)			# sanity check
{
    system("/bin/cp $logfile $logfile.1") ;
    system("/bin/cp /dev/null $logfile") ; # truncates the file
    system("gzip --quiet $logfile.1") ;	# compress it to save space
}

Class Notes

debugging
  tcpdump
  strace/truss
  syslog
  ps/top
  du/df
  telnet 

automation
  cron
  scripts

other
  find
  mail/procmail
  tar cf - . | (cd /dir ; tar xfvp -)  or (GNU) cp -pR . /dir
  lsof
  fuser
  
scripts
  grep/awk/sed
  sh/perl/C

References


Today's Date:
Last Modified:
K. Scott Rowe