UNIX Unleashed, System Administrator's Edition

- 9 -

Bourne Shell

Written By Richard E. Rummel

Revised By William A. Farra

Presented in this chapter are the fundamentals and many useful specifics of the Bourne shell, currently the most popular of the UNIX shells for execution of application programs. Also described are the steps to organize and script shell commands to produce a program that you can run by name at the shell prompt, or other UNIX methods of program initiation.

The following topics are discussed in this chapter:

Shells Basics
- Invocation
- Environment
- Options
- Special Characters
Shell Variables
- User Defined Variables
- Environment Variables
- Positional Variables or Shell Arguments
Shell Script Programming
- Conditional Testing
- Repetition and loop control
Customizing The Shell

Shell Basics

Stephen Bourne wrote the Bourne shell at Bell Laboratories, the development focal point of UNIX. At the time, Bell Laboratories was a subsidiary of AT&T. Since then, many system corporations have produced hardware specific versions of UNIX, but have remarkably kept Bourne Shell basics consistent.

NOTE: $man sh or $man bsh on most UNIX systems will list the generalities of the Bourne Shell as well as detail the specifics to that version of UNIX. It is recommended that the reader familiarize herself with the version she is using before and after reading this chapter.

The Shell Invocation and Environment

The first level of invocation occurs when a user logs on to a UNIX system and is specified by his entry into /etc/passwd file. For example:

farrawa:!:411:102:William Farra, Systems Development,x385:/home/farrawa:/bin/bsh

This entry (which is : delimited) has the login id, encrypted password (denoted by !), user id #, default group id #, comment field, home directory, and startup shell program. In this case, it is the Bourne shell. As the shell executes, it will read the system profile /etc/profile. This may set up various environment variables, such as PATH, that the shell uses to search for executables and TERM, the terminal type being used. Then the shell continues to place the user into the associated home directory and reads the local .profile. Finally the shell displays the default prompt $.

NOTE: On UNIX systems the super-user, also referred to as root, is without restriction. When the super-user logs in, she sees the pound sign (#) as a default prompt. It is a reminder that as super-user some of the built-in protections are not available and that extra care is necessary in this mode. Since the super-user can write to any directory and can remove any file, file permissions do not apply. Normally the root login is only used for system administration and adding or deleting users. It is strongly recommended that only well-experienced UNIX users be given access to root.

Shell Invocation Options When invoking or executing the shell, you can use any of the several options available to the Bourne shell. To test a shell script for syntax, you can use the -n option, which reads the script but does not execute it. If you are debugging a script, the -x option will set the trace mode, displaying each command as it is executed.

The following is a list of Bourne shell options available on most versions of UNIX.

-a Tag all variables for export.

-c "string" Commands are read from string.

-e Non-interactive mode.

-f Disable shell filename generation.

-h Locate and remember functions as defined.

-i Interactive mode.

-k Put arguments in the environment for a command.

-n Reads commands but does not execute them.

-r Restricted mode.

-s Commands are read from the standard input.

-t A single command is executed, and the shell exits.

-u Unset variables are an error during substitution.

-v Verbose mode, displays shell input lines.

-x Trace mode, displays commands as they are executed.

There are many combinations of these options that will work together. Some obviously will not, such as -e, which sets noninteractive mode, and -i, which sets interactive mode. However, experience with options gives the user a multitude of alternatives in creating or modifying his shell environment.

The Restricted Shell

bsh -r or /bin/rsh or /usr/bin/rsh.

Depending on the version of UNIX, this will invoke the Bourne shell in the restricted mode. With this option set, the user cannot change directories (cd), change the PATH variable, specify a full pathname to a command, or redirect output. This ensures an extra measure of control and security to the UNIX system. It is typically used for application users, who never see a shell prompt, and dialup accounts, where security is a must. Normally, a restricted user is placed, from login, into a directory in which she has no write permission. Not having write permission in this directory does not mean that the user has no write permission anywhere. It does mean he cannot change directories or specify pathnames in commands. Also he cannot write a shell script and later access it in his working directory.

NOTE: If the restricted shell calls an unrestricted shell to carry out the commands, the restrictions can be bypassed. This is also true if the user can call an unrestricted shell directly. Remember that programs like vi and more allow users to execute commands. If the command is sh, again it is possible to bypass the restrictions.

Changing Shell Options with set Once the user is at the command prompt $, she can modify her shell environment by setting or unsetting shell options with the set command. To turn on an option, use a - (hyphen) and option letter. To turn off an option, use a + (plus sign) and option letter. Most UNIX systems allow the options a, e, f, h, k, n, u, v, and x to be turned off and on. Look at the following examples:

$set -xv

This enables the trace mode in the shell so that all commands and substitutions are printed. It also displays the line input to shell.

$set +u

This disables error checking on unset variables when substitution occurs. To display which shell options have been set, type the following.

$echo $-
is

This indicates that the shell is in interactive mode and taking commands from standard input. Turning options on and off is very useful when debugging a shell script program or testing a specific shell environment.

The User's Shell Startup File: .profile Under each Bourne shell user's home directory is a file named .profile. This is where a system administrator or user (if given write permission) can make permanent modifications to his shell environment. To add a directory to the existing execution path, just add the following as line into .profile.

PATH=$PATH:/sql/bin ; export PATH

With this line in .profile, from the time the user logs in, the directory /sql/bin is searched for commands and executable programs. To create a variable that contains environment data for an applications program, follow the same procedure.

Shell Environment Variables Once at the command prompt, there are several environment variables that have values. The following is a list of variables found on most UNIX systems.

CDPATH Contains search path(s) for the cd command.

HOME Contains the user's home directory.

IFS Internal field separators, normally space, tab, and newline.

MAIL Path to a special file (mail box), used by UNIX e-mail.

PATH Contains search path(s) for commands and executables.

PS1 Primary prompt string, by default $.

PS2 Secondary prompt string, by default >.

TERM Terminal type being used.

If the restricted mode is not set, these variables can be modified to accommodate the user's various needs. For instance, to change your prompt, type the following.

$PS1="your wish:" ; export PS1

Now instead of a $ for a prompt, your wish: appears. To change it back, type the following.

$PS1="\$" ; export PS1

To display the value(s) in any given variable, type the echo command, space and a $ followed by the variable name.

$echo $MAIL
/usr/spool/mail/(user id)

Care should be used when modifying environment variables. Incorrect modifications to shell environment variables may cause commands either not to function or not to function properly. If this happens, it is recommended that the user log out and log back in. Experience with these variables will give you even more control over your shell environment. Also there is a set of environment variables that are identified by special characters. These are detailed in the next section of the chapter.

Special Characters and Their Meanings

The Bourne shell uses many of the non-alphanumeric characters to define specific shell features. Most of these features fall into four basic categories: special variable names, filename generation, data/program control, and quoting/escape character control. While this notation may seem cryptic at first, this gives the shell environment the ability to accomplish complex functions with a minimal amount of coding.

Special Characters for Shell Variable Names There are special characters that denote special shell variables, automatically set by the shell. As with all variables, they are preceded by a $. The following is a list of these variables.

$# The number of arguments supplied to the command shell.

$- Flags supplied to the shell on invocation or with set.

$? The status value returned by the last command.

$$ The process number of the current shell.

$! The process number of the last child process.

$@ All arguments, individually double quoted.

$* All arguments, double quoted.

$n Positional argument values, where 'n' is the position.

$0 The name of the current shell.

To display the number of arguments supplied to the shell, type the following.

$echo $#
0

This indicates that no arguments were supplied to the shell on invocation. These variables are particularly useful when writing a shell script, which is described later in this chapter in the section Positional Variables or Shell Arguments.

Special Characters for Filename Generation The Bourne shell uses special characters or meta-characters to indicate pattern matches with existing filenames. These are those characters:

* Matches any string or portion of string

? Matches any single character

[-,!] Range, list or not matched

To match files beginning with f, type the following:

$ls f*

To match files with a prefix of invoice, any middle character and a suffix of dat, type the following:

$ls invoice?dat

To match files starting with the letter a through e, type the following:

$ls [a-e]*

To match file starting with a, c, and e, type the following:

$ls [a,c,e]*

To exclude a match with the letter m, type following:

$ls [!m]*

NOTE: To use the logical not symbol, !, it must be the first character after the left bracket, [.

Special Characters for Data/Program Control The Bourne shell uses special characters for data flow and execution control. With these characters, the user can send normal screen output to a file or device or as input to another command. These characters also allow the user to run multiple commands sequentially or independently from the command line. The following is a list of those characters.

>(file) Redirect output to a file.

>>(file) Redirect and append output to the end of a file.

<(file) Redirect standard input from a file.

; Separate commands.

| Pipe standard output to standard input.

& Place at end of command to execute in background.

'' Command substitution, redirect output as arguments.

There are many ways to use these controls and those are described in more detail later in this chapter in the section Entering Simple Commands.

Special Characters for Quoting and Escape The Bourne shell uses the single quotes, '', and double quotes, "", to encapsulate special characters or space delineated words to produce a single data string. The major difference between single and double quotes is that in using double quotes, variable and command substitution is active as well as the escape character.

$echo "$HOME $PATH"
$/u/farrawa /bin:/etc:/usr/bin:

This example combined the values of $HOME and $PATH to produce a single string.

$echo '$HOME $PATH'
$$HOME $PATH

This example simply prints the string data enclosed. The shell escape character is a backslash \, which is used to negate the special meaning or shell function of the following character.

$echo \$HOME $PATH
$$HOME /bin:/etc:/usr/bin:

In this example, only the $ of $HOME is seen as text, and the variable meaning the shell is negated, $PATH, is still interpreted as a variable.

How the Shell Interprets Commands

The first exposure most people have to the Bourne shell is as an interactive shell. After logging on the system and seeing any messages from the system administrator, the user sees a shell prompt. For users other than the super-user, the default prompt for the interactive Bourne shell is a dollar sign ($). When you see the dollar sign ($), the interactive shell is ready to accept a line of input, which it interprets.

The shell sees a line of input as a string of characters terminated with a newline character, which is usually the result of pressing Enter on your keyboard. The length of the input line has nothing to do with the width of your computer display. When the shell sees the newline character, it begins to interpret the line.

Entering Simple Commands

The most common form of input to the shell is the simple command, in which a command name is followed by any number of arguments. In the example

$ ls file1 file2 file3

ls is the command and file1, file2, and file3 are the arguments. The command is any UNIX executable. It is the responsibility of the command, not the shell, to interpret the arguments. Many UNIX commands, but certainly not all, take the following form:

$ command -options filenames

Although the shell does not interpret the arguments of the command, the shell does make some interpretation of the input line before passing the arguments to the command. Special characters, when you enter them on a command line, cause the shell to redirect input and output, start a different command, search the directories for filename patterns, substitute variable data, and substitute the output of other commands.

Substitution of Variable Data Many of the previous examples in this chapter have used variables in the command line. Whenever the shell sees(not quoted or escaped) a dollar sign $, it interprets the following qualified text as a variable name. Whether the variable is environmental or user defined, the data stored in the variable is substituted on the command line. For example, the command

$ ls $HOME

lists the contents of the user's home directory, regardless of what the current working directory is. HOME is an environment variable. Variables are discussed in more detail in the next major section of this chapter. As in filename substitution, the ls command sees only the result of the substitution, not the variable name.

You can substitute variable names anywhere in the command line, including for the command name itself. For example,

$ dir=ls
$ $dir f*
file1
file1a
form

This example points out that the shell makes its substitutions before determining what commands to execute.

Redirection of Input and Output When the shell sees the input (<) or output (>) redirection characters, the argument following the redirection symbol is sent to the subshell that controls the execution of the command. When the command opens the input or output file that has been redirected, the input or output is redirected to the file.

$ ls -l >dirfile

In this example, the only argument passed on to ls is the option -l. The filename dirfile is sent to the subshell that controls the execution of ls. To append output to an existing file, use (>>).

$ ls -l /tmp >> dirfile

This example takes the file listing from the /tmp directory and appends it to the end of the local file dirfile.

Entering Multiple Commands on One Line Ordinarily, the shell interprets the first word of command input as the command name and the rest of the input as arguments to that command. The shell special character--the semicolon, ;,--indicates to the shell that the preceding command text is ended and the following is a new command. For example, the command line

$ who -H; df -v; ps -e

is the equivalent of

$ who -H
$ df -v
$ ps -e

In the second case, however, the results of each command appear between the command input lines. When you use the semicolon to separate commands on a line, the commands are executed in sequence. The shell waits until one command is complete before executing the next. Also, if there is an error, the shell will stop executing the command line at the position the error occurred.

Linking Multiple Commands with Pipes One of the most powerful features of the Bourne shell is its ability to take standard output from one command and used it as standard input to another. This is accomplished with the pipe symbol, |,. When the shell sees a pipe, it executes the preceding command and then creates a link to the standard input of the following command in the order the commands are on the command line. For example

$who | grep fred

This takes the list of users logged in from the who command and searches the list for the string fred using the grep. This creates output only if user fred is logged in.

$ls -ls | sort -nr | pg

This creates a list of files in the current directory with the block size as the first data item of each line. It hen sorts the list in reverse numeric order and finally pages the output on the screen. This results in paged listing of all files by size with the largest on top. This is useful when trying to determine where disk space is being consumed. Any UNIX command that takes input from standard input and sends output to standard output can be linked using the pipe.

Entering Commands to Process in Background To take advantage of the UNIX ability to multitask, the shell allows commands to be processed in background. This is accomplished by placing the ampersand symbol, &, at the end of a command. For example,

$find / -name "ledger" -print > find.results 2>/dev/null &

This command line searches the entire file system for files named ledger, sends its output to a local file named find.results, eliminates unwanted errors, and processes this command independent of the current shell (in background).

$wc -l < chapter5.txt > chapter5.wcl 2> chapter5.err &

In this example, the command wc takes it input from the file chapter5.txt, sends its output (a line count) to the file chapter5.wcl, send errors to the file chapter5.err, and executes in background.

NOTE: If a user has processes running in the background and she logs off, most UNIX systems will terminate the processes owned by that login. Also when you enter a command to process in background, the shell will return and display the process ID number.

Substituting the Results of Commands in a Command Line

Sometimes it is useful to pass the output or results of one command as arguments to another command. You do so by using the shell special character, the back quotation mark (´´). You use the back quotation marks in pairs. When the shell sees a pair of back quotation marks, it executes the command inside the quotation marks and substitutes the output of that command in the original command line. You most commonly use this method to store the results of command executions in variables. To store the five-digit Julian date in a variable, for example, you use the following command:

$ julian=´´date ´´+%y%j´´´´

The back quotation marks cause the date command to be executed before the variable assignment is made. Back quotation marks can be extremely useful when you're performing arithmetic on shell variables; see "Shell Programming" later in this chapter.

Shell Variables

In algebra, variables are symbols that stand for some value. In computer terminology, variables are symbolic names that stand for some value. Earlier in this chapter, you saw how the variable HOME stood for the name of a user's home directory. If you enter the change directory command, cd, without an argument, cd takes you to your home directory. Does a generic program like cd know the location of every user's home directory? Of course not, it merely knows to look for a variable, in this case HOME, which stands for the home directory.

Variables are useful in any computer language because they allow you to define what to do with a piece of information without knowing specifically what the data is. A program to add two and two is not very useful, but a program that adds two variables can be, especially if the value of the variables can be supplied at execution time by the user of the program. The Bourne shell has four types of variables: user-defined variables, positional variables or shell arguments, predefined or special variables, and environment variables.

Storing Data or User-Defined Variables

As the name implies, user-defined variables are whatever you want them to be. Variable names are comprised of alphanumeric characters and the underscore character, with the provision that variable names do not begin with one of the digits 0 through 9. (Like all UNIX names, variables are case sensitive. Variable names take on values when they appear in a command line to the left of an equal sign (=). For example, in the following command lines, COUNT takes on the value of 1, and NAME takes on the value of Stephanie:

$ COUNT=1
$ NAME=Stephanie

TIP: Because most UNIX commands are lowercase words, shell programs have traditionally used all capital letters in variable names. It is certainly not mandatory to use all capital letters, but using them enables you to identify variables easily within a program.

To recall the value of a variable, precede the variable name by a dollar sign ($):

$ NAME=John
$ echo Hello $NAME
Hello John

You also can assign variables to other variables, as follows:

$ JOHN=John
$ NAME=$JOHN
$ echo Goodbye $NAME
Goodbye John

Sometimes it is useful to combine variable data with other characters to form new words, as in the following example:

$ SUN=Sun
$ MON=Mon
$ TUE=Tues
$ WED=Wednes
$ THU=Thurs
$ FRI=Fri
$ SAT=Satur
$ WEEK=$SAT
$ echo Today is $WEEKday
Today is
$

What happened here? Remember that when the shell's interpreter sees a dollar sign ($), it interprets all the characters until the next white space as the name of a variable, in this case WEEKday. You can escape the effect of this interpretation by enclosing the variable name in curly braces ({,}) like this:

$ echo Today is ${WEEK}day
Today is Saturday
$

You can assign more than one variable in a single line by separating the assignments with white space, as follows:

$ X=x Y=y

The variable assignment is performed from right to left:

$ X=$Y Y=y
$ echo $X
y
$ Z=z Y=$Z
$ echo $Y

$

You may notice that when a variable that has not been defined is referenced, the shell does not give you an error but instead gives you a null value.

You can remove the value of a variable using the unset command, as follows:

$ Z=hello
$ echo $Z
hello
$ unset Z
$ echo $Z

$

Conditional Variable Substitution

The most common way to retrieve the value of a variable is to precede the variable name with a dollar sign ($), causing the value of the variable to be substituted at that point. With the Bourne shell, you can cause variable substitution to take place only if certain conditions are met. This is called conditional variable substitution. You always enclose conditional variable substitutions in curly braces ({ }).

Substituting Default Values for Variables As you learned earlier, when variables that have not been previously set are referenced, a null value is substituted. The Bourne shell enables you to establish default values for variable substitution using the form

${variable:-value}

where variable is the name of the variable and value is the default substitution. For example,

$ echo Hello $UNAME
Hello
$ echo Hello ${UNAME:-there}
Hello there
$ echo $UNAME
$
$ UNAME=John
$ echo Hello ${UNAME:-there}
Hello John
$

As you can see in the preceding example, when you use this type of variable substitution, the default value is substituted in the command line, but the value of the variable is not changed. Another substitution construct not only substitutes the default value but also assigns the default value to the variable as well. This substitution has the form

${variable:=value}

which causes variable to be assigned value after the substitution has been made. For example,

$ echo Hello $UNAME
Hello
$ echo Hello ${UNAME:=there}
Hello there
$ echo $UNAME
there
$ UNAME=John
$ echo Hello ${UNAME:-there}
Hello John
$

The substitution value need not be literal; it can be a command in back quotation marks:

USERDIR={$MYDIR:-'pwd'}

A third type of variable substitution substitutes the specified value if the variable has been set, as follows:

${variable:+value}

If variable is set, then value is substituted; if variable is not set, then nothing is substituted. For example,

$ ERROPT=A
$ echo ${ERROPT:+"Error Tracking is Active"}
Error Tracking is Active
$ ERROPT=
$ echo ${ERROPT:+"Error Tracking is Active"}

$

Conditional Variable Substitution with Error Checking Another variable substitution method allows for error checking during variable substitution:

${variable:?message}

If variable is set, its value is substituted; if it is not set, message is written to the standard error file. If the substitution is made in a shell program, the program immediately terminates. For example,

$ UNAME=
$ echo ${ UNAME:?"UNAME has not been set"}
UNAME has not been set
$ UNAME=Stephanie
$ echo ${UNAME:?"UNAME has not been set"}
Stephanie
$

If no message is specified, the shell displays a default message, as in the following example:

$ UNAME=
$ echo ${UNAME:?}
sh: UNAME: parameter null or not set
$

Positional Variables or Shell Arguments

You may recall that when the shell's command-line interpreter processes a line of input, the first word of the command line is considered to be an executable file, and the remainder of the line is passed as arguments to the executable. If the executable is a shell program, the arguments are passed to the program as positional variables. The first argument passed to the program is assigned to the variable $1, the second argument is $2, and so on up to $9. Notice that the names of the variables are actually the digits 1 through 9; the dollar sign, as always, is the special character that causes variable substitution to occur.

The positional variable $0 always contains the name of the executable. Positional variables are discussed later in this chapter in the section "Shell Programming."

Preventing Variables from Being Changed If a variable has a value assigned, and you want to make sure that its value is not subsequently changed, you may designate a variable as a read-only variable with the following command:


readonly variable

From this point on, variable cannot be reassigned. This ensures that a variable won't be accidentally changed.

Making Variables Available to Subshells with export When a shell executes a program, it sets up a new environment for the program to execute in. This is called a subshell. In the Bourne shell, variables are considered to be local variables; in other words, they are not recognized outside the shell in which they were assigned a value. You can make a variable available to any subshells you execute by exporting it using the export command. Your variables can never be made available to other users.

Now suppose you start a new shell.

Enter Command: sh
$ exit
Enter Command:

When you started a new shell, the default shell prompt appeared. This is because the variable assignment to PS1 was made only in the current shell. To make the new shell prompt active in subshells, you must export it as in the following example.

$ PS1="Enter Command: "
Enter Command: export PS1
Enter Command: sh
Enter Command:

Now the variable PS1 is global; that is, it is available to all subshells. When a variable has been made global in this way, it remains available until you log out of the parent shell. You can make an assignment permanent by including it in your .profile, see "Customizing the Shell."

Shell Script Programming

In this major section, you learn how to put commands together in such a way that the sum is greater than the parts. You learn some UNIX commands that are useful mainly in the context of shell programs. You also learn how to make your program perform functions conditionally based on logical tests that you define, and you learn how to have parts of a program repeat until its function is completed. In short, you learn how to use the common tools supplied with UNIX to create more powerful tools specific to the tasks you need to perform.

What Is a Program?

A wide assortment of definitions exist for what is a computer program, but for this discussion, a computer program is an ordered set of instructions causing a computer to perform some useful function. In other words, when you cause a computer to perform some tasks in a specific order so that the result is greater than the individual tasks, you have programmed the computer. When you enter a formula into a spreadsheet, for example, you are programming. When you write a macro in a word processor, you are programming. When you enter a complex command like

$ ls -R / | grep myname | pg

in a UNIX shell, you are programming the shell; you are causing the computer to execute a series of utilities in a specific order, which gives a result that is more useful than the result of any of the utilities taken by itself.

A Simple Program

Suppose that daily you back up your data files with the following command:

$ cd /usr/home/myname; ls * | cpio -o >/dev/rmt0

As you learned earlier, when you enter a complex command like this, you are programming the shell. One of the useful things about programs, though, is that they can be placed in a program library and used over and over, without having to do the programming each time. Shell programs are no exception. Rather than enter the lengthy backup command each time, you can store the program in a file named backup:

$ cat >backup
cd /usr/home/myname
ls * | cpio -o >/dev/rmt0
Ctrl+d

You could, of course, use your favorite editor (see UNIX Unleashed, Internet Edition, Chapter 3, "Text Editing with vi and emacs"), and in fact with larger shell programs, you almost certainly will want to. You can enter the command in a single line, as you did when typing it into the command line, but because the commands in a shell program (sometimes called a shell script) are executed in sequence, putting each command on a line by itself makes the program easier to read. Creating easy-to-read programs becomes more important as the size of the programs increase.

Now to back up your data files, you need to call up another copy of the shell program (known as a subshell) and give it the commands found in the file backup. To do so, use the following command:

$ sh backup

The program sh is the same Bourne shell that was started when you logged in, but when a filename is passed as an argument, instead of becoming an interactive shell, it takes its commands from the file.

An alternative method for executing the commands in the file backup is to make the file itself an executable. To do so, use the following command:

$ chmod +x backup

Now you can back up your data files by entering the newly created command:

$ backup

If you want to execute the commands in this manner, the file backup must reside in one of the directories specified in the environment variable $PATH.

The Shell as a Language

If all you could do in a shell program was to string together a series of UNIX commands into a single command, you would have an important tool, but shell programming is much more. Like traditional programming languages, the shell offers features that enable you to make your shell programs more useful, such as: data variables, argument passing, decision making, flow control, data input and output, subroutines, and handling interrupts.

By using these features, you can automate many repetitive functions, which is, of course, the purpose of any computer language.

Using Data Variables in Shell Programs

You usually use variables within programs as place holders for data that will be available when the program is run and that may change from execution to execution. Consider the backup program:

cd /usr/home/myname
ls | cpio -o >/dev/rmt0

In this case, the directory to be backed up is contained in the program as a literal, or constant, value. This program is useful only to back up that one directory. The use of a variable makes the program more generic:

cd $WORKDIR
ls * | cpio -o >/dev/rmt0

With this simple change, any user can use the program to back up the directory that has been named in the variable $WORKDIR, provided that the variable has been exported to subshells. See "Making Variables Available to Subshells with export" earlier in this chapter.

Entering Comments in Shell Programs

Quite often when you're writing programs, program code that seemed logical six months ago may be fairly obscure today. Good programmers annotate their programs with comments. You enter comments into shell programs by inserting the pound sign (#) special character. When the shell interpreter sees the pound sign, it considers all text to the end of the line as a comment.

Doing Arithmetic on Shell Variables

In most higher level programming languages, variables are typed, meaning that they are restricted to certain kinds of data, such as numbers or characters. Shell variables are always stored as characters. To do arithmetic on shell variables, you must use the expr command.

The expr command evaluates its arguments as mathematical expressions. The general form of the command is as follows:

expr integer operator integer

Because the shell stores its variables as characters, it is your responsibility as a shell programmer to make sure that the integer arguments to expr are in fact integers. Following are the valid arithmetic operators:

+ Adds the two integers.

- Subtracts the second integer from the first.

* Multiplies the two integers.

/ Divides the first integer by the second.

% Gives the modulus (remainder) of the division.

$ expr 2 + 1
3
$ expr 5 - 3
2

If the argument to expr is a variable, the value of the variable is substituted before the expression is evaluated, as in the following example:

$ $int=3
$ expr $int + 4
7

You should avoid using the asterisk operator (*) alone for multiplication. If you enter

$ expr 4 * 5

you get an error because the shell sees the asterisk and performs filename substitution before sending the arguments on to expr. The proper form of the multiplication expression is

$ expr 4 \* 5
20

You also can combine arithmetic expressions, as in the following:

$ expr 5 + 7 / 3
7

The results of the preceding expression may seem odd. The first thing to remember is that division and multiplication are of a higher precedence than addition and subtraction, so the first operation performed is 7 divided by 3. Because expr deals only in integers, the result of the division is 2, which is then added to 5, giving the final result 7. Parentheses are not recognized by expr, so to override the precedence, you must do that manually. You can use back quotation marks to change the precedence, as follows:

$ int='expr 5 + 7'
$ expr $int / 3
4

Or you can use the more direct route:

$ expr 'expr 5 + 7' / 3
4

Passing Arguments to Shell Programs

A program can get data in two ways: either it is passed to the program when it is executed as arguments, or the program gets data interactively. An editor such as vi is usually used in an interactive mode, whereas commands such as ls and expr get their data as arguments. Shell programs are no exception. In the section "Reading Data into a Program Interactively," you see how a shell program can get its data interactively.

Passing arguments to a shell program on a command line can greatly enhance the program's versatility. Consider the inverse of the backup program presented earlier:

$ cat >restoreall
cd $WORKDIR
cpio -i </dev/rmt0
Ctrl+d

As written, the program restoreall reloads the entire tape made by backup. But what if you want to restore only a single file from the tape? You can do so by passing the name of the file as an argument. The enhanced restore1 program is now:

# restore1 - program to restore a single file
cd $WORKDIR
cpio -i $1 </dev/rmt0

Now you can pass a parameter representing the name of the file to be restored to the restore1 program:

$ restore1 file1

Here, the filename file1 is passed to restore1 as the first positional parameter. The limitation to restore1 is that if you want to restore two files, you must run restore1 twice.

As a final enhancement, you can use the $* variable to pass any number of arguments to the program:

# restoreany - program to restore any number of files
cd $WORKDIR
cpio -i $* </dev/rmt0

$ restoreany file1 file2 file3

Because shell variables that have not been assigned a value always return null, or empty, if the restore1 or restoreany programs are run with no command-line parameters, a null value is placed in the cpio command, which causes the entire archive to be restored.

Consider the program in listing 9.1; it calculates the length of time to travel a certain distance.

Listing 9.1. Program example with two parameters.

# traveltime - a program to calculate how long it will
# take to travel a fixed distance
# syntax: traveltime miles mph
X60=´´expr $1 \* 60´´
TOTMINUTES=´´expr $X60 / $2´´
HOURS=´´expr $TOTMINUTES / 60´´
MINUTES=´´expr $TOTMINUTES % 60´´
echo "The trip will take $HOURS hours and $MINUTES minutes"

The program in listing 9.1 takes two positional parameters: the distance in miles and the rate of travel in miles per hour. The mileage is passed to the program as $1 and the rate of travel as $2. Note that the first command in the program multiplies the mileage by 60. Because the expr command works only with integers, it is useful to calculate the travel time in minutes. The user-defined variable X60 holds an interim calculation that, when divided by the mileage rate, gives the total travel time in minutes. Then, using both integer division and modulus division, the number of hours and number of minutes of travel time is found.

Now execute the traveltime for a 90-mile trip at 40 mph with the following command line:

$ traveltime 90 40
The trip will take 2 hours and 15 minutes

Decision Making in Shell Programs

One of the things that gives computer programming languages much of their strength is their capability to make decisions. Of course, computers don't think, so the decisions that computer programs make are only in response to conditions that you have anticipated in your program. The decision making done by computer programs is in the form of conditional execution: if a condition exists, then execute a certain set of commands. In most computer languages, this setup is called an if-then construct.

The if-then Statement The Bourne shell also has an if-then construct. The syntax of the construct is as follows:

if command_1
then
  command_2
  command_3
fi
command_4

You may recall that every program or command concludes by returning an exit status. The exit status is available in the shell variable $?. The if statement checks the exit status of its command. If that command is successful, then all the commands between the then statement and the fi statement are executed. In this program sequence, command_1 is always executed, command_2 and command_3 are executed only if command_1 is successful, and command_4 is always executed.

Consider a variation of the backup program, except that after copying all the files to the backup media, you want to remove them from your disk. Call the program unload and allow the user to specify the directory to be unloaded on the command line, as in the following example:

# unload - program to backup and remove files
# syntax - unload directory
cd $1
ls -a | cpio -o >/dev/rmt0
rm *

At first glance, it appears that this program will do exactly what you want. But what if something goes wrong during the cpio command? In this case, the backup media is a tape device. What if the operator forgets to insert a blank tape in the tape drive? The rm command would go ahead and execute, wiping out the directory before it has been backed up! The if-then construct prevents this catastrophe from happening. A revised unload program is shown in listing 9.2.

Listing 9.2. Shell program with error checking.

# unload - program to backup and remove files
# syntax - unload directory
cd $1
if ls -a | cpio -o >/dev/rmt0
then
   rm *
fi

In the program in listing 9.2, the rm command is executed only if the cpio command is successful. Note that the if statement looks at the exit status of the last command in a pipeline.

Data Output from Shell Programs The standard output and error output of any commands within a shell program are passed on the standard output of the user who invokes the program unless that output is redirected within the program. In the example in listing 9.2, any error messages from cpio would have been seen by the user of the program. Sometimes you may write programs that need to communicate with the user of the program. In Bourne shell programs, you usually do so by using the echo command. As the name indicates, echo simply sends its arguments to the standard output and appends a newline character at the end, as in the following example:

$ echo "Mary had a little lamb"
Mary had a little lamb

The echo command recognizes several special escape characters that assist in formatting output. They are as follows:

\b Backspace

\c Prints line without newline character

\f Form Feed: advances page on a hard copy printer; advances to new screen on a display terminal

\n Newline

\r Carriage return

\t Tab

\v Vertical Tab

\\ Backslash

\0nnn A one-, two-, or three-digit octal integer representing one of the ASCII characters

If you want to display a prompt to the user to enter the data, and you want the user response to appear on the same line as the prompt, you use the \c character, as follows:

$ echo "Enter response:\c"
Enter response$

The if-then-else Statement A common desire in programming is to perform one set of commands if a condition is true and a different set of commands if the condition is false. In the Bourne shell, you can achieve this effect by using the if-then-else construct:

if command_1
then
   command_2
   command_3
else
   command_4
   command_5
fi

In this construct, command_1 is always executed. If command_1 succeeds, the command_2 and command_3 are executed; if it fails, command_4 and command_5 are executed.

You can now enhance the unload program to be more user friendly. For example,

# unload - program to backup and remove files
# syntax - unload directory
cd $1
if ls -a | cpio -o >/dev/rmt0
then
   rm *
else
   echo "A problem has occurred in creating the backup."
   echo "The directory will not be erased."
   echo "Please check the backup device and try again."
fi

TIP: Because the shell ignores extra whitespace in a command line, good programmers use this fact to enhance the readability of their programs. When commands are executed within a then or else clause, indent all the commands in the clause the same distance.

Testing Conditions with test You've seen how the if statement tests the exit status of its command to control the order in which commands are executed, but what if you want to test other conditions? A command that is used a great deal in shell programs is the test command. The test command examines some condition and returns a zero exit status if the condition is true and a nonzero exit status if the condition is false. This capability gives the if statement in the Bourne shell the same power as other languages with some enhancements that are helpful in shell programming.

The general form of the command is as follows:

test condition

The conditions that can be tested fall into four categories: 1) String operators that test the condition or relationship of character strings; 2) Integer relationships that test the numerical relationship of two integers; 3) File operators that test for the existence or state of a file; 4) Logical operators that allow for and/or combinations of the other conditions.

Testing Character Data You learned earlier that the Bourne shell does not type cast data elements. Each word of an input line and each variable can be taken as a string of characters. Some commands, such as expr and test, have the capability to perform numeric operations on strings that can be translated to integer values, but any data element can be operated on as a character string.

You can compare two strings to see whether they are equivalent or not equivalent. You also can test a single string to see whether it has a value or not. The string operators are as follows:

str1 = str2 True if str1 is the same length and contains the same characters as str2

str1 != str2 True if str1 is not the same as str2

-n str1 True if the length of str1 is greater than 0 (is not null)

-z str1 True if str1 is null (has a length of 0)

str1 True if str1 is not null

Even though you most often use test with a shell program as a decision maker, test is a program that can stand on its own as in the following:

$ str1=abcd
$ test $str1 = abcd
$ echo $?
0
$

Notice that unlike the variable assignment statement in the first line in the preceding example, the test command must have the equal sign surrounded by white space. In this example, the shell sends three arguments to test. Strings must be equivalent in both length and characters by character.

$ str1="abcd "
$ test "$str1" = abcd
$ echo $?
1
$

In the preceding example, str1 contains five characters, the last of which is a space. The second string in the test command contains only four characters. The nonequivalency operator returns a true value everywhere that the equivalency operator returns false.

$ str1=abcd
$ test $str1 != abcd
$ echo $?
1
$

Two of the string operations, testing of a string with no operator and testing with the

`-a`	Tag all variables for export.
`-c "string"`	Commands are read from string.
`-e`	Non-interactive mode.
`-f`	Disable shell filename generation.
`-h`	Locate and remember functions as defined.
`-i`	Interactive mode.
`-k`	Put arguments in the environment for a command.
`-n`	Reads commands but does not execute them.
`-r`	Restricted mode.
`-s`	Commands are read from the standard input.
`-t`	A single command is executed, and the shell exits.
`-u`	Unset variables are an error during substitution.
`-v`	Verbose mode, displays shell input lines.
`-x`	Trace mode, displays commands as they are executed.

`CDPATH`	Contains search path(s) for the `cd` command.
`HOME`	Contains the user's home directory.
`IFS`	Internal field separators, normally space, tab, and newline.
`MAIL`	Path to a special file (mail box), used by UNIX e-mail.
`PATH`	Contains search path(s) for commands and executables.
`PS1`	Primary prompt string, by default `$`.
`PS2`	Secondary prompt string, by default `>`.
`TERM`	Terminal type being used.

`$#`	The number of arguments supplied to the command shell.
`$-`	Flags supplied to the shell on invocation or with `set`.
`$?`	The status value returned by the last command.
`$$`	The process number of the current shell.
`$!`	The process number of the last child process.
`$@`	All arguments, individually double quoted.
`$*`	All arguments, double quoted.
`$n`	Positional argument values, where 'n' is the position.
`$0`	The name of the current shell.

`*`	Matches any string or portion of string
`?`	Matches any single character
`[-,!]`	Range, list or not matched

`>(file)`	Redirect output to a file.
`>>(file)`	Redirect and append output to the end of a file.
`<(file)`	Redirect standard input from a file.
`;`	Separate commands.
`\|`	Pipe standard output to standard input.
`&`	Place at end of command to execute in background.
`''`	Command substitution, redirect output as arguments.

`+`	Adds the two integers.
`-`	Subtracts the second integer from the first.
`*`	Multiplies the two integers.
`/`	Divides the first integer by the second.
`%`	Gives the modulus (remainder) of the division.

`\b`	Backspace
`\c`	Prints line without newline character
`\f`	Form Feed: advances page on a hard copy printer; advances to new screen on a display terminal
`\n`	Newline
`\r`	Carriage return
`\t`	Tab
`\v`	Vertical Tab
`\\`	Backslash
`\0nnn`	A one-, two-, or three-digit octal integer representing one of the ASCII characters

`str1 = str2`	True if `str1` is the same length and contains the same characters as `str2`
`str1 != str2`	True if `str1` is not the same as `str2`
`-n str1`	True if the length of `str1` is greater than 0 (is not null)
`-z str1`	True if `str1` is null (has a length of 0)
`str1`	True if `str1` is not null