UNIX USER TRAINING


Session 2 : The UNIX Environment


Objectives


This session will cover the following topics


  1. The Shell

  2. Types of Shell

  3. The C shell

    1. What happens at login time?

    2. Everything you never wanted to know about variables

    3. Command History

    4. C shell Aliases

    5. Input/Output Redirection

    6. C shell Pipes

    7. Command Substitution

    8. Running Commands in the Background

    9. C shell Limitations


1) The Shell - A Command Interpreter


When you type commands at the UNIX prompt they are 'captured' by a program called the shell and interpreted before being passed to the operating system kernel for execution. The kernel then runs the command and displays the output (if any) on your terminal. But what is the shell and what is it doing when it interprets your commands?


The shell is a standard UNIX program like any other. When you log into a UNIX system the shell is the first program (or process) that you run. You do not need to deliberately run this first shell program; the login process does that for you. Each time you log into a UNIX system, a shell process will be set up. On UNIX workstations or PCs running some sort of X Windows emulator, each terminal window you run also constitutes a shell process.


The shell interprets your commands. This is quite a big job for such a small program. Some of the tasks it performs include



2) Types of Shell


Solaris includes a number of different shells. The most important of these are described briefly below:


The Bourne Shell (sh) - The original UNIX shell. Good for writing shell scripts but poor for interactive use.


The C shell (csh) - A more advanced shell program. This has good facilities for interactive use, such as a command history mechanism, filename completion and a rudimentary command line editing facility. For writing shell scripts, csh uses a syntax similar to the C programming language.


The Korn Shell (ksh) - A re-write of the Bourne Shell, designed to combine the best programming features of the Bourne Shell with the interactive facilities of the C-shell. It has a more flexible command line editing facility (vi based!), command history and aliasing.


In addition to these, there are a number of other shells which have been developed over time. Such shells include the Bourne-Again Shell (bash) and the Terminal C shell (tcsh) which are not supported under Solaris.


Your user account, when initially set up by IT, will have the C shell as your default login shell. The remainder of this course will, therefore, concentrate on csh . It is important to understand that, nine times out of ten, the shell you are using will make no difference whatsoever to the output of commands like ls , cat , rm and all the others you will be using on a regular basis.


3) The C Shell


a) At login time


When you log in to UNIX with the C shell as your command interpreter, several scripts are run before you even see your prompt. Briefly, these scripts are:


/etc/.login A system-wide parameter file

~/.cshrc Owned by the user, the .cshrc file can contain your own settings

~/.login Again owned by the user, .login can also contain your settings


Thereafter, such as when running a new shell or a command terminal under X, only the .cshrc file is run.


Why have two user-owned files to set up the C shell environment? The difference is subtle: the .login file is run only at login time; the .cshrc file is run every time a new C shell is started. Therefore, place only those commands that need to be run once in the .login file. Typically, .login contains commands to specify the terminal type and environment.


NB: the ~ (tilde) symbol is a C shell abbreviation for your UNIX home directory (e.g. /home/colinbr)


b) Everything you never wanted to know about variables


There are two types of variable.


Environment variables are exported to the environment and will be inherited by subsequent shell processes.


Shell variables are local to that shell process and are not inherited by child shells.


An example to clarify these statements will follow. But first, some basic information on variables.


Legitimate characters that may be used in variable names are


A-Z , a-z , 0-9 , _


It is convention to define environment variables in UPPER CASE while shell variables are usually lower or MixedCase. Variable names must start with a letter; the _ (underscore) is considered a letter.




Displaying environment variables


Use the env or setenv commands to display all the environment variables currently in force. This can generate quite a lot of output, particularly during an X Windows session (CDE or OpenWindows) where the login process sets a lot of display and library related environment variables. To refer to the value of a variable, simply precede the variable name with a $ (dollar) symbol. The examples below use the echo command to display a couple of basic variables.

mclaren% echo $HOME

/home/colinbr

mclaren% echo $PATH

/bin:/usr/bin:/usr/ucb:/usr/sbin:/etc:.:/opt/dbs/oracle/product/816/bin


Important Environment Variables


ALL shells support these; setting them differs from shell-to-shell


NB: .login does NOT get called by CDE logins!


HOME - your home directory (eg /home/colinbr)

PATH - directory search path for executables

TERM - the terminal type

SHELL - the shell interpreter you are using

USER - your username


Other important variables


LD_LIBRARY_PATH path to software library files

EDITOR the default editor to use

VISUAL the default full-screen editor to use

PAGER the page-display program (normally pg or more)


Setting environment variables


Due to the vagaries of C shell start up, it's probably best to set all environment variables in your .cshrc file.


For example


setenv PATH /bin:/usr/bin:/usr/ucb:.

setenv PAGER more

setenv LD_LIBRARY_PATH /opt/dbs/oracle/product/816/lib:/usr/local/lib


Environment variables can reference themselves. Given the PATH variable shown above, the following command can append /usr/local/bin to the end of the path:


setenv PATH ${PATH}:/usr/local/bin


Note the use of ${PATH} . The curly brackets are necessary to protect the value of the existing $PATH variable from the shell. The : is a character that can be used (normally by accident!) in a variable name. Without these, the command would be


setenv PATH $PATH:/usr/local/bin


This will fail because the shell will look for a variable called PATH:


Unsetting Environment Variables


Simply use the unsetenv command, for example


unsetenv PAGER


Unsetting HOME , USER , PATH and SHELL is not recommended!


Some predefined C shell variables


Within your .cshrc file, you can set variables using the set command. Built into csh are several useful variables


filec file name completion

prompt the prompt displayed on the command line

history the number of commands kept in the history buffer

ignoreeof the shell will ignore control-D; you must type exit to exit a shell

noclobber prevents > and >> redirections overwriting existing files

savehist the number of commands to be saved in the $HOME/.history file


Most of these are simple switches: they are either on or off and can be set as follows, typically within the .cshrc file.


set noclobber

set ignoreeof

set filec


The prompt , savehist and history variables need extra parameters. For history and savehist, this is the number of commands to be saved in the buffer, for example


set history=50

set savehist=50


For prompt , this is a string which will be displayed at the start of each command line, replacing the default % sign used by the C shell.


set prompt="Whaddyawant? "


or


set prompt="! ${USER}@`hostname`> "


which gives a prompt like:


3 colinbr@mclaren>


The values of the USER , TERM and PATH environment variables are automatically set from the corresponding C shell variables user , term and path .


Non-System Environment Variables


It is possible to set any variable in your environment - just don't override any of the important system variables. In many cases, these can save a lot of tedious typing. Consider the path


/home/colinbr/PROJECTS/ORACLE/oracle8i/SQL


Then


setenv SQ /home/colinbr/PROJECTS/ORACLE/oracle8i/SQL


and type


cd $SQ


Further, the path can be appended to. If there is a "plsql" directory under the one shown, use the following


cd $SQ/plsql


A couple of simple examples


Running under the C shell, first set a variable called MY_NUMBER


5 colinbr@mclaren> set MY_NUMBER=12345


Spaces, or the lack thereof, are important.


6 colinbr@mclaren> echo $MY_NUMBER

12345


Now run a new C shell process. This is called a child process; the original shell is referred to as the parent.


7 colinbr@mclaren> csh

1 colinbr@mclaren>


In this case, you can see the current command number is reset from 7 to 1, indicating no commands have yet been typed in this shell process. Now display the value of MY_NUMBER .


1 colinbr@mclaren> echo $MY_NUMBER

MY_NUMBER: Undefined variable

2 colinbr@mclaren>


In the new shell, the value of MY_NUMBER has not been set: we say that its value has not been inherited from the parent shell.


Exit from the child shell using ctrl-D. Now, back in the parent shell, unset the value of MY_NUMBER


8 colinbr@mclaren> echo $MY_NUMBER

12345

9 colinbr@mclaren> unset MY_NUMBER

10 colinbr@mclaren> echo $MY_NUMBER

MY_NUMBER: Undefined variable


Then create MY_NUMBER as an environment variable using setenv


11 colinbr@mclaren> setenv MY_NUMBER 12345


Note the difference in syntax here. No "=" sign. Now run another child shell and check the value of MY_NUMBER .


12 colinbr@mclaren> csh

1 colinbr@mclaren> echo $MY_NUMBER

12345

2 colinbr@mclaren>


And you can see that MY_NUMBER has been inherited by the child process.


What if you run a different shell?


13 colinbr@mclaren> echo $MY_NUMBER

12345

14 colinbr@mclaren> sh

$ echo $MY_NUMBER

12345

$


MY_NUMBER is still inherited, in this case by a Bourne shell child process.


c) Command History Mechanism


The C shell includes a mechanism that can store previously run commands so that they can be conveniently recalled and re-run, a feature which saves much tedious typing. Command history, however, is not enabled by default and requires two or three changes to be made in the ~/.cshrc file. Add the following to your .cshrc file


set history=50

alias h ‘history’


Optionally add the following


set savehist=50


The first two commands, respectively, set the history list length to 50 commands and set an alias so the saved commands can be recalled simply by typing h . The third command specifies that 50 commands should be saved in the file ~/.history as a persistent record of commands.


When these settings are in effect, you can recall your history list and re-run commands from that list. For example:


5 colinbr@williams> h

1 cd MISC

2 ls

3 ls pci*

4 vi pcift_archive_index

5 h

6 colinbr@williams>


To then re-run the command ls pci* , simply type


6 colinbr@williams> !3


The ! symbol instructs the C shell to re-run command number 3. The re-run of this command then appears in the history list, thus:


6 colinbr@williams> !3

ls pci*

pci_cica_errors pcift_archive_index pcift_sep

pci_opt pcift_archive_readme pcift_uploads

pci_segs_cds pcift_archives pcift_useful

pciesp_useful pcift_march

7 colinbr@williams> h

1 cd MISC

2 ls

3 ls pci*

4 vi pcift_archive_index

5 h

6 ls pci*

7 h

8 colinbr@williams>


Note that the chosen command is displayed before being run.


Commands from the history list can also be appended to quite easily. Command number 2 in the above list can be recalled and piped (see the section on pipes, below) into the more command, like this:


8 colinbr@williams> !2 | more


This runs the command ls | more , where ls is derived from the substitution of command number 2 from the history list.

A very simple way of running your most recent command is to use !! (two exclamation marks) as shown below:


9 colinbr@williams> !!


There are a number of other useful history manipulations, some of which are described briefly below. In these descriptions str is any string of characters.


!str re-run the last command that started with str


!str additional re-run the last command starting with str and add the additional characters


!?str? re-run the last command containing str


!?str? additional as above but append the additional characters to the new command


d) C shell Aliases


Another method for reducing tedious typing is the command line alias in which

a long or complex command line can be reduced to a few characters. Aliases are

defined in the .cshrc file. Simple examples include


alias lsl 'ls -alF'

alias lsf 'ls -aF'

alias cls 'clear'


More useful aliases include


alias cp ‘cp –i’

alias mv ‘mv –i’

alias rm ‘rm –i’


It is not normally sensible to alias a command back to itself. The following is legitimate syntax:


alias ls 'ls -a'


If you then write a script whose behaviour depends on ls acting in this way and then port that script to machine where the alias does not exist, the script will fail (or at least produce strange results).


In the case of the mv, cp and rm aliases shown above, these can prevent horrible mistakes. Should you wish to switch off an alias on a temporary basis (for example when copying a hundred files to a new directory, this will be tedious with cp –i) use a backslash character to ‘escape’ the special meaning of the alias. For example


\cp *.tif ../Backups

e) Input/Output Redirection in the C shell


What is I/O redirection? In a standard command line UNIX session, the shell will read its input from the keyboard and send its output back to the screen. The keyboard is referred to as the standard input device (abbreviated std.in) and the screen may be called the standard output device (std.out). In addition, error messages are usually sent to the screen, so the display is also called the standard error device (std.err). It is important to realize that, even though std.out and std.err are the same device (i.e. the screen) UNIX treats them as two different output streams and handles each separately.


Within all the UNIX shells, it is possible to change std.in , std.out and std.err to be a file, another command or even a device like a tape drive. This is I/O redirection.


Simple I/O redirections in C shell are performed as follows:


< filename Redirect standard input from filename


> filename Redirect standard output to filename


>> filename Redirect standard output and append to filename


Examples of these simple cases:


13 colinbr@williams> pg < install_gcc


This will read the file install_gcc with the pg command. In reality, this is a nonsensical example as the command


14 colinbr@williams> pg install_gcc


performs the same task without the redirection. Such input redirections are more useful when pg is replaced with a shell script of your own devising.


The following command writes a directory listing to a file


15 colinbr@williams> ls –al > my_file.lst


Interestingly, if you then cat my_file.lst you will see an entry similar to that shown below:


-rw-r--r-- 1 colinbr it 0 Sep 5 11:19 my_file.lst


This indicates that the shell has created a new, empty my_file.lst , ready to receive the redirected data before the ls command is executed by the kernel.


More data can then be appended to my_file.lst as follows


16 colinbr@williams> head -20 jordan >> my_file.lst


or, perhaps more simply


17 colinbr@williams> echo “hello world” >> my_file.lst


The matter of redirecting standard error output to a file is one of the areas in which the C shell differs slightly from the Bourne and Korn shells. In the C shell, use the following syntax


>& filename Redirect std.in and std.err to filename


>>& filename Redirect std.in and std.err and append to filename


To take a simple example, consider the directory listing below:


-rw-r--r-- 1 colinbr it 2415 Aug 16 13:54 binaries.html

-rw-r--r-- 1 colinbr it 7421 Aug 16 13:54 build.html


Try the following command


24 colinbr@williams> ls -l b* x

x: No such file or directory

-rw-r--r-- 1 colinbr it 2415 Aug 16 13:54 binaries.html

-rw-r--r-- 1 colinbr it 7421 Aug 16 13:54 build.html


We get the message “x: no such file or directory” because x does not exist. All this output appears on the screen. Now try


25 colinbr@williams> ls -l b* x > error1

x: No such file or directory

26 colinbr@williams> cat error1

-rw-r--r-- 1 colinbr it 2415 Aug 16 13:54 binaries.html

-rw-r--r-- 1 colinbr it 7421 Aug 16 13:54 build.html


In command number 25, the error message appears on the screen. The output of the successful part of the command line (ls –l b*) is written to the file errors1 (as seen in command 26). Standard error has not been redirected. Then use the following command:


27 colinbr@williams> ls -l b* x >& error2

28 colinbr@williams> cat error2

x: No such file or directory

-rw-r--r-- 1 colinbr it 2415 Aug 16 13:54 binaries.html

-rw-r--r-- 1 colinbr it 7421 Aug 16 13:54 build.html


In command 27 we see no output at all. Both standard output and standard error have been written to the file error2 (as displayed by command 28).


It is possible, but slightly tricky in C shell, to redirect standard output to one destination and standard error to another. This involves running the original command in a sub-shell of its own, within that sub-shell, redirecting std.out, then outside the sub-shell, redirecting std.err . The description is a mouthful but the syntax is not that complex. For example:


30 colinbr@williams> (ls -l b* x > error3) >& error4

32 colinbr@williams> cat error3

-rw-r--r-- 1 colinbr it 2415 Aug 16 13:54 binaries.html

-rw-r--r-- 1 colinbr it 7421 Aug 16 13:54 build.html

33 colinbr@williams> cat error4

x: No such file or directory


In command 30, the parentheses instruct the C shell to run the ls command in a sub-shell and place its standard output in the file error3 . The rest of the command line directs the standard error into the file error4 . Commands 32 and 33 display the two files.


This may seem to be a nit-picking point but there may be times when you need to capture the errors along with the output, for example, of a custom written script.


Some general points about I/O redirection


With both the >> and >>& forms, if filename does not exist, it will be created.


If the noclobber C shell variable is set, output redirection will fail if the target file already exists unless one of the following forms is used


>! >&! >>! >>&!


In effect, the ! overrides the noclobber setting


Redirecting std.out and std.err to different files is easier under the Bourne and Korn shells than under the C shell.


f) C shell Pipes


In the preceding section it was claimed that standard input, standard output and standard error could be changed to be, amongst other things, another command, and then went on to discuss (at exhausting length) redirection to files. When commands are connected together, such that the output of the first command becomes the input to a second command, that is called a pipeline, or simply pipe.


In a command line the pipe operation is represented by the | character (vertical bar) as shown in this simple example


72 colinbr@mclaren> cd /usr/bin

73 colinbr@mclaren> ls -la | more


This command directs the output of the ls –al command into the input of the more command, which of course then sends its output to the screen, and has the practical effect of displaying a long directory listing a page at a time.


More complex pipelines are often best built up in stages, with a little bit of trial and error. Consider the following: the wc command prints the number of lines, words and characters in a file (or files), respectively, as shown in the following output.


85 colinbr@mclaren> wc *.txt

38 356 2196 clancy_bio.txt

36 269 1727 elf.txt

22 175 1100 news.txt

96 800 5023 total


What if we want the total number of characters these files amount to? The answer to that is 5023, the third field of the last line. For just three files, it would be a trivial matter to use


86 colinbr@mclaren> ls –l *.txt


and add up the file sizes with a on paper! However, this becomes tedious in cases where there are hundreds of *.txt files. So try the following, to isolate the last line of the output


86 colinbr@mclaren> wc *.txt | tail -1

96 800 5023 total


But we are still only interested in the total, the third column of this output. Getting into UNIX esoterica at this point, use


87 colinbr@mclaren> wc *.txt | tail -1 | awk '{print $3}'

5023


The awk command, in the simple form above, isolates column 3 of the output (awk can do an awful lot more than this, however).


Why might this be useful? Apart from estimating disk usage, the value output by this command can also be assigned to a shell or environment variable. We will look at this in more detail in a later section.


Here is another example of pipes in action:


48 colinbr@mclaren> ps -ef | grep colinbr | grep -v grep

colinbr 3447 3445 0 Aug 28 pts/11 0:00 -csh

colinbr 12060 12058 0 Aug 31 pts/20 0:00 -csh

colinbr 22157 22155 0 11:03:11 pts/28 0:00 -csh

colinbr 21870 21868 0 10:40:38 pts/22 0:00 –csh


This command line is an example of a filter. It uses the ps –ef command to display all the processes on the system. The first pipe then selects any lines that contain colinbr , i.e. we are looking only for processes owned by Colin Brett. The last portion of the command line then filters out the grep command itself (the –v option to grep inverts the search). We are left with a break-down of what Colin Brett is up to on mclaren (running four C shell processes).


Finally in this section, the output of a pipeline can itself be redirected to a file or device. As a simple example try


91 colinbr@mclaren> who | grep colinbr > cols_logins

92 colinbr@mclaren> cat cols_logins

colinbr pts/22 Sep 6 10:40 (williams)

colinbr pts/28 Sep 6 11:03 (williams)

colinbr pts/11 Aug 28 14:36 (t130)

colinbr pts/20 Aug 31 14:36 (brabham)


Command 91 uses the who command to determine who is logged in and filters that output for the user colinbr before writing that output to the file cols_logins . Basically, what we get here is proof that Colin Brett has been logged into mclaren from various machines (williams, brabham and t130), in some cases for over a week!


g) Command Substitution


We have already seen one example of command substitution in the section on C shell variables. To recap, that example was:


set prompt="! ${USER}@`hostname`> "


What we see here is the shell prompt being set to the string enclosed in double quotes “” but within that string is the expression `hostname` enclosed in single backward quotes ``. These quotes, variously called grave accents, backquotes or backticks, are the mechanism by which command substitution occurs.


When the shell parses a command line, it substitutes in variables, expands file names, resolves aliases and looks for commands in these backquotes. If such commands exist, a sub-shell is spawned, the backquoted command is run, and its results substituted into the original command name. So, in the above example, the shell is performing the following actions



The final result, on screen, will look something like


100 colinbr@mclaren>


where


100 is the current command number

colinbr is the value of $USER

mclaren is the output of the hostname command

The @ and > characters are treated as literal strings


In the section on Pipes, we looked also at the command


87 colinbr@mclaren> wc *.txt | tail -1 | awk '{print $3}'

5023


Using command substitution within a shell script, we could assign the output of this command to a variable


set text_size=`wc *.txt | tail -1 | awk '{print $3}'`


The value can then be manipulated within the script, perhaps dividing it by 1024 to get sizes in kilobytes, or comparing it to a fixed value for determining file size limits.


Command substitution is an immensely useful facility, though perhaps more useful within a script than in a command line UNIX session.


h) Running Commands in the Background


If you have a command that you know (or suspect) will take a long time to run, consider redirecting the output to a file and running the command in the background. In this way, the C shell begins executing the command and returns your prompt so you can continue working. The syntax to run a command in the background is to add a & (ampersand) character to the end of the command line. As a simple example, consider a command to recursively list the contents of your home directory:


20 colinbr@williams> ls -alR > my_files &

[1] 4687

21 colinbr@williams> ps -ef | grep ls

colinbr 4687 3327 4 15:48:19 pts/16 0:01 ls -alR

colinbr 4689 3327 0 15:48:23 pts/16 0:00 grep ls


Command 20 is the recursive ls command, redirected into the file my_files . The & character instructs the C shell to spawn a new child shell for the ls command. We see no output from this command (as would be expected because std.out is redirected) except for the following numbers:


[1] 4687


The [1] indicates this is job number 1 that has been spawned from the current shell and the 4687 is the process ID number (PID) of the child shell. After this output is displayed we are returned to the prompt ready for the next command.


With the ps command (21) we can see PID 4687 running. While the command runs we can be doing other things, for example





23 colinbr@williams> ls –l my_files

-rw-r--r-- 1 colinbr it 434176 Sep 11 15:48 my_files

24 colinbr@williams> !!

lsl files

-rw-r--r-- 1 colinbr it 458752 Sep 11 15:48 my_files


We can see the size of my_files increasing as more output is written to it. Eventually the ls command will finish. On pressing Return or Enter at the end of a subsequent command, we see output like


[1] + Done ls -alR > my_files

28 colinbr@williams>


This indicates that job number 1 is completed and the command that was running is also echoed to the screen.


i) C shell limitations


As stated in the C shell man page:


Although robust enough for general use, adventures into the

esoteric periphery of the C shell may reveal unexpected

quirks.


This does not necessarily mean the C shell is full of bugs. But there are certain limitations that apply, some of which are summarized below.



Probably the most commonly-met limit is the last one. For example, an attempt to remove all the files in a directory with the command


32 colinbr@williams> rm *


will fail if there are more than 1706 files in that directory.