The UNIX Computer System at ARE (Edition )
As mentioned earlier, the power of UNIX lies in its modularity. Each command does one thing well. UNIX allows you to combine several of these single-purpose commands into a command pipeline. Here's a simple, and very useful, command pipeline that allows you to page through a long directory listing:
$ ls -al | less
The `ls -al' command generates a detailed list of file names and
attributes, one file per line. The output of that command is piped
into the pager `less'. If there are more files in the directory
than can be displayed on a single screen, this command lets you page
through them at your leisure.
Pipes allow the output of one command to become the input to another. UNIX uses the terms standard output and standard input to refer to the two ends of a pipe. Commands in the pipeline don't know where the data is coming from or where its going: they just read the standard input, do their thing with it, and send the results to the standard output. The shell is responsible for making sure the beginning and end of a pipeline are connected properly to a meaningful source and destination.
Here's another, more complicated, example of a pipeline. This one prints
out the number of files in the current directory that contain the word
`hysteresis':
$ grep 'hysteresis' * | awk -F: '{print $1}' | sort | uniq | wc -l
-| 2
The first thing the shell does when it sees this command is expand the
`*' into a list of files in the current directory. The `grep'
command then searches through this list of files for the word
`hysteresis', outputting all lines from any file that contain the
word, prefixed by the file name and a colon. The shell pipes this output
into the `awk' command, which splits each line into fields
separated by colons and then prints the first such field. The printed
field is the file name part of the line printed out by the `grep'
command. So the output from the `awk' part of the pipe is a list of
file names. Since the word `hysteresis' may appear multiple times
in a file, at this point its possible that `awk' has outputted a
list that contains duplicate file names. The `sort' command makes
sure the list of names is sorted for the `uniq' command which
removes any duplicates (keeping only unique items). Finally this list of
sorted, unique file names is piped into the `wc' command to count
how many lines are in the list. Pretty cool.