5.5. Objective 4: Use Streams, Pipes, and RedirectsAmong the many beauties of Linux and Unix systems is the notion that everything is a file. Things such as disk drives and their partitions, tape drives, terminals, serial ports, the mouse, and even audio are mapped into the filesystem. This mapping allows programs to interact with many different devices and files in the same way, simplifying their interfaces. Each device that uses the file metaphor is given a device file, which is a special object in the filesystem that provides an interface to the device. The kernel associates device drivers with various device files, which is how the system manages the illusion that devices can be accessed as if they were files. Using a terminal as an example, a program reading from the terminal's device file will receive characters typed at the keyboard. Writing to the terminal causes characters to appear on the screen. While it may seem odd to think of your terminal as a file, the concept provides a unifying simplicity to Linux and Linux programming. 5.5.1. Standard I/O and Default File DescriptorsStandard I/O is a capability of the shell, used with all text-based Linux utilities to control and direct program input, output, and error information. When a program is launched, it is automatically provided with three file descriptors. File descriptors are regularly used in programming and serve as a "handle" of sorts to another file. Standard I/O creates the following file descriptors:
Standard output and standard error are separated because it is often useful to process normal program output differently than errors. The standard I/O file descriptors are used in the same way as those created during program execution to read and write disk files. They enable you to tie commands together with files and devices, managing command input and output in exactly the way you desire. The difference is they are provided to the program by the shell by default and do not need to be explicitly created. 5.5.2. PipesFrom a program's point of view there is no difference between reading text data from a file and reading it from your keyboard. Similarly, writing text to a file and writing text to a display are equivalent operations. As an extension of this idea, it is also possible to tie the output of one program to the input of another. This is accomplished using a pipe (|) to join two or more commands together. For example: $ grep "01523" order* | less This command searches through all files whose names begin with order to find lines containing the word 01523. By creating this pipe, the standard output of grep is sent to the standard input of less. The mechanics of this operation are handled by the shell and are invisible to the user. Pipes can be used in a series of many commands. When more than two commands are put together, the resulting operation is known as a pipeline or text stream, implying the flow of text from one command to the next. As you get used to the idea, you'll find yourself building pipelines naturally to extract specific information from text data sources. For example, suppose you wish to view a sorted list of inode numbers from among the files in your current directory. There are many ways you could achieve this. One way would be to use awk in a pipeline to extract the inode number from the output of ls, then send it on to the sort command and finally to a pager for viewing (don't worry about the syntax or function of these commands at this point): $ ls -i * | awk '{print $1}' | sort -nu | less The pipeline concept in particular is a feature of Linux and Unix that draws on the fact that your system contains a diverse set of tools for operating on text. Combining their capabilities can yield quick and easy ways to extract otherwise hard to handle information. 5.5.3. RedirectionEach pipe symbol in the previous pipeline example instructs the shell to feed output from one command into the input of another. This action is a special form of redirection, which allows you to manage the origin of input streams and the destination of output streams. In the previous example, individual programs are unaware that their output is being handed off to or from another program because the shell takes care of the redirection on their behalf. Redirection can also occur to and from files. For example, rather than sending the output of an inode list to the pager less, it could easily be sent directly to a file with the > redirection operator: $ ls -i * | awk '{print $1}' | sort -nu > in.txt When you change the last redirection operator, the shell creates an empty file (in.txt), opens it for writing, and the standard output of sort places the results in the file instead of on the screen. Note that, in this example, anything sent to standard error is still displayed on the screen. In addition, if your specified file, in.txt, already existed in your current directory it would be overwritten. Since the > redirection operator creates files, the >> redirection operator can be used to append to existing files. For example, you could use the following command to append a one-line footnote to in.txt: $ echo "end of list" >> in.txt Since in.txt already exists, the quote will be appended to the bottom of the existing file. If the file didn't exist, the >> operator would create the file and insert the text "end of list" as its contents. It is important to note that when creating files, the output redirection operators are interpreted by the shell before the commands are executed. This means that any output files created through redirection are opened first. For this reason you cannot modify a file in place, like this: $ grep "stuff" file1 > file1 don't do it! If file1 contains something of importance, this command would be a disaster because an empty file1 would overwrite the original. The grep command would be last to execute, resulting in a complete data loss from the original file1 file because the file that replaced it was empty. To avoid this problem, simply use an intermediate file and then rename it: $ grep "stuff" file1 > file2 $ mv file2 file1 Standard input can also be redirected. The input redirection operator is <. Using a source other than the keyboard for a program's input may seem odd at first, but since text programs don't care about where their standard input streams originate, you can easily redirect input. For example, the following command will send a mail message with the contents of the file in.txt to user jdean: $ mail -s "inode list" jdean < in.txt Normally, the mail program prompts the user for input at the terminal. However, with standard input redirected from the file in.txt, no user input is needed and the command executes silently. Table 5-4 lists the common standard I/O redirections for the bash shell, specified in the LPI Objectives.
5.5.4. Using the tee CommandSometimes you'll want to run a program and send its output to a file while at the same time viewing the output on the screen. The tee utility is helpful in this situation.
|