HowTo |
Use UNIX Standard Input and Output | Prof. Bartenstein |
Whenever a new process is started in UNIX, the operating system automatically opens three file streams that can be used by that process called standard input, standard output, and standard error. Collectively, these three streams are called "Standard Input and Output", or Standard I/O for short.
By definition, the standard input stream is used to provide input data to the process from some external source. The standard output and standard error streams are both used for the process to provide output to an external source. Originally, the standard output stream was intended to provide data for other programs to consume. The standard error stream was intended for humans to read. Since standard output is easier to use than standard error, programmers often ignore this distinction and write messages intended for human consumption to standard output.
In order to understand standard I/O, it's easiest to think about two different perspectives: standard I/O from inside the process looking out, and standard I/O from outside the process looking in.
A process consists of an invocation of a UNIX command. Most often, the command is a program written in a programming language. Different computer languages handle input and output different ways. This discussion will focus on how a C program uses standard I/O. Other languages are functionally similar, but have slightly different syntax.
The C standard library of I/O functions available with #include <stdio.h>, includes special functions to read and write from standard I/O streams. The standard I/O functions are summarized in the following table:
| Stream | Stream Name | Single Character | Format Driven String | End-of-File |
|---|---|---|---|---|
| Standard Input | stdin | int ch=getchar() | scanf(fmt,...) | EOF or feof(stdin) |
| Standard Output | stdout | putchar(int ch) | printf(fmt,...) | --- |
| Standard Error | stderr | putc(stderr,int ch) | fprintf(stderr,fmt,...) | --- |
The programmer can assume that all three streams are open and available when your code starts running, and there is no need to close these streams, they will be closed automatically when the program ends. For more details on formatted I/O see HowTo Use Format Strings in C. For full details, look up specific functions in the Online Linux Man Pages.
When a process starts, by default, standard input is connected to the user's keyboard, and both standard output and standard error are connected to the user's terminal display. However, UNIX provides a capability called redirection that allows users to redirect standard input, standard output, and standard error to a file, or to the input of another program.
When standard input is connected to the user's keyboard, and when the program runs and reads from standard input, then UNIX prints a new line to the terminal and allows the cursor to blink. The program is then waits until the user types something on the keyboard, and hits enter. When the user hits enter, the entire string the user typed, including a new-line character (\n) still shows up on the terminal, but it also becomes available to the program. If the program consumes everything, including the new line, and asks for more from standard input, the program will wait again until the user types in more information, and hits enter again.
⚠ Students new to C programming often start a C program that reads from standard input, and say "The program is broken... it's not doing anything!" That's because the program is waiting for the student to type something on the keyboard. If your cursor is blinking, try typing something short and hitting enter to see what happens.
The user can signal an "end of file" by typing Ctrl-D. This will close the standard input stream, and prevent the program from reading any more input from standard input. If the user types a Ctrl-C, that creates a KILL interupt to the program and causes it to stop altogether.
By default, if the standard input stream is connected to a keyboard, it will not be buffered, so as soon as the user hits the enter key, the program will see the input.
Both the standard output and standard error streams are connected to the user's terminal display by default. If neither of these streams are redirected, they will not be buffered by default, so as soon as a program writes to either standard out or standard error, the output should show up on the terminal. Because both standard output and standard error go to the same place, the output from standard output and from standard error may be intermixed on the terminal display.
UNIX command lines from a terminal are processed by a shell program. There are many different shell programs, and shell programs often have slightly different command line processing syntax and semantics. This page describes the most common syntax and semantics for command line processing and redirection. You can find out which shell you are running by typing the command echo $SHELL. Then, for example, if the shell is tcsh, you can google "tcsh redirection" to get the specifics for the tcsh shell.
In general, shells look at the first blank delimeted word of a command to determine which command to execute. If the first word does not have a path name, then the shell will often use the PATH environment to find the command. The remaining words, up to the first redirection token, are treated as command line parameters, and passed in to the program to process. A redirection token is a word that starts with the character, '<', '>', or '|', or a number followed by '<' or '>'. The '<' character is used for standard input redirection, the '>' is used for standard output and standard error redirection, and the '|' is used for piping, or redirecting standard output from one command to standard input for another command.
It is valid to have more than one redirection token at the end of a command in order to redirect more than one stream.
To read from a file instead of from the keyboard, add the following specfication at the end of your command:
command parm1 ... <input_file
With this redirection, when the command reads from standard input, it will read the input_file instad of reading from the keyboard. The input_file specification may be fully qualified, relative to the current directory, or in the current directory. If no such file is available, the shell will issue an error message before running the program.
To write standard output to a file instead of to the terminal, add the following specification to the end of your command:
command parm1 ... >output_file
With this redirection, when the command writes to standard output, it will write to the output_file instead of writing to the terminal. The output_file may be fully qualified, relative to the current directory, or in the current directory. If the output_file does not exist, the shell will create a new file for you. If the output_file does exist, it will be overwritten with the output from the command, destroying what was in the file before the command was run. Shells check to make sure that you are not redirecting output to the same file you are using for input redirection.
To append to the output file, add the following specification to the end of your file:
command parm1 ... >>output_file
When redirecting to a file, the shell buffers the output stream. This improves the efficiency of the I/O signficantly, but has a side effect; namely that when the command ends abnormally, there may be output in the buffer that does not get written to the file. This often makes debugging the code more difficult. To turn off buffering, add the following code in your main function: setbuf(stdout,0);. Don't forget to comment this out after debugging.
To redirect standard output from one command into standard input from another command (or pipe the output from one command into another), use the following specfication:
command1 parm1A ... | command2 parm2A ...
To both save the output of a command in a file and send that output on standard out so that it can, for instance, be used in a pipe in another command, use the UNIX tee command, as in:
command1 parm1A ... | tee output_file | command2 parm2A ...
This will cause the standard output from command1 to overwrite the contents of otuput_file, as well as being processed as input to command2. To append to the output file instead of overwriting, use:
command1 parm1A ... | tee -a output_file | command2 parm2A ...
To write standard error to a file instead of to the terminal, add the following specification to the end of your command:
command parm1 ... 2>error_file
With this redirection, when the command writes to standard error, it will write to the error_file instead of writing to the terminal. The error_file may be fully qualified, relative to the current directory, or in the current directory. If the error_file does not exist, the shell will create a new file for you. If the error_file does exist, it will be overwritten with the error messages from the command, destroying what was in the file before the command was run. Shells check to make sure that you are not redirecting error output to the same file you are using for input redirection.
To append to the error file, add the following specification to the end of your file:
command parm1 ... 2>>output_file
When redirecting errors to a file, the shell buffers the error stream. This improves the efficiency of the I/O signficantly, but has a side effect; namely that when the command ends abnormally, there may be errors in the buffer that does not get written to the file. This often makes debugging the code more difficult. To turn off buffering, add the following code in your main function: setbuf(stderr,0);. Don't forget to comment this out after debugging.
It is possible to redirect both standard output and standard error to a single file. Different shells implement this in different ways, but the most common implementation uses the following specification:
command parm1 ... 2>&1 >output_error_file
In this case, since both standard error and standard output are buffered, it is common to see standard output intermixed with standard error at unexpected breakpoints like in the middle of lines, because that's when the buffer was full. Use setbuf to turn off both standard error and standard otuput buffering to avoid this side effect.
To pipe a combination of standard error and standard input to another program, use a similar specfication:
command1 parm1A ... 2>&1 | command2 parm2A ...