Previous Page
Next Page

13.4. Reading and Writing

This section describes the functions that actually retrieve data from or send data to a stream. First, there is another detail to consider: an open stream can be used either for byte characters or for wide characters.

13.4.1. Byte-Oriented and Wide-Oriented Streams

In addition to the type char, C also provides a type for wide characters, named wchar_t. This type is wide enough to represent any character in the extended character sets that the implementation supports (see "Wide Characters and Multibyte Characters" in Chapter 1). Accordingly, there are two complete sets of functions for input and output of characters and strings: the byte-character I/O functions and the wide-character I/O functions. Functions in the second set operate on characters with the type wchar_t. Each stream has an orientation that determines which set of functions is appropriate.

Immediately after you open a file, the orientation of the stream associated with it is undetermined. If the first file access is performed by a byte-character I/O function, then from that point on the stream is byte-oriented. If the first access is by a wide-character function, then the stream is wide-oriented. The orientation of the standard streams, stdin, stdout, and stderr, is likewise undetermined when the program starts.

You can call the function fwide( ) at any time to ascertain a stream's orientation. Before the first I/O operation, fwide( ) can also set a new stream's orientation. To change a stream's orientation once it has been determined, you must first reopen the stream by calling the freopen( ) function.

The wide characters written to a wide-oriented stream are stored in the file associated with the stream as multibyte characters. The read and write functions implicitly perform the necessary conversion between wide characters of type wchar_t and the multibyte character encoding. This conversion may be stateful. In other words, the value of a given byte in the multibyte encoding may depend on control characters that precede it, which alter the shift state or conversion state of the character sequence. For this reason, each wide-oriented stream has an associated object with the type mbstate_t, which stores the current multibyte conversion state. The functions fgetpos( ) and fsetpos( ), which get and set the value of the file position indicator, also save and restore the conversion state for the given file position.

13.4.2. Error Handling

The I/O functions can use a number of mechanisms to indicate to the caller when they incur errors, including return values, error and EOF flags in the FILE object, and the global error variable errno. To read which mechanisms are used by a given function, see the individual function descriptions in Chapter 17. This section describes the I/O error-handling mechanisms in general.

13.4.2.1. Return values and status flags

The I/O functions generally indicate any errors that occur by their return value. In addition, they also set an error flag in the FILE object that controls the stream if an error in reading or writing occurs. To query this flag, you can call the ferror( ) function. An example:

    (void)fputc( '*', fp );           // Write an asterisk to the stream fp.
    if ( ferror(fp) )
      fprintf( stderr, "Error writing.\n" );

Furthermore, read functions set the stream's EOF flag on reaching the end of the file. You can query this flag by calling the feof( ) function. A number of read functions return the value of the macro EOF if you attempt to read beyond the last character in the file. (Wide-character functions return the value WEOF.) A return value of EOF or WEOF can also indicate an error, however. To distinguish between the two cases, you must call ferror( ) or feof( ), as the following example illustrates:

    int i, c;
    char buffer[1024];
    /* ... Open a file to read using the stream fp ... */
    i = 0;
    while ( i < 1024 &&                 // While there is space in the buffer
            ( c = fgetc( fp )) != EOF ) // ... and the stream can deliver
      buffer[i++] = (char)c;            // characters.
    if ( i < 1024 && ! feof(fp) )
      fprintf( stderr, "Error reading.\n" );

The if statement in this example prints an error message if fgetc( ) returns EOF and the EOF flag is not set.

13.4.2.2. The error variable errno

Several standard library functions support more specific error handling by setting the global error variable errno to a value that indicates the kind of error that has occurred. Stream handling functions that set errno include ftell( ), fgetpos( ), and fsetpos( ). Depending on the implementation, other functions may also set the errno variable. errno is declared in the header errno.h with the type int (see Chapter 15). errno.h also defines macros for the possible values of errno.

The perror( ) function prints a system-specific error message for the current value of errno to the stderr stream.

    long pos = ftell(fp);      // Get the current file position.
    if ( pos < 0L )            // ftell( ) returns -1L if an error occurs.
      perror( "ftell( )" );

The perror( ) function prints its string argument followed by a colon, the error message, and a newline character. The error message is the same as the string that strerror( ) would return if called with the given value of errno as its argument. In the previous example, the perror( ) function as implemented in the GCC compiler prints the following output to indicate an invalid FILE pointer argument:

    ftell( ): Bad file descriptor

The error variable errno is also set by functions that convert between wide characters and multibyte characters in reading from or writing to a wide-oriented stream. Such conversions are performed internally by calls to the wcrtomb( ) and mbrtowc( ) functions. When these functions are unable to supply a valid conversion, they return the value of -1 cast to size_t, and set errno to the value of EILSEQ (for "illegal sequence").

13.4.3. Unformatted I/O

The standard library provides functions to read and write unformatted data in the form of individual characters, strings, or blocks of any given size. This section describes these functions, listing the prototypes of both the byte-character and the wide-character functions. The type wint_t is an integer type capable of representing at least all the values in the range of wchar_t, and the additional value WEOF. The macro WEOF has the type wint_t and a value that is distinct from all the character codes in the extended character set.

Unlike EOF, the value of WEOF is not necessarily negative.


13.4.3.1. Reading characters

Use the following functions to read characters from a file:

    int fgetc( FILE * fp );
    int getc( FILE *fp );
    int getchar( void );
    wint_t fgetwc( FILE *fp );
    wint_t getwc( FILE *fp );
    wint_t getwchar( void );

The fgetc( ) function reads a character from the input stream referenced by fp. The return value is the character read, or EOF if an error occurred. The macro getc( ) has the same effect as the function fgetc( ). The macro is commonly used because it is faster than a function call. However, if the argument fp is an expression with side effects (see Chapter 5), then you should use the function instead, because a macro may evaluate its argument more than once. The macro getchar( ) reads a character from standard input. It is equivalent to getc(stdin).

fgetwc( ), getwc( ), and getwchar( ) are the corresponding functions and macros for wide-oriented streams. These functions set the global variable errno to the value EILSEQ if an error occurs in converting a multibyte character to a wide character.

13.4.3.2. Putting a character back

Use one of the following functions to push a character back into the stream from whence it came:

    int ungetc( int c, FILE *fp );
    wint_t ungetwc( wint_t c, FILE *fp );

ungetc( ) and ungetwc( ) push the last character read, c, back onto the input stream referenced by fp. Subsequent read operations then read the characters put back, in LIFO (last in, first out) orderthat is, the last character put back is the first one to be read. You can always put back at least one character, but repeated attempts might or might not succeed. The functions return EOF (or WEOF) on failure, or the character pushed onto the stream on success.

13.4.3.3. Writing characters

The following functions allow you to write individual characters to a stream:

    int fputc( int c, FILE *fp );
    int putc( int c, FILE *fp);
    int putchar( int c );
    wint_t fputwc( wchar_t wc, FILE *fp );
    wint_t putwc( wchar_t wc, FILE *fp );
    wint_t putwchar( wchar_t wc );

The function fputc( ) writes the character value of the argument c to the output stream referenced by fp. The return value is the character written, or EOF if an error occurred. The macro putc( ) has the same effect as the function fputc( ). If either of its arguments is an expression with side effects (see Chapter 5), then you should use the function instead, because a macro might evaluate its arguments more than once. The macro putchar( ) writes the specified character to the standard output stream.

fputwc( ), putwc( ), and putwchar( ) are the corresponding functions and macros for wide-oriented streams. These functions set the global variable errno to the value EILSEQ if an error occurs in converting the wide character to a multibyte character.

The following example copies the contents of a file opened for reading, referenced by fpIn, to a file opened for writing, referenced by fpOut. Both streams are byte-oriented.

    _Bool error = 0;
    int c;
    rewind( fpIn );         // Set the file position indicator to the beginning
                            // of the file, and clear the error and EOF flags.
    while (( c = getc( fpIn )) != EOF )  // Read one character at a time.
      if ( putc( c, fpOut ) == EOF )     // Write each character to the output
      {                                  // stream.
        error = 1; break;                // A write error.
      }
    if ( ferror( fpIn ))                 // A read error.
      error = 1;

13.4.3.4. Reading strings

The following functions allow you to read a string from a stream:

    char *fgets( char *buf, int n, FILE *fp );
    char *gets( char *buf );
    wchar_t *fgetws( wchar_t *buf, int n, FILE *fp );

The functions fgets( ) and fgetws( ) read up to n - 1 characters from the input stream referenced by fp into the buffer addressed by buf, appending a null character to terminate the string. If the functions encounter a newline character or the end of the file before they have read the maximum number of characters, then only the characters read up to that point are read into the buffer. The newline character '\n' (or, in a wide-oriented stream, L'\n') is also stored in the buffer if read.

gets( ) reads a line of text from standard input into the buffer addressed by buf. The newline character that ends the line is replaced by the null character that terminates the string in the buffer. fgets( ) is a preferable alternative to gets( ), as gets( ) offers no way to limit the number of characters read. There is no wide-character function corresponding to gets( ).

All three functions return the value of their argument buf, or a null pointer if an error occurred, or if there were no more characters to be read before the end of the file.

13.4.3.5. Writing strings

Use the following functions to write a null-terminated string to a stream:

    int fputs( const char *s, FILE *fp );
    int puts( const char *s );
    int fputws( const wchar_t *s, FILE *fp );

The three puts functions have some features in common as well as certain differences:

  • fputs( ) and fputws( ) write the string s to the output stream referenced by fp. The null character that terminates the string is not written to the output stream.

  • puts( ) writes the string s to the standard output stream, followed by a newline character. There is no wide-character function that corresponds to puts( ).

  • All three functions return EOF (not WEOF) if an error occurred, or a non-negative value to indicate success.

The function in the following example prints all the lines of a file that contain a specified string.

    // Write to stdout all the lines containing the specified search string
    // in the file opened for reading as fpIn.
    // Return value: The number of lines containing the search string,
    //               or -1 on error.
    // ----------------------------------------------------------------
    int searchFile( FILE fpIn, const char *keyword )
    {
      #define MAX_LINE 256
      char line[MAX_LINE] = "";
      int count = 0;

      if ( fpIn == NULL || keyword == NULL )
        return -1;
      else
        rewind( fpIn );

      while ( fgets( line, MAX_LINE, fpIn ) != NULL )
        if ( strstr( line, keyword ) != NULL )
        {
          ++count;
          fputs( line, stdout );
        }

      if ( !feof( fpIn ) )
        return -1;
      else
        return count;
    }

13.4.3.6. Reading and writing blocks

The fread( ) function reads up to n objects whose size is size from the stream referenced by fp, and stores them in the array addressed by buffer:

    size_t fread( void *buffer, size_t size, size_t n, FILE *fp );

The function's return value is the number of objects transferred. A return value less than the argument n indicates that the end of the file was reached while reading, or that an error occurred.

The fwrite( ) function sends n objects whose size is size from the array addressed by buffer to the output stream referenced by fp:

    size_t fwrite( const void *buffer, size_t size, size_t n, FILE *fp );

Again, the return value is the number of objects written. A return value less than the argument n indicates that an error occurred.

Because the fread( ) and fwrite( ) functions do not deal with characters or strings as such, there are no corresponding functions for wide-oriented streams. On systems that distinguish between text and binary streams, the fread( ) and fwrite( ) functions should be used only with binary streams.

The function in the following example assumes that records have been saved in the file records.dat by means of the fwrite( ) function. A key value of 0 indicates that a record has been marked as deleted. In copying records to a new file, the program skips over records whose key is 0.

    // Copy records to a new file, filtering out those with the key value 0.
    // ---------------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    #define ARRAY_LEN 100          // Maximum number of records in the buffer.
    // A structure type for the records:
    typedef struct { long key;
                     char name[32];
                     /* ... other fields in the record ... */ } Record_t;

    char inFile[ ]  = "records.dat",                // Filenames.
         outFile[ ] = "packed.dat";

    // Terminate the program with an error message:
    inline void error_exit( int status, const char *error_msg )
    {
      fputs( error_msg, stderr );
      exit( status );
    }

    int main( )
    {
      FILE *fpIn, *fpOut;
      Record_t record, *pArray;
      unsigned int i;

      if (( fpIn = fopen( inFile, "rb" )) == NULL )         // Open to read.
        error_exit( 1, "Error on opening input file." );

      else if (( fpOut = fopen( outFile, "wb" )) == NULL )  // Open to write.
        error_exit( 2, "Error on opening output file." );

      else if (( pArray = malloc( ARRAY_LEN )) == NULL )    // Create the
        error_exit( 3, "Insufficient memory." );            // buffer.

      i = 0;                                // Read one record at a time:
      while ( fread( &record, sizeof(Record_t), 1, fpIn ) == 1 )
      {
        if ( record.key != 0L )             // If not marked as deleted ...
        {                                   // ... then copy the record:
           pArray[i++] = record;
           if ( i == ARRAY_LEN )                    // Buffer full?
           {                                        // Yes: write to file.
             if ( fwrite( pArray, sizeof(Record_t), i, fpOut) < i )
                break;
             i = 0;
           }
        }
      }
      if ( i > 0 )                        // Write the remaining records.
        fwrite( pArray, sizeof(Record_t), i, fpOut );

      if ( ferror(fpOut) )                                  // Handle errors.
        error_exit( 4, "Error on writing to output file." );
      else if ( ferror(fpIn) )
        error_exit( 5, "Error on reading input file." );

      return 0;
    }

13.4.4. Formatted Output

C provides formatted data output by means of the printf( ) family of functions. This section illustrates commonly used formatting options with appropriate examples. A complete, tabular description of output formatting options is included in Part II: see the discussion of the printf( ) function in Chapter 17.

13.4.4.1. The printf( ) function family

The printf( ) function and its various related functions all share the same capabilities of formatting data output as specified by an argument called the format string . However, the various functions have different output destinations and ways of receiving the data intended for output. The printf( ) functions for byte-oriented streams are:


int printf( const char * restrict format, ... );

Writes to the standard output stream, stdout.


int fprintf( FILE * restrict fp, const char * restrict format, ... );

Writes to the output stream specified by fp. The printf( ) function can be considered to be a special case of fprintf( ).


int sprintf( char * restrict buf, const char * restrict format, ... );

Writes the formatted output to the char array addressed by buf, and appends a terminating null character.


int snprintf( char * restrict buf, size_t n, const char * restrict format, ... );

Like sprintf( ), but never writes more than n bytes to the output buffer.

The ellipsis (...) in these function prototypes stands for more arguments, which are optional. Another subset of the printf( ) functions takes a pointer to an argument list, rather than accepting a variable number of arguments directly in the function call. The names of these functions begin with a v for "variable argument list":

    int vprintf( const char * restrict format, va_list argptr );
    int vfprintf( FILE * restrict fp, const char * restrict format,
                  va_list argptr );
    int vsprintf( char * restrict buf, const char * restrict format,
                  va_list argptr );
    int vsnprintf( char * restrict buffer, size_t n,
                   const char * restrict format, va_list argptr );

To use the variable argument list functions, you must include stdarg.h in addition to stdio.h.

There are counterparts to all of these functions for output to wide-oriented streams. The wide-character printf( ) functions have names containing wprintf instead of printf, as in vfwprintf( ) and swprintf( ), for example. There is one exception: there is no snwprintf( ). Instead, swprintf( ) corresponds to the function snprintf( ), with a parameter for the maximum output length.

13.4.4.2. The format string

One argument passed to every printf( ) function is a format string. This is a definition of the data output format, and contains some combination of ordinary characters and conversion specifications . Each conversion specification defines how the function should convert and format one of the optional arguments for output. The printf( ) function writes the format string to the output destination, replacing each conversion specification in the process with the formatted value of the corresponding optional argument.

A conversion specification begins with a percent sign % and ends with a letter, called the conversion specifier . (To include a percent sign % in the output, there is a special conversion specification: %%. printf( ) converts this sequence into a single percent sign.)

The syntax of a conversion specification ends with the conversion specifier. Throughout the rest of this section, we use both these terms frequently in talking about the format strings used in printf( ) and scanf( ) function calls.


The conversion specifier determines the type of conversion to be performed, and must match the corresponding optional argument. An example:

    int score = 120;
    char player[ ] = "Mary";
    printf( "%s has %d points.\n", player, score );

The format string in this printf( ) call contains two conversion specifications: %s and %d. Accordingly, two optional arguments have been specified: a string, matching the conversion specifier s (for "string"), and an int, matching the conversion specifier d (for "decimal"). The function call in the example writes the following line to standard output:

    Mary has 120 points.

All conversion specifications (with the exception of %%) have the following general format:

    %[flags][field_width][.precision][length_modifier]specifier

The parts of this syntax that are indicated in square brackets are all optional, but any of them that you include must be placed in the order shown here. The permissible conversion specifications for each argument type are described in the sections that follow. Any conversion specification can include a field width. The precision does not apply to all conversion types, however, and its significance is different depending on the specifier.

13.4.4.3. Field widths

The field width option is especially useful in formatting tabular output. If included, the field width must be a positive decimal integer (or an asterisk, as described below). It specifies the minimum number of characters in the output of the corresponding data item. The default behavior is to position the converted data right-justified in the field, padding it with spaces to the left. If the flags include a minus sign (-), then the information is left-justified, and the excess field width padded with space characters to the right.

The following example first prints a line numbering the character positions to illustrate the effect of the field width option:

    printf("1234567890123456\n");                // Character positions.
    printf( "%-10s %s\n", "Player", "Score" );   // Table headers.
    printf( "%-10s %4d\n", "John", 120 );        // Field widths: 10; 4.
    printf( "%-10s %4d\n", "Mary", 77 );

These statements produce a little table:

    1234567890123456
    Player     Score
    John        120
    Mary         77

If the output conversion results in more characters than the specified width of the field, then the field is expanded as necessary to print the complete data output.

If a field is right-justified, it can be padded with leading zeroes instead of spaces. To do so, include a 0 (that's the digit zero) in the conversion specification's flags. The following example prints a date in the format mm-dd-yyyy:

    int month = 5, day = 1, year = 1987;
    printf( "Date of birth: %02d-%02d-%04d\n", month, day, year );

This printf( ) call produces the following output:

    Date of birth: 05-01-1987

You can also use a variable to specify the field width. To do so, insert an asterisk (*) as the field width in the conversion specification, and include an additional optional argument in the printf( ) call. This argument must have the type int, and must appear immediately before the argument to be converted for output. An example:

    char str[ ] = "Variable field width";
    int width = 30;
    printf( "%-*s!\n", width, str );

The printf statement in this example prints the string str at the left end of a field whose width is determined by the variable width. The results are as follows:

    Variable field width          !

Notice the trailing spaces preceding the bang (!) character in the output. Those spaces are not present in the string used to initialize str[ ]. The spaces are generated by virtue of the fact that the printf statement specifies a 30-character width for the string.

13.4.4.4. Printing characters and strings

The printf( ) conversion specifier for strings is s, as you have already seen in the previous examples. The specifier for individual characters is c (for char). They are summarized in Table 13-2.

Table 13-2. Conversion specifiers for printing characters and strings

Specifier

Argument types

Representation

c

int

A single character

s

Pointer to any char type

The string addressed by the pointer argument


The following example prints a separator character between the elements in a list of team members:

    char *team[ ] = { "Vivian", "Tim", "Frank", "Sally" };
    char separator = ';';
    for ( int i = 0;  i < sizeof(team)/sizeof(char *); ++i )
      printf( "%10s%c ", team[i], separator );
    putchar( '\ n' );

The argument represented by the specification %c can also have a narrower type than int, such as char. Integer promotion automatically converts such an argument to int. The printf( ) function then converts the int arguments to unsigned char, and prints the corresponding character.

For string output, you can also specify the maximum number of characters of the string that may be printed. This is a special use of the precision option in the conversion specification, which consists of a dot followed by a decimal integer. An example:

    char msg[ ] = "Every solution breeds new problems.";
    printf( "%.14s\n", msg );      // Precision: 14.
    printf( "%20.14s\n", msg );    // Field width is 20; precision is 14.
    printf( "%.8s\n", msg+6 );     // Print the string starting at the 7th
                                   // character in msg, with precision 8.

These statements produce the following output:

    Every solution
          Every solution
    solution

13.4.4.5. Printing integers

The printf( ) functions can convert integer values into decimal, octal, or hexadecimal notation. The conversion specifiers listed in Table 13-3 are provided for this purpose.

Table 13-3. Conversion specifiers for printing integers

Specifier

Argument types

Representation

d, i

int

Decimal

u

unsigned int

Decimal

o

unsigned int

Octal

x

unsigned int

Hexadecimal with lowercase a, b, c, d, e, f

X

unsigned int

Hexadecimal with uppercase A, B, C, D, E, F


The following example illustrates different conversions of the same integer value:

    printf( "%4d %4o %4x %4X\n", 63, 63, 63, 63 );

This printf( ) call produces the following output:

      63   77   3f   3F

The specifiers u, o, x, and X interpret the corresponding argument as an unsigned integer. If the argument's type is int and its value negative, the converted output is the positive number that corresponds to the argument's bit pattern when interpreted as an unsigned int:

    printf( "%d   %u   %X\n", -1, -1, -1 );

If int is 32 bits wide, this statement yields the following output:

    -1   4294967295   FFFFFFFF

Because the arguments are subject to integer promotion, the same conversion specifiers can be used to format short and unsigned short arguments. For arguments with the type long or unsigned long, you must prefix the length modifier l (a lowercase L) to the d, i, u, o, x, and X specifiers. Similarly, the length modifier for arguments with the type long long or unsigned long long is ll (two lowercase Ls). An example:

    long bignumber = 100000L;
    unsigned long long hugenumber = 100000ULL * 1000000ULL;
    printf( "%ld   %llX\n", bignumber, hugenumber );

These statements produce the following output:

    100000   2540BE400

13.4.4.6. Printing floating-point numbers

Table 13-4 shows the printf( ) conversion specifiers to format floating-point numbers in various ways.

Table 13-4. Conversion specifiers for printing floating-point numbers

Specifier

Argument types

Representation

f

double

Decimal floating-point number

e, E

double

Exponential notation, decimal

g, G

double

Floating-point or exponential notation, whichever is shorter

a, A

double

Exponential notation, hexadecimal


The most commonly used specifiers are f and e (or E). The following example illustrates how they work:

    double x = 12.34;
    printf( "%f  %e  %E\n", x, x, x );

This printf( ) call generates following output line:

    12.340000  1.234000e+01  1.234000E+01

The e that appears in the exponential notation in the output is lowercase or uppercase, depending on whether you use e or E for the conversion specifier. Furthermore, as the example illustrates, the default output shows precision to six decimal places. The precision option in the conversion specification modifies this behavior:

    double value = 8.765;
    printf( "Value: %.2f\n", value );        // Precision is 2: output to two
                                             // decimal places.
    printf( "Integer value:\n"
            " Rounded:     %5.0f\n"          // Field width 5; precision 0.
            " Truncated:   %5d\n", value, (int)value );

These printf( ) calls produce the following output:

    Value: 8.77
    Integer value:
     Rounded:        9
     Truncated:      8

As this example illustrates, printf( ) rounds floating-point numbers up or down in converting them for output. If you specify a precision of 0, the decimal point itself is suppressed. If you simply want to truncate the fractional part of the value, you can cast the floating-point number as an integer type.

The specifiers described can also be used with float arguments, because they are automatically promoted to double. To print arguments of type long double, however, you must insert the length modifier L before the conversion specifier, as in this example:

    #include <math.h>
    long double xxl = expl(1000);
    printf( "e to the power of 1000 is %.2Le\n", xxl );

13.4.5. Formatted Input

To read in data from a formatted source, C provides the scanf( ) family of functions. Like the printf( ) functions, the scanf( ) functions take as one of their arguments a format string that controls the conversion between the I/O format and the program's internal data. This section highlights the differences between the uses of format strings and conversion specifications in the scanf( ) and the printf( ) functions.

13.4.5.1. The scanf( ) function family

The various scanf( ) functions all process the characters in the input source in the same way. They differ in the kinds of data sources they read, however, and in the ways in which they receive their arguments. The scanf( ) functions for byte-oriented streams are:


int scanf( const char * restrict format, ... );

Reads from the standard input stream, stdin.


int fscanf( FILE * restrict fp, const char * restrict format, ... );

Reads from the input stream referenced by fp.


int sscanf( const char * restrict src, const char * restrict format, ... );

Reads from the char array addressed by src.

The ellipsis (...) stands for more arguments, which are optional. The optional arguments are pointers to the variables in which the scanf( ) function stores the results of its conversions.

Like the printf( ) functions, the scanf( ) family also includes variants that take a pointer to an argument list, rather than accepting a variable number of arguments directly in the function call. The names of these functions begin with the letter v for "variable argument list": vscanf( ), vfscanf( ), and vsscanf( ). To use the variable argument list functions, you must include stdarg.h in addition to stdio.h.

There are counterparts to all of these functions for reading wide-oriented streams. The names of the wide-character functions contain the sequence wscanf in place of scanf, as in wscanf( ) and vfwscanf( ), for example.

13.4.5.2. The format string

The format string for the scanf( ) functions contains both ordinary characters and conversion specifications that define how to interpret and convert the sequences of characters read. Most of the conversion specifiers for the scanf( ) functions are similar to those defined for the printf( ) functions. However, conversion specifications in the scanf( ) functions have no flags and no precision options. The general syntax of conversion specifications for the scanf( ) functions is as follows:

    %[*][field_width][length_modifier]specifier

For each conversion specification in the format string, one or more characters are read from the input source and converted in accordance with the conversion specifier. The result is stored in the object addressed by the corresponding pointer argument. An example:

    int age = 0;
    char name[64] = "";
    printf( "Please enter your first name and your age:\n" );
    scanf( "%s%d", name, &age );

Suppose that the user enters the following line when prompted:

    Bob 27\n

The scanf( ) call writes the string Bob into the char array name, and the value 27 in the int variable age.

All conversion specifications except those with the specifier c skip over leading whitespace characters. In the previous example, the user could type any number of space, tab, or newline characters before the first word, Bob, or between Bob and 27, without affecting the results.

The sequence of characters read for a given conversion specification ends when scanf( ) reads any whitespace character, or any character that cannot be interpreted under that conversion specification. Such a character is pushed back onto the input stream, so that processing for the next conversion specification begins with that character. In the previous example, suppose the user enters this line:

    Bob 27years\n

Then on reaching the character y, which cannot be part of a decimal numeral, scanf( ) stops reading characters for the conversion specification %d. After the function call, the characters years\n would remain in the input stream's buffer.

If, after skipping over any whitespace, scanf( ) doesn't find a character that matches the current conversion specification, an error occurs, and the scanf( ) function stops processing the input. We'll show you how to detect such errors in a moment.

Often the format string in a scanf( ) function call contains only conversion specifications. If not, all other characters in the format string except whitespace characters must literally match characters in corresponding positions in the input source. Otherwise, the scanf( ) function quits processing and pushes the mismatched character back on to the input stream.

One or more consecutive whitespace characters in the format string matches any number of consecutive whitespace characters in the input stream. In other words, for any whitespace in the format string, scanf( ) reads past all whitespace characters in the data source up to the first non-whitespace character. Knowing this, what's the matter with the following scanf( ) call?

    scanf( "%s%d\n", name, &age );      // Problem?

Suppose that the user enters the following line:

    Bob 27\n

In this case, scanf( ) doesn't return after reading the newline character, but instead continues reading more inputuntil a non-whitespace character comes along.

Sometimes you will want to read past any sequence of characters that matches a certain conversion specification without storing the result. You can achieve exactly this effect by inserting an asterisk (*) immediately after the percent sign (%) in the conversion specification. Do not include a pointer argument for a conversion specification with an asterisk.

The return value of a scanf( ) function is the number of data items successfully converted and stored. If everything goes well, the return value matches the number of conversion specifications, not counting any that contain an asterisk. The scanf( ) functions return the value of EOF if a read error occurs or they reach the end of the input source before converting any data items. An example:

    if ( scanf( "%s%d", name, &age ) < 2 )
      fprintf( stderr, "Bad input.\n" );
    else
    {  /* ... Test the values stored ... */  }

13.4.5.3. Field widths

The field width is a positive decimal integer that specifies the maximum number of characters that scanf( ) reads for the given conversion specification. For string input, this item can be used to prevent buffer overflows:

    char city[32];
    printf( "Your city: ");
    if ( scanf( "%31s", city ) < 1 )  // Never read in more than 31 characters!
      fprintf( stderr, "Error reading from standard input.\ n" );
    else
    /* ... */

Unlike printf( ), which exceeds the specified field width whenever the output is longer than that number of characters, scanf( ) with the s conversion specifier never writes more characters to a buffer than the number specified by the field width.

13.4.5.4. Reading characters and strings

The conversion specifications %c and %1c read the next character in the input stream, even if it is a whitespace character. By specifying a field width, you can read that exact number of characters, including whitespace characters, as long as the end of the input stream does not intervene. When you read more than one character in this way, the corresponding pointer argument must point to a char array that is large enough to hold all the characters read. The scanf( ) function with the c conversion specifer does not append a terminating null character. An example:

    scanf( "%*5c" );

This scanf( ) call reads and discards the next five characters in the input source.

The conversion specification %s always reads just one word, as a whitespace character ends the sequence read. To read whole text lines, you can use the fgets( ) function.

The following example reads the contents of a text file word by word. Here we assume that the file pointer fp is associated with a text file that has been opened for reading:

    char word[128];
    while ( fscanf( fp, "%127s", word ) == 1 )
    {
      /* ... process the word read ... */
    }

In addition to the conversion specifier s, you can also read strings using the "scanset" specifier, which consists of an unordered set of characters between square brackets ([scanset]). The scanf( ) function then reads all characters, and saves them as a string (with a terminating null character), until it reaches a character that does not match any of those in the scanset. An example:

    char strNumber[32];
    scanf( "%[0123456789]", strNumber );

If the user enters 345X67, then scanf( ) stores the string 345\0 in the array strNumber. The character X and all subsequent characters remain in the input buffer.

To invert the scansetthat is, to match all characters except those between the square bracketsinsert a caret (^) immediately after the opening bracket. The following scanf( ) call reads all characters, including whitespace, up to a punctuation character that terminates a sentence; and then reads the punctuation character itself:

    char ch, sentence[512];
    scanf( "%511[^.!?]%c", sentence, &ch );

The following scanf( ) call can be used to read and discard all characters up to the end of the current line:

    scanf( "%*[^\n]%*c" );

13.4.5.5. Reading integers

Like the printf( ) functions, the scanf( ) functions offer the following conversion specifiers for integers: d, i, u, o, x, and X. These allow you to read and convert decimal, octal, and hexadecimal notation to int or unsigned int variables. An example:

    // Read a non-negative decimal integer:
    unsigned int value = 0;
    if ( scanf( "%u", &value ) < 1 )
      fprintf( stderr, "Unable to read an integer.\n" );
    else
      /* ... */

For the specifier i in the scanf( ) functions, the base of the numeral read is not predefined. Instead, it is determined by the prefix of the numeric character sequence read, in the same way as for integer constants in C source code (see "Integer Constants" in Chapter 3). If the character sequence does not begin with a zero, then it is interpreted as a decimal numeral. If it does begin with a zero, and the second character is not x or X, then the sequence is interpreted as an octal numeral. A sequence that begins with 0x or 0X is read as a hexadecimal numeral.

To assign the integer read to a short, char, long, or long long variable (or to a variable of a corresponding unsigned type), you must insert a length modifier before the conversion specifier: h for short, hh for char, l for long, or ll for long long. In the following example, the FILE pointer fp refers to a file opened for reading:

    unsigned long position = 0;
    if ( fscanf( fp, "%lX", &position ) < 1 )  // Read a hexadecimal integer.
      /* ... Handle error: unable to read a numeral ... */

13.4.5.6. Reading floating-point numbers

To process floating-point numerals, the scanf( ) functions use the same conversion specifiers as printf( ): f, e, E, g, and G. Furthermore, C99 has added the specifiers a and A. All of these specifiers interpret the character sequence read in the same way. The character sequences that can be interpreted as floating-point numerals are the same as the valid floating-point constants in C; see "Floating-Point Constants" in Chapter 3. scanf( ) can also convert integer numerals and store them in floating-point variables.

All of these specifiers convert the numeral read into a floating-point value with the type float. If you want to convert and store the value read as a variable of type double or long double, you must insert a length modifier: either l (a lowercase L) for double, or L for long double. An example:

    float x = 0.0F;
    double xx = 0.0;
    // Read in two floating-point numbers; convert one to float and the other
    // to double:
    if ( scanf( "%f %lf", &x, &xx ) < 2 )
      /* ... */

If this scanf( ) call receives the input sequence 12.3 7\n, then it stores the value 12.3 in the float variable x, and the value 7.0 in the double variable xx.


Previous Page
Next Page