CSC 161 Grinnell College Fall, 2011
Imperative Problem Solving and Data Structures

Reading on File Streams


Users generally think of files in either of two ways:

  1. Streams: Files may be considered as streams of data (e.g., sequences of characters, sequences of integers, sequences of real numbers)
  2. Text Files: Files may be considered as being divided into lines, with processing proceeding line by line.

This reading focuses on streams; the next reading covers text files.

Overview of Working with Files

Working with files (either streams or text files) generally involves three main steps:

  1. Opening a file: Preparing to read or write
  2. Processing file data: Files may be considered either as streams or text files
  3. Closing the file: Wrapping up by performing final operations

Opening a File

Users typically think of a file as a logical entity — often with a descriptive name, such as myData.dat. Within the machine, however, the file has a specific location on the disk or other storage medium. Also, during processing, the computer must keep track of what part of the file has already been processed and what data will be considered next. In addition, behind the scenes, the computer must handle details of moving data between main memory and the disk or other storage device.

In C, both input and output files are declared as pointer variables to elements of type FILE. Preparation to use a file involves connecting the user's logical name with the file variable, and setting up the internal details for reading or writing. These details are handled by the fopen procedure. The basic form of the command is:

   <variable> = fopen (<file-name>, <mode>)

In this context,

For example, we might write:

  FILE * myfile = fopen("integers.dat", "r");

to prepare to read a file that the user called "integers.dat".

Closing a File

When we are done working with a file, we close it with fclose, as illustrated in the following example:


Working with Streams of Numbers

A stream is a sequence of data elements (normally of the same type). Two common types of streams sequences of numbers and sequences of characters.

Programs genfiles.c and readfiles.c illustrate processing with streams of numbers. Both programs work with a file of integers "integer.file" and a file of real numbers "real.file". For readability, these files place each number on a separate line, but file processing ignores the lines and proceeds simply number-by-number.

Pragmatically, the process of writing and reading to files is much the same as writing and reading with a terminal window, except that we must specify which file to use. Thus, we use fprintf and fscanf instead of printf and scanf. In each case, the file procedures have an additional first parameter, which indicates a file variable (for a file that has already been opened).

When using fscanf for reading, the procedure returns an integer as follows:

Thus, testing (fscanf ( ... ) != EOF) provides a natural way to continue processing until the end of a file is encountered.

For example, the following code segment reads three real numbers from a file "real.file" and computes their average:

   FILE * rFile = fopen ("real.file", "r");
   double x, y, z;
   if ((fscanf (rFile, "%lf", &x) == EOF)
           || (fscanf (rFile, "%lf", &y) == EOF)
           || (fscanf (rFile, "%lf", &z) == EOF))
       printf ("error:  three numbers needed for average\n");
   else {
       double average = (x + y + z) / 3.0;
       printf ("the average is %lf\n", average);

Programs genfiles.c and readfiles.c provide additional examples of generating and reading files of numbers.

Working with Streams of Characters

When considering a file as a sequence of characters, one approach would be to use

   fscanf (rFile, "%c", &ch);

However, a more common approach uses ch = fgetc(rFile) or its alternative ch = getc(rFile). (Differences between fgetc and getc involve some behind-the-scenes technicalities in C, and we ignore such matters here.)

Program fileletters-1.c shows the use of a character stream in an application that counts the number of times various letters appear in a file. Although the program handles a range of cases, the main loop show the key elements of using a stream of characters:

   /* processing the file */
   while ((ch = getc (file_var)) != EOF)
     /* process the character before going on 
        to the next character in the file */