Computer Science 195:  Program Management

Program Management

This page describes the division of a binary-search-tree program into logical components, expanding upon Lab 9, Introduction to Binary Search Trees and Lab 10, Modifying Tree Structures.

Previously, program ~walker/c/trees/bst.c contained all components of the binary-search code in a single file. Specifically, this program contained:

While such a monolithic framework works fine for small projects, the use of a single file for an entire program has several drawbacks:

In C (and other languages), such problems are resolved following a two-pronged approach:

  1. A program is divided into multiple files.
  2. Compiling is automated, so that multiple files can be as needed compiled using a simple command line.

Dividing the bst.c Program Into Pieces

Since bst.c contains several independent components, a separate component could be defined for each component. The relevant files and their dependencies are shown below:

bst program file dependencies

As this diagram indicates, the original bst.c program may be divided into the following four components:

The source files for all of these files may be found in directory ~walker/c/prog-mgmt.

Within this structure, node.h is independent of the others. However, information about a node structure is needed elsewhere, so that both bst.h and main.c contain references to node.h in include statements. Similarly, both implementation files (bst-proc.c and main.c) reference tree operations, so both contain references to bst-proc.h.

Technically, you may have noted that main.c includes node.h, so an explicit inclusion of node.h in main.c is unnecessary. However, in such a distributed structure of files, it is not uncommon that some definitions are referenced in several places. (A programmer could track down all possible references, but this may undermine some of the advantages of dividing the program into pieces.)

Unfortunately, this multiple referencing of a file could mean that a definition was given twice in a program, and compilers take a dim view of such matters. To resolve this problem, node.h contains lines:


#ifndef _NODE_H
#define _NODE_H

...

#endif

In C, files can define identifiers for the compiler, and the compiler can check if an identifier has been defined previously. For example, the identifier strMax is defined as the number 20 for a global constant, just as was done in previous programs. However, in node.h, a new identifier _NODE_H also is defined. With this new identifer, when a file first references node.h, the identifier _NODE_H will not have been defined. The test #ifndef asks the compiler if an identifier is not defined, and in this case processing continues within the if statement. This first call, therefore, defines identifier _NODE_H. With any subsequent references to node.h, identifier _NODE_H will have been defined, so processing within the ifndef statement will not happen a second time.

Compiling

With this structure, the header files node.h and bst-proc.h contain definitions, but do not yield any code directly. Files bst-proc.c and bst.c, however, must be compiled. Since these files are independent, they can be compiled in either order, with the commands:


gcc -c bst-proc.c
gcc -c bst.c

Here the -c flag tells the compiler to produce a machine-language or "object" file, but not to expect the whole program to be present. The resulting files have a .o extension.

These pieces then can be linked together with the command:


gcc -o bst bst.o bst-proc.o

Alternatively, if bst.c is to be compiled after bst-proc.c, then compiling and linking of bst.ccan be done in one step. The resulting commands are:


gcc -c bst-proc.c
gcc -o bst bst.c bst-proc.o

As this illustrates in the second line, the main .c program is given before any object files.

make and Makefile in Linux/Unix

While the division of software into multiple files can ease development, the manual compiling all of the pieces can be tedious. Unix provides a make capability to automate this process, where instructions for compiling are given in a file called Makefile. Here is one version of such a file: Makefile.

While this program is slightly more complex than is absolutely necessary, this version shows several common elements of many Makefiles. Running this twice at a workstation provides the following interaction.


$ make
gcc -ansi -c bst.c
gcc -ansi -c bst-proc.c
gcc -o bst bst.o bst-proc.o
$ make
make: Nothing to be done for `all'.

As this illustrates, make and Makefile keep track of what needs to be done to compile and link the designated files. Work occurs only as needed. Thus, the first time make was run, both programs were compiled and the resulting object files linked. However, the second time make was run, the machine detected that no files had changed from the first time, so no further work was needed. To expand on this point, if file bst-proc.c were changed, but no other changes were make, running make might produce the following:


$ make
gcc -ansi -c bst-proc.c
gcc -o bst bst.o bst-proc.o

Here, nothing related to file bst.c had changed, so that was not recompiled. More generally, make reviews the status of all relevant files and compiles and links only those that are out of date.

With this overview of make, we now look at the MakeFile instructions more carefully. While comments are very helpful for documentation, general processing in a MakeFile has three components: dependencies, rules, and macros.

Comments in a MakeFile begin with the character #. The comment continues for the rest of the line, as in bash or csh shell programming.

Dependencies within MakeFile indicate which files depend on which. In the example, these dependencies are given by:


all: bst
bst:  bst.o bst-proc.o
bst.o:  bst.c node.h bst-proc.h
bst-proc.o:  bst-proc.c bst-proc.h node.h

After the first line, each line indicates which other files are needed in order to compile or link the given resulting file. The target file is given first, followed by a semicolon, and the required files follow.

The first line in the example actually has a similar purpose, although this first line also provides the primary target or goal for the entire process. In the case at hand, we might have moved the bst: line to the top of the file. However, we wanted to specify some other information early as well, so this placement of bst: would have been awkward. Instead, we used the dummy target all, and specified that this target would depend on our real goal: bst. (If we had wanted several final program files, all of them could have been listed here.)

Rules specify what command(s) must be given to create the desired targets. In the example, we could have used the following rules, one for each actual file to be created:


gcc -ansi -c bst.c
gcc -ansi -c bst-proc.c
gcc -o bst bst.o bst-proc.o

Typing Note: By convention, such rules must begin with a tab character.

Macros: While such explicit specification of commands works fine within a Makefile, this approach sometimes may cause trouble if the software is to be compiled and linked on multiple platforms. To anticipate such matters, it is common to use macros to specify various compiling details. Then, if the files are moved to other systems, only the macros need be changed -- not the entire Makefile.

In the example at hand, we specify both which C compiler to use (gcc) and what flags to use for that compiler (-ansi). Such macros are defined at the start of the example Makefile.


CC = gcc
CFLAGS = -ansi

Each of these lines defines a new variable that can be used later. As in C-shell programming, referencing these variables is achieved by preceding the variable name with a dollar sign $. Parentheses also are allowed, as illustrated in the example.


	$(CC) -o bst bst.o bst-proc.o
	$(CC) $(CFLAGS) -c bst.c
	$(CC) $(CFLAGS) -c bst-proc.c

Beyond these basic capabilities, make and Makefile allow many additional features. However, the pieces here may be adequate for many common applications.


This document is available on the World Wide Web as

     http://www.cs.grinnell.edu/~walker/courses/195.fa01/program-mgmt.html

created December 3, 2001
last revised December 3, 2001
Valid HTML 3.2!
For more information, please contact Henry M. Walker at walker@cs.grinnell.edu.