Lab: Profiling

CSC 323: Software design · Spring, 2012

Department of Computer Science · Grinnell College

0: Setting up the source code

Create a new subdirectory for this lab somewhere within your MathLAN home directory and use cd to move into that directory. Put in it a copy of the files queue.h, array-queue.c, and test_queues.c from the /home/stone/courses/software-design/code directory.

1: Compiling for gprof

First, compile the array-queue.c library, supplying (in addition to the usual compiler options, such as -Wall and -c) the option, the profile-creation option -pg. Then compile test-queues.c, linking in the array-queue.o object file, to get an executable called test-queues. Again, specify the -pg option when invoking the compiler.

2: Running the executable

Running an executable program that has been compiled with the -pg option creates a file called gmon.out, as a side effect. This file contains the information from which gprof constructs its reports.

Execute the test-queues program and confirm that the gmon.out file has been placed in your current working directory.

3: Constructing a report

By default, gprof constructs and outputs two tables. The first of these tables (the flat profile) gives execution-time statistics by function: the percentage of the total execution time spent within each function, the average duration of invocations of each function, the number of invocations of each function, and so on. There is one row of the table for each function, and the rows are arranged in descending order by the percentage of total execution time consumed in invocations.

Most of these quantities are measured by interrupting the execution of the program at regular, short intervals and inspecting the run-time stack at each interruption point to see which functions have been started but not yet completed. These measurements are reliable only to the extent that the program runs for long enough for these interruption points to constitute a representative sample. They are not useful for programs that require only a fraction of a second to execute in the first place.

The second table is the call graph analysis, which breaks down the statistics further, separating out the invocations of each function by different calling functions. This makes it easier to identify the “hot spots” in the program -- the particular function calls that are executed most frequently and contribute most to the program's running time.

To have gprof construct and output these two tables, invoke it with the executable being profiled as its command-line argument. By default, gprof includes in the output an elaborate commentary, explaining what each column in the table means and how to read the call graph analysis. You should read this commentary at least once, but thereafter you can suppress it with the -b option to gprof.

The tables are sufficiently large that it would be a good idea to redirect gprof's output into a file that you can inspect at leisure, or at least to pipe it into a pager such as less.

Have gprof construct the report on the execution of test-queues. (It is not necessary to mention gmon.out on the command line; gprof always uses that file as its data source.) Using the report, determine (0) which of the functions in the queue library took longest to execute (on the average); (1) which of those functions was invoked most often; and (2) which of the function-call expressions in test-queues.c and array-queue.c was evaluated most often.

4: Statement-execution counts with gcov

Originally, gprof had another use: Programmers used a different selection of command-line options to produce annotated source listings -- copies of the source-code files in which each block of executable statements is labelled with the number of times it was executed. The infrastructure for this is still present in gprof, but it has been superseded by a separate, special-purpose utility, gcov, which does a similar job much more efficiently.

To compile a source file for gcov, add the command-line options -fprofile-arcs and -ftest-coverage to the gcc command. In addition to the usual .o or executable file, the compiler will construct a data file, with a name ending in .gcno, containing data that gcov will use in building its reports.

At this point, you should recompile array-queue.c and test-queues.c with these options added. Confirm that the compiler constructs the array-queue.gcno and test-queues.gcno files.

Running the executable that results from such compilations now creates additional files, with file names ending in .gcda. These files contain data about the execution of the program, data that gcov also uses in its report.

To direct gcov to build the annotated source-code listings for a program, list the source-code files to be annotated as command-line arguments. The output from the gcov program tells you, for each source-code file, how many lines of the file contain executable statements and what percentage of those statements were in fact executed when the program was run. The annotated source-code listings appear in files with names similar to those of the original source-code files, but with .gcov added at the end.

The .gcov files are ordinary text files. Each line of source code is prefixed with an execution count and a line number. For lines that contain no executable statements, the execution count is replaced with a hyphen; for lines containing statements that were never executed, the count is replaced with the marker #####. (The idea of using a special marker, instead of just writing the execution count as 0, is to call the programmer's attention to such lines. Code that is not being executed is not being tested, and so is particularly likely to contain undetected errors.)

Use the execution counts to determine the percentage of invocations of the queue_size procedure in which the rear queue position had “wrapped around” the end of the array while the front queue position had not, so that the former position was actually less than the latter.

Use the execution counts to determine how many times the array underlying a queue had to be resized and reallocated during the execution of the program.

Determine which statements were executed more frequently than any others in the program. Explain why those statements were executed so much more often than any of the others. Suggest a way to revise the program to avoid some of these statement executions, without affecting its correctness or reliability.

Confirm, using the output from gcov, that a third of the executable statements in test-queues.c are not executed at all when test-queues is run. Explain why not.

Determine what effect the -f option has on the output from gcov and describe some circumstances in which the additional information might be useful.