Create a new subdirectory for this lab somewhere within your MathLAN home
directory and use cd to move into that directory. Put in it a copy
of the files line-sorter.c and line-tester from the
/home/stone/courses/software-design/code directory.
The line-sorter.c file contains a standalone C program that reads in
lines from one or more text files specified on the command line, sorts them
into lexicographical order, and writes the sorted lines to standard
output. Any of the files can contain any number of lines, and any line can
contain any number of characters -- storage is allocated dynamically to
accommodate files of any length and lines of any length.
line-tester in which the lines are
sorted.
Even though this is quite a short program, it allocates and frees memory in
several different places, not always in the same function. For instance,
the push_every_line function contains a call to malloc, but
no call to free, so that none of the storage that it allocates when
executed is freed before the exit from the function. And it would be an
error to add a statement like free(new_component); anywhere inside
the definition of push_every_line. Even though the pointer new_component is stored in the activation record for push_every_line on the run-time stack and so is discarded when that
function returns, the storage on the other end of that pointer remains
accessible, because push_every_line returns another copy of that
pointer, and the caller has a valid use for it.
It's often difficult to get the timing right. You don't want to free any
dynamically allocated storage while there is still an accessible pointer to
it that might still be needed, but you also shouldn't wait around until
there are no pointers to it left at all (since then there is no way to
refer to it in order to free it). Storage allocated by the call to malloc in push_every_line is freed at the very end of the execution
of the main function, in the free(trailer) line.
But is all of the storage allocated by all of the calls to
push_every_line freed at that point? It's hard to say, at a glance.
To fully justify either an affirmative answer or a negative one, you'd
probably have to trace carefully through the operations that occur during
the intervening invocations of mergesort, making sure that no
storage locations are dropped, discarded, or accidentally made inaccessible
as the list components are split up, rearranged, and merged.
malloc function in line-sorter.c, find the call to the free function that frees the
storage that it allocates.
The GNU C compiler comes with a library that makes it possible for the
programmer to determine whether all dynamically allocated storage is
properly freed. The mtrace function, which takes no arguments and
returns no value, replaces the standard functions for memory allocation and
deallocation (malloc and the calloc macro, realloc,
and free) with instrumented versions that record their operations in
an auxiliary log file. The user specifies this file by setting an
environment variable, MALLOC_TRACE, in the shell within with the
program to be checked will run.
To set this variable, one might give the command
in the shell's terminal window. The part that looks like an assignment
statement sets the environment variable; preceding it with the word export directs the shell to pass this variable along to any subshells that
it spawns.
MALLOC_TRACE environment variable,
using the file name of your choice for the log, and create an empty file
with that name in the directory containing the line-sorter.c
program. Edit line-sorter.c, placing a call to mtrace at the
beginning of the main function. (You'll also need to #include the header file containing the prototype for mtrace, which
is mcheck.h.) Recompile the line-sorter program and run it
again to sort the lines of line-tester.]
mtrace utility
The log file that the memory-management system constructs is not very
readable, so GNU also includes a utility program also called mtrace,
that reformats the results more legibly. You invoke it from the command
line, giving it the name of the log file as the command-line argument:
If you do this now, mtrace will produce the report
which tends to confirm the conclusion that you may have reached back in
section 1, that line-sorter eventually frees all of the storage that
it allocates.
Unfortunately, this conclusion is incorrect: The version of line-sorter that I provided to you does leak memory, though not when it is
used to sort the line-tester file.
line-sorter leaks memory,
sort the lines of line-sorter.c itself, then use the mtrace
utility to get a readable report of the contents of the log file. You
should see a line containing three hexadecimal numbers: the memory address
of a block of memory that was allocated and never freed, the number of
bytes in that block, and the memory address of the call to malloc
(or realloc) that performed the allocation. Using this information,
try to find, diagnose, and correct the error that resulted in the leak.
mtrace's report
Of the three hexadecimal numbers that mtrace reported, the first and
last were not much help, because the programmer usually has no useful
information about where blocks of allocated memory are located and how the
executable program instructions are arranged. In particular, the caller's
address would be much more useful if it were associated with a particular
line number in the file containing the source code.
To enable mtrace to include that information in its report, you need
to compile the program in which the memory is to be traced with the -g option, the same one that prepares a program for debugging. One of the
effects of this option is to embed information about the names of the
source files and the line-by-line structure of the source code into the
executable file. If you then re-run the program and invoke mtrace,
giving it both the name of the executable file and the name of the log file
as command-line arguments, mtrace will replace the hexadecimal
caller address with a reference to the source-code file and line number.
mtrace as described, to determine the exact location of the call that
allocated storage that was never deallocated. Using this additional
information, find, diagnose, and correct the error that is causing the
leak. Confirm that your solution worked (at least in this particular
case).
The GNU library that contains mtrace also contains the function
muntrace, which also takes no arguments and returns no value. The
effect of invoking muntrace is to suspend the tracing of memory
allocation and deallocation. Tracing resumes when and if another call to
mtrace is executed.
For instance, if you're investigating whether a memory leak occurs within a
particular block of code, you can put a call to mtrace at the
beginning of the block and a call to muntrace at the end, and the
memory-allocation system will track only operations inside the block.
When a directory contains several files, perhaps even several standalone
executables, that contain calls to mtrace, it can be difficult to
keep track of which of them the log file currently pertains to. Another
way to be more selective about memory tracing is to enclose the invocation
of mtrace in compiler directives that make it conditional on the
definedness of some identifier that the preprocessor will see -- MTRACING, perhaps:
The gcc compiler leaves the call to mtrace out of the
executable that it constructs unless the identifier MTRACING has been
defined. To activate it, recompile the code, giving gcc the
command-line option -DMTRACING (“define MTRACING”).
mtrace that you added
to line-sorter.c. Recompile that file without defining MTRACING and run the resulting executable on line-tester. Confirm
(by checking timestamps) that the newly compiled version of line-sorter did not create an mtrace log file.