On MathLAN, the utility program cp is often used to create an exact
duplicate of a given text file. In a terminal-emulator window, the command
cp original duplicate
copies the file named original into a new file named duplicate. You wind up with two files that have exactly the same
contents.
At this point, we can write a Scheme program to do exactly the same thing.
The actual copying would be done by the copy-file procedure:
;;; copy-file: create a duplicate of a given file ;;; Givens: ;;; NAME-OF-ORIGINAL and NAME-OF-DUPLICATE, both strings. ;;; Results: ;;; None. ;;; Preconditions: ;;; (1) NAME-OF-ORIGINAL is the operating system's name for ;;; a file that already exists. ;;; (2) NAME-OF-DUPLICATE is not the operating system's name ;;; for a file that already exists. ;;; Postcondition: ;;; A file denoted by NAME-OF-DUPLICATE has been created, ;;; and its contents are the same as those of NAME-OF-ORIGINAL. (define copy-file (lambda (name-of-original name-of-duplicate) (let ((source (open-input-file name-of-original)) (target (open-output-file name-of-duplicate))) (let kernel ((next-character (read-char source))) (if (eof-object? next-character) (begin (close-input-port source) (close-output-port target)) (begin (write-char next-character target) (kernel (read-char source))))))))
In other words: Let source be a port through which we can pull
characters in from the file to be copied, and let target be a port
through which we can push characters out to the new file. Try to read a
character from source. If it's the end-of-file object, close both
ports and we're done; otherwise, write the character to target, try
to read another character from source, and repeat this step. Since
every new call to the kernel procedure consumes one character from
the source file, the end of that file will ultimately be reached and the
recursive calls will cease.
After this definition, we complete the program with an appropriate call to
the copy-file procedure, giving it the file names as arguments:
(copy-file "original" "duplicate")
The kernel of the copy-file procedure exemplifies one of the common
patterns of complete-file recursion -- recursion guided by the
structure of the file from which data is read. The base case in a
complete-file recursion is the case in which the file contains no data, or
at least no more data, so that the value of a call to some input procedure
is the end-of-file object. If that base case has not yet been reached, a
complete-file recursion procedure performs some operation on the value that
has just been read in -- in copy-file, the character next-character -- and invokes itself recursively to deal with the rest of
the file, starting with an attempt to read in another datum.
The copy-file procedure illustrates the tail-recursive version of
complete-file recursion. (It is tail-recursive because the transfer of
each character from the source input port to the target
output port takes place before the recursive call is made; after the
recursive call has been evaluated, there is no more work to be done.)
The first version of the sum-of-file procedure from the reading on files illustrates
complete-file recursion in its non-tail-recursive form: Each recursive call
to sum-of-file returns the sum of the part of the file that has not
been read yet at the time the call is made, and the current element is
added to that sum after the recursive call returns it.
The arguments that the caller supplies to the copy-file procedure
are the strings that name the files. The copy-file procedure
itself is responsible for opening and closing the ports to those files. An
alternative approach, frequently used because of its greater flexibility,
is to write the copying procedure so that it takes the ports as
arguments, making the caller responsible for opening them before the
procedure call and closing them afterwards. Here's how the file-copying
procedure looks if this approach is used:
;;; port-copy: copy characters from a given input ;;; port to a given output port ;;; Givens: ;;; SOURCE, an input port. ;;; TARGET, an output port. ;;; Results: ;;; None. ;;; Precondition: ;;; Neither SOURCE nor TARGET has yet been closed. ;;; Postcondition: ;;; Characters have been read from SOURCE and written, ;;; without change, to TARGET, until the end-of-file ;;; object has been encountered in SOURCE. (define port-copy (lambda (source target) (let kernel ((next-character (read-char source))) (if (not (eof-object? next-character)) (begin (write-char next-character target) (kernel (read-char source)))))))
This is a much simpler and clearer procedure. On the other hand, whoever
calls it has to open the input and output ports before invoking
port-copy and close them afterwards, and it's easy to forget
to do this.
An input port operation is a Scheme procedure that takes an input
port as its only argument. For instance, it would be easy to rewrite the
sum-of-file procedure from the
reading on files as an input port operation, by requiring the caller to
create the port before invoking the procedure and to close it afterwards:
;;; port-sum: compute the sum of the numbers that can be read ;;; in through a given input port ;;; Given: ;;; SOURCE, an input port. ;;; Result: ;;; TOTAL, a number. ;;; Preconditions: ;;; (1) SOURCE has not yet been closed. ;;; (2) Every value that can be read in through SOURCE is ;;; a number. ;;; Postcondition: ;;; TOTAL is the sum of the numbers that can be read in ;;; through SOURCE (before the end-of-file object is ;;; encountered). (define port-sum (lambda (source) (if (not (input-port? source)) (error 'port-sum "The argument must be an input port")) (let kernel ((total 0) (next-number (read source))) (if (eof-object? next-number) total (kernel (+ total next-number) (read source))))))
One advantage of writing this procedure as an input port operation is that
one can then use the primitive Scheme procedure call-with-input-file
to invoke it. The call-with-input-file procedure takes two
arguments, the first of which is a string that names an existing file and
the second an input port operation. Call-with-input-file
automatically opens the file, invokes the input port procedure (giving it
the port to the input file), collects the value that it returns, closes the
port, and returns the value collected from the input port procedure. In
other words, it works essentially as if it were defined like this:
;;; call-with-input-file: call a given procedure, passing to ;;; it a port open to a given input file ;;; Givens: ;;; NAME-OF-INPUT-FILE, a string. ;;; OPERATION, a procedure that takes an input port as its ;;; only argument. ;;; Result: ;;; RESULT, a value. ;;; Precondition: ;;; NAME-OF-INPUT-FILE is the operating system's name for ;;; an existing file. ;;; Postcondition: ;;; RESULT is the result of calling OPERATION, passing it ;;; a input port open to the file denoted by ;;; NAME-OF-INPUT-FILE. (define call-with-input-file (lambda (name-of-input-file operation) (let* ((source (open-input-file name-of-input-file)) (result (operation source))) (close-input-port source) result)))
For instance, if the file numbers.dat contains nothing but
numbers, the following expression computes the sum of those numbers:
(call-with-input-file "numbers.dat" port-sum)
Naturally, there is a corresponding notion of an output port
operation -- a procedure that takes an output port as its only argument.
Scheme provides a built-in procedure call-with-output-file that
takes as its arguments a string that names a file to be created and an
output port operation, opens a port to the specified output file, runs the
output port operation on that port, closes the port, and returns the result
of the output port operation.
Initially, call-with-output-file seems much less useful
than call-with-input-file, because it's hard to think of
plausible output-port operations -- all the interesting output procedures
take two or more arguments. But remember that a procedure of two or more
arguments can be curried, so that it takes its arguments separately.
I am indebted to Professor Ben Gum for his contributions to the development of this reading.