The insertion sort algorithm is not always the best one to use. When sorting n values, starting from a random initial arrangement, the insertion sort has to look through half of the elements in the sorted part of the data to find the correct insertion point for each new value it places. The size of that sorted part increases linearly from 0 to n, so its average size is n/2 and the average number of comparisons needed to insert one element is n/4. Taking all the insertions together, then, the insertion sort performs about n2/4 comparisons to sort the entire set.
Accordingly, when the number of values to be sorted is large (greater than one hundred, say), it is preferable to use a sorting method that is more complicated to set up initially but performs fewer operations on each element in the process of positioning it correctly. The merge sort algorithm is often an appropriate choice. We shall again examine two variants of this algorithm -- one taking a list of values and returning a newly allocated list containing the same values, but in sorted order, and the other one taking a vector of values and arranging its elements as a ``destructive'' side effect.
One of the building blocks for the first version of the merge sort is a
procedure that takes two sorted lists as arguments and returns a sorted
list containing all of the values from both argument lists. If we abstract
out the ordering relation as may-precede?, a curried version of
the merge procedure looks like this:
;;; merge: given an ordering predicate, construct and return a procedure ;;; that merges two lists that are ordered by that predicate into one ;;; similarly ordered list ;;; Given: ;;; MAY-PRECEDE?, a binary predicate. ;;; Result: ;;; MERGER, a procedure that takes two lists as arguments and returns a ;;; list. ;;; Precondition: ;;; MAY-PRECEDE? expresses an ordering relation. ;;; Postconditions: ;;; Given two lists, LEFT and RIGHT, of values that meet any preconditions ;;; that MAY-PRECEDE? imposes on either of its arguments, with the ;;; additional precondition that both lists are in order by MAY-PRECEDE?, ;;; MERGER returns a list MERGED such that ;;; (1) the elements of MERGED are exactly the elements of LEFT and ;;; RIGHT, and ;;; (2) MERGED is in order by MAY-PRECEDE?. (define merge (lambda (may-precede?) (letrec ((merger (lambda (left right) (cond ((null? left) right) ((null? right) left) ((may-precede? (car left) (car right)) (cons (car left) (merger (cdr left) right))) (else (cons (car right) (merger left (cdr right)))))))) merger)))
If either of the given lists is null, the result of applying the
specialized merging procedure (here called merger) is simply the
other list. Otherwise, we split off whichever of the first elements of the
given lists may precede the other, issue a recursive call to merge the
remainders of both lists, and prepend the selected first element to the
result.
The merge sort works by starting with short lists, merging them to form
somewhat longer lists, merging the somewhat longer lists to form still
longer ones, and so on until only one list remains -- the result of the
final merge operation. The short lists that we begin with must satisfy the
precondition for the procedure that merge constructs: they must
already be sorted. There are several ways to establish this precondition:
One approach is to put just one value into each short list. A single value
is already ``sorted,'' in a trivial or vacuous sense (and any procedure
constructed by merge will work correctly on such single-element
lists).
A second idea, sometimes called the ``natural merge sort,'' is to pre-process the data in their initial arrangement, grouping together sequences of values that are already in the correct relative order, and then merging the resulting groups.
A third alternative is to use a kind of recursion known as ``divide and
conquer.'' Sorting a list ls is trivial if ls has no
elements, or only one; those are our base cases for the recursion. If
ls has two or more elements, we can use the procedure that merge constructs to produce a sorted list, provided that we can somehow
get two lists, sorted separately, from ls. But we can use recursive
calls to sort the parts of ls, provided that we can somehow split
ls into two sublists.
We get the greatest leverage if those lists are equal in size, so that the
subproblems for the recursive calls are much smaller than the original
problem. This suggests that we need a procedure split that takes a
list ls of two or more elements and divides it into two lists, equal
or nearly equal in size. Since the splitting precedes the sorting, the
elements can be distributed into the two lists in any order.
Since divide-and-conquer recursion is a common and useful general strategy,
we'll take the third alternative. Here's one way to split a list into two
lists of equal or nearly equal length. This version of split
returns the two lists as the car and cdr of a pair.
;;; split: separate a given list into two lists, equal or nearly equal in ;;; length ;;; Given: ;;; LS, a list ;;; Result: ;;; PAIR-OF-PARTS, a pair of lists. ;;; Preconditions: ;;; None. ;;; Postconditions: ;;; (1) The elements of LS are exactly the elements of the car of ;;; PAIR-OF-PARTS and the cdr of PAIR-OF-PARTS. ;;; (2) The length of the car of PAIR-OF-PARTS is either equal to, or one ;;; greater than, the length of the cdr of PAIR-OF-PARTS. (define split (lambda (ls) (let kernel ((rest ls) (left '()) (right '())) (if (null? rest) (cons left right) (kernel (cdr rest) (cons (car rest) right) left)))))
In the tail-recursive kernel procedure, the parameters left and
right are both ``so-far'' accumulators, each holding about half of
the elements so far encountered in the original list. At each invocation
of kernel, either left and right have the same number
of elements (as is true initially) or left has one more element than
right; prepending the element taken from rest to right
and then having the two lists swap places in each recursive call ensures
that any imbalance is immediately redressed.
The full merge-sort procedure checks to see whether either of the
base cases holds; if not, it invokes split to create two subproblems
of the same kind, solves each one by a recursive call, and finally invokes
a procedure constructed by merge to combine the results.
;;; merge-sort: given a binary ordering predicate, construct a procedure ;;; that takes a list and arranges its elements to be consistent with the ;;; ordering ;; Given: ;; MAY-PRECEDE?, a binary predicate ;; Result: ;; SORTER, a procedure. ;; Precondition: ;; MAY-PRECEDE? expresses an ordering relation (that is, it is connected ;; and transitive). ;; Postconditions: ;; Given any list UNSORTED of elements that meet any preconditions that ;; MAY-PRECEDE? imposes on either of its arguments, SORTER returns a ;; list SORTED such that ;; ;; (1) The elements of SORTED are exactly the elements of UNSORTED. ;; (2) SORTED is in ascending order. (define merge-sort (lambda (may-precede?) (let ((merger (merge may-precede?))) (letrec ((sorter (lambda (ls) (if (or (null? ls) (null? (cdr ls))) ls (let ((halves (split ls))) (merger (sorter (car halves)) (sorter (cdr halves)))))))) sorter))))
One uses the merge-sort procedure in exactly the same way as the
generalized insertion-sort procedure in the previous reading:
> ((merge-sort <=) '(3 1 4 1 5 9 2 6))
(1 1 2 3 4 5 6 9)
> ((merge-sort >=) '(3 1 4 1 5 9 2 6))
(9 6 5 4 3 2 1 1)
> (define alphabetize (merge-sort string-ci<=?))
> (alphabetize '("Brunner" "Furuta" "Romero" "Shadel" "Poulin" "Hecker"
"Falcon" "Rapp" "Bakyu" "Manfredi" "Benness" "Sims"
"Morrison" "Herrington" "Pecsok" "Chamberlain"))
("Bakyu" "Benness" "Brunner" "Chamberlain" "Falcon" "Furuta" "Hecker"
"Herrington" "Manfredi" "Morrison" "Pecsok" "Poulin" "Rapp" "Romero"
"Shadel" "Sims")
> (define sort-chars (merge-sort char-ci<=?))
> (define alphanagram
(lambda (str)
(list->string (sort-chars (string->list str)))))
> (alphanagram "conglomeration")
"acegilmnnooort"
Now suppose that the values to be sorted arrive in the form of a vector, and that the objective is to rearrange the contents of the vector so that at the end of the sorting procedure the same elements are present, but their their order is the one specified by the comparison rule. As in the constructive version of the algorithm above, we want to work our way up from single-element subvectors. Instead of allocating a separate vector for each single element, however, we can take advantage of constant-time access to the elements of a vector by making the separation purely notional: We can identify a sub-vector of the original vector by keeping track of the positions at which it begins and ends.
Let's adopt the convention that the starting position of a
subvector is the position of the first element that is inside the
subvector, and the ending position is the position of the first
element after and outside of the subvector (or, if there is no such
element, the length of the entire vector). So, within the vector '#(a b c d e), the subvector with elements 'b and 'c has
starting position 1 and ending position 3. (This convention has the
advantage that the number of elements in the subvector is the difference
between the ending position and the starting position. The arguments to
Scheme's substring procedure are required to follow the same
convention, for the same reason.)
The merge! procedure takes two adjacent subvectors of the
same vector, both of which must already be sorted, and overwrites them with
the merged (and therefore sorted) version of their elements.
Unfortunately, this cannot be done completely ``in place''; there must be a
``holding area'' that provides separate storage for the elements as they
are merged, and at the end of the merging process the elements have to be
copied back from the holding area into the original vector. In this
implementation, the holding area takes the form of a second vector, of the
same size as the original. As the two adjacent subvectors of the original
vector are merged, they are placed into the positions of the holding vector
that they will eventually occupy in the original vector; at the end of the
merge, they are copied.
So we need a Scheme procedure copy-subvector! that takes four
arguments -- a source vector source, the starting position start and ending position finish of a subvector of source,
and a holding vector target of the same size as source -- and
copies the specified subvector of source into the corresponding
positions in target, using vector-set!. Here's a sequence
that shows the effect of a sample call:
> (define vector-1 (vector 'alpha 'beta 'gamma 'delta 'epsilon)) > (define vector-2 (vector 'first 'second 'third 'fourth 'fifth)) > (copy-subvector! vector-1 1 3 vector-2) > vector-2 #(first beta gamma fourth fifth) > vector-1 ; no change #(alpha beta gamma delta epsilon)
This procedure is just a do-expression:
;;; copy-subvector!: copy elements in a given range of positions from one ;;; vector into another ;;; Givens: ;;; SOURCE, a vector ;;; START, an exact integer ;;; FINISH, an exact integer ;;; TARGET, a vector ;;; Results: ;;; None. ;;; Preconditions: ;;; (1) The length of TARGET is equal to the length of SOURCE. ;;; (2) START is non-negative and less than or equal to FINISH. ;;; (3) FINISH is less than or equal to the length of SOURCE. ;;; Postconditions: ;;; (1) The contents of SOURCE are the same as they were initially. ;;; (2) The contents of TARGET are the same as they were initially, ;;; except that the elements from position START up to (but not ;;; including) FINISH, are the same as the elements at the same ;;; positions in SOURCE. (define copy-subvector! (lambda (source start finish target) (do ((position start (+ position 1))) ((= position finish)) (vector-set! target position (vector-ref source position)))))
The procedure that actually does the merging, then, needs to know the boundaries of the subvectors that it is supposed to merge. It takes five arguments: the vector whose elements are to be rearranged, a second vector that provides the holding area, the starting position of the left subvector, a boundary position (which is both the ending position of the left subvector and the starting position of the right one), and the ending position of the right subvector.
The kernel procedure keeps track of three positions. The parameter target-position counts off the positions in the holding vector as they are
filled up, from left to right. Current-left keeps track of the
position of the leftmost element in the first subvector that has not yet
been copied into the holding vector; current-right does the same for
the second subvector. There are three cases, which will be handled in
three separate cond-clauses:
If current-left has been incremented enough times to make it
equal to the boundary, then the recursion can stop and all the
elements that have been moved to the holding vector can be copied back into
the original vector The remaining elements in the second subvector are
already in their correct sorted positions and need not be moved at all.
If no more elements remain in the second subvector, because
current-right has been incremented until it is equal to
finish-right, or if the current element from the first subvector
may precede the current element of the second subvector, copy the current
element from the first subvector into the holding area, then advance to the
next position in the first subvector and in the holding area.
Otherwise, copy the current element from the second subvector into the holding area, then advance to the next position in the second subvector and in the holding area.
Translating into Scheme, and adding the curried ordering procedure may-precede?:
;;; merge!: given an ordering predicate, construct and return a procedure ;;; that will (destructively) merge two adjacent subvectors of a given ;;; vector ;;; Given: ;;; MAY-PRECEDE?, a binary predicate ;;; Result: ;;; MERGER!, a procedure ;;; Precondition: ;;; MAY-PRECEDE? is an ordering predicate. ;;; Postconditions: ;;; (1) MERGER! takes five arguments, namely: VEC and HOLDING, vectors, ;;; and START-LEFT, BOUNDARY, and FINISH-RIGHT, exact integers, ;;; subject to the conditions that VEC and HOLDING are equal in ;;; length, that the elements of VEC can satisfy any conditions that ;;; MAY-PRECEDE? imposes on any of its arguments, that START-LEFT is ;;; non-negative and less than or equal to BOUNDARY, that BOUNDARY is ;;; less than or equal to FINISH-RIGHT, that FINISH-RIGHT is less ;;; than or equal to the length of VEC, that the elements in VEC from ;;; position START-LEFT up to (but not including) BOUNDARY are in ;;; order by MAY-PRECEDE?, and that the elements in VEC from position ;;; BOUNDARY up to (but not including) FINISH-RIGHT are in order by ;;; MAY-PRECEDE?. ;;; (2) After a call to MERGER!, the elements in positions less than ;;; START-LEFT and the elements in positions greater than or equal to ;;; FINISH-RIGHT in VEC are the same as they were initially; the ;;; elements in positions from START-LEFT up to (but not including) ;;; FINISH-RIGHT in VEC are a permutation of the elements initially ;;; in those positions; and the elements in positions from START-LEFT ;;; up to (but not including) FINISH-RIGHT in VEC are in order by ;;; MAY-PRECEDE?. (define merge! (lambda (may-precede?) (lambda (vec holding start-left boundary finish-right) (let kernel ((target-position start-left) (current-left start-left) (current-right boundary)) ;; If there are no elements in the left subvector, the remaining ;; elements in the right subvector are already in their correct ;; positions, so we have reached the base case of the recursion. ;; Copy all of the elements in the holding area back into VEC. (cond ((= current-left boundary) (copy-subvector! holding start-left current-right vec)) ;; If there are no more elements in the right subvector, or ;; if the next element from the left subvector may precede ;; the next element from the right subvector, transfer the ;; next element from the left subvector into the holding area ;; and invoke KERNEL again to go on to the next step. ((or (= current-right finish-right) (may-precede? (vector-ref vec current-left) (vector-ref vec current-right))) (vector-set! holding target-position (vector-ref vec current-left)) (kernel (+ target-position 1) (+ current-left 1) current-right)) ;; Otherwise, transfer the next element from the right ;; subvector and invoke KERNEL again. (else (vector-set! holding target-position (vector-ref vec current-right)) (kernel (+ target-position 1) current-left (+ current-right 1))))))))
Because there are more things to keep track of in each recursive call in
this version of the merge sort, the customized sorter that the merge-sort! procedure returns in this case is actually a husk that accepts
the vector as its input, determines its length, and sets up a holding
vector for merge! to use, then invokes a kernel procedure (here
called subsort!) that divides the vector into subvectors, invokes
itself recursively to sort the subvectors, and merges the results.
Since we have random access to vectors, the vector analogue of the split procedure just computes the midpoint of some subvector and returns
it, so that it can be used as the boundary between the two subproblems to
be solved recursively. The index of the midpoint is the average of
starting point and the ending point of the subvector.
;;; merge-sort!: given a binary ordering predicate, construct a procedure ;;; that takes a vector and destructively rearranges its elements to be ;;; consistent with the ordering ;;; Given: ;;; MAY-PRECEDE?, a binary predicate ;;; Result: ;;; SORTER!, a procedure. ;;; Precondition: ;;; MAY-PRECEDE? expresses an ordering relation (that is, it is connected ;;; and transitive). ;;; Postconditions: ;;; Given any vector VEC of elements that meet any preconditions that ;;; MAY-PRECEDE? imposes on either of its arguments, SORTER! destructively ;;; modifies VEC so that the following conditions are met: ;;; ;;; (1) The elements of VEC are the same as in its initial state. ;;; (2) VEC is in order by MAY-PRECEDE? ;;; ;;; SORTER! does not return any particular value; it is invoked only for ;;; its side effect. (define merge-sort! (lambda (may-precede?) (let ((merger! (merge! may-precede?))) (lambda (vec) (let* ((len (vector-length vec)) (holding (make-vector len))) (let subsort! ((start 0) (finish len)) ;; Unless there are at least two elements in the subvector, ;; sorting is superfluous. (if (<= 2 (- finish start)) (let ((boundary (quotient (+ start finish) 2))) (subsort! start boundary) (subsort! boundary finish) (merger! vec holding start boundary finish)))))))))