Sorting by merging

Course links

The limitations of insertion sorting

The insertion sort algorithm is not always the best one to use. When sorting n values, starting from a random initial arrangement, the insertion sort has to look through half of the elements in the sorted part of the data to find the correct insertion point for each new value it places. The size of that sorted part increases linearly from 0 to n, so its average size is n/2 and the average number of comparisons needed to insert one element is n/4. Taking all the insertions together, then, the insertion sort performs about n2/4 comparisons to sort the entire set.

Accordingly, when the number of values to be sorted is large (greater than one hundred, say), it is preferable to use a sorting method that is more complicated to set up initially but performs fewer operations on each element in the process of positioning it correctly. The merge sort algorithm is often an appropriate choice. We shall again examine two variants of this algorithm -- one taking a list of values and returning a newly allocated list containing the same values, but in sorted order, and the other one taking a vector of values and arranging its elements as a ``destructive'' side effect.

Merge sort: constructing a sorted list

One of the building blocks for the first version of the merge sort is a procedure that takes two sorted lists as arguments and returns a sorted list containing all of the values from both argument lists. If we abstract out the ordering relation as may-precede?, a curried version of the merge procedure looks like this:

;;; merge: given an ordering predicate, construct and return a procedure
;;; that merges two lists that are ordered by that predicate into one
;;; similarly ordered list

;;; Given:
;;;   MAY-PRECEDE?, a binary predicate.

;;; Result:
;;;   MERGER, a procedure that takes two lists as arguments and returns a
;;;   list.

;;; Precondition:
;;;   MAY-PRECEDE? expresses an ordering relation.

;;; Postconditions:
;;;   Given two lists, LEFT and RIGHT, of values that meet any preconditions
;;;   that MAY-PRECEDE? imposes on either of its arguments, with the
;;;   additional precondition that both lists are in order by MAY-PRECEDE?,
;;;   MERGER returns a list MERGED such that
;;;     (1) the elements of MERGED are exactly the elements of LEFT and
;;;         RIGHT, and
;;;     (2) MERGED is in order by MAY-PRECEDE?.

(define merge
  (lambda (may-precede?)
    (letrec ((merger (lambda (left right)
                       (cond ((null? left) right)
                             ((null? right) left)
                             ((may-precede? (car left) (car right))
                              (cons (car left) (merger (cdr left) right)))
                             (else
                              (cons (car right)
                                    (merger left (cdr right))))))))
      merger)))

If either of the given lists is null, the result of applying the specialized merging procedure (here called merger) is simply the other list. Otherwise, we split off whichever of the first elements of the given lists may precede the other, issue a recursive call to merge the remainders of both lists, and prepend the selected first element to the result.

The merge sort works by starting with short lists, merging them to form somewhat longer lists, merging the somewhat longer lists to form still longer ones, and so on until only one list remains -- the result of the final merge operation. The short lists that we begin with must satisfy the precondition for the procedure that merge constructs: they must already be sorted. There are several ways to establish this precondition:

Since divide-and-conquer recursion is a common and useful general strategy, we'll take the third alternative. Here's one way to split a list into two lists of equal or nearly equal length. This version of split returns the two lists as the car and cdr of a pair.

;;; split: separate a given list into two lists, equal or nearly equal in
;;; length

;;; Given:
;;;   LS, a list

;;; Result:
;;;   PAIR-OF-PARTS, a pair of lists.

;;; Preconditions:
;;;   None.

;;; Postconditions:
;;;   (1) The elements of LS are exactly the elements of the car of
;;;       PAIR-OF-PARTS and the cdr of PAIR-OF-PARTS.
;;;   (2) The length of the car of PAIR-OF-PARTS is either equal to, or one
;;;       greater than, the length of the cdr of PAIR-OF-PARTS.

(define split
  (lambda (ls)
    (let kernel ((rest ls)
                 (left '())
                 (right '()))
      (if (null? rest)
          (cons left right)
          (kernel (cdr rest) (cons (car rest) right) left)))))

In the tail-recursive kernel procedure, the parameters left and right are both ``so-far'' accumulators, each holding about half of the elements so far encountered in the original list. At each invocation of kernel, either left and right have the same number of elements (as is true initially) or left has one more element than right; prepending the element taken from rest to right and then having the two lists swap places in each recursive call ensures that any imbalance is immediately redressed.

The full merge-sort procedure checks to see whether either of the base cases holds; if not, it invokes split to create two subproblems of the same kind, solves each one by a recursive call, and finally invokes a procedure constructed by merge to combine the results.

;;; merge-sort: given a binary ordering predicate, construct a procedure
;;; that takes a list and arranges its elements to be consistent with the
;;; ordering

;; Given:
;;   MAY-PRECEDE?, a binary predicate

;; Result:
;;   SORTER, a procedure.

;; Precondition:
;;   MAY-PRECEDE? expresses an ordering relation (that is, it is connected
;;   and transitive).

;; Postconditions:
;;   Given any list UNSORTED of elements that meet any preconditions that
;;   MAY-PRECEDE? imposes on either of its arguments, SORTER returns a
;;   list SORTED such that
;; 
;;     (1) The elements of SORTED are exactly the elements of UNSORTED.
;;     (2) SORTED is in ascending order.

(define merge-sort
  (lambda (may-precede?)
    (let ((merger (merge may-precede?)))
      (letrec ((sorter (lambda (ls)
                         (if (or (null? ls) (null? (cdr ls)))
                             ls
                             (let ((halves (split ls)))
                               (merger (sorter (car halves))
                                       (sorter (cdr halves))))))))
        sorter))))

One uses the merge-sort procedure in exactly the same way as the generalized insertion-sort procedure in the previous reading:

> ((merge-sort <=) '(3 1 4 1 5 9 2 6))
(1 1 2 3 4 5 6 9)
> ((merge-sort >=) '(3 1 4 1 5 9 2 6))
(9 6 5 4 3 2 1 1)
> (define alphabetize (merge-sort string-ci<=?))
> (alphabetize '("Brunner" "Furuta" "Romero" "Shadel" "Poulin" "Hecker"
                 "Falcon" "Rapp" "Bakyu" "Manfredi" "Benness" "Sims"
                 "Morrison" "Herrington" "Pecsok" "Chamberlain"))
("Bakyu" "Benness" "Brunner" "Chamberlain" "Falcon" "Furuta" "Hecker"
 "Herrington" "Manfredi" "Morrison" "Pecsok" "Poulin" "Rapp" "Romero"
 "Shadel" "Sims")
> (define sort-chars (merge-sort char-ci<=?))
> (define alphanagram
    (lambda (str)
      (list->string (sort-chars (string->list str)))))
> (alphanagram "conglomeration")
"acegilmnnooort"

Merge sort: overwriting the contents of a vector

Now suppose that the values to be sorted arrive in the form of a vector, and that the objective is to rearrange the contents of the vector so that at the end of the sorting procedure the same elements are present, but their their order is the one specified by the comparison rule. As in the constructive version of the algorithm above, we want to work our way up from single-element subvectors. Instead of allocating a separate vector for each single element, however, we can take advantage of constant-time access to the elements of a vector by making the separation purely notional: We can identify a sub-vector of the original vector by keeping track of the positions at which it begins and ends.

Let's adopt the convention that the starting position of a subvector is the position of the first element that is inside the subvector, and the ending position is the position of the first element after and outside of the subvector (or, if there is no such element, the length of the entire vector). So, within the vector '#(a b c d e), the subvector with elements 'b and 'c has starting position 1 and ending position 3. (This convention has the advantage that the number of elements in the subvector is the difference between the ending position and the starting position. The arguments to Scheme's substring procedure are required to follow the same convention, for the same reason.)

The merge! procedure takes two adjacent subvectors of the same vector, both of which must already be sorted, and overwrites them with the merged (and therefore sorted) version of their elements. Unfortunately, this cannot be done completely ``in place''; there must be a ``holding area'' that provides separate storage for the elements as they are merged, and at the end of the merging process the elements have to be copied back from the holding area into the original vector. In this implementation, the holding area takes the form of a second vector, of the same size as the original. As the two adjacent subvectors of the original vector are merged, they are placed into the positions of the holding vector that they will eventually occupy in the original vector; at the end of the merge, they are copied.

So we need a Scheme procedure copy-subvector! that takes four arguments -- a source vector source, the starting position start and ending position finish of a subvector of source, and a holding vector target of the same size as source -- and copies the specified subvector of source into the corresponding positions in target, using vector-set!. Here's a sequence that shows the effect of a sample call:

> (define vector-1 (vector 'alpha 'beta 'gamma 'delta 'epsilon))
> (define vector-2 (vector 'first 'second 'third 'fourth 'fifth))
> (copy-subvector! vector-1 1 3 vector-2)
> vector-2
#(first beta gamma fourth fifth)
> vector-1  ; no change
#(alpha beta gamma delta epsilon)

This procedure is just a do-expression:

;;; copy-subvector!: copy elements in a given range of positions from one
;;; vector into another

;;; Givens:
;;;   SOURCE, a vector
;;;   START, an exact integer
;;;   FINISH, an exact integer
;;;   TARGET, a vector

;;; Results:
;;;   None.

;;; Preconditions:
;;;   (1) The length of TARGET is equal to the length of SOURCE.
;;;   (2) START is non-negative and less than or equal to FINISH.
;;;   (3) FINISH is less than or equal to the length of SOURCE.

;;; Postconditions:
;;;   (1) The contents of SOURCE are the same as they were initially.
;;;   (2) The contents of TARGET are the same as they were initially,
;;;       except that the elements from position START up to (but not
;;;       including) FINISH, are the same as the elements at the same
;;;       positions in SOURCE.

(define copy-subvector!
  (lambda (source start finish target)
    (do ((position start (+ position 1)))
        ((= position finish))
      (vector-set! target position (vector-ref source position)))))

The procedure that actually does the merging, then, needs to know the boundaries of the subvectors that it is supposed to merge. It takes five arguments: the vector whose elements are to be rearranged, a second vector that provides the holding area, the starting position of the left subvector, a boundary position (which is both the ending position of the left subvector and the starting position of the right one), and the ending position of the right subvector.

The kernel procedure keeps track of three positions. The parameter target-position counts off the positions in the holding vector as they are filled up, from left to right. Current-left keeps track of the position of the leftmost element in the first subvector that has not yet been copied into the holding vector; current-right does the same for the second subvector. There are three cases, which will be handled in three separate cond-clauses:

Translating into Scheme, and adding the curried ordering procedure may-precede?:

;;; merge!: given an ordering predicate, construct and return a procedure
;;; that will (destructively) merge two adjacent subvectors of a given
;;; vector

;;; Given:
;;;   MAY-PRECEDE?, a binary predicate

;;; Result:
;;;   MERGER!, a procedure

;;; Precondition:
;;;   MAY-PRECEDE? is an ordering predicate.

;;; Postconditions:
;;;   (1) MERGER! takes five arguments, namely: VEC and HOLDING, vectors,
;;;       and START-LEFT, BOUNDARY, and FINISH-RIGHT, exact integers,
;;;       subject to the conditions that VEC and HOLDING are equal in
;;;       length, that the elements of VEC can satisfy any conditions that
;;;       MAY-PRECEDE? imposes on any of its arguments, that START-LEFT is
;;;       non-negative and less than or equal to BOUNDARY, that BOUNDARY is
;;;       less than or equal to FINISH-RIGHT, that FINISH-RIGHT is less
;;;       than or equal to the length of VEC, that the elements in VEC from
;;;       position START-LEFT up to (but not including) BOUNDARY are in
;;;       order by MAY-PRECEDE?, and that the elements in VEC from position
;;;       BOUNDARY up to (but not including) FINISH-RIGHT are in order by
;;;       MAY-PRECEDE?.
;;;   (2) After a call to MERGER!, the elements in positions less than
;;;       START-LEFT and the elements in positions greater than or equal to
;;;       FINISH-RIGHT in VEC are the same as they were initially; the
;;;       elements in positions from START-LEFT up to (but not including)
;;;       FINISH-RIGHT in VEC are a permutation of the elements initially
;;;       in those positions; and the elements in positions from START-LEFT
;;;       up to (but not including) FINISH-RIGHT in VEC are in order by
;;;       MAY-PRECEDE?.

(define merge!
  (lambda (may-precede?)
    (lambda (vec holding start-left boundary finish-right)
      (let kernel ((target-position start-left)
                   (current-left start-left)
                   (current-right boundary))

        ;; If there are no elements in the left subvector, the remaining
        ;; elements in the right subvector are already in their correct
        ;; positions, so we have reached the base case of the recursion.
        ;; Copy all of the elements in the holding area back into VEC.

        (cond ((= current-left boundary)
               (copy-subvector! holding start-left current-right vec))

              ;; If there are no more elements in the right subvector, or
              ;; if the next element from the left subvector may precede
              ;; the next element from the right subvector, transfer the
              ;; next element from the left subvector into the holding area
              ;; and invoke KERNEL again to go on to the next step.

              ((or (= current-right finish-right)
                   (may-precede? (vector-ref vec current-left)
                                 (vector-ref vec current-right)))
               (vector-set! holding target-position
                            (vector-ref vec current-left))
               (kernel (+ target-position 1)
                       (+ current-left 1)
                       current-right))

              ;; Otherwise, transfer the next element from the right
              ;; subvector and invoke KERNEL again.

              (else
               (vector-set! holding target-position
                            (vector-ref vec current-right))
               (kernel (+ target-position 1)
                       current-left
                       (+ current-right 1))))))))

Because there are more things to keep track of in each recursive call in this version of the merge sort, the customized sorter that the merge-sort! procedure returns in this case is actually a husk that accepts the vector as its input, determines its length, and sets up a holding vector for merge! to use, then invokes a kernel procedure (here called subsort!) that divides the vector into subvectors, invokes itself recursively to sort the subvectors, and merges the results.

Since we have random access to vectors, the vector analogue of the split procedure just computes the midpoint of some subvector and returns it, so that it can be used as the boundary between the two subproblems to be solved recursively. The index of the midpoint is the average of starting point and the ending point of the subvector.

;;; merge-sort!: given a binary ordering predicate, construct a procedure
;;; that takes a vector and destructively rearranges its elements to be
;;; consistent with the ordering

;;; Given:
;;;   MAY-PRECEDE?, a binary predicate

;;; Result:
;;;   SORTER!, a procedure.

;;; Precondition:
;;;   MAY-PRECEDE? expresses an ordering relation (that is, it is connected
;;;   and transitive).

;;; Postconditions:
;;;   Given any vector VEC of elements that meet any preconditions that
;;;   MAY-PRECEDE? imposes on either of its arguments, SORTER! destructively
;;;   modifies VEC so that the following conditions are met:
;;; 
;;;     (1) The elements of VEC are the same as in its initial state.
;;;     (2) VEC is in order by MAY-PRECEDE?
;;;
;;;  SORTER! does not return any particular value; it is invoked only for
;;;  its side effect.

(define merge-sort!
  (lambda (may-precede?)
    (let ((merger! (merge! may-precede?)))
      (lambda (vec)
        (let* ((len (vector-length vec))
               (holding (make-vector len)))
          (let subsort! ((start 0)
                         (finish len))

            ;; Unless there are at least two elements in the subvector,
            ;; sorting is superfluous.

            (if (<= 2 (- finish start))
                (let ((boundary (quotient (+ start finish) 2)))
                  (subsort! start boundary)
                  (subsort! boundary finish)
                  (merger! vec holding start boundary finish)))))))))