# Laboratory: Merge Sort

Summary: In this laboratory, we consider merge sort, a more efficient technique for sorting lists of values.

## Exercise 0: Preparation

Add the code from the end of this lab to your definitions pane.

## Exercises

### Exercise 1: Merging

a. Write an expression to merge the lists `(1 2 3)` and `(1 1.5 2.3)`.

b. Write an expression to merge two identical lists of numbers. For example, you might merge the lists `(1 2 3 5 8 13 21)` and `(1 2 3 5 8 13 21)`

c. Write an expression to merge two lists of strings. (You may choose the strings yourself. Each list should have at least three elements.)

d. Assume that we represent names as lists of the form `(last-name first-name)`. Write an expression to merge the following two lists

```(define mathstats-faculty
(list (list "Chamberland" "Marc")
(list "French" "Chris")
(list "Jager" "Leah")
(list "Kuiper" "Shonda")
(list "Moore" "Emily")
(list "Moore" "Tom")
(list "Mosley" "Holly")
(list "Romano" "David")
(list "Shuman" "Karen")
(list "Wolf" "Royce")))

(define more-faculty
(list (list "Moore" "Chuck")
(list "Moore" "Ed")
(list "Moore" "Gordon")
(list "Moore" "Roger")))
```

### Exercise 2: Reflecting on Merging

a. What will happen if you call `merge` with unsorted lists as the two list parameters?

b. Check your answer by experimentation. To help you understand what is happening, you may wish to modify `merge` so that it displays the values of `sorted1` and `sorted2`.

c. What will happen if you call `merge` with sorted lists of very different lengths as the first two parameters?

### Exercise 3: Splitting

Use `split` to split:

a. A list of numbers of length 6

b. A list of numbers of length 5

c. A list of strings of length 6

### Exercise 4: Sorting

a. Run `merge-sort` on a list you design of fifteen integers.

b. Run `new-merge-sort` on a list you design of ten strings.

c. Add the following lines to the `repeat-merge` helper in `new-merge-sort`. (Add them directly after the line that contains `lambda`.)

```                 (display list-of-lists) (newline)
```

d. What output do you expect to get if you rerun `new-merge-sort` on the list from step b?

f. Rerun `new-merge-sort` on a list of twenty integers.

### Exercise 5: Special Cases

As we've seen, in exploring any algorithm, it's a good idea to check a few special cases that might cause the algorithm difficulty. Here are some to start with.

a. Run both versions of merge sort on the empty list.

b. Run both versions of merge sort on a one-element list.

c. Run both versions of merge sort on a list with duplicate elements.

### Exercise 6: Is It Sorted?

As you've probably noticed, there are two key postconditions of a procedure that sorts lists: The result is a permutation of the original list and the result is sorted.

We need a way to make sure that the result is sorted, particularly if the result is very long.

Write a procedure, ```(sorted? lst may-precede?)``` that checks whether or not `lst` is sorted by `may-precede?`.

For example,

````>` `(sorted? (list 1 3 5 7 9) <=)`
`#t`
`>` `(sorted? (list 1 3 5 4 7 9) <=)`
`#f`
`>` `(sorted? (list "alpha" "beta" "gamma") string-ci<=?)`
```

## For Those with Extra Time

### Extra 1: Splitting, Revisited

Some computer scientists prefer to define `split` something like the following.

```(define split
(lambda (ls)
(let kernel ((rest ls)
(left null)
(right null))
(if (null? rest)
(list left right)
(kernel (cdr rest) (cons (car rest) right) left)))))
```

a. How does this procedure split the list?

b. Why might you prefer one version of split over the other?

### Extra 2: Checking Permutations

We've written a procedure that checks whether a list is sorted, which is one of the postconditions of a list-based sorting routine. However, we have not yet written a procedure to check whether two lists are permutations of each other.

Write a procedure, ```(permutation? lst1 lst2)```, that determines if `lst2` is a permutation of `lst1`.

You might find the following strategy, which involves iterating through the first list, an appropriate strategy for testing for permutations.

Base case: If both lists are empty, the two lists are permutations of each other.

Base case: If the first list is empty and the second is not (i.e., it's a pair), the two lists are not permutations of each other.

Base case: If the first list is nonempty and car of the first list is not contained in the second, the two lists are not permutations of each other.

Recursive case: If the first list is not empty and the first element of the first list is in the second list, then the two lists are permutations of each other only if the cdr of the first list is a permutation of what you get by removing the car of the first list from the second list.

You may also find the following two procedures useful.

```;;; Procedure:
;;;   list-contains?
;;; Parameters:
;;;   lst, a list
;;;   val, a value
;;; Purpose:
;;;   Determines if lst contains val.
;;; Produces:
;;;   contained?, a Boolean
;;; Preconditions:
;;; Postconditions:
;;;   If there is an i such that (list-ref lst i) equals val,
;;;     then contained? is true (#t).
;;;   Otherwise,
;;;     contained? is false.
(define list-contains?
(lambda (lst val)
(and (not (null? lst))
(or (equal? (car lst) val)
(list-contains? (cdr lst) val)))))

;;; Procedure:
;;;   list-remove-one
;;; Parameters:
;;;   lst, a list
;;;   val, a value
;;; Purpose:
;;;   Remove one copy of val from lst.
;;; Produces:
;;;   newlst
;;; Preconditions:
;;;   lst contains a copy of val.
;;; Postconditions:
;;;   (length newlst) = (length lst) - 1
;;;   lst is a permutation of (cons val newlst).
;;;   If val1 and val2 appear in both lst and newlst, and val1 precedes
;;;     val2 in lst, the val1 precedes val2 in newlst.
(define list-remove-one
(lambda (lst val)
(cond
((null? lst)
null)
((equal? val (car lst))
(car lst))
(else
(cons (car lst) (list-remove-one (cdr lst) val))))))
```

## Some Useful Procedures

```;;; Procedure:
;;;   merge
;;; Parameters:
;;;   sorted1, a sorted list.
;;;   sorted2, a sorted list.
;;;   may-precede?, a binary predicate that compares values.
;;; Purpose:
;;;   Merge the two lists.
;;; Produces:
;;;   sorted, a sorted list.
;;; Preconditions:
;;;   may-precede? can be applied to any two values from
;;;     sorted1 and/or sorted2.
;;;   may-precede? represents a transitive operation.
;;;   sorted1 is sorted by may-precede? That is, for each i such that
;;;     0 <= i < (length sorted1)
;;;       (may-precede? (list-ref sorted1 i) (list-ref sorted1 (+ i 1)))
;;;   sorted2 is sorted by may-precede? That is, for each i such that
;;;     0 <= j < (length sorted2)
;;;       (may-precede? (list-ref sorted2 j) (list-ref sorted2 (+ j 1)))
;;; Postconditions:
;;;   sorted is sorted by may-precede?.
;;;     For each k, 0 <= k < (length sorted)
;;;       (may-precede? (list-ref sorted k) (list-ref sorted (+ k 1)))
;;;   sorted is a permutation of (append sorted1 sorted2)
;;;   Does not affect sorted1 or sorted2.
;;;   sorted may share cons cells with sorted1 or sorted2.
(define merge
(lambda (sorted1 sorted2 may-precede?)
(cond
; If the first list is empty, return the second
((null? sorted1) sorted2)
; If the second list is empty, return the first
((null? sorted2) sorted1)
; If the first element of the first list is smaller or equal,
; make it the first element of the result and recurse.
((may-precede? (car sorted1) (car sorted2))
(cons (car sorted1)
(merge (cdr sorted1) sorted2 may-precede?)))
; Otherwise, do something similar using the first element
; of the second list
(else
(cons (car sorted2)
(merge sorted1 (cdr sorted2) may-precede?))))))

;;; Procedure:
;;;   split
;;; Parameters:
;;;   lst, a list
;;; Purpose:
;;;   Split a list into two nearly-equal halves.
;;; Produces:
;;;   halves, a list of two lists
;;; Preconditions:
;;;   lst is a list.
;;; Postconditions:
;;;   halves is a list of length two.
;;;   Each element of halves is a list (which we'll refer to as
;;;     firsthalf and secondhalf).
;;;   lst is a permutation of (append firsthalf secondhalf).
;;;   The lengths of firsthalf and secondhalf differ by at most 1.
;;;   Does not modify lst.
;;;   Either firsthalf or secondhalf may share cons cells with lst.
(define split
(lambda (lst)
;;; kernel
;;;   Remove the first count elements of a list.  Return the
;;;   pair consisting of the removed elements (in order) and
;;;   the remaining elements.
(let kernel ((remaining lst) ; Elements remaining to be used
(removed null)  ; Accumulated initial elements
(count          ; How many elements left to use
(quotient (length lst) 2)))
; If no elements remain to be used,
(if (= count 0)
; The first half is in removed and the second half
; consists of any remaining elements.
(list (reverse removed) remaining)
; Otherwise, use up one more element.
(kernel (cdr remaining)
(cons (car remaining) removed)
(- count 1))))))

(define merge-sort
(lambda (stuff may-precede?)
; If there are only zero or one elements in the list,
; the list is already sorted.
(if (or (null? stuff) (null? (cdr stuff)))
stuff
; Otherwise,
;   split the list in half,
;   sort each half,
;   and then merge the sorted halves.
(let* ((halves (split stuff))
(some (car halves))
(merge (merge-sort some may-precede?)
(merge-sort rest may-precede?)
may-precede?)))))

(define new-merge-sort
(lambda (lst may-precede?)
(letrec (
; Merge neighboring pairs in a list of lists
(merge-pairs
(lambda (list-of-lists)
(cond
; Base case: Empty list.
((null? list-of-lists) null)
; Base case: Single-element list (nothing to merge)
((null? (cdr list-of-lists)) list-of-lists)
; Recursive case: Merge first two and continue
(else (cons (merge (car list-of-lists) (cadr list-of-lists)
may-precede?)
(merge-pairs (cddr list-of-lists)))))))
; Repeatedly merge pairs
(repeat-merge
(lambda (list-of-lists)
; Show what's happening
; (write list-of-lists) (newline)
; If there's only one list in the list of lists
(if (null? (cdr list-of-lists))
; Use that list
(car list-of-lists)
; Otherwise, merge neighboring pairs and start again.
(repeat-merge (merge-pairs list-of-lists))))))
(repeat-merge (map list lst)))))
```

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Janet Davis, Matthew Kluber, and Samuel A. Rebelsky. (Selected materials copyright by John David Stone and Henry Walker and used by permission.)

This material is based upon work partially supported by the National Science Foundation under Grant No. CCLI-0633090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit `http://creativecommons.org/licenses/by-nc/2.5/` or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.