Algorithms and OOD (CSC 207 2014F) : EBoards

# CSC 207.01 2014F: Extra Session 9 (Thursday, 6 November 2014)

Topics to discuss

• How do we use merge in the scratch version of merge sort
• How do we use scratch in merge?
• How do we design testing suites?
• What happens at runtime when we have algorithms that use generics?
• Why are we analyzing SortedArrayList in HW9?

## Using a scratch array in merge

The (conceptually) simplest version of merge: Take two lists of integers and merge them into a new list.

For merging lists, we want

;;; Procedure:
;;;   merge
;;; Parameters:
;;;   lst1, a sorted list of real numbers
;;;   lst2, a sorted list of real numbers
;;; Purpose:
;;;   Merge lst1 and lst2 into a new sorted list
;;; Produces:
;;;   newlst, a list
;;; Preconditions:
;;;   lst1 is sorted, which means that
;;;     (<= (list-ref lst1 i) (list-ref lst1 (+ i 1)))
;;;     for all reasonable i.
;;;   lst2 is sorted
;;; Postconditions:
;;;   newlst is a permutation of (append lst1 lst2)
;;;   newlst is sorted
;;; Process:
;;;   Repeatedly grab the smaller of the two cars.
;;;   When one list runs out, use the other.
(define merge
(lambda (lst1 lst2)
(cond
[(null? lst1) lst2]
[(null? lst2) lst1]
[(<= (car lst1) (car lst2))
(cons (car lst1) (merge (cdr lst1) lst2))]
[else ; car of lst2 is smaller
(cons (car lst2) (merge lst1 (cdr lst2)))])))

(merge '(1 3 7 8 9) '(2 4 5))
=> (cons 1 (merge '(3 7 8 9) '(2 4 5)))
=> (cons 1 (cons 2 (merge '(3 7 8 9) '(4 5))))
=> (cons 1 (cons 2 (cons 3 (merge '(7 8 9) '(4 5)))))
=> (cons 1 (cons 2 (cons 3 (cons 4 (merge '(7 8 9) '(5))))))
=> (cons 1 (cons 2 (cons 3 (cons 4 (cons 5 (merge '(7 8 9) '()))))))
=> (cons 1 (cons 2 (cons 3 (cons 4 (cons 5 '(7 8 9))))))

Now let's do it with arrays (still simplest version). We tend to process arrays iteratively, which gives us an opportunity for invariants.

/**
* merge arr1 and arr2 into a new array.
*
* @return newarr
*
* @pre sorted(arr1)
* @pre sorted(arr2)
* @post newarr is a permutation of the concatenation of arr1
*    and arr2
* @post sorted(newarr)
*/

Since we're working with arrays, I'll think about having an index in each array that divides the stuff already handled with the stuff left to be processed.

// arr1:    +---------+---------+
//          |processed| sorted  |
//          +---------+---------+
//          0         i1        length1

// arr2:    +---------+---------+
//          |processed| sorted  |
//          +---------+---------+
//          0         i2        length2

// newarr:  +---------+---------+
//          | sorted  | empty   |
//          +---------+---------+
//          0         i         length

// Textually
//   0 <= i1 <= length1
//   0 <= i2 <= length2
//   0 <= i <= length
//   The subarray of arr1, indexed from i1 (inclusive) to
//     length1 (exclusive) is sorted.
//     For all j, i1 <= j < length1-1, arr1[j] <= arr1[j+1]
//   The subarray of arr2, indexed from i2 (inclusive) to
//     length2 (exclusive) is sorted.
//   The subarray of newarr from 0 to i is sorted.
//   The subarray of newarr from 0 to i is a permutation of the
//     concatenation of the subarray of arr1 from 0 to i1 and
//     the subarray of arr2 from 0 to i2.
//   Everything in the subarray of newarr from 0 to i is less
//     than or equal to everything in the subarray of arr2 from
//     0 to i2.  (Similar for arr1)

public static int[] merge(int[] arr1, int[] arr2)

Next, we want to generalize.

public <T> static T[] merge(T[] arr1, T[] arr2, Comparator<T> order)

But that means that each time we call merge, we create a new array. And then my colleague criticizes us. Instead of creating a new array, we want to merge into an existing array (that we reuse over and over and over and over).

public <T> static void merge(T[] arr1, T[] arr2, T[] scratch,
Comparator<T> order)

Now, let's start thinking about merge sort.

Initial model

//          +---------+---------+
//          | ??????? | ??????? |
//          +---------+---------+

Sort the two halves

//          +---------+---------+
//          | sorted  | sorted  |
//          +---------+---------+

We want to call merge. But merge only takes two separate arrays. We want two parts of the same array. We're going to generalize: We want parts of two arrays

public <T> static void merge(T[] arr1, int lb1, ub1,
T[] arr2, int lb2, ub2,
T[] scratch,
Comparator<T> order)

Since we're indicating the portion of the two "input" arrays, we might as well indicate the portion of the "result" array. (scratch)

public <T> static void merge(T[] arr1, int lb1, ub1,
T[] arr2, int lb2, ub2,
T[] scratch, int lbs, int ubs,
Comparator<T> order)

We are now ready to think again about merge sort

// array    +---------+---------+
//          | sorted  | sorted  |
//          +---------+---------+

merge(array, 0, mid,
array, mid, length,
scratch, 0, length,
order)

// scratch  +---------+---------+
//          |  sorted           |
//          +---------+---------+

copy back

// array    +---------+---------+
//          |  sorted           |
//          +---------+---------+

Alternate view

// array    +---------+---------+
//          | sorted  | sorted  |
//          +---------+---------+

copy into scratch

// scratch  +---------+---------+
//          | sorted  | sorted  |
//          +---------+---------+

merge(scratch, 0, mid,
scratch, mid, length,
array, 0, length,
order)

// array    +---------+---------+
//          |  sorted           |
//          +---------+---------+

## Why are we analyzing SortedArrayList in HW9?

• We are implementing an alternative.
• We want to see whether the alternative is "better", or at least in what situations it's better.