Algorithms and OOD (CSC 207 2014F) : Outlines

Outline 28: Merge Sort

Held: Wednesday, 15 October 2014

Back to Outline 27 - Quadratic Sorts. On to Outline 29 - Quicksort.

Summary

We consider the merge sort algorithm, our first O(nlogn) sorting algorithm.

Related Pages

Overview

• Lower bounds on sorting.
• Divide and conquer algorithms.
• An introduction to merge sort.
• Analyzing merge sort.

• New lab partners (keep for the rest of this week)
• Office hours
• Normal this week (Th 1:45-3:15; walking 1:15-1:45); No hours Friday
• I no longer know where I am in the morning. Email me to figure it out.
• Review session Thursday at 11am.
• Note: Some of you are good about asking me for help in person. Some of you are good about asking me for help via email. Some do neither.
Students from the past two semesters of CSC 207 keep saying "Once I realized that Sam was actually helpful, I started asking for help." So, I'm going to leave the room for five minutes and let you talk to each other about asking me for help. (You can also make a list of things that I can do better, other than magically finding time for more office hours.)

Upcoming Work

• Exam 1 due Thursday.
• Required epilogue due Friday night (or before).

Interesting Things to Do

• I'm told that there is a cool concert tonight.

Extra Credit

• Thursday Extras, 4:30 p.m. Thursday, October 16, Grad School in CS
• CS Table Friday (Day PDR): Debrief on GHC

Peer Support

• Ajuna's Radio show Mondays at 8pm. Listen to African music

An introduction to merge sort

• There's a theoretical analysis that shows that O(nlogn) comparisons are necessary for a comparison-based sort.
• All of the sorting algorithms we've seen so far are O(n^2).
• Can we do better? (Can we achieve the known lower bound?)
• One strategy for writing faster algorithms is "divide and conquer". When presented with a large problem,
• split it into two parts
• solve each part
• combine the solutions
• The easiest way to split an array: first half and second half.
• We sort the two halves.
• What can we do after sorting the two halves?

Analysis

• Let's let t(n) represent the time mergesort takes on input of size n.
• To sort an array of size n, we must sort two arrays of size n/2, and then merge the two. Merging takes n steps.
• We have a simple recurrence relation: t(n) = 2*t(n/2) + n
• We can explore recurrence relations top-down or bottom up.
• Bottom up
• t(1) = 1
• t(2) = 21 + 2 = 4
• t(4) = 24 + 4 = 12
• t(8) = 212 + 8 = 32
• t(16) = 232 + 16 = 80
• Hmmm ...
• Top down
• t(n) = 2t(n/2) + n
• t(n) = 2(2t(n/4) + n/2) + n // Expand t(n/2)
• t(n) = 22t(n/4) + 2n/2 + n // Distribute
• t(n) = 4t(n/4) + 2n // Simplify
• t(n) = 4(2t(n/8) + n/4) + 2n // Expand t(n/4)
• t(n) = 42t(n/8) + 4n/4 + 2n // Distribute
• t(n) = 8t(n/8) + 3n // Simplify
• t(n) = 8(2t(n/16) + n/8) + 3n // Expand t(n/8)
• t(n) = 82t(n/16) + 8n/8 + 3n // Distribute
• t(n) = 16*t(n/16) + 4n // Distribute
• I see a pattern:
• t(n) = 2^k*t(n/(2^k)) + kn
• If we let 2^x = n, we get
• t(n) = n*t(1) + xn
• If 2^x = n, then x = log2(n)
• So t(n) = n + log2(n) * n
• The second term dominates. t(n) is in O(nlogn)

Lab

• If you haven't already done so, fork and clone the code from https://github.com/Grinnell-CSC207/sorting
• Write a loop invariant for `Utils.merge`.
• Implement `Utils.merge`.
• Extra problems to gather data and/or do bottom-up merge sort.