Algorithms and OOD (CSC 207 2013F) : Outlines
Held: Wednesday, 30 October 2013
Back to Outline 31 - Quadratic Sorts.
On to Outline 33 - Quicksort.
We consider the merge sort algorithm, our first O(nlogn) sorting algorithm.
- An introduction to merge sort.
- Analyzing merge sort.
- Today we will do a quick analysis of merge sort and then follow it up
with some lab exercises.
- I'm moving the due time for the electronic and printed versions of the
exam to 10:30 pm on Friday night. Put the printed version under my
- We should discuss the project and the role of Ushahidi in this class.
Clearly, we were less successful at getting the materials ready than
we would have liked this summer, and I was as unsuccessful at getting
them ready during the semester.
- Upcoming extra credit opportunities:
- Study in Budapest Lunch, Today
- Learning from Alumni, Thursday: Jordan Shkolnick '11 (Microsoft)
- CS Table, Friday: Ambient Belonging
- One Grinnell Prize Event next week
An introduction to merge sort
- There's a theoretical analysis that shows that O(nlogn) comparisons
are necessary for a comparison-based sort.
- All of the sorting algorithms we've seen so far are O(n^2).
- Can we do better? (Can we achieve the known lower bound?)
- One strategy for writing faster algorithms is "divide and conquer".
When presented with a large problem,
- split it into two parts
- solve each part
- combine the solutions
- The easiest way to split an array: first half and second half.
- We sort the two halves.
- What can we do after sorting the two halves?
- Let's let t(n) represent the time mergesort takes on input of size n.
- To sort an array of size n, we must sort two arrays of size n/2, and
then merge the two. Merging takes n steps.
- We have a simple recurrence relation: t(n) = 2*t(n/2) + n
- We can explore recurrence relations top-down or bottom up.
- Bottom up
- t(1) = 1
- t(2) = 21 + 2 = 4
- t(4) = 24 + 4 = 12
- t(8) = 212 + 8 = 32
- t(16) = 232 + 16 = 80
- Hmmm ...
- Top down
- t(n) = 2t(n/2) + n
- t(n) = 2(2t(n/4) + n/2) + n // Expand t(n/2)
- t(n) = 22t(n/4) + 2n/2 + n // Distribute
- t(n) = 4t(n/4) + 2n // Simplify
- t(n) = 4(2t(n/8) + n/4) + 2n // Expand t(n/4)
- t(n) = 42t(n/8) + 4n/4 + 2n // Distribute
- t(n) = 8t(n/8) + 3n // Simplify
- t(n) = 8(2t(n/16) + n/8) + 3n // Expand t(n/8)
- t(n) = 82t(n/16) + 8n/8 + 3n // Distribute
- t(n) = 16*t(n/16) + 4n // Distribute
- I see a pattern:
- t(n) = 2^k*t(n/(2^k)) + kn
- If we let 2^x = n, we get
- t(n) = n*t(1) + xn
- If 2^x = n, then x = log2(n)
- So t(n) = n + log2(n) * n
- The second term dominates. t(n) is in O(nlogn)