Class 15: Algorithm Analysis

Back to Arrays and Sequences. On to Recursion.

Held Wednesday, February 17

Summary

Contents

Handouts

Notes

• There's still a Cool Statistics Talk tomorrow at 4:15/4:30. Be there, or never know the joys of Bayesian statistics.
• Next Monday at noon, there's a CS brown bag lunch discussing summer opportunities for research in CS at Grinnell. It's unlikely that we'll be able to take students out of CS152, but come and hear what you might be able to do the following summer.
• Are there any final questions on assignment 4 or assignment 5?
• Expect yet another revision to the syllabus, as I add a lab Friday on today's stuff. (We'll probably do the lab next Monday.)
• Don't forget to turn in the lab evaluation form. One of you left one under my door (which means you'll get no credit), and didn't bother to fill in the second side.

Assignment 3

I believe that I now have assignment 3 from everyone who intended to do that assignment. We'll spend a little bit of time discussing the assignment, some common problems, and a better technique for the whole assignment.

What is that better technique? We can take advantage of inheritance. We'll design a general `Drawable` class that deals with size and color, and just have `Circle` and `Square` extend it. Now, `DrawingAssistant` need only have one copy of each method for `Drawable`, rather than two (or more).

Algorithm Analysis

• As you may have noted, there are often multiple algorithms one can use to solve the same problem.
• In searching an ordered list, one can use linear search, binary search, or ``look randomly'' (as well as many others). [Don't worry if you don't know any of these.]
• In finding the minimum element of a list, you can step through the list, keeping track of the current minimum. You could also sort the list and grab the first element.
• You can come up with your own variants.
• How do we choose which algorithm is the best?
• The fastest/most efficient algorithm.
• The one that uses the fewest resources.
• The clearest.
• The shortest.
• The easiest to write.
• The most general.
• ...
• Frequently, we look at the ``speed'' (how long does the algorithm take to run).
• What is the best way to represent the running time of an algorithm?
• Is there an exact number we can provide? Surprisingly, no.
• Different inputs lead to different running times. For example, if there are conditionals in the algorithm (as there are in a typical minimum algorithm), different instructions will be executed depending on the input.
• Not all operations take the same time. For example, addition is typically quicker than multiplication, and integer addition is typically quicker than floating point addition.
• The same operation make take different times on different machines.
• The same operation make appear to take different times on the same machine, particularly if other things are happening on the same machine.
• Many things are happening behind the scenes that we can't predict (e.g., caching).

Asymptotic Analysis

• Noting problems in providing a precise analysis of the running time of programs, computer scientists developed a technique which is often called asymptotic analysis. In asymptotic analysis of algorithms, one describes the general behavior of algorithms in terms of the size of input, but without delving into precise details.
• There are many issues to consider in analyzing the asymptotic behavior of a program. One particularly useful metric is an upper bound on the running time of an algorithm. We call this the ``Big-O'' of an algorithm.
• Big-O is defined somewhat mathematically, as a relationship between functions.
• f(n) is in O(g(n)) iff
• there exists a number n0
• there exists a number d > 0
• for all n > n0, abs(f(n)) <= abs(d*g(n))
• What does this say? It says that after a certain point (n0), f(n) is bounded above by a constant (d) times g(n).
• The constant (d) helps accommodate the variation in the algorithm.
• We don't usually identify the d precisely.
• For algorithms,
• n is the "size" of the input (e.g., the number of items in a list or vector to be manipulated).
• f(n) is the running time of the algorithm.
• Some common Big-O bounds
• An algorithm that is O(1) takes constant time. That is, the running time is independent of the input. Getting the size of an array should be an O(1) algorithm.
• An algorithm that is O(n) takes time linear in the size of the input. That is, we basically do constant work for each ``element'' of the input. Finding the smallest element in a list is often an O(n) algorithm.
• An algorithm that is O(log_2(n)) takes logarithmic time. While the running time is dependent on the size of the input, it is clear that not every element of the input is processed. Many such algorithms involve the strategy of ``divide and conquer''.

Eliminating Constants

• One of the nice things about asymptotic analysis is that it makes constants ``unimportant'' because they can be ``hidden'' in the d.
• If f(n) is 100*n seconds and g(n) is 0.5*n seconds, then f(n) is O(g(n)) [let d be 200] and g(n) is f(n).
• If f(n) is 100*n seconds and g(n) is n*n seconds, then f(n) is O(g(n)) [let n0 be 100 and d be 1; let n0 be 1 and d be 100; ...].
• However, g(n) is not O(f(n)). Why not? Suppose there were an n0 and a d. Consider what happens for n = 101d. d*f(n) = d*100*101*d = d*d*100*101. However, g(n) = d*d*101*101, which is even larger. If n0 is greater than 101d, we'll still have this problem [proof left to reader].
• Since constants can be eliminated, we normally don't write them.

Asymptotic Analysis in Practice

• We now have a theoretical grounding for asymptotic analysis. How do we do it in practice?
• At this point in your career, it's often best to ``count'' the steps in an algorithm and then add them up. After you've taken combinatorics, you can use recurrence relations.
• Over the next few days, we'll look at a number of examples. Some starting ones.
• Finding the smallest/largest element in an array of integers.
• Finding the average of all the elements in an array of integers.
• Putting the largest element in an array at the end of the array.
• Putting the largest element in an array at the end of the array if we're only allowed to swap subsequent elements.
• Computing the nth Fibonacci number.
• ...

History

• Created Monday, January 11, 1999.
• Added short summary on Friday, January 22, 1999.
• Filled in the details on Wednesday, February 17, 1999. These details were based, in part, on outline 18 from CSC152 98S.

Back to Arrays and Sequences. On to Recursion.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.