Introduction to Statistics (MAT/SST 115.03 2008S)
Back to Topic 9: Measures of Spread. On to Review / Project Discussion.
This outline is also available in PDF.
Held: Friday, 15 February 2008
Summary: We consider more ways to summarize numeric data.
Notes:
I'm sick and won't be in classnotices. Please take care of yourselves.
Overview:
We've seen at least three measures of the spread of a distribution.
Some of you observed that the book and R give different answers for some IQR computations. Note that IQR can be computed in slightly different ways, which can have an effect on the result. (And different stats packages take different approaches.)
Let's use the distribution 1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9 as our example.
Officially, the lower quartile is a number such that 1/4 of the values are smaller and 1/4 of the values are larger. (It's not always possible to find such a value; for example, if all the values are the same, you can't divide this evenly.) Similarly, the upper quartile is a number such that 3/4 of the values are smaller and 1/4 of the values are larger.
We can start by finding those values directly.
Our book tells us a different strategy for computing the IQR. First, compute the median and then compute the median of each half. However, our book is vague on what you do when the median is repeated.
So, how did our book end up with 4.5 as the answer? Here's my guess. We did an odd thing in the analysis above: We put one 5 on each half. Arguably, both should go on the same side.
