Introduction to Statistics (MAT/SST 115.03 2008S)

R Notes for Topic 9: Measures of Spread


R notes for Activity 9-3: Value of Statistics

Although the book tells you that the data are stored in a single file, I've found it easier to segment it into five files, ClassF, ClassG, ClassH, ClassI, and ClassJ. You should load each separately. For example,

ClassF = read.csv("/home/rebelsky/Stats115/Data/ClassF.csv")
ClassG = read.csv("/home/rebelsky/Stats115/Data/ClassG.csv")
ClassH = read.csv("/home/rebelsky/Stats115/Data/ClassH.csv")
ClassI = read.csv("/home/rebelsky/Stats115/Data/ClassI.csv")
ClassJ = read.csv("/home/rebelsky/Stats115/Data/ClassJ.csv")

Each of these CSV files contains a single column, titled Ratings. Hence, to make a histogram for one of them, you would write something like the following.

hist(ClassF$Ratings)

Of course, the book doesn't tell you to make your own histograms, but you might find it useful to do so.

What the book does is ask you to compute a variety of numbers, including range, interquartile range, and standard deviation. R's range function gives you the min and the max, rather the difference between the two. To compute the difference between the two, you need to subtract the max from the min.

max(ClassF$Ratings) - min(ClassF$Ratings)

You compute interquartile range with IQR and standard deviation with sd. (And no, I do not know why they use different capitalization in different places.)

IQR(ClassF$Ratings)
sd(ClassF$Ratings)

Problems i and j ask you to create a hypothetical example. Use something like the following (replacing the 0's by other numbers).

iHypotheticals = c(0,0,0,0,0,0,0,0,0,0)
sd(iHypotheticals)
jHypotheticals = c(0,0,0,0,0,0,0,0,0,0)
sd(jHypotheticals)

R notes for Activity 9-6: Marriage Ages

The data for this exercises are stored in MarriageAges.csv.

MarriageAges = read.csv("/home/rebelsky/Stats115/Data/MarriageAges.csv")

The column names in the frame are Couple, HusbandAge, WifeAge, and Difference. We might, for example, compute the median husband age with

median(MarriageAges$HusbandAge)

You may once again find it useful to ask for summaries of the different variables.

summary(MarriageAges$HusbandAge)

We can get summaries of all the variables with

summary(MarriageAges)

The problem also asks you to compute standard deviations and interquartile ranges. You use the sd to compute standard deviations. You use the IQR function to compute interquartile ranges.

sd(MarriageAges$HusbandAge)
IQR(MarriageAges$HusbandAge)

Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Samuel A. Rebelsky.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.