Introduction to Statistics (MAT/SST 115.03 2008S)

R notes for Activity 22-2: Hypothetical Commuting Times


This is one of those fun times in which our data set combines a number of essentially independent columns into a single data frame. Since R pads the empty cells in the data frame with NA values, our analyses may be slightly more complicated.

Let's start by loading the data. There's little enough data that we can look at all of it.

CommuteTimes = read.csv("/home/rebelsky/Stats115/Data/HypoCommute.csv")
CommuteTimes

The columns are named A1 (for Alex's Route 1), A2 (for Alex's Route 2), B1 (for Barb's Route 1), and so on and so forth.

22-2 c. Computing Alex's route statistics

You should be able to read the sample size from the table. To get the sample mean and standard deviation, we can use mean and sd, but need to tell the functions to ignore the NA values. (Having to tell the functions to deal with the NA values differently is one of the disadvantages of combining the columns.

mean(CommuteTimes$A1, na.rm=T)
sd(CommuteTimes$A1, na.rm=T)

22-2 d. Conducting the significance test

R makes two-sample t-tests very easy to compute. Just call t.test with the two samples.

t.test(CommuteTimes$A1,CommuteTimes$A2)

22-2 f. Confidence intervals

We repeat the t-test, telling it to use a different confidence level.

t.test(CommuteTimes$A1,CommuteTimes$A2, conf.level=.90)

22-2 k. More Computations

You should be able to figure out how to do these computations by revisiting the Alex examples from above.

Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Samuel A. Rebelsky.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.