Introduction to Statistics (MAT/SST 115.03 2008S)

R notes for Activity 19-3


19-3 c: Visualizing Sample Data

You can read and preview the data with

BodyTemps = read.csv("/home/rebelsky/Stats115/Data/BodyTemps.csv")
summary(BodyTemps)
head(BodyTemps)

You'll note that the data have two columns: BodyTemp and Sex. We just want the first column, which we will select with BodyTemps$BodyTemp.

We can build a quick histogram of those data with the following command. (Since R and Minitab make different decisions as to how to make intervals, this may look a bit different than the sample answer.)

hist(BodyTemps$BodyTemp)

But we should certainly label the x axis

hist(BodyTemps$BodyTemp,
  main="Sample Body Temperatures",
  xlab="Body Temperature in Degrees F"
)

If we'd rather do a dot plot, we can use

library(BHH2, lib="/home/rebelsky/Stats115/Packages")
dotPlot(BodyTemps$BodyTemp,
  main="Sample Body Temperatures",
  xlab="Body Temperature in Degrees F"
)

We can create the normal probability plot with

qqnorm(BodyTemps$BodyTemp, datax=T, ylab="Body Temperature in Degrees F")

19-3 f: Computing Confidence Intervals

Since you used some form of technology to compute these confidence intervals in activity 19-1, I'm not sure why they're asking you to do so again. But, hey, let's cooperate. One technique is to tell R the formula. We'll start by recording the values we know.

x_bar = 98.249
s = .733
n = 130

We can use qt to compute t*. Unlike the table on p. 625, qt computes the appropriate t value given the area to the left of that t. Hence, for a 95% confidence interval, we use .975. (Why .975? Because there's 0.025 to the right, and therefore 0.975 to the left.) As you should recall from the reading, the degrees of freedom should be n-1.

t_star = qt(0.975, n-1)

Now, we're ready to compute the lower bound and upper bounds of the confidence interval using the standard formula.

ci_lower = x_bar - t_star*s/sqrt(n)
ci_upper = x_bar + t_star*s/sqrt(n)
c(ci_lower, ci_upper)

Of course, that's a lot of work. Hence, we might want to use the built-in t.test function, which provides not just the confidence interval, but also a lot of other data. However, we need to work from the original data set, rather than from the mean and standard deviation already computed from that data set. (If you only know mean, standard deviation, and sample size, you'll need to use the technique above.) To use the t.test function, you also need to provide a hypothesized population parameter (mu) and a desired confidence interval (conf.level). While you don't need mu to compute the confidence interval, the t-test computes more than just the confidence interval, and therefore requires a bit more.

t.test(BodyTemps$BodyTemp, mu=98.6, conf.level=0.95)

For the other two confidence intervals, we would use

t.test(BodyTemps$BodyTemp, mu=98.6, conf.level=0.90)
t.test(BodyTemps$BodyTemp, mu=98.6, conf.level=0.99)

19-3 j: Computing Another CI

Since you don't have the original data set, you cannot use the t.test function. Hence, you must provide R with the formulae.

x_bar = 98.249
s = .733
n = 13
t_star = qt(0.975, n-1)
ci_lower = x_bar - t_star*s/sqrt(n)
ci_upper = x_bar + t_star*s/sqrt(n)
c(ci_lower, ci_upper)

Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Samuel A. Rebelsky.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.