Introduction to Statistics (MAT/SST 115.03 2008S)
Activity 25-2 b asks us to “use technology to compute the expected
values, test-statistic, and p-value”. Doing the
chi-square test is easy, because we can use
once the data are in a table. So, let's put the data in a table.
Happiness = data.frame( row.names = c("Very Happy", "Pretty Happy", "Not Too Happy"), Y1972=c(486,855,265), Y1988=c(498,832,136), Y2004=c(419,738,180)) Happiness
Okay, now let's get the important statistics.
But what about the expected values? That's a little bit harder. I'll also note that they don't ask you for them and don't provide them in the answer key. But, hey, they've asked, so let's at least think about it.
First, we'll make a vector of proportions of the three response values. We could use the values given in the table for the total number of responses for each value.
ResponseProportions = c(1403/4409, 2425/4409, 581/4409)
 0.3182127 0.5500113 0.1317759
However, we are probably better off asking R to compute these sums for us.
RP = c(sum(Happiness[1,]), sum(Happiness[2,]), sum(Happiness[3,]))/sum(Happiness) RP
Once the proportions are computed, we can build a new data frame.
ExpectedHappiness = data.frame( row.names = c("Very Happy", "Pretty Happy", "Not Too Happy"), Y1972 = RP*sum(Happiness$Y1972), Y1988 = RP*sum(Happiness$Y1988), Y2004 = RP*sum(Happiness$Y2004)) ExpectedHappiness
But that's a long way to compute something. R lets us write this much
more concisely. The
colSums procedures let us compute the sums of
rows and columns, respectively. We still want to convert the row sums
to proportions by dividing by the total number of observational units.
Then we build the expected values table by computing the
outer product of these two vectors.
ExpectedHappiness = rowSums(Happiness) %o% colSums(Happiness)/sum(Happiness)
No matter how we compute the expected value table, we can see a simple comparison of the expected values to the actual values.
Happiness - ExpectedHappiness
And we can even see the various contributions to the chi-square test.
(Happiness - ExpectedHappiness)^2/ExpectedHappiness
We start by creating the table.
CardTips = data.frame(row.names=c("Tip","No Tip"), Joke = c(30,42), Ad = c(14,60), None = c(16,49))
Then we use the chi-square test.
To help ourselves think about the meaning, we build a table of expected values.
ExpectedTips = rowSums(CardTips) %o% colSums(CardTips)/sum(CardTips)
We then compare that table to our original table. (Do you notice anything interesting about the columns?)
CardTips - ExpectedTips
We then see what contributed to the high chi-square value.
PoliticalHappiness = data.frame( row.names = c("Very Happy", "Pretty Happy", "Not Too Happy"), Liberal = c(90, 177, 51), Moderate = c(128, 291, 76), Conservative = c(187, 258, 48)) ExpectedPH = rowSums(PoliticalHappiness) %o% colSums(PoliticalHappiness)/sum(PoliticalHappiness)
Copyright (c) 2007-8 Samuel A. Rebelsky.
This work is licensed under a Creative Commons
Attribution-NonCommercial 2.5 License. To view a copy of this
or send a letter to Creative Commons, 543 Howard Street, 5th Floor,
San Francisco, California, 94105, USA.