# Class 32: Topic 25: Inference for Two-Way Tables

Back to Mini-Project Presentations. On to Time to Work on Projects.

This outline is also available in PDF.

Held: Friday, 18 April 2008

Summary: We consider yet another way in which we use samples to explore populations. We continue to emphasize categorical variables, this time considering pairs of categorical variables. We look at a more general version of the chi-square test from the previous topic.

Notes:

• Katherine will miss class today and Monday.
• I spent until midnight last night getting today's classes semi-prepared. My 151 students claim that you'll understand that the preparation meant that I did not get your exams graded.
• Extra credit for any one Pride Week event.
• Is anyone in Symphonic band? If so, EC for Sunday's concert.
• Due Monday: Mini-Project Memos.

Overview:

• Presentation Debriefing.
• Today's Topic: Inferences from Samples for Categorical Variables, Continued.
• Some R.

## Debriefing on the Presentation

• What did you see as strengths?
• What did you see as weaknesses?
• What errors did you note?
• I apologize that correcting errors may embarrass some of you. However, I think that it's important that you hear about errors.

## What Samples Tell Us About Categorical Variables, Continued

• Context: Why do we call the topics of the first few weeks descriptive statistics and the more recent topics inferential startistics?
• Topic 21: Two populations, one binary categorical variable.
• Topic 24: One population, one non-binary categorical variable.
• Topic 25: One population, two categorical variables (binary or non-binary).
• Can you think about the values for the explanatory variable as representing different populations?
• What might your null and alternate hypothesis be?
• Just as in topic 24, we answer questions about populations using a chi-square test.
• That is, we compute expected values, subtract them from observed, square the difference, divide by the expected, and sum all of those values.

## Doing Chi-Square Computations in R

So, how do we do a chi-squre computation in R? It ends up being fairly straightforward.

We start by making a table or data frame from our data.

```> SC = fys[,c(1,3)]
> table(SC)
CHOICE
SEX   1   2   3   4
1   4  11  34 108
2  15   7  35 127
> SCframe = data.frame(row.names=c("Below Third","Third","Second","First"),
+   Male=SCtable[1,],
+   Female=SCtable[2,])
> SCframe
Male Female
Below Third    4     15
Third         11      7
Second        34     35
First        108    127
```

We can simple apply the `chisq.test` procedure to this table to get the important values.

```> chisq.test(SCframe)
Pearson's Chi-squared test
data:  SCframe
X-squared = 6.7122, df = 3, p-value = 0.08166
```

However, we will often want to look more carefuly at the differences between observed and expected values. The computation of the expected values is a strange formula that I don't expect you to understand. (I do expect that you could do an individual computation by hand, but this does all of them at once.)

```> SCexpected = rowSums(SCframe) %o% colSums(SCframe)/sum(SCframe)
> SCexpected
Male    Female
Below Third   8.74780  10.25220
Third         8.28739   9.71261
Second       31.76833  37.23167
First       108.19648 126.80352
```

We can now compare directly.

```> SCframe - SCexpected
Male     Female
Below Third -4.7478006  4.7478006
Third        2.7126100 -2.7126100
Second       2.2316716 -2.2316716
First       -0.1964809  0.1964809
```

As importantly, we can compute the deviations.

```> (SCframe-SCexpected)^2/SCexpected
Male       Female
Below Third 2.5768317632 2.1987097110
Third       0.8878854292 0.7575978934
Second      0.1567711671 0.1337667023
First       0.0003568024 0.0003044455
```

Back to Mini-Project Presentations. On to Time to Work on Projects.

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Fri May 2 13:40:57 2008.
The source to the document was last modified on Sun Apr 13 21:15:34 2008.
This document may be found at `http://www.cs.grinnell.edu/~rebelsky/Courses/MAT115/2008S/Outlines/outline.32.html`.

You may wish to validate this document's HTML ; ;

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright © 2008 Samuel A. Rebelsky. This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit `http://creativecommons.org/licenses/by-nc/2.5/` or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.