This outline is also available in PDF.
Held: Friday, 18 April 2008
Summary: We consider yet another way in which we use samples to explore populations. We continue to emphasize categorical variables, this time considering pairs of categorical variables. We look at a more general version of the chi-square test from the previous topic.
So, how do we do a chi-squre computation in R? It ends up being fairly straightforward.
We start by making a table or data frame from our data.
> SC = fys[,c(1,3)] > table(SC) CHOICE SEX 1 2 3 4 1 4 11 34 108 2 15 7 35 127 > SCframe = data.frame(row.names=c("Below Third","Third","Second","First"), + Male=SCtable[1,], + Female=SCtable[2,]) > SCframe Male Female Below Third 4 15 Third 11 7 Second 34 35 First 108 127
We can simple apply the
chisq.test procedure to this table
to get the important values.
> chisq.test(SCframe) Pearson's Chi-squared test data: SCframe X-squared = 6.7122, df = 3, p-value = 0.08166
However, we will often want to look more carefuly at the differences between observed and expected values. The computation of the expected values is a strange formula that I don't expect you to understand. (I do expect that you could do an individual computation by hand, but this does all of them at once.)
> SCexpected = rowSums(SCframe) %o% colSums(SCframe)/sum(SCframe) > SCexpected Male Female Below Third 8.74780 10.25220 Third 8.28739 9.71261 Second 31.76833 37.23167 First 108.19648 126.80352
We can now compare directly.
> SCframe - SCexpected Male Female Below Third -4.7478006 4.7478006 Third 2.7126100 -2.7126100 Second 2.2316716 -2.2316716 First -0.1964809 0.1964809
As importantly, we can compute the deviations.
> (SCframe-SCexpected)^2/SCexpected Male Female Below Third 2.5768317632 2.1987097110 Third 0.8878854292 0.7575978934 Second 0.1567711671 0.1337667023 First 0.0003568024 0.0003044455
I usually create these pages
on the fly, which means that I rarely
proofread them and they may contain bad grammar and incorrect details.
It also means that I tend to update them regularly (see the history for
more details). Feel free to contact me with any suggestions for changes.
This document was generated by
Siteweaver on Fri May 2 13:40:57 2008.
The source to the document was last modified on Sun Apr 13 21:15:34 2008.
This document may be found at
You may wish to validate this document's HTML ; ;Samuel A. Rebelsky, firstname.lastname@example.org
http://creativecommons.org/licenses/by-nc/2.5/or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.