Introduction to Statistics (MAT/SST 115.03 2008S)

R notes for Activity 6-3


This activity requires that you look at data from the class for gender and preferred lifetime acheivement (which we've abbreviated as PLA. In R, you can read in the complete set of data for our classes with

GP = read.csv("/home/rebelsky/Stats115/Data/GenderedPLAs.csv")

6-3.c

Activity 6-3.a asks you to determine the marginal distribution of the preferred lifetime acheivement variable and then to graph it. How do you get information from that data frame, except by counting manually? Recall that R provides a useful summary function. We can call that function on the whole frame.

summary(GP)

But we really care only about the PLA variable, so we can also ask for that.

summary(GP$PLA)

And, if the purpose of our graph is our own understanding, we can just bring up the simplest bar graph.

barplot(summary(GP$PLA))

6-3.d

This problem requires us to figure out the conditional distribution. You should start by counting a few lines by hand. However, you will soon find that tedious.

Do we have enough R tools to figure that out? Certainly. We simply select the rows that contain women and then the rows that contain men.

Women = GP[GP$Gender == "Female", ]
Men = GP[GP$Gender == "Male", ]

You can then fill in the table by summarizing the two parts.

summary(Women$PLA)
summary(Men$PLA)

Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Samuel A. Rebelsky.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.