Introduction to Statistics (MAT/SST 115.03 2008S)
Activity 14-2.j asks you to take 500 random samples of size n = 30 from a uniform population. (At least we think it's uniform; the graph is unclear.) As you might guess, this involves taking a sample and doing so repeatedly. In most statistics packages, samples are relative easy, but doing things repeatedly is a bit more complicated.
Let's start by building a vector of change amounts. We'll use
seq to get all the values between 0 and 99,
change = seq(from=0, to=99, step=1)
R provides two mechanisms for taking samples:
takes a sample of the specified size from the vector,
without replacing values. (That is, each value in
the vector can appear at most once in the result - once
you've sampled it, you can't sample it again.) In contrast,
replace=T) takes a sample with replacement from the vector.
Let's try both to understand how they work.
sample(change,30) sample(change,30) sample(change,30,replace=T) sample(change,30,replace=T)
You may find the values easier to look at if they are sorted.
sort(sample(change,30)) sort(sample(change,30)) sort(sample(change,30,replace=T)) sort(sample(change,30,replace=T))
Since the same change amount can occur more than once, we'll use
the sampling techique with replacement. We can compute the mean of
a single sample using
mean. Let's try a few.
mean(sample(change,30,replace=T)) mean(sample(change,30,replace=T)) mean(sample(change,30,replace=T))
The problem asks us to take 500 random samples and then to plot them. Now, how do we get a vector of 500 samples? We'll first build a vector of the appropriate size and then fill in the values one-by-one using a special command called a for loop. (You don't need to understand this command right now; we'll give you the instructions.)
samples = 1:500 for (i in 1:500) samples[i] = mean(sample(change,30,replace=T))
Let's look at the first few sample means.
Okay, now we need to make a dotplot.
library(BHH2, lib="/home/rebelsky/Stats115/Packages") dotPlot(samples)
You also need to compute mean and standard deviation, but you should be able to figure that out. (Ask for help if you can't figure it out.)
For part k, you need to create a normal probability plot. Here goes ...
qqnorm(samples, datax=T, ylab="...") qqline(samples, datax=T)
You might want to repeat the exercise for 500 samples of size n = 100 to see if the distribution looks any different.
Copyright (c) 2007-8 Samuel A. Rebelsky.
This work is licensed under a Creative Commons
Attribution-NonCommercial 2.5 License. To view a copy of this
or send a letter to Creative Commons, 543 Howard Street, 5th Floor,
San Francisco, California, 94105, USA.