Here is the problem discussed in class. The data are number of fumbles occurring in all football games played in a single weekend of NCAA Division I football. See Case Study 4.2.3.
Here are the data:
3 1 2 4 6 2 1 3 3 1 5 5 4 4 3 3 2 1 1 5 2 4 1 3 3 4 3 3 1 2 1 3 2 2 2 2 2 3 0 2 1 0 0 2 0 4 0 5 2 1 3 2 3 2 5 2 2 4 1 2 4 4 5 1 1 4 1 2 1 6 2 3 2 2 0 7 4 1 1 3 1 2 3 5 2 1 2 2 1 3 1 3 5 4 4 0 1 4 6 1 2 4 0 3 4 1 5 4 3 5Get into R. Then enter the data via the scan function.
fumbles <- scan()R will prompt you for data, just select the 110 data, paste them into the R command line, and then hit the
# First histogram the data. # Breaks gives natural break points for integer data. br <- seq(min(fumbles)-.5,max(fumbles)+.5) hist(fumbles, breaks = br) #Now, we calculate expected values. Can you figure out what the calculations # are doing? m <- mean(fumbles) n <- length(fumbles) k <- seq(min(fumbles),max(fumbles)) expected <- n*exp(-m)*m^k/factorial(k) cbind(table(fumbles), expected)The results of this code, which follow, show a strong adherence to the Poisson probability model:
obs expected 0 8 8.550031 1 24 21.841443 2 27 27.897479 3 20 23.755126 4 17 15.170887 5 10 7.750944 6 3 3.300023 7 1 1.204294 Note: obs = observed countsPoisson approximation to binomial
Another example showing the Poisson approximation to the binomial, with n=1000 and p=1/500. For the Poisson, lambda=np=2.
k binom Poisson 0 0.13506 0.13534 1 0.27067 0.27067 2 0.27094 0.27067 3 0.18063 0.18045 4 0.09022 0.09022 5 0.03602 0.03609 6 0.01197 0.01203 7 0.00341 0.00344 8 0.00085 0.00086 9 0.00019 0.00019 10 0.00004 0.00004