This example is taken from Charles Hicks's Fundamental Concepts in the Design of Experiments (3rd ed.; HBJ, 1982, pp. 66-70). The example is fictitious but illustrative.
A fleet manager wishes to compare the wearability of 4 brands of tire: A, B, C, and D. Four cars are available for the experiment and 4 tires of each brand are available, for a total of 16 tires. The idea is to mount 4 tires on each of the 4 cars, ask the driver of each of the cars to drive his/her car for 20,000 miles and then to measure tread loss. We will measure tread loss in mils (.001 inches). We will designate the 4 cars as cars I, II, III, and IV.
Hicks considers 3 possible experimental designs:
Here is design 1:
Car
I II III IV
____________________
Brand A B C D
Assignment A B C D
A B C D
A B C D
____________________
Can you see the obvious flaw in this design?
Another design that negates the confounding is to use a comletely randomized design (CRD). This entails numbering the 16 tires, drawing at random the numbers and assigning tires to cars in a completely random manner. The following table illustrates a possible result of doing this.
Car
I II III IV
_____________________________
Brand
assignment C(12) A(14) C(10) A(13)
& A(17) A(13) D(11) D(9)
Loss in D(13) B(14) B(14) B(8)
thickness () D(11) C(12) B(13) C(9)
______________________________
The appropriate analysis for such an experiment is the one-way ANOVA:
tire1.df
attach(tire1.df)
tapply(Wear, Brand, mean)
summary(aov(Wear ~ Brand))
Df Sum Sq Mean Sq F value Pr(>F)
Car1 3 32.687 10.896 2.7098 0.0918 .
Residuals 12 48.250 4.021
Do you understand the data frame? Can we infer a difference in mean wear levels between the 4 brands? Are the assumptions for the one-way ANOVA met?
The third design considered by Hicks is the Randomized Complete Block Design. In this case, each car tests all four brands. Thus one tire from each brand is selected at random and randomly allocated to the 4 wheels of car I. Then one tire from each brand is selected and the four are randomly allocated to car II, and so forth. Here are the results of that design.
Car
I II III IV
_____________________________
Brand
assignment B(14) D(11) A(13) C(9)
& C(12) C(12) B(13) D(9)
Loss in A(17) B(14) D(11) B(8)
thickness () D(13) A(14) C(10) A(13)
______________________________
We use a so-called additve, two-way analysis of variance model to analyze these
data, which is model 12.2.1 given on page 775 of Larsen and Marx. The
particulars of setting up the data frame are:
data <- c(17,14,12,13,14,14,12,11,13,13,10,11,13,8,9,9)
foo1 <- factor(rep(c('A','B','C','D'),4))
foo2 <- factor(c(rep('I',4),rep('II',4),rep('III',4),rep('IV',4)))
tire2.df <- data.frame(data,foo1,foo2)
names(tire2.df) <- c("Wear","Brand","Car")
Now plot the data.
attach(tire2.df) boxplot(split(Wear,Brand)) boxplot(split(Wear,Car)) tapply(Wear,Brand,mean) interaction.plot(Brand, Car, Wear)The interaction plot is new. What do we learn from it?
Now let us look at the inference from a two-way ANOVA.
tire.aov <- aov(Wear ~ Brand + Car)
anova(tire.aov)
OUTPUT:
Df Sum of Sq Mean Sq F Value Pr(F)
Brand 3 30.6875 10.22917 7.96216 0.006684942
Car 3 38.6875 12.89583 10.03784 0.003133358
Residuals 9 11.5625 1.28472
Finally look at some residual plots to diagnose any problems with model
assumptions.
qqnorm(resid(tire.aov)) plot(fitted(tire.aov), resid(tire.aov))