Wednesday December 12,
2012
Class Activity --- In Minitab,
bring up the data set calories_schools.mtw, which
contains fat calorie values for several lunches at three schools.
1. Make side-by-side boxplots of calories for the three
schools.
2. Make side-by-side boxplots of demo data for the three
schools.
Looking at the boxplots above, we want to explore the question "Are the data
contradictory to the hypothesis
that the three schools' lunches have the same mean fat calories?"
Discussion
Variation is the KEY
Conclusion --- We must analyze
the variation
- among sample averages, and
- within samples
This is formally called Analysis of Variance or ANOVA
Formal Notation
Assumptions
1. each of the k population distributions are normal
2. standard deviation is the same for each of the k
populations
3. the observations within each of the k samples are
independent of one another
4. when comparing populations, the k samples are selected randomly and independently of one
another;
when comparing treatment means, treatments are assigned at random to subjects
Cranking it Out
For the calories and schools data
SSTr = n1(sample avg 1 - grand avg)2 + n2(sample
avg 2 - grand avg)2 + n3(sample avg 3 - grand
avg)2
= 8(145-141)2 +
8(138-141)2 + 8(140-141)2
= 8(16) + 8(9) + 8(1)
= 208
MSTr = SSTr / (k-1) = 208 / 2 = 104
SSE = (n1-1)S12 + (n2-1)S22 + (n3-1)S32
= 7(10.88)2 + 7(8.73)2 + 7(8.00)2
= 1810.11
MSE = SSE / (N-k) = 1810.11 / (24-3) = 86.196
F = MSTr / MSE = 104 / 86.196 = 1.21
p-value = P(F2,21 > 1.21) = 0.318
Class Problem --- Bring up the
Minitab data set solvent_sorption.mtw, which provides the sorption
rates
into a test medium of three organic solvents --- aromatics,
chloroalkanes, and esters. Test to see if there
is significant evidence in the data that the mean sorption rates differ
among these solvents.
Finally, check the assumptions for the validity of the ANOVA F test.