Wednesday April 30, 2014

Class Activity --- In Minitab, bring up the data set calories_schools.mtw, which
contains fat calorie values for several lunches at three schools.

1.  Make side-by-side boxplots of  calories  for the three schools.

2.  Make side-by-side boxplots of  demo data  for the three schools.

Looking at the boxplots above, we want to explore the question "Are the data contradictory to the hypothesis
that the three schools' lunches have the same mean fat calories?"

Discussion

Variation is the KEY

Conclusion --- We must analyze the variation
- among sample averages, and
- within samples
This is formally called Analysis of Variance or ANOVA

Formal Notation

Assumptions
1. each of the k population distributions are normal
2. standard deviation is the same for each of the k populations
3. the observations within each of the k samples are independent of one another
4. when comparing populations, the k samples are selected randomly and independently of one another;
when comparing treatment means, treatments are assigned at random to subjects

Cranking it Out

For the calories and schools data

SSTr = n1(sample avg 1 - grand avg)2 + n2(sample avg 2 - grand avg)2 + n3(sample avg 3 - grand avg)2
= 8(145-141)2 + 8(138-141)2 + 8(140-141)2
= 8(16) + 8(9) + 8(1)
= 208

MSTr = SSTr / (k-1) = 208 / 2 = 104

SSE = (n1-1)S12 + (n2-1)S22 + (n3-1)S32
= 7(10.88)2 + 7(8.73)2 + 7(8.00)2
= 1810.11

MSE = SSE / (N-k) = 1810.11 / (24-3) = 86.196

F = MSTr / MSE = 104 / 86.196 = 1.21

p-value = P(F2,21 > 1.21) = 0.318

Class Problem --- Bring up the Minitab data set solvent_sorption.mtw, which provides the sorption rates
into a test medium of  three organic solvents --- aromatics, chloroalkanes, and esters.  Test to see if there
is significant evidence in the data that the mean sorption rates differ among these solvents.
Finally, check the assumptions for the validity of the ANOVA F test.