from http://faculty.wwu.edu/jmcl/Biostat/mm.tukey
mm.tukey JFM 2/8/2010 ANOVA using m&m positions for three kinds of m&ms, followed by Tukey multiple comparisons test (Tukey's Honest Significant Difference test). Data fabricated: random (uniform) distributions from overlapping ranges. In R. First, ANOVA: > # m&m ANOVA example > # Fabricate data > mm.pl <- sort(sample(0:40, replace=TRUE, size=10)) > mm.pl [1] 10 21 25 29 31 31 34 36 39 39 > mean(mm.pl) [1] 29.5 > sqrt(var(mm.pl)) [1] 8.947377 > > mm.pnut <- sort(sample(30:70, replace=TRUE, size=10)) > mm.pnut [1] 33 34 41 55 56 61 63 65 66 68 > mean(mm.pnut) [1] 54.2 > sqrt(var(mm.pnut)) [1] 13.35665 > > mm.al <- sort(sample(60:100, replace=TRUE, size=10)) > mm.al [1] 64 74 77 82 82 85 85 87 87 98 > mean(mm.al) [1] 82.1 > sqrt(var(mm.al)) [1] 9.048634 > > # Combine as single variable, "mm.pos" > mm.pos <- c(mm.pl, mm.pnut, mm.al) > mm.pos [1] 10 21 25 29 31 31 34 36 39 39 33 34 41 55 56 61 63 65 66 68 64 74 77 82 82 [26] 85 85 87 87 98 > > # Create factor variable (mm type) > mm.type <- rep(c("Plain", "Peanut", "Almond"), c(10, 10, 10)) > mm.type <- factor(mm.type) > > # View data in strip plot > stripplot(mm.type ~ mm.pos) > # Better: add "jitter" to view overlapping points: > stripplot(mm.type ~ jitter(mm.pos, 0.4), xlim=c(0,100), + xlab="m&m position (cm)") > > # Analysis of Variance: > # H_o: all 4 means equal > # select alpha = 0.05 > > mm.aov <- aov(mm.pos ~ mm.type) > anova(mm.aov) Analysis of Variance Table Response: mm.pos Terms added sequentially (first to last) Df Sum of Sq Mean Sq F Value Pr(F) mm.type 2 13850.87 6925.433 61.04691 9.587497e-11 Residuals 27 3063.00 113.444 > > # Conclusion: Reject H_o; conclude not all means equal (P < 10^-10) ******************************************************************** Next, Tukey multiple comparisons test (Tukey's Honest Significant Difference test). > # Apply TukeyHSD function to output of ANOVA: > TukeyHSD(mm.aov) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = mm.pos ~ mm.type) $mm.type diff lwr upr p adj Peanut-Almond -27.9 -39.71017 -16.08983 9.00e-06 Plain-Almond -52.6 -64.41017 -40.78983 0.00e+00 Plain-Peanut -24.7 -36.51017 -12.88983 5.36e-05 > # Conclusion: Means of all three populations differ from each other. > # (i.e., all pairs differ) > # To plot means and confidence intervals: > plot(TukeyHSD(mm.aov)) > > # Alternatively, perform multiple comparisons test using R's pairwise.t.test. > # Specify the response variable first ("mm.pos"), > # followed by factor variable ("mm.type"), delimited by a comma ",". > pairwise.t.test(mm.pos, mm.type) Pairwise comparisons using t tests with pooled SD data: mm.pos and mm.type Almond Peanut Peanut 6.2e-06 - Plain 4.9e-11 1.9e-05 P value adjustment method: holm > > # Conclusion: Same as above, means of all three populations differ.