Lecture 19
Comparing means of more than two groups

ABD 3e Chapter 15

Chris Merkord

Learning Objectives

Explain why multiple pairwise tests inflate Type I error and why ANOVA is needed
Describe how total variation is partitioned into between-group and within-group components
Interpret sum of squares and mean squares as measures of variation
Explain the \(F\)-statistic as a ratio of between-group to within-group variation
State and interpret the hypotheses tested in a one-way ANOVA
Interpret ANOVA output, including the meaning of a significant \(F\)-test
Identify assumptions of ANOVA and assess when they may be violated
Describe alternatives to ANOVA, including data transformation and the Kruskal–Wallis test

Many biological questions involve more than two groups

Experiments often include multiple treatments, not just two
Example: two medications and a placebo control
This allows us to ask richer questions:
- Are both treatments better than the control?
- Is one treatment better than the other?
- How large are these differences?

Figure 1: Image: La Trobe University (CC BY-NC-SA 4.0).

Comparing groups two at a time inflates false positives

One approach is to run multiple two-sample tests:
- Group 1 vs 2
- Group 2 vs 3
- Group 1 vs 3
This seems reasonable, but it does not scale with more groups

Three treatment groups labeled A. Placebo, B. Moderate dose, and C. High dose, each shown with a differently colored bug icon. Above them are three comparison brackets labeled A-B, B-C, and A-C. — Figure 2: Pairwise comparisons among three treatment groups. Each bracket represents one of the three two-group comparisons that would be made if the groups were analyzed two at a time.

Multiple tests inflate Type I error

Problem:
- Each test has a chance of a Type I error (false positive)
- Multiple tests increase the chance of at least one false positive
Example:
- 5 groups → 10 pairwise tests
- Up to ~40% chance of at least one false positive if all nulls are true

Line plot showing probability of at least one Type I error on the y-axis and number of pairwise comparisons on the x-axis. The curve increases from near 0 to close to 1 as the number of comparisons increases. A dashed horizontal line marks alpha = 0.05. — Figure 3: Probability of at least one Type I error as the number of pairwise comparisons increases (\(\alpha\) = 0.05). Assuming independent tests, the probability rises rapidly with more comparisons.

We need a single test for all groups

Goal:
- Test for differences across all groups at once
Avoid:
- Repeated testing
- Inflated Type I error

Analysis of Variance (ANOVA)

ANOVA compares all group means simultaneously

Analysis of variance (ANOVA) tests for differences among multiple means
Uses a single overall test
Based on:
- Comparing variation among groups to variation within groups
Tests:
- Are individuals from different groups, on average, more different than individuals from the same group?

ANOVA tests variation to detect differences in means

The name (analysis of variance) can be misleading:
- We are interested in means, not variances
Key idea:
- If group means differ → there will be variation among groups
Therefore:
- Testing for variation among groups tells us whether means differ

One-way ANOVA analyzes one explanatory variable

One-way ANOVA:
- One explanatory variable (factor)
- Multiple groups defined by that factor
Examples:
- Treatment type
- Habitat type
- Species

Case Study: Does light exposure affect circadian phase shift?

Study of how light exposure shifts the body’s internal clock (circadian rhythm)
22 participants randomly assigned to one of three treatments:
- No light (control)
- Light to knees
- Light to eyes
Each person received a single 3-hour light exposure
Researchers measured how much each person’s internal clock shifted

Dot plot showing individual phase shift values for three groups: control, knees, and eyes. Each group has several open circles representing participants. Filled dots indicate group means with vertical error bars. The eyes group shows more negative values (greater delays), while control and knees groups are closer to zero. — Figure 4: Phase shift in circadian rhythm (melatonin production) for participants exposed to different light treatments (control, knees, eyes). Open circles show individual participants; filled points with error bars show group means ± standard error. Whitlock & Schluter, The Analysis of Biological Data, 3e © 2020 W. H. Freeman and Company

ANOVA asks: where does the variation come from?

Data vary across individuals, even within the same group
Some variation is due to:
- real differences among groups
- random variation within groups
Goal: Compare between-group variation to within-group variation

Total variation can be partitioned

Total variation: how much all observations vary around the overall mean
This can be split into two parts:
- Between-group variation: differences among group means
- Within-group variation: variation among individuals within groups
ANOVA works by comparing these two sources of variation

Partitioning variation in a real dataset

Same data shown three ways
- Total: differences between each observation and the overall mean
- Groups: differences between each group mean and the overall mean
- Error: differences between observations and their group mean
Total variation = Groups + Error
ANOVA asks whether:
- variation among group means is large relative to variation within groups

Three-panel plot labeled Total, Groups, and Error showing the same data for control, knees, and eyes treatments. Points represent individual observations. The Groups panel shows differences among group means, while the Error panel shows variation within each group around its mean. — Partitioning total variation into between-group (Groups) and within-group (Error) components using circadian phase shift data for three light treatments. Whitlock & Schluter, The Analysis of Biological Data, 3e © 2020 W. H. Freeman and Company.

Mean squares summarize variation

ANOVA uses mean squares (MS) to measure variation
Two key quantities:
- \(MS_{groups}\): variation among group means
- \(MS_{error}\): variation within groups
Each is:
- a measure of variability
- converted to an average amount of variation per sample unit
Larger values = more variation

Start with sum of squares (SS):
- sum of squared deviations from a mean
Convert to mean squares: \(MS = \frac{SS}{df}\)
So:
- \(MS_{groups} = \frac{SS_{groups}}{df_{groups}}\)
- \(MS_{error} = \frac{SS_{error}}{df_{error}}\)
Dividing by \(df\) puts both on the same scale so they can be compared

The \(F\)-statistic compares two sources of variation

ANOVA test statistic:

\[ F = \frac{MS_{groups}}{MS_{error}} \]

Interpretation:
- \(F \approx 1\) : groups are similar
- \(F > 1\) : group means differ more than expected by chance
The larger the ratio:
- the stronger the evidence for differences among groups

ANOVA hypotheses

Null hypothesis:

\[ H_0: \mu_1 = \mu_2 = \cdots = \mu_k \]

Alternative hypothesis:
- Not all group means are equal
Important:
- ANOVA tests for any difference, not which groups differ

Interpreting the ANOVA result

If \(F\) is large → small p-value:
- Reject \(H_0\)
- Evidence that at least one group mean differs
If \(F\) is near 1:
- Fail to reject \(H_0\)
- Differences are consistent with random variation
Next step (if significant):
- Determine which groups differ

Lecture 19 Comparing means of more than two groups

Learning Objectives

Many biological questions involve more than two groups

Comparing groups two at a time inflates false positives

Multiple tests inflate Type I error

We need a single test for all groups

Analysis of Variance (ANOVA)

ANOVA compares all group means simultaneously

ANOVA tests variation to detect differences in means

One-way ANOVA analyzes one explanatory variable

Case Study: Does light exposure affect circadian phase shift?

ANOVA asks: where does the variation come from?

Total variation can be partitioned

Partitioning variation in a real dataset

Mean squares summarize variation

The \(F\)-statistic compares two sources of variation

ANOVA hypotheses

Interpreting the ANOVA result

Lecture 19
Comparing means of more than two groups