BIOL 275 Biostatistics – quarto-inputc600eeed16c8a217

Learning Objectives

distinguish between paired and independent two-sample study designs
estimate and interpret the difference between two population means
explain how confidence intervals for the difference in means are constructed and interpreted
perform and interpret a Welch two-sample t-test
perform and interpret a paired t-test
identify the assumptions of two-sample and paired t-tests

Comparing Two Means

In a one-sample t-test, we asked whether a sample mean differs from a known or hypothesized population mean.
Often, however, we do not have a known population mean.
Instead, we have two samples, each representing a different group.
Because these are samples, their means will naturally differ due to random sampling variation.
The key question:

Is the difference between the sample means small enough to be explained by sampling variation if both groups come from populations with the same mean?

Or is the difference too large to be plausibly explained by chance alone?
To answer this question, we test whether the two populations have the same mean.

Paired versus two independent samples

Different statistical models are used depending on the relationship between the two samples:

Two-sample (aka independent or unpaired) comparison:
- Each treatment group is composed of an independent, random sample of units
Paired comparison:
- Both treatments are applied to every sampled unit
Paired designs are more powerful because they control for more extraneous variations between units

Two-panel diagram comparing sampling designs. The left panel, labeled “Two-sample,” shows red and yellow points scattered independently, representing two separate groups of observations. The right panel, labeled “Paired,” shows red and yellow points arranged in pairs, indicating matched observations where each pair represents two measurements from the same unit or matched units. — Figure 1: Illustration of independent (two-sample) and paired sampling designs. In a two-sample design, observations in the two groups are independent. In a paired design, each observation in one group is matched with a corresponding observation in the other group. Whitlock & Schluter 3e.

Examples of paired and unpaired scenarios

Unpaired (independent) samples

Two separate groups of individuals
Observations in one group are not linked to observations in the other

Examples:

Mean plant biomass in fertilized vs. unfertilized plots
Mean blood pressure in patients receiving Drug A vs. Drug B
Mean time spent hiding in cover for fish from ponds with predators vs. ponds without predators

Paired (dependent) samples

Each observation in one group is linked to a specific observation in the other group
Often the same individual measured twice or matched pairs

Examples:

Leaf nitrogen concentration in the same plants before and after fertilization
Blood pressure before and after treatment in the same patients
Time spent hiding in cover for the same fish measured with and without predator cues in an experimental tank

Estimating the difference in means of two independent samples

The best estimate of the difference between two means is the difference between the two sample means:

\[ \bar{Y}_1 - \bar{Y}_2 \]

Standard error:

\[ \operatorname{SE}_{(\bar{Y}_1 - \bar{Y}_2)} = \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}} \]

Where:

\(\bar{Y}_1\) and \(\bar{Y}_2\) are the sample means
\(s_1^2\) and \(s_2^2\) are the sample variances
\(n_1\) and \(n_2\) are the sample sizes

Confidence interval for the difference in two means

If the populations are normally distributed (or sample sizes are large), the standardized difference has a \(t\) distribution with \(df\) degrees of freedom:

\[ t=\frac{(\bar{Y}_1 - \bar{Y}_2)-(\mu_1-\mu_2)}{\operatorname{SE}_{(\bar{Y}_1 - \bar{Y}_2)}} \]

with a total degrees of freedom equal to

\[ df=df_1+df_2=n_1+n_2-2 \]

Thus, the confidence interval for \(\bar{Y}_1 - \bar{Y}_2\) would be:

\[ (\bar{Y}_1 - \bar{Y}_2) \pm t_{\alpha/2,df} \times \operatorname{SE}_{(\bar{Y}_1 - \bar{Y}_2)} \]

Calculating estimates in R

Calculate difference in means using dplyr::summarize()
Calculate confidence interval using t.test()
- Defaults to Welch’s test, confidence level = 0.95

# assume a data frame (tibble) with columns `value` and `group`

# calculate the means
summarize(example_data, group_mean = mean(value), .by = group)

# calculate the confidence interval
result <- t.test(value ~ group, data = example_data)
result$conf.int

The Welch’s two-sample \(t\)-test

The test evaluates whether the difference between population means is zero.

\[ H_0: \mu_1 = \mu_2 \]

\[ H_A: \mu_1 \ne \mu_2 \]

Test statistic:

\[ t=\frac{\bar{Y}_1 - \bar{Y}_2}{\operatorname{SE}_{(\bar{Y}_1 - \bar{Y}_2)}} \]

where:

\[ \operatorname{SE}_{(\bar{Y}_1 - \bar{Y}_2)} = \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}} \]

The degrees of freedom are estimated using the Welch–Satterthwaite formula and computed automatically by statistical software.

Assumptions of Welch’s two-sample t-test

The two groups are independent random samples from their populations.
The response variable is numerical.
The variable is approximately normally distributed in each population (or sample sizes are large).

Note

Another variation of the two-sample \(t\)-test assumes that the variances of the two populations are equal (the pooled-variance t-test).
Welch’s test does not assume equal variances, and it maintains the correct Type I error rate when the group variances or sample sizes differ.
Because Welch’s test also performs well when variances are equal, most modern statistical software uses Welch’s test by default (including t.test() in R).

Estimating mean differences of paired data: 3 steps

Calculate the differences between the values in each pair:

\[ d_i = (\text{first measurement of unit }i)-\\ (\text{second measurement of unit }i) \]

Calculate mean \(\bar{d}\) and standard deviation \(s_d\) of the differences and figure out your sample size \(n\)

Calculate the confidence interval:

\[ \bar{d} \pm t_{\alpha/2,df} \times \operatorname{SE}_{\bar{d}} \]

where:

\[ \operatorname{SE}_{\bar{d}} = \frac{s_d}{\sqrt{n}} \]

The paired \(t\)-test

The test evaluates whether the mean difference between paired observations differs from a specified value (usually zero).

\[ H_0: \mu_d=0 \]

\[ H_A: \mu_d \ne 0 \]

Steps:

Calculate the differences \(d\)
Calculate the mean \(\bar{d}\) and standard error \(\operatorname{SE}_{\bar{d}}\) of the differences
Continue with a one-sample \(t\)-test using this \(t\)-statistic:

\[ t=\frac{\bar{d} - \mu_{d_0}}{\operatorname{SE}_{\bar{d}}} \]

Assumptions of the paired \(t\)-test

Same as the assumptions for a one-sample 𝑡-test:

The sampling units are randomly sampled from the population
The paired differences have a normal distribution in populations

Important

The analysis makes no assumptions about the distribution of either of the two measurements made on each sampling unit, only their differences.

The fallacy of indirect comparison

Make comparisons between groups directly, not indirectly.
Common mistake: compare each group to the same reference value and then draw conclusions about the difference between groups.
Example:
- Group 1 significantly different from reference value.
- Group 2 not significantly different from reference value.
This does not imply that the two groups differ from each other.
Instead, test or compare the difference between their means directly

Figure 2: This figure shows the estimate of the mean for two independent groups. To tell if the means are statistically different, you must compare them to each other. Do not compare each mean to a third number (the red line) to determine if the two means are equal. Whitlock & Schluter 3e.

Interpreting overlap of confidence intervals

Comparing two means and confidence intervals visually yields the same results as a hypothesis test (t-test) in cases (a) and (b) below. In case (c) you would need to do the t-test to tell if the means are different.

Three-panel figure comparing means of two groups with error bars. In panel (a), the means are well separated with small error bars, indicating a statistically significant difference. In panel (b), the means differ more but the error bars are large and overlap, indicating no significant difference due to high variability. In panel (c), the means differ moderately with overlapping error bars, illustrating an unclear or inconclusive hypothesis test result. — Figure 3: Three examples illustrating how differences in means and variability affect hypothesis test results. Panel (a) shows two group means that are clearly separated relative to their variability, producing a statistically significant difference. Panel (b) shows a larger difference in means but with high variability, resulting in no statistically significant difference. Panel (c) shows overlapping uncertainty intervals, producing an inconclusive result. Whitlock & Schluter 3e

Comparing variances: estimation

Sometimes you want to compare the variances of two populations
Estimation is one option:
1. Estimate the variances
2. Estimate the confidence limits
3. Plot

Plot comparing variance estimates for two groups. Each group is represented by a point showing the estimated variance with vertical error bars indicating the confidence interval around that estimate. The estimate for Group 2 is higher than that for Group 1, though the confidence intervals overlap. — Figure 4: Estimated variance for two groups with confidence intervals showing uncertainty in each estimate. Differences in variance between groups can be evaluated by comparing the magnitude of the estimates and the overlap of their confidence intervals. Whitlock & Schluter 3e.

Comparing variances: hypothesis testing

Hypothesis testing is another option:

\[ H_0: \sigma_1^2 = \sigma_2^2 \]

\[ H_A: \sigma_1^2 \ne \sigma_2^2 \]

\(F\)-test

Calculate test statistic \[F=s_1^2/s_2^2\]
Is near 1 if variances are equal
Has an \(F\)-distribution with \[df_1=n_1-1\] \[df_2=n_2-1\]
Assumes normality

Levene’s test

More robust to violations of the assumption of normality
Most commonly used test, for this reason
Can be applied to > 2 groups, e.g. \[H_0:\sigma_1^2=\sigma_2^2=\sigma_3^2\]

Note

These tests are for exploratory analysis, not required before Welch’s \(t\)-test

Lecture 15 Comparing Two Means

Learning Objectives

Comparing Two Means

Paired versus two independent samples

Examples of paired and unpaired scenarios

Unpaired (independent) samples

Paired (dependent) samples

Estimating the difference in means of two independent samples

Confidence interval for the difference in two means

Calculating estimates in R

The Welch’s two-sample \(t\)-test

Assumptions of Welch’s two-sample t-test

Estimating mean differences of paired data: 3 steps

The paired \(t\)-test

Assumptions of the paired \(t\)-test

The fallacy of indirect comparison

Interpreting overlap of confidence intervals

Comparing variances: estimation

Comparing variances: hypothesis testing

\(F\)-test

Levene’s test

Lecture 15
Comparing Two Means