ABD 3e Chapter 14
Observational studies suggested that hormone replacement therapy (HRT) reduced heart disease risk in women (Stamfer et al. 1991 New England J Med)
Women taking HRT had lower rates of heart disease
Conclusion (at the time): HRT protects against heart disease
A randomized trial (Women’s Health Initiative) assigned HRT randomly
Result: HRT did not reduce heart disease risk (and increased some risks)
The original association was due to confounding, not causation
Outcomes can change simply because a treatment is given
A placebo mimics the treatment without an active ingredient
Good placebo: indistinguishable from the treatment
Bad placebo: differs in noticeable ways (e.g., taste, side effects)
People often seek treatment at their worst.
Therefore, people often see their doctor when they are on their way to recovery.
To measure the effects of a new therapy, we need a comparable control group.
id with values of 1 through 10
treatment variable using the sample function, which randomly draws values
control or treatment
n() ensures the number of values in the sample matches the number of rows in the tibble
replace=TRUE ensures true random assignment (which means it allows unequal counts)
Key design strategies:
Replication
Balance
Blocking
Using extreme treatments
Goal: reduce noise without sacrificing generality
\[ \operatorname{SE}_{\bar{Y}_1 - \bar{Y}_2} = \sqrt{s_p^2 \left(\frac{1}{n_1} + \frac{1}{n_2}\right)} \]
Balanced design = equal sample size in each treatment
Unbalanced design = unequal sample sizes
For a fixed total sample size:
\[ \operatorname{SE}_{\bar{Y}_1 - \bar{Y}_2} = \sqrt{s_p^2 \left(\frac{1}{n_1} + \frac{1}{n_2}\right)} \]
Blocking: group similar experimental units into blocks
Units within a block:
Goal:
Within each block:
Analyze differences within blocks, not across all units
Conceptually:
Clark and Tilman (2008) studied whether nitrogen addition reduces plant diversity
Typical (background) N deposition: ~1–10 kg N ha⁻¹ yr⁻¹
Experimental treatments: Up to 100 kg N ha⁻¹ yr⁻¹ (extreme)
Why use extreme levels?
Result: Clear decline in species richness with higher N
Treatment effects are easiest to detect when they are large
Small differences:
Large differences:
Strategy:
| Factorial design: two factors with two levels each | ||
Variable B
|
||
|---|---|---|
| B₁ | B₂ | |
| Variable A | ||
| A₁ | A₁B₁ | A₁B₂ |
| A₂ | A₂B₁ | A₂B₂ |
Study in Cook et al. (2015) Addiction
Outcome: % reduction in cigarettes/day
4 factors (2 levels each: yes vs no):
Design: 2 × 2 × 2 × 2 = 16 combinations
Key idea: Effects depend on combinations of treatments (interactions)
\[ \mu_1 - \mu_2 \]
\[ \bar{Y}_1 - \bar{Y}_2 \]
\[ (\bar{Y}_1 - \bar{Y}_2) \pm \text{margin of error} \]
\[ 2 \times \operatorname{SE} \]
\[ \operatorname{SE} = \sqrt{\frac{2\sigma^2}{n}} \]
\[ n \approx \frac{8\sigma^2}{(\text{margin of error})^2} \]
Experiments assign treatments and enable causal inference
Bias is reduced through controls, randomization, and blinding
Randomization balances confounding variables on average
Observational studies lack randomization and have weaker inference
Confounding in observational studies is reduced by matching and adjustment
Sampling error is reduced by replication, balance, and blocking
Extreme treatments increase the ability to detect effects
Factorial designs test multiple factors and their interactions
Sample size is planned for precision or power
Study design involves tradeoffs among precision, cost, and feasibility

BIOL 275 Biostatistics | Spring 2026