ABD 3e Chapter 10
Many numerical variables have bell-shaped frequency distributions.
Normal distribution = theoretical probability distribution describing many bell curves
Continuous numerical variables
Symmetric, unimodal
Lower probability further from mean
The normal distribution is a continuous probability distribution describing a bell shaped curve.
Has two parameters to describe it location and spread:
Mean (location)
Standard deviation (spread)
Examples
Human body temperature, in degrees Fahrenheit (Shoemaker 1996)
University undergraduate brain size (measured in number of megapixels on an MRI scan) (Willerman et al. 2991)
The number of bristles on the fourth and fifth segments of the abdomens of fruit flies (Falconer and macKay 1995)
The black lines show normal distributions with the same mean and standard deviation as measured in the data.
Rarely used by hand (do not memorize)
Mean can be any value
Standard deviation can be any positive value
Thus βnormal distributionβ is really an infinite number of distributions, each with its own
Mean
Standard deviation
\[ f(Y) = \frac{1}{\sqrt{2\pi\sigma^2}} \, e^{-\frac{(Y-\mu)^2}{2\sigma^2}} \]
Example: What proportion of the population have values between X and Y?
To solve: Subtract π-score for Y from π-score for X
\[ Z = \frac{Y-\mu}{\sigma} \]
Where:
\(Z\) is a standard normal value
\(Y\) is any particular value
\(\mu\) is the population mean
\(\sigma\) is the population standard deviation
In many (most?) kinds of modeling, raw values of predictor variables should be converted to π scores.
Two reasons:
Effects size of variables can be compared directly (e.g. when doing multiple regression)
Facilitates model convergence (computers have an easier time optimizing likelihoods to estimate model parameters). In terms that matter to you: the computer goes faster and is less likely to throw an error.
\(\bar{Y}\) is the mean of a single sample.
If a variable \(Y\) has a normal distribution in a population, then the distribution of sample means is also normal
The standard deviation of the sampling distribution for \(\bar{Y}\) is known as the standard error of the mean
\[ \sigma_{\bar{Y}}=\frac{\sigma}{\sqrt{n}} \]
You can calculate the probability of obtaining a sample with a mean in a given range:
To do this, calculate the π score for a given sample mean \(\bar{Y}\)
Then calculate the π score for another sample mean \(\bar{Y}\)
Subtract one π score from the other to get the AUC between the two π scores
The natural log of growth (change in radius per year in mm) of Engelmann spruce is approximately normally distributed with mean of 0.037 log units and standard deviation 0.385.
Following these steps, determine the probability that a tree has a bad year, defined as having growth less than β0.050 log units in a year.
\[ \mu \pm 3\sigma \\0.037 \pm 3 \times 0.385 \\0.037 \pm 1.155 \\-1.118 < \mu < 1.192 \]



(i.e., those values less than β0.05)
Calculate the standard normal deviate (π) associated with the value we are interested in here, β0.05
Mean \(\mu=0.037\) log units
Standard deviation \(\sigma=0.385\) log units
Value of interest \(Y=-0.05\) log units
\[ Z = \frac{Y-\mu}{\sigma} \]
\[ \frac{-0.05-0.037}{0.385} \]
\[ -0.226 \]
We are interested in the probability of getting a value less than -0.226.
What is the probability that a random draw from a standard normal distribution will be greater than 0.226?
What is the probability that a random draw from a standard normal distribution will be less than β0.226?
What is the probability that a tree has a bad growth year, that is, less than β0.05 log units?

BIOL 275 Biostatistics | Spring 2026