BIOL 275 Biostatistics – quarto-inputb5cfaa35a108358

Learning Objectives

By the end of this lecture, you should be able to:

Explain probability as long-run relative frequency and how it links samples to populations.
Distinguish between discrete probability distributions and continuous probability densities.
Apply the addition rule and multiplication rule of probability.
Determine whether events are mutually exclusive, independent, or dependent.
Compute and interpret conditional probabilities, including simple multi-step processes.
Use probability trees to calculate probabilities across sequential events.

Probability theory: the foundation of statistical thinking

Probability is the true relative frequency of an event
It is the proportion of times the event would occur if the same process were repeated many times
Probability values range from 0 to 1
Written as Pr(event)

A horizontal probability scale labeled from 0 (impossible) to 1 (certain), with intermediate labels unlikely, even chance, and likely, and example icons illustrating a 1-in-6 chance, a fair coin toss, and a 4-in-5 chance. — Figure 1: Conceptual probability scale from impossible to certain, showing that probability values range from 0 to 1, with examples including a 1-in-6 chance, an even chance, and a 4-in-5 chance. Source: Illustration generated by ChatGPT.

Probability theory and statistics

Probability theory links sample estimates to population parameters
Samples:
- Come from data
- Vary by chance
Populations:
- Usually unobserved
- Have fixed values
Probability explains how chance affects what we observe

Venn diagrams and sample space

A Venn diagram represents all possible outcomes of a random trial
The entire diagram corresponds to the sample space
Events are shown as regions within the sample space
Larger areas represent events with higher probability

Example: \(\operatorname{Pr}[A] > \operatorname{Pr}[B]\)

A Venn diagram with two non-overlapping circles labeled A and B. Circle A is larger than circle B, indicating a higher probability for event A than event B. The circles are different colors and partially transparent on a transparent background. — Figure 2: Venn diagram representing the sample space of a random trial, with two mutually exclusive events where event A has a larger area than event B.

Mutually exclusive vs. non-exclusive events

Mutually exclusive events

Cannot occur at the same time
Outcomes belong to only one event

\[ Pr[A \text{ and } B] = 0 \]

A Venn diagram with two non-overlapping circles labeled A and B, indicating mutually exclusive events with no shared outcomes. — Figure 3: Venn diagram of two mutually exclusive events, where events A and B do not overlap.

Not mutually exclusive events

Can occur at the same time
Outcomes may belong to both events

\[ Pr[A \text{ and } B] > 0 \]

A Venn diagram with two overlapping circles labeled A and B, showing a shared region that represents outcomes common to both events. — Figure 4: Venn diagram of two non-exclusive events, where events A and B overlap.

Probability distributions

The true relative frequency of all possible values of a random variable
Some probability distributions:
- Can be described mathematically
- Are simply a list of possible outcomes with their probabilities

Example: the outcomes from rolling a fair six-sided die

A probability distribution showing six discrete outcomes from a fair die, with each outcome assigned the same probability value. — Figure 5: Probability distribution for the outcomes of a fair six-sided die, where each possible outcome has equal probability.

Probability: distributions vs. densities

Probability distributions describe discrete variables

All discrete outcomes have finite probabilities
Probabilities across all outcomes sum to 1

Probability densities describe continuous variables

The probability of any exact value is infinitesimal
Probabilities are defined over ranges, not single values
The total area under the curve integrates to 1

Two vertically stacked plots based on a normal distribution. The top plot shows a discrete probability distribution from binning a normal distribution into intervals, with bars labeled by the bin ranges. The bottom plot shows a smooth normal probability density curve over the same range. — Figure 6: Comparison of a probability distribution and a probability density based on the same normal distribution. The top panel shows a discretized probability distribution, while the bottom panel shows the corresponding continuous probability density.

Proportions

A proportion is the number of times an event occurs divided by the number of trials
Range from 0 to 1
A proportion can be viewed as a realized sample from a probability distribution
With more trials, proportions tend to get closer to the underlying probability

The General Addition Principle

The probability of A or B includes all outcomes in A, all outcomes in B, and avoids double-counting outcomes in both
When two events overlap, their shared outcomes must be subtracted once

\[ \operatorname{Pr}[A \text{ or } B] = \operatorname{Pr}[A] + \operatorname{Pr}[B] - \operatorname{Pr}[A \text{ and } B] \]

A Venn diagram with two overlapping circles labeled A and B. The diagram shows that the combined area representing A or B equals the area of A plus the area of B, with the overlapping region subtracted once to avoid double counting. — Figure 7: Visual illustration of the General Addition Principle, showing that the probability of A or B equals the probability of A plus the probability of B minus the probability of their overlap.

A special case of the addition principle

When two events are mutually exclusive, they cannot occur at the same time
There is no overlap between events

\[ \operatorname{Pr}[A \text{ or } B] = \operatorname{Pr}[A] + \operatorname{Pr}[B] \]

A diagram of ABO and Rh blood types organized into non-overlapping regions, showing that each blood type category is mutually exclusive with the others. — Figure 8: ABO and Rh blood types arranged into mutually exclusive categories, illustrating a case where events have no overlapping outcomes.

Example: probability of a range

Consider the sum of two fair six-sided dice
Possible sums range from 2 to 12
We want the probability that the sum is between 6 and 8, inclusive

\[ \operatorname{Pr}[6 \text{ or } 7 \text{ or } 8] =\\ \operatorname{Pr}[6] + \operatorname{Pr}[7] + \operatorname{Pr}[8] \]

A bar chart showing the probability distribution for the sum of two dice from 2 to 12. Bars corresponding to sums of 6, 7, and 8 are highlighted, while other sums are shown in a lighter color. — Figure 9: Probability distribution of the sum of two fair six-sided dice, with outcomes 6 through 8 highlighted to illustrate calculating the probability of a range.

Example: probabilities sum to 1

Consider a single fair six-sided die
Group outcomes into mutually exclusive events
Together, these events include all possible outcomes
The probabilities of all such events must sum to 1

Event A: rolling 1–4
Event B: rolling 5–6

\[ \operatorname{Pr}[1\text{–}4 \text{ or } 5\text{–}6] =\\ \operatorname{Pr}[1\text{–}4] + \operatorname{Pr}[5\text{–}6] = \\ 0.67 + 0.33 = \\1 \]

The General Multiplication Principle

The probability that A and B both occur depends on whether the events are independent
In general, the probability of A and B equals the probability of A, multiplied by the probability of B given A

\[ \operatorname{Pr}[A \text{ and } B] = \operatorname{Pr}[A] \times \operatorname{Pr}[B \mid A] \]

Read “|” as “given”

Independence

Two events are independent if the occurrence of one does not change the probability of the other
When events are independent:

\[ \operatorname{Pr}[A \mid B] = \operatorname{Pr}[A] \]

In this case, the multiplication rule simplifies
This situation is described as independence

A 6 by 6 grid showing all possible outcomes of rolling two dice, labeled as ordered pairs. A highlighted row shows outcomes where the first roll is 3, and a highlighted column shows outcomes where the second roll is 3, illustrating that each has probability one sixth and that the events are independent. — Figure 10: Sample space for rolling two fair dice, illustrating independence between the first and second roll. The probability of rolling a 3 on one roll remains 1/6 regardless of the outcome of the other roll.

Probabilities for independent variables

When two events are independent, one does not affect the probability of the other
The probability that both events occur is the product of their individual probabilities
This is a special case of the general multiplication principle

\[ \operatorname{Pr}[A \text{ and } B] = \operatorname{Pr}[A] \times \operatorname{Pr}[B] \]

Example: Oguchi disease

Oguchi disease is an autosomal recessive condition
The disease is expressed only if an individual inherits:
- One mutant allele from mom
- One mutant allele from dad
Parents who carry one mutant allele typically do not show symptoms

Side-by-side fundus photographs of the retina showing abnormal retinal coloration and vascular patterns associated with Oguchi disease and its progression to retinitis pigmentosa, with visible changes in retinal structure and blood vessels over long-term disease progression. — Figure 11: Fundus photographs illustrating retinal changes associated with Oguchi disease and its progression to retinitis pigmentosa, showing characteristic alterations in retinal appearance and vasculature after long-term disease progression (Nishiguchi et al. 2020).

Question: Child of Two Oguchi Carriers

Question

If both parents have one copy of the disease allele, what is the probability that a given child will have Oguchi disease?

Thinking

A child has a \(\frac{1}{2}\) chance of inheriting mom’s affected chromosome AND a \(\frac{1}{2}\) chance of inheriting dad’s affected chromosome.

Answer

The probability that a given child of heterozygotes has the disease is \(\frac{1}{2} \times \frac{1}{2} = \frac{1}{4}\).

Visualization: two children of Oguchi carriers

If both parents have an affected chromosome but no disease:
- What’s the probability that both of their children will have Oguchi disease?

Answer:

\[ \operatorname{Pr}[\text{A affected and B affected}]= \\ \operatorname{Pr}[\text{A affected}] \times \operatorname{Pr}[\text{B affected}]= \\ \frac{1}{4} \times \frac{1}{4}=\frac{1}{16} \]

A 2×2 grid representing outcomes for two children of Oguchi disease carriers. The horizontal axis indicates whether child two is affected (No, Yes), and the vertical axis indicates whether child one is affected (No, Yes). The cells are labeled with the number of affected children (0, 1, or 2), with darker shading indicating outcomes with more affected children. — Figure 12: Outcome grid showing the possible numbers of children affected by Oguchi disease when two children are born to heterozygous carrier parents, illustrating the probabilities of zero, one, or two affected children.

Probability trees

Probability tree: a diagram that can be used to calculate the probabilities of combinations of events resulting from multiple random trials

Let’s revisit the example of two children of Oguchi carriers

Probability Tree Step 1: Write down all possible outcomes for event one, two… etc. and connected them

Probability Tree Step 2: Write down the probability of each outcome, conditional on their path

Probability Tree Step 3: Sum paths that lead to the same destination

Probability Tree Step 4: Sum paths that lead to the same destination

\[ \operatorname{Pr}[2 \text{ affected}] = \frac{1}{16} \]

\[ \operatorname{Pr}[1 \text{ affected}] = \frac{6}{16} \]

\[ \operatorname{Pr}[0 \text{ affected}] = \frac{9}{16} \]

Dependent events

Two events are dependent if the occurrence of one changes the probability of the other
Knowing that one event occurred provides information about the other
In this case, probabilities cannot be multiplied directly

\[ \operatorname{Pr}[A \text{ and } B] \neq \operatorname{Pr}[A] \times \operatorname{Pr}[B] \]

Example: surviving the Titanic

Of the \(2092\) adults on the Titanic:
- \(319\) (approximately \(0.152\)) sat in first class (more expensive)
- \(654\) (approximately \(0.312\)) survived
If survival and sitting in first class are independent:
- We expect about \(0.152 \times 0.312 \times 2092 = 100\) first-class adults to survive
- We expect about \(0.848 \times 0.312 \times 2092 = 554\) other adults to survive

Survivors of the RMS Titanic aboard a lifeboat, illustrating unequal survival outcomes during the disaster. Source: Public domain, via Wikimedia Commons.

Surviving the Titanic depends on class

More first-class passengers survived than expected:
- \(197\) of the \(319\) adults in first class survived
- This is much higher than the \(\approx 100\) survivors expected under independence
Fewer other passengers survived than expected:
- \(457\) of the \(1773\) other adults survived
- This is much lower than the \(\approx 554\) survivors expected under independence
Survival was therefore not independent of passenger class

Conditional probability

The conditional probability of an event is the probability that the event occurs given that a condition is met
Read the symbol | as “given”
\(\operatorname{Pr}[X \mid Y]\) means the probability of X, given that Y is true

Surviving the Titanic was conditional on class

The probability of survival depends on passenger class
These probabilities are calculated by conditioning on class membership

\[ \operatorname{Pr}[\text{survive} \mid \text{adult in first class}] = \frac{197}{319} = 0.62 \]

\[ \operatorname{Pr}[\text{survive} \mid \text{adult not in first class}] = \frac{457}{1773} = 0.26 \]

The Law of Total Probability

The total probability of an event can be calculated by summing over all possible conditions
Each term is a conditional probability, weighted by how common that condition is

\[ \operatorname{Pr}[X] = \sum_i \operatorname{Pr}[X \mid Y_i] \times \operatorname{Pr}[Y_i] \]

The total probability of surviving the Titanic

Applying this to survival on the Titanic:

\[ \operatorname{Pr}[\text{survive}] = \sum_i \operatorname{Pr}[\text{survive} \mid \text{class}_i] \times \operatorname{Pr}[\text{class}_i] \]

\[ \begin{aligned} \operatorname{Pr}[\text{survive}] = &\operatorname{Pr}[\text{survive} \mid \text{1st class}] \times \operatorname{Pr}[\text{1st class}] +\\ &\operatorname{Pr}[\text{survive} \mid \text{not 1st class}] \times \operatorname{Pr}[\text{not 1st class}] \end{aligned} \]

\[ \operatorname{Pr}[\text{survive}] = 0.62 \times 0.152 + 0.26 \times 0.848 = 0.314 \]

Probability trees for conditional probabilities

Apply the probability tree to the Titanic survival example
Multiply along each path to get path probabilities
Add the survival paths to get the overall probability of survival

\[ \operatorname{Pr}[\text{Survive}] = 0.094 + 0.220 = 0.314 \]

A probability tree diagram for adult Titanic passengers, split first by class and then by survival outcome. Each complete path shows the combined probability, and the two survival paths are summed to obtain the total probability of surviving. — Figure 13: Probability tree showing survival on the Titanic by passenger class, with branch probabilities for first class versus not first class and conditional survival outcomes.

Summary: the addition principle

Use the addition principle when calculating the probability of A or B
Add probabilities of events that can occur instead of one another
Subtract any overlap to avoid double counting

\[ \operatorname{Pr}[A \text{ or } B] = \operatorname{Pr}[A] + \operatorname{Pr}[B] - \operatorname{Pr}[A \text{ and } B] \]

If events are mutually exclusive, the overlap term is \(0\)

Summary: the multiplication principle

Use the multiplication principle when calculating the probability of A and B
Multiply probabilities of events that occur together
Conditional probability is required when events are dependent

\[ \operatorname{Pr}[A \text{ and } B] = \operatorname{Pr}[A] \times \operatorname{Pr}[B \mid A] \]

If events are independent, this simplifies to
\(\operatorname{Pr}[A \text{ and } B] = \operatorname{Pr}[A] \times \operatorname{Pr}[B]\)

Bayes’ theorem

Bayes’ theorem allows us to reverse a conditional probability
It tells us how to find the probability of A given B using:
- The probability of B given A
- How common A is
- The overall probability of B

\[ \operatorname{Pr}[A \mid B] = \frac{\operatorname{Pr}[B \mid A] \times \operatorname{Pr}[A]} {\operatorname{Pr}[B]} \]

Bayes’ theorem is not a new rule, but a rearrangement of ideas you already know

Applying Bayes’ theorem: the Titanic

Find the probability that an adult survivor was in first class
Of the \(2092\) adults on the Titanic:
- \(319\) were in first class
- \(197\) of the \(319\) first-class adults survived
- \(457\) of the other adults survived

\[ \operatorname{Pr}[\text{1st class} \mid \text{survive}] = \frac{ \operatorname{Pr}[\text{survive} \mid \text{1st class}] \times \operatorname{Pr}[\text{1st class}] }{ \operatorname{Pr}[\text{survive}] } \]

\[ \operatorname{Pr}[\text{1st class} \mid \text{survive}] = \frac{ \left(\frac{197}{319}\right) \times \left(\frac{319}{2092}\right) }{ \left(\frac{197 + 457}{2092}\right) } = 0.301 \]

When Bayes’ theorem applies

Bayes’ theorem is used to reverse a conditional probability
It applies when:
- You know \(\operatorname{Pr}[B \mid A]\), and
- You want \(\operatorname{Pr}[A \mid B]\)
The condition you observe is not the condition you care about

Use Bayes’ theorem when the probability you want is reversed from the probability you know.

Bayes’ theorem: examples and takeaway

Examples where Bayes applies

Medical testing:
Known \(\operatorname{Pr}[\text{positive} \mid \text{disease}]\) →
Want \(\operatorname{Pr}[\text{disease} \mid \text{positive}]\)
Titanic:
Known \(\operatorname{Pr}[\text{survive} \mid \text{class}]\) →
Want \(\operatorname{Pr}[\text{class} \mid \text{survive}]\)

Example where Bayes is not needed

If you already have \(\operatorname{Pr}[A \mid B]\) and that is what you want

What to know:

Bayes’ theorem helps compute the probability of a cause given an observed outcome
It combines conditional probability, prior probability, and total probability
You should be able to recognize when Bayes’ theorem applies and interpret a worked example

Lecture 7 Probability

Learning Objectives

Probability theory: the foundation of statistical thinking

Probability theory and statistics

Venn diagrams and sample space

Mutually exclusive vs. non-exclusive events

Mutually exclusive events

Not mutually exclusive events

Probability distributions

Probability: distributions vs. densities

Proportions

The General Addition Principle

A special case of the addition principle

Example: probability of a range

Example: probabilities sum to 1

The General Multiplication Principle

Independence

Probabilities for independent variables

Example: Oguchi disease

Question: Child of Two Oguchi Carriers

Visualization: two children of Oguchi carriers

Probability trees

Probability Tree Step 1: Write down all possible outcomes for event one, two… etc. and connected them

Probability Tree Step 2: Write down the probability of each outcome, conditional on their path

Probability Tree Step 3: Sum paths that lead to the same destination

Probability Tree Step 4: Sum paths that lead to the same destination

Dependent events

Example: surviving the Titanic

Surviving the Titanic depends on class

Conditional probability

Surviving the Titanic was conditional on class

The Law of Total Probability

The total probability of surviving the Titanic

Probability trees for conditional probabilities

Summary: the addition principle

Summary: the multiplication principle

Bayes’ theorem

Applying Bayes’ theorem: the Titanic

When Bayes’ theorem applies

Bayes’ theorem: examples and takeaway

Lecture 7
Probability