Conditional Probability and Bayes' Theorem
Conditional probability is the mathematics of updating. It asks how the probability of an event changes after learning that another event occurred. In statistics, this is the bridge between prior information and evidence; in everyday reasoning, it is where many probability mistakes happen because the direction of conditioning matters.
Figure: Probability trees make the conditioning structure in Bayes' theorem explicit. Image: Wikimedia Commons, Gnathan87, CC0 1.0.
The probability chapter in Lane et al. emphasizes conditional probability through cards, disease testing, base rates, and Bayes' theorem. This page develops the same ideas formally and connects them to independence, tree diagrams, and diagnostic reasoning.
Definitions
For events and with , the conditional probability of given is
The vertical bar is read as "given." The condition becomes the new reference universe. The numerator keeps only the outcomes where both and occur; the denominator normalizes by the probability of being inside .
The multiplication rule follows by rearranging:
Events and are independent if learning that one occurred does not change the probability of the other:
whenever . Equivalently,
A collection is mutually independent if every finite intersection factors into the product of its probabilities. Pairwise independence alone is weaker and does not guarantee mutual independence.
A set of events is a partition of if the events are disjoint and their union is . Partitions often represent competing hypotheses.
Key results
Law of total probability. If partition and , then
This says that the probability of evidence can be computed by splitting the world into cases .
Bayes' theorem. For a partition ,
For two events and ,
The terms have standard interpretations:
| Term | Bayesian name | Diagnostic-test name |
|---|---|---|
| prior probability | base rate | |
| likelihood | sensitivity if is a positive test | |
| evidence probability | positive-test rate | |
| posterior probability | positive predictive value |
Independence and complements. If and are independent, then and are independent, and are independent, and and are independent. For example,
Conditional independence is different. Events may be independent unconditionally but dependent given a third event, or dependent unconditionally but independent given a third event. This is one reason causal reasoning requires care.
Bayes' theorem can also be written in odds form. If is a hypothesis and is evidence, then
The first factor is the prior odds and the second factor is the likelihood ratio. This form is useful because it shows exactly how evidence changes belief: evidence multiplies prior odds by a factor. A likelihood ratio greater than supports over ; a likelihood ratio less than supports over .
Conditional probability also depends on the information protocol. In a medical test, a positive result is generated by a known test procedure. In a card game, seeing another player's card depends on the rules of dealing and revealing. In a search problem, the fact that a match was found may depend on how many candidates were searched. These details change the conditioning event. Before applying a formula, state the event after the vertical bar in a way that includes how the information was obtained.
Another reliable habit is to draw a probability tree before writing Bayes' theorem. The first split usually represents the hidden condition or hypothesis, and the second split represents the observed evidence. Multiplying along branches gives joint probabilities such as and . Adding the branches that end in the same observation gives the denominator. This tree method is algebraically the same as Bayes' theorem, but it makes the base rate visible and reduces the chance of reversing the conditional probabilities.
Visual
| Quantity | Symbol | In a medical test |
|---|---|---|
| Sensitivity | positive if diseased | |
| Specificity | negative if not diseased | |
| False positive rate | positive if not diseased | |
| False negative rate | negative if diseased | |
| Positive predictive value | diseased if positive |
Worked example 1: two cards without replacement
Problem. Two cards are drawn from a standard -card deck without replacement. Find the probability that both cards are aces. Then find the probability that the second card is an ace given that the first card is an ace.
Method.
-
Let be "first card is an ace" and be "second card is an ace."
-
The first draw has aces among cards:
- If the first card is an ace, then aces remain among cards:
- Apply the multiplication rule:
- Check against counting. The number of unordered two-card hands is . The number with two aces is . Thus
Checked answer. , and . The events are not independent because .
Worked example 2: base rates and a positive test
Problem. A disease affects of a population. A test has sensitivity and false positive rate . If a person tests positive, what is the probability that the person has the disease?
Method.
-
Let be the event "has disease" and be the event "test positive."
-
Translate the problem:
- Compute the total probability of a positive test:
- Apply Bayes' theorem:
- Check with natural frequencies. In people, about have the disease and do not. True positives are . False positives are . Among positives, the diseased count is out of , so
Checked answer. The probability is about , not . The low base rate creates many false positives.
Code
def bayes_binary(prior, sensitivity, false_positive_rate):
p_pos = sensitivity * prior + false_positive_rate * (1 - prior)
posterior = sensitivity * prior / p_pos
return posterior, p_pos
posterior, positive_rate = bayes_binary(
prior=0.02,
sensitivity=0.99,
false_positive_rate=0.09,
)
print(f"P(positive) = {positive_rate:.4f}")
print(f"P(disease | positive) = {posterior:.4f}")
# Compare with a less rare condition.
for prior in [0.02, 0.10, 0.50]:
post, _ = bayes_binary(prior, 0.99, 0.09)
print(f"prior={prior:.2f}, posterior={post:.3f}")
Common pitfalls
- Reversing and . A test can be very likely positive among diseased people while a positive-testing person is not very likely diseased.
- Ignoring the denominator in Bayes' theorem. The denominator includes all ways the evidence could occur.
- Treating "independent" as meaning "disjoint." Disjoint nonempty events are usually dependent: if one occurs, the other cannot.
- Assuming pairwise independence implies mutual independence. Three events can be pairwise independent while the triple intersection does not factor.
- Conditioning on a collider or selected subgroup without noticing that the conditioning event can create dependence.
- Saying "the test is accurate" without specifying sensitivity, specificity, and prevalence.