Common Discrete Distributions
Discrete distributions model counts, categories, and waiting times measured in whole trials. They appear whenever the outcome is a finite choice, a number of successes, a number of failures before success, or a count of rare events in a fixed region. Many examples in introductory statistics, including those in Lane et al.'s probability chapter, use binomial, Poisson, multinomial, and hypergeometric models.
The main skill is not memorizing formulas in isolation. It is matching the story to the assumptions: fixed number of independent trials, sampling with or without replacement, waiting until a success, or counting events that occur at an average rate. A wrong distribution can produce a polished but wrong answer.
Figure: Binomial probability mass function. Image: Wikimedia Commons, Tayste, public domain.
Definitions
A Bernoulli random variable records one success/failure trial:
A Binomial random variable counts successes in independent Bernoulli trials with common success probability :
A Geometric random variable counts the trial number of the first success:
Some books define geometric as the number of failures before the first success, with support . Always check the convention.
A Negative binomial random variable counts the trial number of the -th success:
A Poisson random variable counts events occurring in a fixed interval when events happen independently at average rate :
A Hypergeometric random variable counts successes in draws without replacement from a population of size containing successes:
A Multinomial random vector counts outcomes in categories across independent trials:
where and .
Key results
| Distribution | Support | Mean | Variance | Typical use |
|---|---|---|---|---|
| Bernoulli | one yes/no trial | |||
| Binomial | successes in fixed independent trials | |||
| Geometric | waiting time to first success | |||
| NegBin | waiting time to successes | |||
| Poisson | counts at a rate | |||
| Hypergeometric | valid counts | without-replacement sampling |
Binomial as a sum. If are independent Bernoulli variables, then
Poisson approximation to binomial. If is large, is small, and is moderate, then
This is useful for rare-event counts.
Hypergeometric versus binomial. The binomial assumes independent trials, which fits sampling with replacement or a very large population. The hypergeometric accounts for dependence created by sampling without replacement.
Memorylessness of the geometric distribution. For ,
After failures, the remaining waiting-time distribution is unchanged because independent trials restart the same success probability.
The assumptions behind each distribution are part of the model. A binomial distribution needs a fixed number of trials, two outcome classes, constant success probability, and independence. A Poisson distribution needs a rate interpretation and is most natural when events in disjoint intervals are approximately independent. A hypergeometric distribution deliberately violates independence because the population changes after each draw. Checking these assumptions is usually more important than recognizing the formula.
There are also parameterization traps. Negative binomial distributions may count trials until the -th success or failures before the -th success. Geometric distributions have the same convention issue. Software libraries differ, so read the documentation and verify the support by calculating a simple probability such as the probability of success on the first trial.
Approximation is another modeling decision. A hypergeometric distribution can be approximated by a binomial distribution when the sample is small compared with the population, because removing a few items barely changes the success probability. A binomial distribution can be approximated by a Poisson distribution when successes are rare and is moderate. These approximations are useful, but the exact distribution should remain clear.
Visual
| Model cue | Distribution to try | Red flag |
|---|---|---|
| "exactly successes in trials" | Binomial | probabilities change across trials |
| "draw from a finite population" | Hypergeometric | replacement is actually used |
| "first success occurs on trial " | Geometric | trials are not independent |
| "third success occurs on trial " | Negative binomial | successes not specified |
| "calls per hour" or "defects per meter" | Poisson | events cluster strongly |
Worked example 1: binomial and Poisson approximation
Problem. A manufacturing process produces a defective part with probability , independently from part to part. In a batch of parts, find the probability of exactly defective parts using the binomial model, then approximate it with a Poisson distribution.
Method.
- Let be the number of defective parts. A fixed number of independent parts is inspected, each with the same defect probability. Thus
- The exact probability is
- Compute the combination:
- Substitute:
Since ,
- For the Poisson approximation, use :
Checked answer. The exact binomial probability is about , and the Poisson approximation is about . The approximation is close because is small and is large.
Worked example 2: hypergeometric sampling
Problem. A lot contains components, of which are faulty. An inspector samples components without replacement. What is the probability that exactly sampled components are faulty?
Method.
- The sample is without replacement from a finite population, so use a hypergeometric model:
-
Count favorable samples:
- choose faulty components from ;
- choose good components from .
The favorable count is
- Count all possible samples:
- Form the probability:
- Compute:
Therefore
- Check reasonableness. The expected number is
Exactly faulty components is above the mean but not extreme.
Checked answer. The probability is approximately .
Code
from math import comb, exp, factorial
from scipy.stats import binom, poisson, hypergeom, nbinom, geom
# Example 1: binomial and Poisson approximation.
n, p, k = 200, 0.01, 3
exact = binom.pmf(k, n, p)
approx = poisson.pmf(k, n * p)
print("binomial exact:", exact)
print("poisson approximation:", approx)
# Example 2: hypergeometric.
N, K, draws, observed = 30, 6, 5, 2
manual = comb(K, observed) * comb(N - K, draws - observed) / comb(N, draws)
library = hypergeom.pmf(observed, N, K, draws)
print("hypergeometric manual:", manual)
print("hypergeometric scipy:", library)
# Geometric and negative binomial conventions in scipy:
# geom counts trial number of first success; nbinom counts failures before r successes.
print("P(first success on trial 4):", geom.pmf(4, 0.25))
print("P(2 failures before 3 successes):", nbinom.pmf(2, 3, 0.25))
Common pitfalls
- Using binomial for sampling without replacement from a small population. Use hypergeometric unless replacement or approximate independence is justified.
- Mixing geometric conventions. Some formulas count the trial of first success; others count failures before first success.
- Forgetting that Poisson mean and variance are both . If data are much more variable, a Poisson model may be too restrictive.
- Treating multinomial category counts as independent. The counts sum to , so increasing one count forces others down.
- Using a probability such as as if it were a rate . A probability is unitless and bounded by ; a rate depends on interval length.
- Rounding early in factorial-heavy calculations. Use exact combinations or software for large counts.