Discrete Probability
Discrete probability assigns weights to finite or countable outcomes. It uses counting when outcomes are equally likely, and it uses functions on sample spaces when outcomes have different probabilities. The subject is essential for randomized algorithms, average-case analysis, hashing, data science, and risk reasoning.
In discrete mathematics, probability is often a counting problem with uncertainty attached. A clean sample space, precise events, and careful conditioning are more important than formula memorization. Random variables then turn outcomes into numbers, making expected values and variances possible.
Definitions
An experiment is a process with possible outcomes. The sample space is the set of all outcomes. An event is a subset of .
If is finite and all outcomes are equally likely, then
For a general finite or countable sample space, a probability distribution assigns each outcome a number such that and
Then
Conditional probability is
when . Events and are independent if , equivalently when .
A random variable is a function from the sample space to the real numbers. Its expected value is
Variance measures spread:
Key results
Complement rule:
Addition rule:
If and are disjoint, this reduces to . Disjointness and independence are different: two nonempty disjoint events cannot be independent because while .
Bayes' theorem:
when the denominator is nonzero. If partition the sample space, then
Linearity of expectation is one of the most useful tools:
for random variables and , even when they are not independent. This makes indicator variables powerful. If is when event occurs and otherwise, then .
Visual
| Concept | Formula | Interpretation |
|---|---|---|
| equally likely event | favorable outcomes over all outcomes | |
| complement | often easier for "at least one" | |
| union | corrects double counting | |
| conditional | restrict sample space to | |
| independence | knowing one does not change the other | |
| expectation | long-run average value |
Worked example 1: Dice sums and conditional probability
Problem. Roll two fair six-sided dice. Find the probability that the sum is given that at least one die shows .
Method.
- Use ordered outcomes , so there are equally likely outcomes.
- Let be the event "at least one die shows ." Count :
- First die is : outcomes.
- Second die is : outcomes.
- Both counted overlap once too many.
- Let be the event "sum is ."
- The outcomes with sum are
- Intersect with . The outcomes with at least one are
- Therefore
Checked answer. The probability is . The denominator is , not , because conditioning restricts the sample space to outcomes where at least one die is .
Worked example 2: Expected fixed points in a random permutation
Problem. A random permutation of is chosen uniformly. Find the expected number of fixed points.
Method.
- Let be the number of fixed points.
- Define indicator variables
- Then
- For a fixed , exactly of the permutations fix , so
- Thus
- Use linearity of expectation:
Checked answer. The expected number of fixed points is , for every positive integer . The indicators are not independent, but linearity of expectation does not require independence.
Code
from collections import Counter
from itertools import product, permutations
outcomes = list(product(range(1, 7), repeat=2))
sums = Counter(a + b for a, b in outcomes)
given = [o for o in outcomes if 3 in o]
sum8_given = [o for o in given if sum(o) == 8]
def fixed_points(perm):
return sum(1 for i, value in enumerate(perm, start=1) if i == value)
perms = list(permutations([1, 2, 3, 4]))
expected_fixed = sum(fixed_points(p) for p in perms) / len(perms)
print(len(sum8_given), len(given), len(sum8_given) / len(given))
print(expected_fixed)
The permutation computation verifies the fixed-point expectation for by exhaustive enumeration.
Common pitfalls
- Using unordered dice outcomes when the sample space of ordered rolls is equally likely.
- Forgetting to change the denominator after conditioning.
- Treating disjoint events as independent. Nonempty disjoint events give information about each other.
- Assuming equally likely outcomes without checking the experiment.
- Multiplying probabilities for events that have not been shown independent.
- Believing expectation must be a possible outcome. An expected die roll is , although no face shows .
The first step in a probability problem is choosing the sample space. For two dice, ordered pairs are equally likely; unordered sums are not. The sum occurs in six ordered ways, while the sum occurs in one ordered way. If the sample space is changed to sums without assigning their unequal probabilities, the computation will be biased. Always ask whether the listed outcomes have equal probability before using .
Conditioning should be read as replacing the universe. After conditioning on , outcomes outside are no longer possible, and probabilities must be renormalized by . This is why may be much larger or smaller than . A useful check is that whenever , because within the restricted universe is certain.
Independence is a statement about information, not about visual separation in a diagram. Events can overlap and still be independent, as with drawing a card that is a heart and drawing a card that is an ace from a standard deck. They can also be non-overlapping and therefore dependent if both have positive probability. The equation is the definitive test.
Linearity of expectation is often easier than distribution counting. To find the expected number of occupied boxes after throwing balls into boxes, define indicators for each box being occupied. To find expected fixed points, define indicators for each position. The events may be dependent, but expectation still adds. This is one reason expectation problems in discrete mathematics often look simpler than probability distribution problems.
Variance, unlike expectation, does care about dependence through cross terms. The shortcut requires independence or at least zero covariance. When variance is requested, check whether the random variables interact before adding variances.
Bayes' theorem is easiest to use when every term has a story. is the prior probability of the explanation, is the likelihood of seeing the evidence under that explanation, and is the updated probability after observing the evidence. The denominator is not optional; it is the total probability of the evidence across all explanations.
Tree diagrams help when an experiment happens in stages. Multiply probabilities along a branch and add probabilities of branches that lead to the event. If sampling is without replacement, branch probabilities change after each draw. If sampling is with replacement, they often stay the same. Drawing two levels of the tree can prevent the common error of treating dependent draws as independent.
For random variables, always specify the values and their probabilities. The same experiment can support many random variables: in a dice roll, might be the sum, the maximum, an indicator for doubles, or the number of sixes. Expectation is computed from the random variable's values, not directly from the raw outcome labels unless the variable is the identity value.
A probability answer should include both the event and the denominator being used. Writing is less informative than "six ordered dice outcomes out of thirty-six." This habit makes it easier to catch denominator errors after conditioning or after changing from ordered to unordered outcomes.
For expectation problems, check units. If counts heads, is a number of heads. If is a dollar payoff, is dollars. Unit checks reveal mistakes such as adding a probability directly to a payoff or averaging event labels instead of random-variable values.
Connections
- Counting principles supplies sample-space counts.
- Permutations and combinations counts hands, arrangements, and fixed-position events.
- Pigeonhole and inclusion-exclusion supports probabilities of unions and derangements.
- Algorithms and complexity uses probability in randomized and average-case analysis.