Sample Spaces, Events, and Axioms
Probability begins by separating two ideas that everyday language often mixes together: the outcomes that could happen and the numerical rules used to describe how likely those outcomes are. A sample space lists the possible outcomes of an experiment, while events are the subsets of outcomes we ask questions about. Once those objects are clear, probability is not a collection of gambling tricks; it is a measure on events.
Figure: A Venn diagram connects set operations with the same logical connectives used in proofs. Image: Wikimedia Commons, Watchduck, public domain.
This page gives the formal starting point for the rest of probability theory. Lane et al.'s statistics text introduces probability through equally likely outcomes, relative frequencies, simple compound events, and base rates. A probability-theory course keeps those examples but places them inside Kolmogorov's axioms, which work equally well for finite dice rolls, countably infinite waiting times, and continuous measurements such as lifetimes.
Definitions
A random experiment is a process whose outcome is not known in advance but whose possible outcomes can be specified. The sample space is the set of all possible outcomes and is usually denoted by or .
An event is a subset of the sample space. If the observed outcome lies in event , we say that event occurred. Common event operations are:
- Complement: .
- Union: .
- Intersection: .
- Difference: .
- Disjoint events: and are disjoint if .
A probability space is a triple where:
- is the sample space.
- is a collection of events, called a sigma-algebra, closed under complements and countable unions.
- assigns a number to each event .
In a finite or countable sample space, it is common to assign probabilities to individual outcomes and then add. If and , then , , and
When all finite outcomes are equally likely, the classical rule is
This rule is useful but not a definition of probability in general. It applies only when the outcomes are equally likely and the sample space is finite.
Key results
The Kolmogorov axioms are the foundation:
| Axiom | Statement | Meaning |
|---|---|---|
| Nonnegativity | An event cannot have negative probability. | |
| Normalization | Something in the sample space occurs. | |
| Countable additivity | If are pairwise disjoint, then | Probabilities of non-overlapping alternatives add. |
Several rules follow immediately.
Complement rule. Since and are disjoint and ,
Empty event. Since and are disjoint and ,
Monotonicity. If , then
with disjoint pieces, so
Addition rule. For any two events,
The subtraction is necessary because outcomes in were counted once in and once in .
Finite inclusion-exclusion. For three events,
The pattern alternates between adding single events, subtracting pairwise intersections, and adding the triple intersection.
The axioms also separate probability from the physical story that motivated it. A coin, a weather forecast, and a randomized algorithm can all be modeled by the same probability rules once the sample space and event class are chosen. The hard modeling work is deciding which outcomes belong in and which probability assignment is appropriate. In finite examples, the phrase "equally likely" must be justified by symmetry or design. In continuous examples, the probability of a single point is usually zero, so events must be intervals, regions, or other measurable sets. This is why the sigma-algebra appears in the formal definition: probability is assigned to events that the model is prepared to measure, not to arbitrary verbal descriptions.
Visual
| Event expression | Plain-language reading | Probability rule |
|---|---|---|
| does not occur | ||
| both and occur | depends on dependence | |
| at least one of occurs | ||
| occurs but does not | ||
| disjoint | no shared outcomes |
Worked example 1: two dice and event algebra
Problem. Roll two fair six-sided dice. Let be the event that the sum is , and let be the event that at least one die shows . Find , , , and .
Method.
- The sample space is ordered pairs:
There are equally likely outcomes.
- Event contains the pairs whose sum is :
Therefore and
- Event contains outcomes with a in the first coordinate or second coordinate. There are outcomes with first die and outcomes with second die , but is counted twice:
Hence
- For , the sum must be and at least one die must be . From the list for , only and qualify:
- Use the addition rule:
Checked answer. Direct counting confirms this: has the outcomes in plus the sum-seven outcomes without a , for outcomes out of .
Worked example 2: a countably infinite sample space
Problem. A coin is tossed until the first head appears. Let be the toss number on which the first head appears. For a fair coin, find and .
Method.
- The sample space can be written as
Equivalently, the outcome is .
- The event means the first tosses are tails and the -th toss is heads:
- Add the disjoint cases :
- Use the complement rule:
- Check directly. The event means the first three tosses are all tails:
Checked answer. and . The probabilities over the infinite sample space still sum to one because
Code
from fractions import Fraction
omega = [(i, j) for i in range(1, 7) for j in range(1, 7)]
A = {(i, j) for (i, j) in omega if i + j == 7}
B = {(i, j) for (i, j) in omega if i == 6 or j == 6}
def prob(event):
return Fraction(len(event), len(omega))
print("P(A) =", prob(A))
print("P(B) =", prob(B))
print("P(A and B) =", prob(A & B))
print("P(A or B) =", prob(A | B))
# Verify the addition rule.
lhs = prob(A | B)
rhs = prob(A) + prob(B) - prob(A & B)
print("addition rule holds:", lhs == rhs)
Common pitfalls
- Treating a sample space as equally likely without checking the modeling assumption. The unordered sums of two dice, for example, are not equally likely.
- Forgetting that is inclusive: it includes outcomes where both and occur.
- Adding for overlapping events without subtracting .
- Confusing the impossible event with events that have probability zero in continuous models. A point such as can have probability zero under a continuous distribution without being logically impossible.
- Defining events vaguely. "A high value" is not an event until the cutoff is specified.
- Assuming probability one means guaranteed in every philosophical sense. In continuous probability, probability-one events may still exclude exceptional outcomes.