Pigeonhole and Inclusion-Exclusion
The pigeonhole principle proves that collisions are unavoidable. Inclusion-exclusion counts unions by correcting overcounting. Together they handle problems where direct counting is messy because objects overlap or constraints force repetition.
Both tools require careful modeling. For pigeonhole arguments, the hard part is choosing boxes so that sharing a box gives the desired conclusion. For inclusion-exclusion, the hard part is identifying which objects were counted more than once and correcting in the right alternating pattern.
Definitions
The pigeonhole principle says that if objects are placed into boxes and , then some box contains at least two objects.
The generalized form says that some box contains at least
objects. Equivalently, if more than objects are placed into boxes, then some box contains at least objects.
The principle of inclusion-exclusion for two sets is
For three sets:
A derangement is a permutation with no fixed points. Derangements are a classic inclusion-exclusion application because the forbidden events "position is fixed" overlap.
Key results
General inclusion-exclusion for finite sets :
Proof idea: fix an element that belongs to exactly of the sets. Its total contribution to the right side is
using the binomial identity from . So every element in the union is counted once.
The derangement formula is
To derive it, let be the set of permutations fixing position . Then , and there are choices of the fixed positions. Inclusion-exclusion gives
Factoring out yields the formula above.
Visual
| Tool | What it proves or counts | Modeling question |
|---|---|---|
| basic pigeonhole | at least one collision | what are the boxes? |
| generalized pigeonhole | at least in one box | what is box capacity? |
| complement | avoid direct "at least one" count | what does failure look like? |
| two-set inclusion-exclusion | union of two overlapping sets | what was double counted? |
| full inclusion-exclusion | union of many overlapping sets | what intersections have each size? |
| derangements | avoid all fixed points | what forbidden events overlap? |
Worked example 1: Force a shared remainder
Problem. Prove that among any integers, two have the same remainder when divided by .
Method.
- The objects are the integers.
- The boxes are the possible remainders modulo :
- There are boxes.
- Each integer goes into exactly one box, according to its remainder.
- Since , the pigeonhole principle implies that some box contains at least two integers.
Checked answer. Two of the integers have the same remainder modulo . Equivalently, their difference is divisible by .
Worked example 2: Count integers divisible by 2, 3, or 5
Problem. How many positive integers at most are divisible by , , or ?
Method.
- Let be the sets of positive integers at most divisible by respectively.
- Single counts:
- Pairwise intersections are multiples of lcms:
- Triple intersection:
- Apply inclusion-exclusion:
Checked answer. There are positive integers at most divisible by , , or . The final restores multiples of , which were subtracted once too many after being included in all three single counts.
Code
from math import factorial
def derangements(n):
total = 0
for k in range(n + 1):
total += (-1) ** k * factorial(n) // factorial(k)
return total
def count_divisible(limit, divisors):
return sum(
1
for x in range(1, limit + 1)
if any(x % d == 0 for d in divisors)
)
print([derangements(n) for n in range(1, 8)])
print(count_divisible(100, [2, 3, 5]))
The brute-force divisibility count is useful for checking the inclusion-exclusion calculation on small limits.
Common pitfalls
- Choosing boxes that do not imply the desired conclusion when two objects share a box.
- Forgetting the ceiling in the generalized pigeonhole principle.
- Treating inclusion-exclusion as simple subtraction. Triple overlaps must be added back.
- Counting intersections with products instead of least common multiples in divisibility problems.
- Applying derangement formulas to objects with repeated labels without adjusting the model.
- Assuming pigeonhole arguments identify which box is crowded. They usually prove existence only.
A reliable pigeonhole proof has three visible parts: the objects, the boxes, and the reason that two objects in one box force the desired conclusion. If any part is missing, the proof is usually incomplete. For example, to show two selected integers have the same remainder modulo , the boxes are remainders. To show two selected subsets have the same size, the boxes are possible cardinalities. The boxes should be chosen from the conclusion, not from the surface wording of the problem.
Capacity arguments are often easier in contrapositive form. To prove that some box contains at least objects, suppose every box contains at most objects. Then boxes contain at most objects total. If the actual number of objects is larger, the supposition is impossible. This form avoids rounding mistakes because it turns the ceiling statement into an integer capacity statement.
For inclusion-exclusion, organize the computation by intersection size. In a divisibility problem, single sets count multiples of one divisor, pairwise intersections count multiples of least common multiples of two divisors, and triple intersections count multiples of least common multiples of three divisors. In derangement problems, a -fold intersection means specified positions are fixed. Naming this pattern before calculating reduces the chance of omitting an intersection term.
The sign pattern also has a reason. Elements in exactly one set are counted once by the single terms. Elements in exactly two sets are counted twice and then subtracted once. Elements in exactly three sets are counted three times, subtracted three times in pairwise intersections, and added once in the triple intersection. When unsure about a formula, test one element belonging to exactly sets and compute its net contribution.
Inclusion-exclusion counts a union of forbidden events just as well as a union of desired events. Many problems ask for objects with none of several bad properties. The standard route is to count all objects and subtract the union of bad events, using inclusion-exclusion to count that union accurately.
For pigeonhole problems involving integers, remainders, parity, and intervals are common boxes. If the desired conclusion is "two numbers have a difference divisible by ," use residue classes modulo . If the conclusion is "two numbers sum to a fixed value," pair possible numbers into boxes such as and so on. If the conclusion is "two values are close," divide the number line into intervals whose lengths force closeness.
For inclusion-exclusion problems involving functions or strings, define a forbidden event for each violated condition. A password problem with missing required character types might use events for no letters, for no digits, and for no symbols. The intersections then have a concrete interpretation: means the password uses only symbols. This naming discipline prevents formulas from becoming detached from the objects counted.
The strongest check is to test a small case by enumeration. For derangements, compute , , and by listing. For divisibility, list integers up to before trusting a formula up to . Small cases expose missing intersection terms quickly because the final count may be negative, too large, or inconsistent with an obvious list.
For inclusion-exclusion with many sets, write a table of intersection sizes before substituting into the formula. The table should have rows for single sets, pair intersections, triple intersections, and so on. This separates the combinatorial modeling from the arithmetic and makes sign mistakes easier to spot.
Pigeonhole conclusions are existential. They usually do not identify which object pair collides or which box is crowded. If a problem asks for an algorithm to find the pair, the pigeonhole principle may prove the pair exists, but an additional search or constructive argument is needed to locate it.
For inclusion-exclusion, the full universe should be clear even when only a union is counted. In derangements, the universe is all permutations. In divisibility problems, it is usually integers in a bounded interval. Naming the universe clarifies whether the final answer is a raw union size, a complement count, or a probability.
Connections
- Counting principles introduces the sum, subtraction, and division rules.
- Permutations and combinations supplies factorials and binomial coefficients.
- Discrete probability uses inclusion-exclusion for probabilities of unions.
- Number theory basics supplies divisibility, remainders, gcds, and lcms.