Skip to main content

Association Pattern Mining

Association pattern mining finds combinations of items, events, or attribute values that occur together more often than a user-defined threshold or more interestingly than chance. Aggarwal treats this as one of the four major data mining building blocks because it converts large transaction collections into interpretable co-occurrence structure. The classic setting is market basket analysis, but the same ideas apply to web paths, text terms, categorical profiles, biological markers, and many discretized data sets.

This page covers the basic frequent itemset model, association rules, Apriori, pattern growth, and interest measures. The advanced chapter adds closed and maximal patterns, querying, constraints, and downstream uses.

Definitions

Let I={i1,,im}I=\{i_1,\dots,i_m\} be a set of items. A transaction TT is a subset of II. A transaction database D\mathcal{D} is a multiset of transactions.

An itemset XX is a subset of II. Transaction TT contains XX if XTX\subseteq T.

The support count of itemset XX is

supcount(X)={TD:XT}.\mathrm{supcount}(X)=|\{T\in\mathcal{D}:X\subseteq T\}|.

The support is often the fraction supcount(X)/D\mathrm{supcount}(X)/\vert \mathcal{D}\vert .

An itemset is frequent if its support count is at least a threshold σ\sigma.

An association rule has the form XYX\Rightarrow Y, where XY=X\cap Y=\emptyset. Its support is sup(XY)\mathrm{sup}(X\cup Y) and its confidence is

conf(XY)=sup(XY)sup(X).\mathrm{conf}(X\Rightarrow Y)=\frac{\mathrm{sup}(X\cup Y)}{\mathrm{sup}(X)}.

The Apriori property states that every subset of a frequent itemset is frequent. Equivalently, if an itemset is infrequent, all of its supersets are infrequent.

FP-growth mines frequent patterns by compressing transactions into an FP-tree and recursively mining conditional pattern bases, avoiding the breadth-first candidate explosion of Apriori in many dense cases.

Key results

Downward closure enables pruning. If itemset XX is not frequent, then any YXY\supseteq X cannot be frequent because every transaction containing YY also contains XX. Therefore supcount(Y)supcount(X)<σ\mathrm{supcount}(Y)\le \mathrm{supcount}(X)\lt \sigma.

Apriori is level-wise. It first finds frequent 1-itemsets, then generates candidate 2-itemsets, counts them, prunes infrequent ones, and repeats. It is simple and exact, but candidate generation and repeated database scans can be expensive.

Rule generation follows itemset mining. First mine frequent itemsets. Then for each frequent itemset ZZ, choose nonempty XZX\subset Z and set Y=ZXY=Z\setminus X. The rule XYX\Rightarrow Y is accepted if its confidence exceeds a threshold. This separation is useful because support pruning applies to itemsets, while confidence is evaluated after itemsets are known.

Support and confidence are not enough for all applications. A rule can have high confidence simply because the consequent is common. Interest ratio, lift, correlation, cosine, Jaccard, and related measures compare observed co-occurrence with baselines.

For binary variables AA and BB, lift is

lift(AB)=P(AB)P(A)P(B).\mathrm{lift}(A\Rightarrow B)=\frac{P(A\cap B)}{P(A)P(B)}.

A value above 1 suggests positive association; below 1 suggests negative association.

FP-growth trades candidates for conditional recursion. It compresses common prefixes in the transaction database. When data have many shared prefixes after frequency ordering, the tree is much smaller than the raw database.

Frequent does not mean useful. A very common pattern may simply restate the obvious, such as every grocery basket containing a checkout item. A rare pattern may be more actionable if it is highly discriminative or expensive. Practical association mining therefore combines support thresholds with interest measures, domain constraints, redundancy control, and sometimes statistical testing. The analyst should also inspect whether patterns survive on a holdout period, because co-occurrences in small or seasonal data can be unstable.

Transaction construction is a major modeling choice. A web basket might be one session, one user-day, or one user's full history. A retail basket might include returns or exclude them. A medical transaction might represent one visit or one patient. These choices change support counts and rule meanings, so they should be documented before interpreting the mined patterns.

Rule direction is not causality. The rule XYX\Rightarrow Y means that YY is common among transactions containing XX; it does not prove that buying XX caused buying YY. In many applications the direction is chosen for actionability, not because the data establish cause. Temporal ordering, experiments, or stronger assumptions are needed for causal claims.

Visual

MeasureFormulaInterpretationLimitation
SupportP(XY)P(X\cup Y)How common the full pattern isMisses rare but strong patterns
ConfidenceP(YX)P(Y\mid X)Reliability of implicationInflated by common consequents
LiftP(XY)/(P(X)P(Y))P(XY)/(P(X)P(Y))Departure from independenceUnstable for rare events
CosineP(XY)/P(X)P(Y)P(XY)/\sqrt{P(X)P(Y)}Symmetric co-occurrence strengthLess direct than confidence
JaccardP(XY)/P(XY)P(XY)/P(X\cup Y)Overlap among occurrencesIgnores joint absence

Worked example 1: Apriori support counting

Problem. Mine frequent itemsets with minimum support count σ=3\sigma=3.

transactionitems
1bread, milk
2bread, diaper, beer, eggs
3milk, diaper, beer, cola
4bread, milk, diaper, beer
5bread, milk, diaper, cola

Method.

  1. Count 1-itemsets:

    • bread: transactions 1,2,4,5 -> 4
    • milk: 1,3,4,5 -> 4
    • diaper: 2,3,4,5 -> 4
    • beer: 2,3,4 -> 3
    • eggs: 2 -> 1
    • cola: 3,5 -> 2

    Frequent 1-itemsets: bread, milk, diaper, beer.

  2. Generate 2-item candidates from frequent items and count:

    • bread,milk: transactions 1,4,5 -> 3
    • bread,diaper: 2,4,5 -> 3
    • bread,beer: 2,4 -> 2
    • milk,diaper: 3,4,5 -> 3
    • milk,beer: 3,4 -> 2
    • diaper,beer: 2,3,4 -> 3

    Frequent 2-itemsets: {bread,milk}, {bread,diaper}, {milk,diaper}, {diaper,beer}.

  3. Generate 3-item candidates. Apriori pruning allows {bread,milk,diaper} because all its 2-subsets are frequent? Its subsets are {bread,milk}, {bread,diaper}, {milk,diaper}, all frequent. Count is transactions 4 and 5 -> 2, so it is not frequent.

    Candidate {bread,diaper,beer} is pruned because {bread,beer} is infrequent.

Checked answer. The frequent itemsets are the four frequent singletons and four frequent pairs listed above. No 3-itemset reaches support count 3.

Worked example 2: Rule confidence and lift

Problem. From the same data, evaluate the rule

{diaper}{beer}.\{\text{diaper}\}\Rightarrow \{\text{beer}\}.

Method.

  1. Support count of diaper is 4, so P(diaper)=4/5P(\text{diaper})=4/5.
  2. Support count of beer is 3, so P(beer)=3/5P(\text{beer})=3/5.
  3. Support count of diaper and beer together is 3, so P(diaper,beer)=3/5P(\text{diaper},\text{beer})=3/5.
  4. Confidence:
conf=P(diaper,beer)P(diaper)=3/54/5=34=0.75.\mathrm{conf}=\frac{P(\text{diaper},\text{beer})}{P(\text{diaper})} =\frac{3/5}{4/5}=\frac{3}{4}=0.75.
  1. Lift:
lift=3/5(4/5)(3/5)=0.60.48=1.25.\mathrm{lift}=\frac{3/5}{(4/5)(3/5)}=\frac{0.6}{0.48}=1.25.

Checked answer. The rule has confidence 0.750.75 and lift 1.251.25. It is stronger than independence in this toy database, but the database is small, so the estimate is not stable.

Code

Pseudocode for Apriori:

INPUT: transaction database D, minimum support sigma
OUTPUT: all frequent itemsets F

F1 = frequent 1-itemsets
k = 2
while F(k-1) is not empty:
Ck = join frequent itemsets in F(k-1)
remove any candidate whose (k-1)-subset is not in F(k-1)
count support of each candidate in one scan of D
Fk = candidates with support at least sigma
k = k + 1
return union of all Fk
from itertools import combinations
from collections import Counter

transactions = [
{"bread", "milk"},
{"bread", "diaper", "beer", "eggs"},
{"milk", "diaper", "beer", "cola"},
{"bread", "milk", "diaper", "beer"},
{"bread", "milk", "diaper", "cola"},
]

def support_counts(candidates):
counts = Counter()
for t in transactions:
for c in candidates:
if c.issubset(t):
counts[c] += 1
return counts

def apriori(min_support):
items = sorted(set().union(*transactions))
level = [frozenset([i]) for i in items]
all_freq = {}
while level:
counts = support_counts(level)
freq = {c: n for c, n in counts.items() if n >= min_support}
all_freq.update(freq)
prev = sorted(freq)
candidates = set()
for a, b in combinations(prev, 2):
union = a | b
if len(union) == len(a) + 1:
if all(frozenset(s) in freq for s in combinations(union, len(a))):
candidates.add(union)
level = sorted(candidates)
return all_freq

for itemset, count in apriori(3).items():
print(set(itemset), count)

Common pitfalls

  • Setting minimum support so high that only obvious patterns remain.
  • Setting minimum support so low that the output explodes into millions of patterns.
  • Interpreting confidence as causation.
  • Ignoring base rates; high confidence may only reflect a common consequent.
  • Treating every generated rule as actionable without statistical validation or business constraints.
  • Forgetting to sort or order items consistently in candidate generation.
  • Mining continuous attributes without meaningful discretization.

Connections