Skip to main content

Algorithms and Complexity

An algorithm is a finite, precise procedure for solving a class of problems. Discrete mathematics studies algorithms at two levels: correctness and cost. Correctness asks whether the algorithm always returns the intended answer. Complexity asks how resource use grows as the input size grows.

The subject is not just about writing code. It is about describing a procedure independently of a programming language, proving that the procedure terminates with the right result, and estimating the amount of work needed for large inputs. This is why algorithms connect proof techniques, functions, sums, recurrence relations, number theory, graphs, and computation theory.

Definitions

An algorithm has specified input, specified output, definiteness of each step, correctness, finiteness, and effectiveness. Pseudocode abstracts away machine details while keeping the logical structure. A step is effective when it can be carried out exactly in a finite amount of time by the intended model of computation.

The input size is the parameter used to measure growth. For a list, it may be the number of elements nn. For an integer NN, it is often the number of bits, about log2N+1\lfloor\log_2 N\rfloor+1. This distinction matters: trial division up to N\sqrt N is roughly O(N)O(\sqrt N) arithmetic divisions as a function of the integer value, but that is exponential in the number of input bits.

Asymptotic notation compares functions for large input:

  • f(n)=O(g(n))f(n)=O(g(n)) if f(n)Cg(n)\vert f(n)\vert \le C\vert g(n)\vert for all sufficiently large nn.
  • f(n)=Ω(g(n))f(n)=\Omega(g(n)) if f(n)Cg(n)\vert f(n)\vert \ge C\vert g(n)\vert for all sufficiently large nn.
  • f(n)=Θ(g(n))f(n)=\Theta(g(n)) if both f(n)=O(g(n))f(n)=O(g(n)) and f(n)=Ω(g(n))f(n)=\Omega(g(n)).

Worst-case complexity bounds the maximum cost over all inputs of size nn. Best-case complexity bounds the minimum. Average-case complexity uses a probability distribution over inputs. Space complexity measures memory use rather than time.

A loop invariant is a statement true before and after each loop iteration. It is the standard tool for proving iterative algorithms correct. A recursive algorithm is usually proved correct by induction on the input size or on the recursive structure.

Key results

If

f(n)=aknk+ak1nk1++a0f(n)=a_k n^k+a_{k-1}n^{k-1}+\cdots+a_0

with ak0a_k\ne0, then f(n)=Θ(nk)f(n)=\Theta(n^k). The leading term dominates because each lower power is eventually bounded by a constant multiple of nkn^k.

For a simple nested loop

for i = 1 to n:
for j = 1 to n:
do constant work

the number of constant-work operations is n2n^2, so the time is Θ(n2)\Theta(n^2). If the inner loop instead runs from 11 to ii, the count is

i=1ni=n(n+1)2=Θ(n2).\sum_{i=1}^{n}i=\frac{n(n+1)}{2}=\Theta(n^2).

Binary search on a sorted list takes O(logn)O(\log n) comparisons. Each comparison discards about half the remaining candidates. After kk comparisons, at most n/2kn/2^k candidates remain. The search ends when n/2k1n/2^k\le1, so klog2nk\ge\log_2 n.

Merge sort has recurrence

T(n)=2T(n/2)+cn.T(n)=2T(n/2)+cn.

For powers of two, the recursion tree has log2n\log_2 n nontrivial levels, and each level costs Θ(n)\Theta(n) for merging. Thus T(n)=Θ(nlogn)T(n)=\Theta(n\log n). This style of reasoning prepares for recurrence relations and the master theorem.

Polynomial time is usually treated as feasible in theoretical computer science, while exponential time often becomes infeasible quickly. This is a classification, not a stopwatch. Constants, memory locality, implementation details, and input distribution still matter in real programs.

Visual

GrowthExample sourceTypical description
O(1)O(1)array index, fixed arithmeticconstant time
O(logn)O(\log n)binary search, repeated halvinglogarithmic
O(n)O(n)scan a listlinear
O(nlogn)O(n\log n)merge sortnear linear
O(n2)O(n^2)all unordered pairsquadratic
O(2n)O(2^n)all subsetsexponential
O(n!)O(n!)all permutationsfactorial

Problem. A sorted list has 10001000 elements. What is the maximum number of comparisons binary search needs to decide whether a target is present?

Method.

  1. After kk comparisons, binary search leaves at most 1000/2k1000/2^k candidates.
  2. We need this number to be at most 11:
10002k1.\frac{1000}{2^k}\le1.
  1. Rearranging gives
10002k.1000\le2^k.
  1. Compare powers of two:
29=512,210=1024.2^9=512,\qquad 2^{10}=1024.
  1. Therefore k=10k=10 comparisons are enough in the worst case. Nine are not always enough, because 512<1000512\lt 1000.

Checked answer. The worst-case number of comparisons is log21000=10\lceil\log_2 1000\rceil=10. This count is independent of the target value; it depends on the repeated-halving structure.

Worked example 2: Analyze a brute-force closest-pair algorithm

Problem. Given nn points in the plane, analyze the brute-force algorithm that computes the squared distance between every pair and returns a closest pair.

Method.

  1. The unordered pairs of distinct points correspond to two-element subsets of an nn-element set.
  2. The number of pairs is
(n2)=n(n1)2.\binom{n}{2}=\frac{n(n-1)}{2}.
  1. For each pair (xi,yi),(xj,yj)(x_i,y_i),(x_j,y_j), compute
dij2=(xixj)2+(yiyj)2.d_{ij}^2=(x_i-x_j)^2+(y_i-y_j)^2.
  1. Squared distance is enough because dd2d\mapsto d^2 is increasing for nonnegative distances, so the pair with smallest squared distance also has smallest distance.
  2. Each pair costs constant arithmetic operations if coordinate arithmetic is treated as constant time.
  3. Keeping the smallest distance seen so far costs one comparison per pair after the first.

Checked answer. The algorithm performs (n2)\binom{n}{2} distance computations, so the time is Θ(n2)\Theta(n^2) under the usual unit-cost model. It uses O(1)O(1) extra space beyond the input if it stores only the current best pair and current best squared distance.

Code

def binary_search(xs, target):
lo, hi = 0, len(xs) - 1
while lo <= hi:
mid = (lo + hi) // 2
if xs[mid] == target:
return mid
if xs[mid] < target:
lo = mid + 1
else:
hi = mid - 1
return None

def closest_pair(points):
best = None
best_dist = None
for i in range(len(points)):
x1, y1 = points[i]
for j in range(i + 1, len(points)):
x2, y2 = points[j]
dist = (x1 - x2) ** 2 + (y1 - y2) ** 2
if best_dist is None or dist < best_dist:
best_dist = dist
best = (points[i], points[j])
return best, best_dist

print(binary_search([2, 5, 8, 13, 21, 34], 21))
print(closest_pair([(0, 0), (3, 4), (1, 1), (8, 8)]))

The binary-search loop invariant is that if the target is present, it is always within xs[lo:hi+1]. The closest-pair loop invariant is that after each completed pair check, best stores a pair with minimum squared distance among all pairs checked so far.

Common pitfalls

  • Measuring integer algorithms by the value of the integer when the input size is the number of bits.
  • Ignoring loop bounds that depend on the outer loop index.
  • Dropping lower-order terms correctly but dropping dominant terms accidentally.
  • Calling O(g(n))O(g(n)) an exact runtime. Big-O is an upper-bound class, not equality of operation counts.
  • Forgetting to prove correctness. A fast procedure is not an algorithmic solution until it is shown to return the right answer.
  • Assuming average case without specifying an input distribution.

Correctness proofs usually have two parts: partial correctness and termination. Partial correctness says that if the algorithm terminates, its output is correct. Termination says it actually stops on every valid input. For loops, a variant such as a shrinking interval or decreasing counter often proves termination. For recursion, the input size or another well-founded measure should decrease.

Big-O notation suppresses constants and lower-order terms, but it does not justify ignoring the input model. Sorting nn integers and multiplying two nn-bit integers both involve a parameter called nn, but the primitive operations being counted may differ. State whether the analysis counts comparisons, arithmetic operations, bit operations, memory cells, or graph edge scans.

Worst-case analysis is deliberately pessimistic. It gives a guarantee for every input of size nn. Average-case analysis can be more predictive, but only after specifying a distribution over inputs. Amortized analysis is different again: it bounds the average cost per operation over a worst-case sequence of operations, without assuming random inputs.

Lower bounds are as important as upper bounds. Showing an algorithm runs in O(nlogn)O(n\log n) does not prove no faster algorithm exists. A lower bound such as Ω(nlogn)\Omega(n\log n) for comparison sorting explains why merge sort is asymptotically optimal in that model. Always ask whether a bound describes an algorithm or the inherent difficulty of the problem.

When comparing algorithms, keep the problem fixed. Linear search and binary search both solve search, but binary search assumes sorted input. Sorting first and then using binary search changes the task cost if only one lookup is needed. Complexity comparisons are meaningful only when input assumptions and required outputs match.

For loops, derive counts from bounds rather than from visual nesting alone. Two nested loops are not automatically Θ(n2)\Theta(n^2): the inner loop might halve a variable, stop early, or depend on sparse graph degree. Summation notation is the bridge between pseudocode and asymptotic classification.

Connections