Graph Algorithms
Graphs model relationships: roads between cities, prerequisites between courses, calls between functions, dependencies between packages, links between web pages, and states connected by transitions. Graph algorithms are therefore a central toolkit for computer science. They combine data structures, proof invariants, and traversal order into procedures for reachability, connectivity, ordering, spanning trees, and shortest paths [1], [3].
This page gives a course-scale map rather than a single algorithm. Breadth-first search and depth-first search provide the basic traversal grammar. Topological sorting, connected components, strongly connected components, bridges, articulation points, MSTs, and shortest paths layer additional invariants on top. The important distinction is what the edge weights and graph direction allow: unweighted graphs favor BFS, nonnegative weighted graphs allow Dijkstra, negative edges require Bellman-Ford or reweighting, and arbitrary all-pairs shortest paths use Floyd-Warshall or Johnson's algorithm [1], [6].

Figure: Dijkstra's algorithm repeatedly finalizes the unsettled vertex with minimum tentative distance. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.

Figure: BFS expands a graph layer by layer from the source. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.
Figure: Kruskal's algorithm grows a minimum spanning forest by safe light edges. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.

Figure: Tarjan's algorithm detects strongly connected components using DFS low-link values. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.
Definitions
A graph is , where is a set of vertices and is a set of edges. In an undirected graph, an edge has no direction. In a directed graph, an edge goes from to . A weighted graph has a function assigning edge costs, lengths, or capacities.
The most common representations are adjacency lists, adjacency matrices, and edge lists. An adjacency list stores, for each vertex, its outgoing neighbors and optional weights; it uses space and is ideal for sparse graphs. An adjacency matrix stores whether every pair is adjacent; it uses space and supports constant-time edge existence tests. An edge list is a flat list of edges; it is natural for Kruskal's algorithm and Bellman-Ford.
A path is a sequence of vertices connected by edges. A graph is connected if every vertex can reach every other vertex in the undirected sense. A directed graph is strongly connected if every vertex can reach every other vertex following edge direction. A DAG is a directed acyclic graph. A topological order of a DAG lists every vertex before its outgoing successors.
During DFS in a directed graph, edges can be classified as tree, back, forward, or cross edges. A back edge to an ancestor indicates a directed cycle. Discovery time, finish time, and low-link values are timestamps that encode ancestor reachability.
Key results
BFS explores in increasing number of edges from a source. With a queue, it runs in time on adjacency lists and computes shortest-path distances in unweighted graphs. The proof is by layer invariant: when a vertex is first discovered, all shorter paths would have had to pass through an earlier layer already processed [1].
DFS explores one branch until it cannot continue, then backtracks. It also runs in and produces discovery and finish times. In undirected graphs, DFS identifies connected components. In directed graphs, reverse finish-time order is central to Kosaraju's SCC algorithm, while Tarjan's algorithm computes SCCs in one pass using a stack and low-link values [10].
Topological sorting has two standard implementations. Kahn's algorithm repeatedly removes vertices of indegree zero. DFS topological sort appends vertices after exploring all outgoing edges, then reverses the finish list. Both are correct because a DAG always has at least one source, and DFS finish times place every vertex before those reachable from it.
Bridges and articulation points in undirected graphs use low-link values. For a DFS tree edge , if low[v] > disc[u], then no back edge from 's subtree reaches or an ancestor, so is a bridge. A non-root vertex is an articulation point if some child has low[v] >= disc[u]; removing separates that child subtree.
Minimum spanning tree algorithms assume an undirected connected graph with edge weights. Kruskal sorts edges and adds a light edge when it connects two different components. Prim grows a single tree using the lightest edge crossing the current cut. Boruvka repeatedly adds the cheapest outgoing edge of every component, merging many components per phase. The cut property proves all three: the lightest edge crossing any cut is safe for some MST [8], [9].
Shortest-path algorithms depend on weight assumptions. Dijkstra's algorithm works with nonnegative weights and uses relaxation:
With a binary heap it runs in ; with Fibonacci heaps, the theoretical bound is , though constants often make binary heaps preferable [7]. Bellman-Ford relaxes every edge times, handles negative edges, and detects a negative cycle if any edge can still relax afterward [11]. Floyd-Warshall computes all-pairs shortest paths by allowing intermediate vertices one by one:
Johnson's algorithm adds a new source, runs Bellman-Ford to compute potentials, reweights edges to nonnegative values, then runs Dijkstra from every vertex [13].
search augments Dijkstra with a heuristic estimating distance to the goal. If is admissible and consistent, remains optimal while often expanding far fewer states. Eulerian tours, which use every edge exactly once, have clean degree characterizations and linear algorithms. Hamiltonian tours, which visit every vertex exactly once, are NP-complete in general, so graph algorithms also mark the boundary between efficient and intractable problems.
Visual
| Algorithm | Graph type | Main structure | Time with adjacency lists |
|---|---|---|---|
| BFS | unweighted directed or undirected | queue | |
| DFS | directed or undirected | recursion stack | |
| Kahn topological sort | DAG | indegree queue | |
| Tarjan SCC | directed | DFS stack and low-link | |
| Kruskal MST | undirected weighted | sorting plus union-find | |
| Prim MST | undirected weighted | heap | |
| Dijkstra | nonnegative weighted | priority queue | |
| Bellman-Ford | weighted with negative edges | edge relaxations | |
| Floyd-Warshall | all-pairs | matrix DP |
Worked example 1: Dijkstra on a five-node graph
Problem. Find shortest paths from with edges:
All weights are nonnegative.
Method.
- Initialize and all others to . Heap contains .
- Extract . Relax so . Relax so .
- Extract with distance 1. Relax : , so . Relax : , so .
- Extract with distance 3. Relax : , so . Relax : , so .
- Extract with distance 4. Relax : , so .
- Extract with distance 7. It has no outgoing improvements.
Checked answer. Distances are , , , , . The shortest path to is with length .
Worked example 2: Bellman-Ford detecting a negative cycle
Problem. Consider edges:
Does a reachable negative cycle exist?
Method. There are four vertices, so Bellman-Ford performs full relaxation passes, then checks one more pass.
- Initialize , others .
- Pass 1: , , , then gives .
- Pass 2: gives ; gives ; gives .
- Pass 3: , , .
- Extra check: can still improve from to .
Checked answer. The cycle has total weight , so distances are not well-defined below. Bellman-Ford correctly reports a reachable negative cycle.
Code
from heapq import heappop, heappush
def dijkstra_heap(graph, source):
dist = {source: 0}
parent = {source: None}
heap = [(0, source)]
while heap:
d, u = heappop(heap)
if d != dist.get(u, float("inf")):
continue
for v, weight in graph.get(u, []):
nd = d + weight
if nd < dist.get(v, float("inf")):
dist[v] = nd
parent[v] = u
heappush(heap, (nd, v))
return dist, parent
def bellman_ford(vertices, edges, source):
dist = {v: float("inf") for v in vertices}
dist[source] = 0
for _ in range(len(vertices) - 1):
changed = False
for u, v, w in edges:
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w
changed = True
if not changed:
break
for u, v, w in edges:
if dist[u] + w < dist[v]:
raise ValueError("negative cycle reachable from source")
return dist
def kruskal_mst(vertices, edges):
parent = {v: v for v in vertices}
rank = {v: 0 for v in vertices}
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a, b):
ra, rb = find(a), find(b)
if ra == rb:
return False
if rank[ra] < rank[rb]:
ra, rb = rb, ra
parent[rb] = ra
if rank[ra] == rank[rb]:
rank[ra] += 1
return True
mst = []
for w, u, v in sorted(edges):
if union(u, v):
mst.append((u, v, w))
return mst
def scc_tarjan(graph):
index = 0
stack, on_stack = [], set()
idx, low = {}, {}
comps = []
def dfs(u):
nonlocal index
idx[u] = low[u] = index
index += 1
stack.append(u)
on_stack.add(u)
for v in graph.get(u, []):
if v not in idx:
dfs(v)
low[u] = min(low[u], low[v])
elif v in on_stack:
low[u] = min(low[u], idx[v])
if low[u] == idx[u]:
comp = []
while True:
v = stack.pop()
on_stack.remove(v)
comp.append(v)
if v == u:
break
comps.append(comp)
for u in graph:
if u not in idx:
dfs(u)
return comps
Common pitfalls
- Using an adjacency matrix for a sparse graph with millions of vertices.
- Marking BFS vertices only when popped from the queue, causing duplicate enqueues.
- Treating DFS finish order as a topological order without reversing it.
- Running Dijkstra on a graph with negative edges.
- Forgetting to ignore stale priority-queue entries in Dijkstra implementations without decrease-key.
- Stopping Bellman-Ford after passes without the extra negative-cycle check.
- Confusing strongly connected components with weakly connected components.
- Computing bridge low-link values with directed-graph logic.
- Implementing Kruskal without path compression and union by rank on large inputs.
- Assuming MST is unique when edge weights tie.
- Using Floyd-Warshall when is too large for time and memory.
- Treating Hamiltonian cycle like Eulerian cycle; their difficulty is very different.
- Using an inadmissible heuristic while still claiming optimality.
Connections
- Greedy Algorithms for MST cut properties and Dijkstra's stay-ahead proof.
- Dynamic Programming for Floyd-Warshall and shortest paths on DAGs.
- Network Flow and Matching for augmenting paths and residual graphs.
- Backtracking and Branch & Bound for Hamiltonian paths, graph coloring, and exact search.
- Data Structures for heaps, queues, stacks, and union-find.
- Theory of Computation for NP-completeness of Hamiltonian cycle.
- Discrete Math for graph terminology and proof techniques.
References
[1] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 4th ed. MIT Press, 2022.
[2] R. Sedgewick and K. Wayne, Algorithms, 4th ed. Addison-Wesley, 2011.
[3] J. Kleinberg and E. Tardos, Algorithm Design. Pearson, 2005.
[4] S. S. Skiena, The Algorithm Design Manual, 3rd ed. Springer, 2020.
[5] K. Mehlhorn and P. Sanders, Algorithms and Data Structures: The Basic Toolbox. Springer, 2008.
[6] D. B. West, Introduction to Graph Theory, 2nd ed. Prentice Hall, 2001.
[7] E. W. Dijkstra, "A note on two problems in connexion with graphs," Numerische Mathematik, vol. 1, pp. 269-271, 1959. https://doi.org/10.1007/BF01386390
[8] J. B. Kruskal, "On the shortest spanning subtree of a graph and the traveling salesman problem," Proceedings of the American Mathematical Society, vol. 7, no. 1, pp. 48-50, 1956.
[9] R. C. Prim, "Shortest connection networks and some generalizations," Bell System Technical Journal, vol. 36, no. 6, pp. 1389-1401, 1957.
[10] R. E. Tarjan, "Depth-first search and linear graph algorithms," SIAM Journal on Computing, vol. 1, no. 2, pp. 146-160, 1972.
[11] R. Bellman, "On a routing problem," Quarterly of Applied Mathematics, vol. 16, no. 1, pp. 87-90, 1958.
[12] R. W. Floyd, "Algorithm 97: Shortest path," Communications of the ACM, vol. 5, no. 6, p. 345, 1962. https://doi.org/10.1145/367766.368168
[13] D. B. Johnson, "Efficient algorithms for shortest paths in sparse networks," Journal of the ACM, vol. 24, no. 1, pp. 1-13, 1977. https://doi.org/10.1145/321992.321993
[14] M. L. Fredman and R. E. Tarjan, "Fibonacci heaps and their uses in improved network optimization algorithms," Journal of the ACM, vol. 34, no. 3, pp. 596-615, 1987.