Skip to main content

Graph Algorithms

Graphs model relationships: roads between cities, prerequisites between courses, calls between functions, dependencies between packages, links between web pages, and states connected by transitions. Graph algorithms are therefore a central toolkit for computer science. They combine data structures, proof invariants, and traversal order into procedures for reachability, connectivity, ordering, spanning trees, and shortest paths [1], [3].

This page gives a course-scale map rather than a single algorithm. Breadth-first search and depth-first search provide the basic traversal grammar. Topological sorting, connected components, strongly connected components, bridges, articulation points, MSTs, and shortest paths layer additional invariants on top. The important distinction is what the edge weights and graph direction allow: unweighted graphs favor BFS, nonnegative weighted graphs allow Dijkstra, negative edges require Bellman-Ford or reweighting, and arbitrary all-pairs shortest paths use Floyd-Warshall or Johnson's algorithm [1], [6].

Dijkstra animation showing tentative distances.

Figure: Dijkstra's algorithm repeatedly finalizes the unsettled vertex with minimum tentative distance. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.

Breadth-first search animation.

Figure: BFS expands a graph layer by layer from the source. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.

Kruskal MST example.

Figure: Kruskal's algorithm grows a minimum spanning forest by safe light edges. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.

Tarjan SCC animation.

Figure: Tarjan's algorithm detects strongly connected components using DFS low-link values. Image: Wikimedia Commons, public domain or CC-BY-SA via Wikimedia Commons.

Definitions

A graph is G=(V,E)G=(V,E), where VV is a set of vertices and EE is a set of edges. In an undirected graph, an edge {u,v}\{u,v\} has no direction. In a directed graph, an edge (u,v)(u,v) goes from uu to vv. A weighted graph has a function w:ERw:E\to\mathbb{R} assigning edge costs, lengths, or capacities.

The most common representations are adjacency lists, adjacency matrices, and edge lists. An adjacency list stores, for each vertex, its outgoing neighbors and optional weights; it uses O(V+E)O(V+E) space and is ideal for sparse graphs. An adjacency matrix stores whether every pair is adjacent; it uses O(V2)O(V^2) space and supports constant-time edge existence tests. An edge list is a flat list of edges; it is natural for Kruskal's algorithm and Bellman-Ford.

A path is a sequence of vertices connected by edges. A graph is connected if every vertex can reach every other vertex in the undirected sense. A directed graph is strongly connected if every vertex can reach every other vertex following edge direction. A DAG is a directed acyclic graph. A topological order of a DAG lists every vertex before its outgoing successors.

During DFS in a directed graph, edges can be classified as tree, back, forward, or cross edges. A back edge to an ancestor indicates a directed cycle. Discovery time, finish time, and low-link values are timestamps that encode ancestor reachability.

Key results

BFS explores in increasing number of edges from a source. With a queue, it runs in O(V+E)O(V+E) time on adjacency lists and computes shortest-path distances in unweighted graphs. The proof is by layer invariant: when a vertex is first discovered, all shorter paths would have had to pass through an earlier layer already processed [1].

DFS explores one branch until it cannot continue, then backtracks. It also runs in O(V+E)O(V+E) and produces discovery and finish times. In undirected graphs, DFS identifies connected components. In directed graphs, reverse finish-time order is central to Kosaraju's SCC algorithm, while Tarjan's algorithm computes SCCs in one pass using a stack and low-link values [10].

Topological sorting has two standard implementations. Kahn's algorithm repeatedly removes vertices of indegree zero. DFS topological sort appends vertices after exploring all outgoing edges, then reverses the finish list. Both are correct because a DAG always has at least one source, and DFS finish times place every vertex before those reachable from it.

Bridges and articulation points in undirected graphs use low-link values. For a DFS tree edge (u,v)(u,v), if low[v] > disc[u], then no back edge from vv's subtree reaches uu or an ancestor, so (u,v)(u,v) is a bridge. A non-root vertex uu is an articulation point if some child vv has low[v] >= disc[u]; removing uu separates that child subtree.

Minimum spanning tree algorithms assume an undirected connected graph with edge weights. Kruskal sorts edges and adds a light edge when it connects two different components. Prim grows a single tree using the lightest edge crossing the current cut. Boruvka repeatedly adds the cheapest outgoing edge of every component, merging many components per phase. The cut property proves all three: the lightest edge crossing any cut is safe for some MST [8], [9].

Shortest-path algorithms depend on weight assumptions. Dijkstra's algorithm works with nonnegative weights and uses relaxation:

if dist[v]>dist[u]+w(u,v),dist[v]dist[u]+w(u,v).\text{if } dist[v] > dist[u]+w(u,v),\quad dist[v]\leftarrow dist[u]+w(u,v).

With a binary heap it runs in O((V+E)logV)O((V+E)\log V); with Fibonacci heaps, the theoretical bound is O(E+VlogV)O(E+V\log V), though constants often make binary heaps preferable [7]. Bellman-Ford relaxes every edge V1V-1 times, handles negative edges, and detects a negative cycle if any edge can still relax afterward [11]. Floyd-Warshall computes all-pairs shortest paths by allowing intermediate vertices one by one:

D(k)[i][j]=min(D(k1)[i][j],D(k1)[i][k]+D(k1)[k][j]).D^{(k)}[i][j]=\min(D^{(k-1)}[i][j],D^{(k-1)}[i][k]+D^{(k-1)}[k][j]).

Johnson's algorithm adds a new source, runs Bellman-Ford to compute potentials, reweights edges to nonnegative values, then runs Dijkstra from every vertex [13].

AA^* search augments Dijkstra with a heuristic h(v)h(v) estimating distance to the goal. If hh is admissible and consistent, AA^* remains optimal while often expanding far fewer states. Eulerian tours, which use every edge exactly once, have clean degree characterizations and linear algorithms. Hamiltonian tours, which visit every vertex exactly once, are NP-complete in general, so graph algorithms also mark the boundary between efficient and intractable problems.

Visual

AlgorithmGraph typeMain structureTime with adjacency lists
BFSunweighted directed or undirectedqueueO(V+E)O(V+E)
DFSdirected or undirectedrecursion stackO(V+E)O(V+E)
Kahn topological sortDAGindegree queueO(V+E)O(V+E)
Tarjan SCCdirectedDFS stack and low-linkO(V+E)O(V+E)
Kruskal MSTundirected weightedsorting plus union-findO(ElogE)O(E\log E)
Prim MSTundirected weightedheapO(ElogV)O(E\log V)
Dijkstranonnegative weightedpriority queueO((V+E)logV)O((V+E)\log V)
Bellman-Fordweighted with negative edgesedge relaxationsO(VE)O(VE)
Floyd-Warshallall-pairsmatrix DPO(V3)O(V^3)

Worked example 1: Dijkstra on a five-node graph

Problem. Find shortest paths from AA with edges:

AB:4, AC:1, CB:2, BD:1, CD:5, DE:3, BE:7.A\to B:4,\ A\to C:1,\ C\to B:2,\ B\to D:1,\ C\to D:5,\ D\to E:3,\ B\to E:7.

All weights are nonnegative.

Method.

  1. Initialize dist[A]=0dist[A]=0 and all others to \infty. Heap contains (0,A)(0,A).
  2. Extract AA. Relax ABA\to B so dist[B]=4dist[B]=4. Relax ACA\to C so dist[C]=1dist[C]=1.
  3. Extract CC with distance 1. Relax CBC\to B: 1+2=3<41+2=3\lt 4, so dist[B]=3dist[B]=3. Relax CDC\to D: 1+5=61+5=6, so dist[D]=6dist[D]=6.
  4. Extract BB with distance 3. Relax BDB\to D: 3+1=4<63+1=4\lt 6, so dist[D]=4dist[D]=4. Relax BEB\to E: 3+7=103+7=10, so dist[E]=10dist[E]=10.
  5. Extract DD with distance 4. Relax DED\to E: 4+3=7<104+3=7\lt 10, so dist[E]=7dist[E]=7.
  6. Extract EE with distance 7. It has no outgoing improvements.

Checked answer. Distances are A:0A:0, C:1C:1, B:3B:3, D:4D:4, E:7E:7. The shortest path to EE is ACBDEA\to C\to B\to D\to E with length 1+2+1+3=71+2+1+3=7.

Worked example 2: Bellman-Ford detecting a negative cycle

Problem. Consider edges:

SA:1, AB:1, BC:3, CA:1.S\to A:1,\ A\to B:1,\ B\to C:-3,\ C\to A:1.

Does a reachable negative cycle exist?

Method. There are four vertices, so Bellman-Ford performs V1=3V-1=3 full relaxation passes, then checks one more pass.

  1. Initialize dist[S]=0dist[S]=0, others \infty.
  2. Pass 1: A=1A=1, B=2B=2, C=1C=-1, then CAC\to A gives A=0A=0.
  3. Pass 2: ABA\to B gives B=1B=1; BCB\to C gives C=2C=-2; CAC\to A gives A=1A=-1.
  4. Pass 3: B=0B=0, C=3C=-3, A=2A=-2.
  5. Extra check: ABA\to B can still improve BB from 00 to 1-1.

Checked answer. The cycle ABCAA\to B\to C\to A has total weight 13+1=11-3+1=-1, so distances are not well-defined below. Bellman-Ford correctly reports a reachable negative cycle.

Code

from heapq import heappop, heappush

def dijkstra_heap(graph, source):
dist = {source: 0}
parent = {source: None}
heap = [(0, source)]
while heap:
d, u = heappop(heap)
if d != dist.get(u, float("inf")):
continue
for v, weight in graph.get(u, []):
nd = d + weight
if nd < dist.get(v, float("inf")):
dist[v] = nd
parent[v] = u
heappush(heap, (nd, v))
return dist, parent

def bellman_ford(vertices, edges, source):
dist = {v: float("inf") for v in vertices}
dist[source] = 0
for _ in range(len(vertices) - 1):
changed = False
for u, v, w in edges:
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w
changed = True
if not changed:
break
for u, v, w in edges:
if dist[u] + w < dist[v]:
raise ValueError("negative cycle reachable from source")
return dist

def kruskal_mst(vertices, edges):
parent = {v: v for v in vertices}
rank = {v: 0 for v in vertices}

def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x

def union(a, b):
ra, rb = find(a), find(b)
if ra == rb:
return False
if rank[ra] < rank[rb]:
ra, rb = rb, ra
parent[rb] = ra
if rank[ra] == rank[rb]:
rank[ra] += 1
return True

mst = []
for w, u, v in sorted(edges):
if union(u, v):
mst.append((u, v, w))
return mst

def scc_tarjan(graph):
index = 0
stack, on_stack = [], set()
idx, low = {}, {}
comps = []

def dfs(u):
nonlocal index
idx[u] = low[u] = index
index += 1
stack.append(u)
on_stack.add(u)
for v in graph.get(u, []):
if v not in idx:
dfs(v)
low[u] = min(low[u], low[v])
elif v in on_stack:
low[u] = min(low[u], idx[v])
if low[u] == idx[u]:
comp = []
while True:
v = stack.pop()
on_stack.remove(v)
comp.append(v)
if v == u:
break
comps.append(comp)

for u in graph:
if u not in idx:
dfs(u)
return comps

Common pitfalls

  • Using an adjacency matrix for a sparse graph with millions of vertices.
  • Marking BFS vertices only when popped from the queue, causing duplicate enqueues.
  • Treating DFS finish order as a topological order without reversing it.
  • Running Dijkstra on a graph with negative edges.
  • Forgetting to ignore stale priority-queue entries in Dijkstra implementations without decrease-key.
  • Stopping Bellman-Ford after V1V-1 passes without the extra negative-cycle check.
  • Confusing strongly connected components with weakly connected components.
  • Computing bridge low-link values with directed-graph logic.
  • Implementing Kruskal without path compression and union by rank on large inputs.
  • Assuming MST is unique when edge weights tie.
  • Using Floyd-Warshall when VV is too large for O(V3)O(V^3) time and O(V2)O(V^2) memory.
  • Treating Hamiltonian cycle like Eulerian cycle; their difficulty is very different.
  • Using an inadmissible AA^* heuristic while still claiming optimality.

Connections

References

[1] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 4th ed. MIT Press, 2022.

[2] R. Sedgewick and K. Wayne, Algorithms, 4th ed. Addison-Wesley, 2011.

[3] J. Kleinberg and E. Tardos, Algorithm Design. Pearson, 2005.

[4] S. S. Skiena, The Algorithm Design Manual, 3rd ed. Springer, 2020.

[5] K. Mehlhorn and P. Sanders, Algorithms and Data Structures: The Basic Toolbox. Springer, 2008.

[6] D. B. West, Introduction to Graph Theory, 2nd ed. Prentice Hall, 2001.

[7] E. W. Dijkstra, "A note on two problems in connexion with graphs," Numerische Mathematik, vol. 1, pp. 269-271, 1959. https://doi.org/10.1007/BF01386390

[8] J. B. Kruskal, "On the shortest spanning subtree of a graph and the traveling salesman problem," Proceedings of the American Mathematical Society, vol. 7, no. 1, pp. 48-50, 1956.

[9] R. C. Prim, "Shortest connection networks and some generalizations," Bell System Technical Journal, vol. 36, no. 6, pp. 1389-1401, 1957.

[10] R. E. Tarjan, "Depth-first search and linear graph algorithms," SIAM Journal on Computing, vol. 1, no. 2, pp. 146-160, 1972.

[11] R. Bellman, "On a routing problem," Quarterly of Applied Mathematics, vol. 16, no. 1, pp. 87-90, 1958.

[12] R. W. Floyd, "Algorithm 97: Shortest path," Communications of the ACM, vol. 5, no. 6, p. 345, 1962. https://doi.org/10.1145/367766.368168

[13] D. B. Johnson, "Efficient algorithms for shortest paths in sparse networks," Journal of the ACM, vol. 24, no. 1, pp. 1-13, 1977. https://doi.org/10.1145/321992.321993

[14] M. L. Fredman and R. E. Tarjan, "Fibonacci heaps and their uses in improved network optimization algorithms," Journal of the ACM, vol. 34, no. 3, pp. 596-615, 1987.