Skip to main content

Theory of Computation

Theory of computation studies computation before it becomes a particular programming language, operating system, or machine architecture. It asks which problems can be solved at all, which ones can be solved by limited devices such as finite automata or pushdown automata, and which solvable problems demand unreasonable resources. The subject is abstract, but its abstractions are practical: scanners in compilers behave like finite automata, parsers are organized around context-free grammars, termination questions explain why fully automatic program verification is limited, and complexity theory explains why many optimization problems need reductions, approximations, heuristics, or randomized methods.

These notes follow the scope of Michael Sipser's Introduction to the Theory of Computation, third edition, without reproducing its text. They use the textbook as a map of topics: mathematical preliminaries, automata and languages, computability, decidability, reducibility, time complexity, space complexity, and selected advanced topics. Each page is written as a study note with definitions, key results, visual anchors, worked examples, code, pitfalls, and links to neighboring pages.

Definitions

A model of computation is a mathematical object that describes what counts as a computation. In this course the main models are deterministic finite automata, nondeterministic finite automata, regular expressions, context-free grammars, pushdown automata, Turing machines, and variants of Turing machines. A model is useful when it is precise enough for proof and simple enough to reveal structure.

A language is a set of strings over an alphabet. Many questions in the course are language-recognition questions: given an input string, should a machine accept it or reject it? The same view can encode graphs, formulas, programs, arithmetic expressions, and scheduling instances by choosing a reasonable string representation.

A decision problem is a yes-or-no problem represented as a language. A problem is decidable if some algorithm halts on every input and answers correctly. It is recognizable if some algorithm accepts exactly the yes-instances, possibly running forever on no-instances. These two notions separate total algorithms from semi-decision procedures.

A complexity class is a collection of languages decidable within a resource bound such as polynomial time, polynomial space, logarithmic space, nondeterministic polynomial time, or randomized polynomial time. Complexity theory treats resource bounds as mathematical objects so that hard problems can be compared by reductions.

The page list is:

PageMain scope
mathematical preliminariessets, functions, graphs, strings, languages, Boolean logic
proof methods and countabilityconstruction, contradiction, induction, diagonalization
finite automata and DFAsdeterministic finite automata and regular languages
nondeterminism and closureNFAs, subset construction, regular operations
regular expressions and nonregularityregex, GNFA conversion, pumping lemma
context-free grammars and normal formsCFGs, derivations, parse trees, ambiguity, CNF
pushdown automata and deterministic CFLsPDAs, CFG equivalence, DPDAs
non-context-free languagespumping lemma for CFLs and separation examples
Turing machines and the Church-Turing thesisformal TMs, configurations, algorithms
Turing machine variants and decidable problemsmultitape, nondeterminism, enumerators, language algorithms
decidability and the halting problemdiagonalization, acceptance, halting, recognizability
reductions and the recursion theoremmapping reductions, language-theory undecidability, self-reference
time complexity, P, and NPasymptotic time, P, NP, verifiers
NP-completeness and classic reductionsCook-Levin, SAT, CLIQUE, HAMPATH, SUBSETSUM
space complexityPSPACE, Savitch, L, NL, NL-completeness
advanced complexity topicshierarchy theorems, oracles, randomized complexity, IP, PCP

Key results

The first organizing result is that weak models can still express nontrivial computation. DFAs have no memory except a finite state, yet they exactly recognize regular languages. NFAs and regular expressions look different, but they define the same class. The proof of equivalence is constructive: convert a regular expression to an NFA by structural induction, convert an NFA to a DFA by the subset construction, and convert a finite automaton to a regular expression by state elimination.

The second organizing result is that adding a stack strictly increases expressive power. Context-free grammars and pushdown automata recognize the same class of languages, the context-free languages. The stack can match nested structure such as parentheses or balanced blocks, which finite automata cannot do. However, a single stack still cannot coordinate several independent unbounded counts, which is why languages such as {anbncn:n0}\{a^n b^n c^n:n\ge 0\} are not context-free.

The third organizing result is that Turing machines provide a robust mathematical account of algorithms. Many variations of Turing machines have the same decidability power: multitape machines, nondeterministic machines, and enumerators do not decide fundamentally more languages than the basic model. This robustness supports the Church-Turing thesis: any effectively calculable procedure can be carried out by a Turing machine.

The fourth organizing result is negative: some problems are not algorithmically solvable. The halting problem and the acceptance problem for Turing machines are undecidable. Reductions spread these impossibility results to grammar ambiguity, language equivalence questions, and other program-analysis tasks. In complexity theory the same reduction idea becomes a comparison tool for feasible computation, leading to NP-completeness and PSPACE-completeness.

One way to read the whole sequence is as a series of controlled increases in memory. A DFA has only finite state, so its languages are exactly those whose relevant past can be compressed into one of finitely many summaries. A PDA adds a stack, so it can manage nested obligations but still sees only the top of that stack. A Turing machine adds a rewritable tape, so it can implement general algorithms. Complexity theory then keeps the Turing-machine level of generality but asks how much time, space, nondeterminism, randomness, or interaction is required.

Another way to read the sequence is as a series of equivalence theorems followed by separation theorems. DFA, NFA, and regular expressions are equivalent; CFGs and PDAs are equivalent; many Turing-machine variants are equivalent. These equivalences make the models trustworthy because a language family is not an artifact of one notation. Separations then explain the limits: regular languages are strictly weaker than context-free languages, context-free languages are strictly weaker than decidable languages, recognizable languages are not all decidable, and polynomial-time computation is not known to capture all efficiently verifiable search.

The most reusable skill is reduction design. In automata, reductions are often disguised as closure constructions or encodings. In computability, they transfer undecidability from ATMA_{TM} or HALTTMHALT_{TM} to a new semantic question. In complexity, they transfer NP-hardness or PSPACE-hardness while preserving polynomial or log-space resource bounds. Each reduction has the same core checklist: define the input transformation, prove it is computable within the allowed resource, prove the yes direction, and prove the no direction.

Visual

ThemeCentral questionTypical tool
AutomataWhat can restricted machines recognize?constructions and closure proofs
ComputabilityWhat can any algorithm decide?diagonalization and reductions
ComplexityWhat can be decided with limited resources?asymptotic analysis and completeness

Worked example 1: Classifying a language by required memory

Problem. Decide which model naturally recognizes each language: L1={w{0,1}:wL_1=\{w\in\{0,1\}^*:w ends in 01}\}, L2={0n1n:n0}L_2=\{0^n1^n:n\ge0\}, and L3={M,w:ML_3=\{\langle M,w\rangle:M accepts w}w\}.

Method. Look for the amount and shape of memory required.

  1. For L1L_1, the machine only needs to remember the last two input symbols. There are finitely many possibilities: no symbols yet, last symbol 0, last symbol 1, last two symbols 00, 01, 10, 11. A DFA can keep this information in its state.
  2. For L2L_2, the machine must compare an arbitrary number of leading zeros with an arbitrary number of following ones. A finite state cannot store the count, but a stack can push one marker per 0 and pop one marker per 1. A PDA or CFG is natural.
  3. For L3L_3, the input encodes a machine and another input. Recognizing it requires simulating the encoded machine. A Turing machine can do that, but no decider can always halt because the acceptance problem for Turing machines is undecidable.

Checked answer. L1L_1 is regular, L2L_2 is context-free but not regular, and L3L_3 is Turing-recognizable but not decidable.

Worked example 2: Reading the course through reductions

Problem. Explain how a single undecidability result can imply that another language is undecidable.

Method. Use the reduction pattern. Suppose AA is known undecidable and we want to prove BB undecidable.

  1. Assume for contradiction that BB has a decider DBD_B.
  2. Given an input xx for AA, compute an instance f(x)f(x) for BB.
  3. Arrange the construction so that xAx\in A exactly when f(x)Bf(x)\in B.
  4. Decide AA by running DBD_B on f(x)f(x) and copying its answer.
  5. This contradicts the known undecidability of AA.

Checked answer. The proof succeeds when ff is computable and preserves membership both ways. In notation, AmBA\le_m B. If AA is undecidable and AmBA\le_m B, then BB is undecidable.

Code

def classify_by_memory(description):
cues = description.lower()
if "last" in cues or "substring" in cues or "mod" in cues:
return "try a finite automaton"
if "balanced" in cues or "nested" in cues or "same number" in cues:
return "try a context-free grammar or pushdown automaton"
if "program" in cues or "machine" in cues or "halts" in cues:
return "expect a Turing-machine or undecidability argument"
return "identify the encoding, then choose a model"

for text in ["strings ending with 01", "balanced parentheses", "program halts"]:
print(text, "->", classify_by_memory(text))

Common pitfalls

  • Treating a model as a programming convenience rather than a mathematical definition. A DFA is not just a simple program; it has no unbounded storage.
  • Forgetting that a language is a set of strings. Problems about graphs, machines, and formulas must be encoded before they become languages.
  • Confusing recognizability with decidability. A recognizer may loop on nonmembers; a decider must halt on every input.
  • Using examples as proofs. A few accepted and rejected strings can guide a construction, but a proof must cover all strings.
  • Thinking reductions only prove hardness in complexity theory. The same idea first appears as a way to transfer undecidability.

Connections