Skip to main content

Matrix Inverses and Elementary Matrices

An inverse matrix undoes a linear process. Elementary matrices make this idea concrete: every row operation is the same as multiplying by a simple invertible matrix. This connects row reduction, solving systems, and algebraic invertibility into one framework.

The main point is not merely that some square matrices have formulas for inverses. The deeper point is that invertibility is an equivalence of many ideas: no information is lost, the system Ax=bA\mathbf{x}=\mathbf{b} has exactly one solution for every b\mathbf{b}, the columns form a basis, the determinant is nonzero, and row reduction reaches the identity. Elementary matrices are the bridge between the algorithm and these structural statements.

Definitions

A square matrix AA is invertible if there is a matrix A1A^{-1} such that

AA1=IandA1A=I.AA^{-1}=I \qquad\text{and}\qquad A^{-1}A=I.

If no such matrix exists, AA is singular.

An elementary matrix is obtained by performing one elementary row operation on an identity matrix. Left multiplication by an elementary matrix performs the same row operation on any compatible matrix. For example, if EE is created by swapping rows 11 and 22 of II, then EAEA swaps rows 11 and 22 of AA.

For a square matrix AA, the augmented matrix

[AI]\left[ \begin{array}{c|c} A&I \end{array} \right]

can be row-reduced to compute A1A^{-1}. If the left side becomes II, the right side becomes A1A^{-1}. If the left side cannot become II, then AA is not invertible.

The inverse of a 2×22\times2 matrix has the special formula

[abcd]1=1adbc[dbca],\begin{bmatrix} a&b\\ c&d \end{bmatrix}^{-1} = \frac{1}{ad-bc} \begin{bmatrix} d&-b\\ -c&a \end{bmatrix},

provided adbc0ad-bc\neq0.

Key results

The inverse of an invertible matrix is unique. If BB and CC both satisfy the inverse equations for AA, then

B=BI=B(AC)=(BA)C=IC=C.B=BI=B(AC)=(BA)C=IC=C.

Products of invertible matrices are invertible, and the inverse reverses order:

(AB)1=B1A1.(AB)^{-1}=B^{-1}A^{-1}.

The order reversal is the same phenomenon as function composition: to undo "first BB, then AA," one must undo AA first and BB second.

Elementary matrices are invertible. Their inverses are also elementary matrices: swap the same rows again, multiply by the reciprocal scalar, or add the opposite multiple of one row to another. If row operations reduce AA to II, then

EkE2E1A=I.E_k\cdots E_2E_1A=I.

Thus

EkE2E1=A1.E_k\cdots E_2E_1=A^{-1}.

This explains why applying the same operations to II produces A1A^{-1}.

For a square matrix AA, the following are equivalent:

  1. AA is invertible.
  2. The reduced row echelon form of AA is II.
  3. Ax=0A\mathbf{x}=\mathbf{0} has only the trivial solution.
  4. Ax=bA\mathbf{x}=\mathbf{b} has a unique solution for every b\mathbf{b}.
  5. The columns of AA form a basis of Rn\mathbb{R}^n.

These equivalences are often called the invertible matrix theorem in an introductory course.

Visual

Row operationElementary matrix actionInverse operation
Swap RiR_i and RjR_jswap rows of IIswap the same rows
Scale RiR_i by c0c\neq0scale row ii of IIscale by 1/c1/c
Replace RiR_i by Ri+cRjR_i+cR_jadd cc times row jj to row iiadd c-c times row jj to row ii

Worked example 1: Compute an inverse by row reduction

Problem: compute the inverse of

A=[1237].A= \begin{bmatrix} 1&2\\ 3&7 \end{bmatrix}.

Step 1: augment with the identity.

[12103701]\left[ \begin{array}{rr|rr} 1&2&1&0\\ 3&7&0&1 \end{array} \right]

Step 2: eliminate below the first pivot with R2R23R1R_2\leftarrow R_2-3R_1.

[12100131]\left[ \begin{array}{rr|rr} 1&2&1&0\\ 0&1&-3&1 \end{array} \right]

Step 3: eliminate above the second pivot with R1R12R2R_1\leftarrow R_1-2R_2.

[10720131]\left[ \begin{array}{rr|rr} 1&0&7&-2\\ 0&1&-3&1 \end{array} \right]

Thus

A1=[7231].A^{-1}= \begin{bmatrix} 7&-2\\ -3&1 \end{bmatrix}.

Step 4: check by multiplication.

[1237][7231]=[1001].\begin{bmatrix} 1&2\\ 3&7 \end{bmatrix} \begin{bmatrix} 7&-2\\ -3&1 \end{bmatrix} = \begin{bmatrix} 1&0\\ 0&1 \end{bmatrix}.

The checked answer is correct.

Worked example 2: Use an inverse to solve a system

Problem: solve

x+2y=5,3x+7y=17.\begin{aligned} x+2y&=5,\\ 3x+7y&=17. \end{aligned}

Step 1: write the system as Ax=bA\mathbf{x}=\mathbf{b}:

[1237][xy]=[517].\begin{bmatrix} 1&2\\ 3&7 \end{bmatrix} \begin{bmatrix} x\\y \end{bmatrix} = \begin{bmatrix} 5\\17 \end{bmatrix}.

Step 2: use the inverse from the previous example.

[xy]=A1b=[7231][517].\begin{bmatrix} x\\y \end{bmatrix} = A^{-1}\mathbf{b} = \begin{bmatrix} 7&-2\\ -3&1 \end{bmatrix} \begin{bmatrix} 5\\17 \end{bmatrix}.

Step 3: multiply.

x=7(5)2(17)=3534=1,y=3(5)+17=15+17=2.\begin{aligned} x&=7(5)-2(17)=35-34=1,\\ y&=-3(5)+17=-15+17=2. \end{aligned}

Checked answer: (x,y)=(1,2)(x,y)=(1,2). Substitution gives 1+4=51+4=5 and 3+14=173+14=17.

Code

import numpy as np

A = np.array([[1, 2],
[3, 7]], dtype=float)
b = np.array([5, 17], dtype=float)

A_inv = np.linalg.inv(A)
x = A_inv @ b

print(A_inv)
print(x)
print(A @ A_inv)
print(np.allclose(A @ x, b))

In numerical work, directly forming an inverse is usually less stable and less efficient than solving Ax=bA\mathbf{x}=\mathbf{b} with a factorization. The inverse is still conceptually important, especially for proving structural facts.

Common pitfalls

  • Writing A1=1/AA^{-1}=1/A. Matrix inversion is not entrywise division.
  • Reversing the inverse product rule incorrectly. The correct identity is (AB)1=B1A1(AB)^{-1}=B^{-1}A^{-1}.
  • Assuming every square matrix has an inverse. A square matrix can be singular.
  • Row-reducing only AA and forgetting to apply the same operations to the identity side of [AI][A\mid I].
  • Using the 2×22\times2 inverse formula for larger matrices.
  • Concluding that a matrix is invertible just because all entries are nonzero. Pivot structure, not entrywise nonzero status, controls invertibility.

When computing an inverse by row reduction, every row operation should be interpreted as left multiplication by an elementary matrix. This perspective explains why the method works: the sequence of operations that turns AA into II is exactly the sequence whose product is A1A^{-1}. The right side of [AI][A\mid I] records that product as it is built.

The inverse is best understood as a structural guarantee, not always as a computational tool. If AA is invertible, then Ax=bA\mathbf{x}=\mathbf{b} has the unique solution x=A1b\mathbf{x}=A^{-1}\mathbf{b}. That formula proves uniqueness and dependence on b\mathbf{b}. In numerical computation, however, solving the system directly with a factorization is usually better than forming A1A^{-1} explicitly.

A quick singularity check is to look for dependence among columns or rows. If one column is a scalar multiple or linear combination of others, then the matrix cannot be invertible. Row reduction turns this observation into pivots: a missing pivot means one direction in the domain is collapsed, so the process cannot be reversed.

For products, imagine undoing actions. If a vector is first transformed by BB and then by AA, the combined action is ABAB. To reverse it, undo AA first and then BB, so the inverse is B1A1B^{-1}A^{-1}. This mental model is often safer than memorizing the formula alone.

It is also worth distinguishing left inverses and right inverses in rectangular settings. A square matrix inverse satisfies both AA1=IAA^{-1}=I and A1A=IA^{-1}A=I. For a non-square matrix, one may have a matrix that reverses the action on one side but not the other. This connects directly to one-to-one and onto behavior: full column rank supports left-inverse behavior, while full row rank supports right-inverse behavior. The square invertible case is the special situation where both happen at once.

Elementary matrices give a compact way to reason about algorithms. If elimination reduces AA to an upper triangular matrix UU, then a product of elementary matrices has transformed AA into UU. Rearranging that statement is the beginning of LU factorization. Thus inverse theory is not isolated from numerical methods; it is the algebraic background behind practical solvers.

When checking an inverse candidate BB, multiply on both sides if the problem is theoretical:

AB=IandBA=I.AB=I \qquad\text{and}\qquad BA=I.

For square matrices over ordinary finite-dimensional spaces, one side actually implies the other, but checking both sides is a good habit in introductory work because it reveals order mistakes. If ABAB and BABA are not even both defined, then the proposed inverse cannot be a two-sided inverse.

The inverse also clarifies equations with transformed variables. If y=Ax\mathbf{y}=A\mathbf{x} and AA is invertible, then no information has been lost: x=A1y\mathbf{x}=A^{-1}\mathbf{y}. If AA is singular, two different inputs may produce the same output, or some outputs may be unreachable. Thus invertibility is both an algebraic property and an information-preservation property.

In proofs, it is often better to use the defining inverse equations than to compute the inverse. For example, to show that an equation has at most one solution, suppose Ax=AyA\mathbf{x}=A\mathbf{y} and multiply by A1A^{-1} to get x=y\mathbf{x}=\mathbf{y}. This kind of argument is the real reason inverse notation is powerful.

Before using an inverse in a solution, ask what it proves: existence, uniqueness, reversibility, or a concrete formula. Those are related but distinct purposes, and separating them keeps both computations and proofs cleaner.

When the purpose is only to solve one system, a direct solve is usually the better computational expression of the same idea.

Connections