Matrices and Matrix Algebra
Matrices organize linear information. A matrix can be a table of data, a coefficient array for a system, or the rule for a linear transformation. Matrix algebra is designed so that these interpretations agree: matrix-vector multiplication applies a linear rule, and matrix-matrix multiplication composes such rules.
The main surprise is that matrix arithmetic resembles ordinary arithmetic only partly. Addition behaves as expected, but multiplication is order-sensitive and dimension-sensitive. This is not a defect. It records the fact that applying one linear process after another depends on which process happens first and whether the output size of the first matches the input size of the second.
Definitions
An matrix has rows and columns. Two matrices are equal if they have the same size and the same corresponding entries.
Matrix addition and scalar multiplication are entrywise:
If is and is , then is with entries
Equivalently, the th column of is times the th column of . This column viewpoint is often the cleanest way to remember multiplication:
The transpose is formed by interchanging rows and columns. A square matrix has a main diagonal. Important square matrices include the identity matrix , diagonal matrices, triangular matrices, and symmetric matrices satisfying .
Key results
Matrix addition is commutative and associative. Matrix multiplication is associative and distributes over addition:
However, multiplication is usually not commutative: and may differ or one product may not even be defined. This is one of the most important habits to build early. When matrices represent functions, means "first apply , then apply ," so reversing the order usually changes the result.
Transpose rules are:
The reversed order in is forced by dimensions and by entries. The entry of is the entry of , which is row of dotted with column of . That same scalar is the entry of .
For compatible matrices, multiplication by the identity leaves a matrix unchanged:
Zero matrices absorb under multiplication when dimensions allow:
But the cancellation law can fail. From , one cannot conclude unless has an appropriate inverse.
Matrix multiplication can be understood in several equivalent ways, and each viewpoint is useful in a different setting. The entry formula is best for direct arithmetic. The column viewpoint says that the columns of are the images of the columns of under . The row viewpoint says that the rows of are linear combinations of the rows of , weighted by rows of . The transformation viewpoint says that represents composition.
Block matrices extend the same ideas. If the sizes match, a matrix can be partitioned into submatrices and multiplied by blocks. For example,
This notation is common in systems, least squares, numerical linear algebra, and theoretical proofs because it keeps related groups of variables together.
Special matrix classes carry extra structure. Diagonal matrices scale coordinate axes independently. Triangular matrices are easy to solve by substitution. Symmetric matrices satisfy and have real eigenvalues with orthogonal eigenvectors. Orthogonal matrices satisfy and preserve dot products. Recognizing these classes often tells you which theorem or algorithm is appropriate before doing any arithmetic.
The trace of a square matrix is the sum of its diagonal entries:
It satisfies and when both products are defined as square matrices. Trace later connects to eigenvalues, quadratic forms, and matrix inner products.
Dimension discipline is a major part of matrix algebra. Before performing any calculation, identify the shape of each matrix and the shape of the result. This habit prevents many errors and also clarifies meaning: an matrix maps coordinate vectors in to coordinate vectors in .
Visual
| Operation | Condition | Result size | Key warning |
|---|---|---|---|
| same size | same as and | entrywise only | |
| scalar | same as | scales every entry | |
| columns of = rows of | rows of by columns of | order matters | |
| any matrix | columns of by rows of | reverses products | |
| has one entry per column of | one entry per row of | linear combination of columns |
Worked example 1: Compute a product and interpret columns
Problem: let
Compute .
Step 1: check dimensions. is and is , so is defined and has size .
Step 2: compute entries by row-column dot products.
Thus
Step 3: check with the column viewpoint. The first column of is , and
which matches the first column of .
Worked example 2: Show multiplication is not commutative
Problem: compare and for
Step 1: compute .
Step 2: compute .
Step 3: compare. Since
the products are not equal. Checked answer: both products exist, but .
Code
import numpy as np
A = np.array([[1, 2, 0],
[-1, 3, 4]])
B = np.array([[2, 1],
[0, -3],
[5, 2]])
print(A @ B)
print((A @ B).T)
print(B.T @ A.T)
print(np.allclose((A @ B).T, B.T @ A.T))
The @ operator performs matrix multiplication in NumPy. The final line checks the transpose product rule .
Common pitfalls
- Multiplying corresponding entries and calling the result . Entrywise multiplication is a different operation.
- Forgetting the dimension check before multiplying.
- Reversing product order when translating composition. If and , then corresponds to .
- Assuming because ordinary numbers commute.
- Distributing transpose without reversing order.
- Treating the identity matrix as one fixed size. The size of depends on context.
A practical way to avoid multiplication mistakes is to annotate shapes before multiplying. If is and is , then is because the inner dimensions match and disappear, while the outer dimensions remain. If the inner dimensions do not match, the product is not defined. This simple check should happen before any entry arithmetic begins.
When a product represents composition, read it from right to left as an action on vectors. The expression means first compute , then apply . This convention explains both the order of matrix multiplication and the transpose rule. It also prevents a common modeling error: writing transformations in the order they are described verbally rather than in the order they act on the vector.
Special matrices should trigger special expectations. A diagonal matrix scales coordinates independently. A triangular matrix leads naturally to substitution. A symmetric matrix is connected to quadratic forms and orthogonal eigenvectors. An orthogonal matrix preserves dot products and lengths. Identifying these structures early can reduce computation and suggest the correct theorem.
For block matrices, always check that each block operation is dimensionally valid. Blocks behave like entries only when the block sizes match the intended algebra. Used correctly, block notation makes large systems easier to read. Used carelessly, it hides dimension errors that would have been obvious in entry notation.
Matrix powers are defined only for square matrices. If appears, then must be square because the output of the first application must be a valid input for the second. Powers describe repeated application of the same linear transformation, which is why they occur in Markov chains, recurrence relations, and dynamical systems.
The identity and zero matrices play different algebraic roles. The identity matrix leaves vectors and compatible matrices unchanged, so it behaves like the number under multiplication. The zero matrix sends every vector to zero, so it destroys information. Unlike ordinary arithmetic, it is possible for nonzero matrices and to have if the range of lies in the null space of .
Transpose connects row and column viewpoints. The columns of become the rows of , and symmetric matrices are exactly those where these viewpoints match across the diagonal. This simple operation becomes important in dot products, normal equations, orthogonal matrices, and quadratic forms.