Matrices and Linear Systems
Matrices organize linear relationships. In engineering mathematics they represent systems of algebraic equations, coordinate transformations, discretized differential equations, least-squares fits, network laws, and linearized models. A matrix is not only a table of numbers; it is a linear map whose algebra reflects geometry and computation.
Solving is one of the core tasks in applied mathematics. Exact row reduction teaches structure, while numerical factorization teaches reliability and cost. The same ideas support eigenvalue problems, finite-difference PDE solvers, regression, optimization, and control.
Definitions
A matrix maps vectors in to vectors in by
A linear system is
The column space of is the set of all linear combinations of its columns. The nullspace is
The rank of is the dimension of its column space. A square matrix is invertible when there exists such that
For an augmented matrix , Gaussian elimination uses elementary row operations to reach echelon form. Pivot columns identify basic variables, while nonpivot columns identify free variables.
An LU factorization writes
where is lower triangular and is upper triangular, often with a permutation matrix so that
Key results
For a square matrix , the following are equivalent: is invertible, , , the nullspace contains only , the columns form a basis of , and has a unique solution for every .
The rank-nullity theorem states
where is the number of columns. This balances the number of independent output directions against the number of degrees of freedom lost to the nullspace.
Gaussian elimination is systematic. Row swaps improve pivoting, scaling is optional for exact arithmetic, and elimination below pivots produces triangular form. Back substitution then solves the system. For numerical work, pivoting is not merely cosmetic; it reduces the effect of dividing by small numbers.
The determinant measures signed volume scaling for a square linear map. It is useful for invertibility tests and theoretical formulas, but it is usually not the best numerical way to solve systems. Computing an inverse explicitly is also often unnecessary. To solve , factor and use triangular solves.
Conditioning measures sensitivity. A matrix with large condition number can turn small perturbations in data into large changes in the solution. The residual may be small even when the error is large if is ill-conditioned. This is a major theme in numerical linear algebra and engineering computation.
Least-squares problems solve inconsistent systems approximately. If has more rows than columns and measurements are noisy, one minimizes
The normal equations are
but QR factorization is often more stable in numerical work.
Matrix structure should be preserved. Symmetric, sparse, banded, positive definite, and triangular matrices have specialized algorithms. Ignoring structure can waste computation and reduce accuracy. Finite-difference discretizations of ODEs and PDEs often produce sparse banded systems, so engineering solvers rely heavily on sparse matrix methods.
Consistency has a geometric interpretation. The equation asks whether lies in the column space of . If it does, at least one solution exists. If it does not, no exact solution exists and least squares becomes a natural substitute. When the columns are independent, the coordinates of in that column basis are unique. When the columns are dependent, different coefficient vectors can produce the same output.
Elementary row operations preserve the solution set because they replace equations by equivalent linear combinations of equations. They do not preserve the column space in an obvious visual way, but they preserve consistency and the relationships among variables. Column operations are different: they change the variables themselves unless carefully tracked. For solving systems by hand, row operations are the standard safe operations.
The inverse matrix is best understood as the linear map that undoes . It exists only when is square and one-to-one onto the whole space. If is rectangular, the right replacement depends on the problem: a left inverse may exist for full-column-rank matrices, a right inverse may exist for full-row-rank matrices, and the pseudoinverse gives the least-squares or minimum-norm solution under appropriate conditions.
Norms provide a language for error. The vector norm measures the size of a vector, and a compatible matrix norm measures how much the matrix can stretch vectors. The condition number estimates worst-case relative sensitivity for invertible systems. A matrix can be exactly invertible and still numerically troublesome if is large.
Scaling can improve numerical behavior. If one equation is measured in millions and another in thousandths, pivot choices and residual norms may be dominated by units rather than mathematics. Row and column scaling, nondimensional variables, and careful unit choices can make a system easier to solve and interpret. Engineering models should not hide unit inconsistency inside a matrix.
In large systems, algorithmic cost matters. Dense Gaussian elimination for an matrix costs on the order of operations, while triangular solves cost on the order of . If the same matrix is used with many right-hand sides, factoring once and solving many times is efficient. Sparse direct solvers and iterative methods can reduce cost dramatically when the matrix structure permits.
Visual
| Concept | Algebraic meaning | Computational note |
|---|---|---|
| Pivot | Leading variable in elimination | Small pivots can amplify roundoff |
| Rank | Number of independent columns | Determines consistency and degrees of freedom |
| Nullspace | Solutions of | Describes nonuniqueness |
| Determinant | Volume scaling for square maps | Poor tool for large numerical solves |
| Condition number | Sensitivity measure | Large values warn of unreliable solutions |
Worked example 1: Gaussian elimination
Problem. Solve
Method.
- Write the augmented matrix:
- Eliminate below the first pivot:
This gives
- Eliminate below the second pivot:
Then
- Back substitute:
- Use row 2:
- Use row 1:
Answer.
Check. Substitution gives , , and in the three equations.
The pivot pattern shows that all three variables are basic. There are no free variables and no inconsistent row, so the solution is unique. If the last row had become , the system would have been inconsistent. If the last row had become all zeros, one variable would have remained free.
Worked example 2: Least-squares line fit
Problem. Fit to the points , , and by least squares.
Method.
- Build the design matrix:
- Compute
- Solve normal equations:
- Subtract the first equation from the second:
- Substitute:
Answer.
Check. The residuals are , , and , whose sum is zero, consistent with fitting an intercept.
The residual vector is orthogonal to the columns of . That is the geometric meaning of the normal equations: the error is perpendicular to every direction in the model space. The fitted line is therefore the closest line in the least-squares sense among all lines of the form .
Code
import numpy as np
A = np.array([[1.0, 2.0, -1.0],
[2.0, 3.0, 1.0],
[1.0, -1.0, 2.0]])
b = np.array([3.0, 7.0, 0.0])
x = np.linalg.solve(A, b)
residual = b - A @ x
print(x)
print(np.linalg.norm(residual))
print(np.linalg.cond(A))
The code solves the exact system and reports a condition number. In a real measurement problem, the condition number helps decide how much trust to place in computed digits. A residual near machine precision is good, but it is not a complete accuracy certificate for an ill-conditioned system.
Common pitfalls
- Computing just to solve one system when a factorization or direct solve is better.
- Confusing row space, column space, and nullspace.
- Assuming a small residual always means a small solution error.
- Ignoring pivoting in floating-point Gaussian elimination.
- Using normal equations for badly conditioned least-squares problems without considering QR.
- Forgetting that free variables mean infinitely many solutions only when the system is consistent.
- Treating determinant formulas as practical algorithms for large systems.
- Destroying sparsity by using dense operations on sparse engineering matrices.