Skip to main content

Quadratic Forms and Spectral Theorems

Quadratic forms turn symmetric matrices into scalar-valued geometry. They describe conic sections, surfaces, energy functions, constrained extrema, and optimization tests. The spectral theorem is the key simplifier: a real symmetric matrix can be diagonalized by an orthogonal change of coordinates.

The reason symmetry matters is that symmetric matrices have especially clean eigenstructure. Their eigenvalues are real, eigenvectors from different eigenspaces are orthogonal, and an orthonormal eigenbasis exists. That lets quadratic forms be rotated into a coordinate system with no cross terms.

Definitions

A quadratic form in Rn\mathbb{R}^n is an expression

Q(x)=xTAx,Q(\mathbf{x})=\mathbf{x}^TA\mathbf{x},

where AA is symmetric. For two variables,

Q(x,y)=ax2+2bxy+cy2Q(x,y)=ax^2+2bxy+cy^2

corresponds to

A=[abbc].A= \begin{bmatrix} a&b\\ b&c \end{bmatrix}.

A matrix QQ is orthogonal if

QTQ=I.Q^TQ=I.

Orthogonal matrices preserve lengths and dot products. A matrix AA is orthogonally diagonalizable if

QTAQ=DQ^TAQ=D

for some orthogonal QQ and diagonal DD.

A quadratic form is positive definite if Q(x)>0Q(\mathbf{x})\gt 0 for every nonzero x\mathbf{x}. It is negative definite if Q(x)<0Q(\mathbf{x})\lt 0 for every nonzero x\mathbf{x}. It is indefinite if it takes both positive and negative values.

Key results

Spectral theorem for real symmetric matrices: if AA is real and symmetric, then AA is orthogonally diagonalizable. That is, there is an orthogonal matrix QQ and a real diagonal matrix DD such that

A=QDQT.A=QDQ^T.

The columns of QQ are orthonormal eigenvectors of AA, and the diagonal entries of DD are the corresponding eigenvalues.

Principal axes theorem: if AA is symmetric and x=Qy\mathbf{x}=Q\mathbf{y}, then

xTAx=yTQTAQy=yTDy.\mathbf{x}^TA\mathbf{x} = \mathbf{y}^TQ^TAQ\mathbf{y} = \mathbf{y}^TD\mathbf{y}.

Thus the quadratic form becomes

λ1y12++λnyn2,\lambda_1y_1^2+\cdots+\lambda_ny_n^2,

with no cross terms.

Eigenvalues classify definiteness:

Eigenvalues of symmetric AAType of quadratic form
all positivepositive definite
all nonnegative and at least one zeropositive semidefinite
all negativenegative definite
all nonpositive and at least one zeronegative semidefinite
both positive and negativeindefinite

For a differentiable function of two variables, the Hessian matrix at a critical point is symmetric under common smoothness assumptions. The definiteness of the Hessian gives the second-derivative test: positive definite means local minimum, negative definite means local maximum, and indefinite means saddle point.

The cross term in a quadratic form measures how the original coordinate axes fail to align with the natural axes of the form. For

Q(x,y)=ax2+2bxy+cy2,Q(x,y)=ax^2+2bxy+cy^2,

the term 2bxy2bxy mixes the variables. Orthogonal diagonalization chooses new perpendicular axes so that the same form is written without a mixed term:

Q=λ1y12+λ2y22.Q=\lambda_1y_1^2+\lambda_2y_2^2.

The eigenvectors of the symmetric matrix point along these principal axes, and the eigenvalues measure curvature or scaling in those directions.

Positive definiteness has several equivalent tests. For symmetric matrices, the eigenvalue test is conceptually clean: all eigenvalues must be positive. For a 2×22\times2 symmetric matrix

[abbc],\begin{bmatrix} a&b\\ b&c \end{bmatrix},

positive definiteness is equivalent to

a>0andacb2>0.a>0 \quad\text{and}\quad ac-b^2>0.

The second condition is the determinant being positive. These are the leading principal minor conditions in the 2×22\times2 case.

Quadratic forms also encode constrained extrema on spheres. If AA is symmetric and x=1\|\mathbf{x}\|=1, then the Rayleigh quotient

R(x)=xTAxR(\mathbf{x})=\mathbf{x}^TA\mathbf{x}

takes values between the smallest and largest eigenvalues of AA. The maximum is the largest eigenvalue, achieved at a corresponding unit eigenvector; the minimum is the smallest eigenvalue. This result explains why eigenvalues appear in optimization, mechanics, statistics, and numerical methods.

In applications, a positive definite matrix often represents energy, variance, or squared distance. The positivity condition guarantees that the quantity is genuinely zero only at the zero vector and positive otherwise. Indefinite forms represent saddle geometry: there are directions of increase and directions of decrease through the same point.

Visual

ASCII picture of removing a cross term:

xy-coordinate axes principal axes

y y2
| tilted ellipse | ellipse aligned
| /-----/ | -----
------+---/-----/---- x ------+---------- y1
| /-----/ | -----
|

Worked example 1: Diagonalize a quadratic form

Problem: diagonalize

Q(x,y)=5x2+4xy+2y2.Q(x,y)=5x^2+4xy+2y^2.

Step 1: write the symmetric matrix. Since the cross term is 2bxy2bxy, we have 2b=42b=4, so b=2b=2:

A=[5222].A= \begin{bmatrix} 5&2\\ 2&2 \end{bmatrix}.

Step 2: compute eigenvalues.

det(AλI)=det[5λ222λ]=(5λ)(2λ)4.\det(A-\lambda I) = \det \begin{bmatrix} 5-\lambda&2\\ 2&2-\lambda \end{bmatrix} =(5-\lambda)(2-\lambda)-4.

Expand:

(5λ)(2λ)4=107λ+λ24=λ27λ+6.(5-\lambda)(2-\lambda)-4 =10-7\lambda+\lambda^2-4 =\lambda^2-7\lambda+6.

Factor:

λ27λ+6=(λ6)(λ1).\lambda^2-7\lambda+6=(\lambda-6)(\lambda-1).

Step 3: find orthogonal eigenvectors. For λ=6\lambda=6,

A6I=[1224],A-6I= \begin{bmatrix} -1&2\\ 2&-4 \end{bmatrix},

so x+2y=0-x+2y=0 and an eigenvector is [21]\begin{bmatrix}2\\1\end{bmatrix}.

For λ=1\lambda=1,

AI=[4221],A-I= \begin{bmatrix} 4&2\\ 2&1 \end{bmatrix},

so 2x+y=02x+y=0 and an eigenvector is [12]\begin{bmatrix}1\\-2\end{bmatrix}.

Step 4: normalize:

q1=15[21],q2=15[12].\mathbf{q}_1=\frac{1}{\sqrt5}\begin{bmatrix}2\\1\end{bmatrix}, \qquad \mathbf{q}_2=\frac{1}{\sqrt5}\begin{bmatrix}1\\-2\end{bmatrix}.

Then in principal coordinates,

Q=6y12+y22.Q=6y_1^2+y_2^2.

Checked answer: both eigenvalues are positive, so the form is positive definite.

Worked example 2: Classify a critical point with a Hessian

Problem: classify the critical point (0,0)(0,0) of

f(x,y)=x2+4xy+y2.f(x,y)=x^2+4xy+y^2.

Step 1: compute the gradient.

fx=2x+4y,fy=4x+2y.f_x=2x+4y, \qquad f_y=4x+2y.

At (0,0)(0,0) both derivatives are zero, so it is a critical point.

Step 2: compute the Hessian.

H=[fxxfxyfyxfyy]=[2442].H= \begin{bmatrix} f_{xx}&f_{xy}\\ f_{yx}&f_{yy} \end{bmatrix} = \begin{bmatrix} 2&4\\ 4&2 \end{bmatrix}.

Step 3: find eigenvalues.

det(HλI)=det[2λ442λ]=(2λ)216.\det(H-\lambda I) = \det \begin{bmatrix} 2-\lambda&4\\ 4&2-\lambda \end{bmatrix} =(2-\lambda)^2-16.

Set this equal to zero:

(2λ)2=162λ=±4.(2-\lambda)^2=16 \quad\Longrightarrow\quad 2-\lambda=\pm4.

Thus λ=2\lambda=-2 or λ=6\lambda=6.

Step 4: classify. The Hessian has one positive and one negative eigenvalue, so it is indefinite. Checked answer: (0,0)(0,0) is a saddle point.

Code

import numpy as np

A = np.array([[5, 2],
[2, 2]], dtype=float)

values, Q = np.linalg.eigh(A)
D = np.diag(values)

print(values)
print(Q.T @ A @ Q)
print(np.all(values > 0))

np.linalg.eigh is specialized for symmetric or Hermitian matrices. It returns real eigenvalues and orthonormal eigenvectors, matching the spectral theorem.

Common pitfalls

  • Forgetting that the matrix of ax2+2bxy+cy2ax^2+2bxy+cy^2 has off-diagonal entries bb, not 2b2b.
  • Applying the real spectral theorem to a nonsymmetric matrix.
  • Assuming diagonalizable automatically means orthogonally diagonalizable. Orthogonal diagonalization is stronger and is guaranteed for real symmetric matrices.
  • Classifying definiteness from diagonal entries of AA instead of eigenvalues or a valid criterion.
  • Ignoring zero eigenvalues when distinguishing positive definite from positive semidefinite.
  • Losing the coordinate change: x=Qy\mathbf{x}=Q\mathbf{y}, so the new variables are coordinates in the orthonormal eigenbasis.

A safe classification routine is to start with symmetry. The standard eigenvalue definiteness test applies to symmetric matrices. If the matrix is not symmetric, first rewrite the quadratic form using its symmetric part, because

xTAx=xT(A+AT2)x.\mathbf{x}^TA\mathbf{x} = \mathbf{x}^T\left(\frac{A+A^T}{2}\right)\mathbf{x}.

The skew-symmetric part contributes nothing to the quadratic form. After the symmetric matrix is identified, use eigenvalues or a valid principal-minor test.

When diagonalizing a quadratic form, keep track of which variables are old and which are new. The equation x=Qy\mathbf{x}=Q\mathbf{y} means the columns of QQ are the new orthonormal axes written in old coordinates. The vector y\mathbf{y} contains coordinates along those axes. Losing this distinction can lead to correct eigenvalue arithmetic but wrong geometric interpretation.

The signs of eigenvalues describe the shape. In two variables, two positive eigenvalues give ellipses for level curves Q=c>0Q=c\gt 0. One positive and one negative eigenvalue gives hyperbola-like level curves and saddle behavior. A zero eigenvalue creates a flat direction. These pictures are often more memorable than the terminology.

In optimization, positive definite Hessians mean the function curves upward in every direction near the critical point. Negative definite Hessians mean it curves downward in every direction. Indefinite Hessians mean there are both upward and downward directions, so the point is a saddle. This is the multivariable version of the one-variable second derivative test.

Orthogonal changes of coordinates are preferred because they do not distort lengths. If QQ is orthogonal, then x=y\|\mathbf{x}\|=\|\mathbf{y}\| when x=Qy\mathbf{x}=Q\mathbf{y}. Thus diagonalizing a quadratic form by an orthogonal matrix rotates or reflects the coordinate system without stretching it. The shape changes its description, not its underlying geometry.

Positive semidefinite forms deserve separate attention. They never take negative values, but they can be zero for nonzero vectors. Geometrically, this means there are flat directions. In statistics, covariance matrices are positive semidefinite because variances cannot be negative, but zero variance can occur in directions with exact linear dependence.

The spectral theorem is also the reason symmetric matrices are central in applications. Energy matrices, covariance matrices, Hessians, and many graph matrices are symmetric. Their orthonormal eigenvectors provide stable axes for analysis, and their real eigenvalues can be ordered and interpreted as curvatures, variances, frequencies, or connectivity measures depending on the model.

A useful final check is to evaluate the quadratic form on eigenvectors. If Aqi=λiqiA\mathbf{q}_i=\lambda_i\mathbf{q}_i and qi=1\|\mathbf{q}_i\|=1, then

qiTAqi=λi.\mathbf{q}_i^TA\mathbf{q}_i=\lambda_i.

So eigenvalues are the values of the quadratic form on its principal unit directions.

That sentence is often the simplest interpretation of the spectral theorem for quadratic forms.

Connections