Skip to main content

Functions of Random Variables

Many useful random variables are built from other random variables. If XX is a measurement, then X2X^2, logX\log X, aX+baX+b, max(X1,,Xn)\max(X_1,\ldots,X_n), and X+YX+Y are all transformations. Probability theory gives systematic ways to derive the distribution of the transformed variable rather than guessing from simulation.

A Galton box diagram shows balls falling through pegs into bins.

Figure: A Galton box turns repeated random left-right choices into an approximate bell-shaped distribution. Image: Wikimedia Commons, Marcin Floryan, CC BY-SA 3.0.

Transformations are also the engine behind standardization, change of variables in continuous distributions, sums of independent variables, and many sampling distributions. The main tools are the CDF method, one-to-one density transformations, Jacobians, and convolution.

Definitions

If Y=g(X)Y=g(X), then YY is a function of a random variable. Its distribution is determined by

FY(y)=P(Yy)=P(g(X)y).F_Y(y)=P(Y\le y)=P(g(X)\le y).

This is called the CDF method. It is often the safest method because it works even when gg is not one-to-one.

If XX is continuous with density fXf_X and Y=g(X)Y=g(X) where gg is differentiable and one-to-one with inverse x=g1(y)x=g^{-1}(y), then

fY(y)=fX(g1(y))ddyg1(y).f_Y(y)=f_X(g^{-1}(y))\left|\frac{d}{dy}g^{-1}(y)\right|.

For a vector transformation (U,V)=g(X,Y)(U,V)=g(X,Y) with inverse (x,y)=h(u,v)(x,y)=h(u,v), the joint density is

fU,V(u,v)=fX,Y(h(u,v))Jh(u,v),f_{U,V}(u,v)=f_{X,Y}(h(u,v))\left|J_h(u,v)\right|,

where JhJ_h is the determinant of the Jacobian matrix of the inverse transformation.

If XX and YY are independent continuous random variables, the density of their sum S=X+YS=X+Y is the convolution

fS(s)=fX(x)fY(sx)dx.f_S(s)=\int_{-\infty}^{\infty} f_X(x)f_Y(s-x)\,dx.

For discrete variables,

P(S=s)=xP(X=x)P(Y=sx).P(S=s)=\sum_x P(X=x)P(Y=s-x).

Key results

Linear transformations. If Y=aX+bY=aX+b and a0a\ne 0, then

FY(y)=FX(yba),a>0,1FX(yba),a<0 with endpoint care.F_Y(y)= \begin{aligned} &F_X\left(\frac{y-b}{a}\right),\quad a>0,\\ &1-F_X\left(\frac{y-b}{a}\right),\quad a<0 \text{ with endpoint care}. \end{aligned}

For densities,

fY(y)=1afX(yba).f_Y(y)=\frac{1}{|a|}f_X\left(\frac{y-b}{a}\right).

Means and variances transform as

E[aX+b]=aE[X]+b,E[aX+b]=aE[X]+b, Var(aX+b)=a2Var(X).\operatorname{Var}(aX+b)=a^2\operatorname{Var}(X).

Monotone transformations. If gg is strictly increasing, then

FY(y)=FX(g1(y)).F_Y(y)=F_X(g^{-1}(y)).

If gg is strictly decreasing, inequalities reverse.

Order statistics. If X1,,XnX_1,\ldots,X_n are independent with CDF FF, then the maximum M=maxiXiM=\max_i X_i has CDF

FM(m)=P(X1m,,Xnm)=F(m)n.F_M(m)=P(X_1\le m,\ldots,X_n\le m)=F(m)^n.

The minimum L=miniXiL=\min_i X_i satisfies

P(L>l)=P(X1>l,,Xn>l)=(1F(l))n.P(L>l)=P(X_1>l,\ldots,X_n>l)=(1-F(l))^n.

A practical transformation workflow is:

  1. Find the support of the new variable before doing algebra.
  2. Decide whether the transformation is one-to-one on that support.
  3. If it is one-to-one, use the inverse and derivative formula.
  4. If it is not one-to-one, split the support into one-to-one branches or use the CDF method.
  5. Check that the resulting density integrates to 11.

For example, Y=X2Y=X^2 is not one-to-one on (1,1)(-1,1), but it is one-to-one on (1,0)(-1,0) and (0,1)(0,1). The transformed density receives contributions from both branches. The CDF method automatically accounts for both branches, which is why it is often safer.

In multivariable transformations, the Jacobian factor measures local area distortion. A transformation may stretch a small rectangle in (u,v)(u,v)-space into a larger or smaller region in (x,y)(x,y)-space. The absolute determinant corrects the density so that probability mass is preserved. The sign of the determinant is irrelevant for probability, which is why the absolute value is used.

For sums, convolution is a distribution-level version of adding all possible ways to reach the same total. In the continuous case, the integral sweeps over every possible value xx of the first variable and pairs it with sxs-x for the second variable.

Transformations are also how simulation turns simple random numbers into useful samples. Many pseudorandom generators first produce values that are approximately Uniform(0,1)(0,1). If FF is a continuous CDF and UUniform(0,1)U\sim\operatorname{Uniform}(0,1), then

X=F1(U)X=F^{-1}(U)

has CDF FF. This is the inverse-transform method. It works especially well when the inverse CDF is available in closed form or can be computed numerically. For example, if UUniform(0,1)U\sim\operatorname{Uniform}(0,1), then X=log(1U)/λX=-\log(1-U)/\lambda is Exponential(λ)(\lambda).

Another common transformation is standardization. If XX has mean μ\mu and standard deviation σ>0\sigma\gt 0, then

Z=XμσZ=\frac{X-\mu}{\sigma}

has mean 00 and variance 11. Standardization does not usually make a variable normal; it only changes location and scale. It becomes a standard normal variable only when the original XX was normal.

Absolute values, squares, maxima, and ratios deserve special care because they can merge many original outcomes into the same transformed value. Whenever a transformation collapses information, expect multiple inverse branches or support boundaries. A quick sketch of the function often prevents algebraic mistakes.

For ratios, also check where the denominator can be zero or close to zero, since this often creates heavy tails and may destroy moments.

Visual

TaskToolWarning
Y=aX+bY=aX+blinear density formuladivide by a\vert a\vert
Y=g(X)Y=g(X) monotoneinverse transformationsupport changes
Y=X2Y=X^2CDF method or split branchestwo preimages for y>0y\gt 0
S=X+YS=X+Y independentconvolutionindependence required
(U,V)=g(X,Y)(U,V)=g(X,Y)Jacobianuse inverse Jacobian

Worked example 1: squaring a uniform variable

Problem. Let XUniform(1,1)X\sim\operatorname{Uniform}(-1,1) and let Y=X2Y=X^2. Find the CDF and PDF of YY.

Method.

  1. The support of YY is 0Y10\le Y\le 1.

  2. For 0y10\le y\le 1,

FY(y)=P(Yy)=P(X2y).F_Y(y)=P(Y\le y)=P(X^2\le y).
  1. Rewrite the event:
X2yyXy.X^2\le y \quad \Longleftrightarrow \quad -\sqrt{y}\le X\le \sqrt{y}.
  1. Since XX is uniform on an interval of length 22,
P(yXy)=2y2=y.P(-\sqrt{y}\le X\le \sqrt{y}) =\frac{2\sqrt{y}}{2} =\sqrt{y}.
  1. Therefore
FY(y)=0,y<0,y,0y1,1,y>1.F_Y(y)= \begin{aligned} &0,\quad y<0,\\ &\sqrt{y},\quad 0\le y\le 1,\\ &1,\quad y>1. \end{aligned}
  1. Differentiate on (0,1)(0,1):
fY(y)=ddyy=12y.f_Y(y)=\frac{d}{dy}\sqrt{y}=\frac{1}{2\sqrt{y}}.

Checked answer. YY has density fY(y)=1/(2y)f_Y(y)=1/(2\sqrt{y}) on 0<y<10\lt y\lt 1. The density is high near zero because many XX values near zero produce very small squares.

Worked example 2: sum of two independent uniforms

Problem. Let X,YX,Y be independent Uniform(0,1)(0,1) random variables. Find the density of S=X+YS=X+Y.

Method.

  1. Use convolution:
fS(s)=fX(x)fY(sx)dx.f_S(s)=\int_{-\infty}^{\infty} f_X(x)f_Y(s-x)\,dx.
  1. Since both densities equal 11 on (0,1)(0,1), the integrand is 11 when
0<x<10<x<1

and

0<sx<1.0<s-x<1.
  1. The second inequality means
s1<x<s.s-1<x<s.
  1. Therefore xx must lie in the overlap
(0,1)(s1,s).(0,1)\cap(s-1,s).
  1. For 0<s<10\lt s\lt 1, the overlap is (0,s)(0,s), length ss, so
fS(s)=s.f_S(s)=s.
  1. For 1s<21\le s\lt 2, the overlap is (s1,1)(s-1,1), length 2s2-s, so
fS(s)=2s.f_S(s)=2-s.
  1. Outside 0<s<20\lt s\lt 2, there is no overlap.

Checked answer.

fS(s)=s,0<s<1,2s,1s<2,0,otherwise.f_S(s)= \begin{aligned} &s,\quad 0<s<1,\\ &2-s,\quad 1\le s<2,\\ &0,\quad \text{otherwise}. \end{aligned}

The density is triangular and integrates to 11.

Code

import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(1)
n = 200_000

# Example 1: Y = X^2 for X uniform(-1, 1).
x = rng.uniform(-1, 1, size=n)
y = x**2
print("P(Y <= 0.25) simulation:", np.mean(y <= 0.25))
print("P(Y <= 0.25) theory:", np.sqrt(0.25))

# Example 2: sum of two uniforms.
u = rng.uniform(0, 1, size=n)
v = rng.uniform(0, 1, size=n)
s = u + v
print("mean of sum:", s.mean())
print("P(0.5 <= S <= 1.5):", np.mean((s >= 0.5) & (s <= 1.5)))

# Optional plot when running locally.
plt.hist(s, bins=80, density=True, alpha=0.5)
grid = np.linspace(0, 2, 200)
density = np.where(grid < 1, grid, 2 - grid)
plt.plot(grid, density, color="black")
plt.show()

Common pitfalls

  • Forgetting the derivative factor in one-to-one transformations.
  • Using the inverse transformation formula when the function is not one-to-one without splitting branches.
  • Losing support restrictions after a transformation.
  • Treating convolution as valid without independence.
  • Confusing the Jacobian of the forward transformation with the Jacobian of the inverse transformation.
  • Forgetting endpoint behavior for decreasing transformations and CDF calculations.

Connections