Skip to main content

Sensors, Cameras, LiDAR, Radar, and IMU

Autonomous driving is built on imperfect measurements. Cameras see texture and color, lidar gives accurate 3D range, radar detects velocity and survives some weather, IMUs measure short-term motion, GNSS anchors the vehicle globally, and ultrasonic sensors cover close-range parking distances. No single sensor solves the problem alone; each has blind spots, failure modes, timing issues, and calibration requirements that shape the rest of the stack.

This page is the sensor-level foundation for perception, depth estimation, sensor fusion, localization, and safety analysis. The practical lesson is that sensor choice is never just a bill-of-materials decision; it changes which algorithms are credible and which hazards must be mitigated.

Definitions

A camera measures image irradiance on a pixel grid. Automotive cameras are usually monocular RGB or RCCB sensors with wide dynamic range. A rolling shutter exposes rows at slightly different times, which can bend fast-moving objects or distort images during vibration. A global shutter exposes all pixels at once, reducing motion distortion but often with cost or sensitivity trade-offs.

LiDAR stands for light detection and ranging. It estimates distance by measuring the time of flight, phase shift, or frequency difference of emitted light. Mechanical spinning lidar, MEMS lidar, flash lidar, optical phased-array lidar, and FMCW lidar differ in scanning pattern, range, velocity measurement, cost, and robustness.

Radar uses radio waves, usually frequency-modulated continuous-wave signals in automotive systems. FMCW radar estimates range from beat frequency, radial velocity from Doppler shift, and angle from antenna arrays. MIMO radar uses multiple transmit and receive antennas to synthesize a larger virtual aperture.

An IMU contains accelerometers and gyroscopes. It measures specific force and angular velocity at high rate, but bias and noise cause dead-reckoning drift. IMUs are powerful in the short term and weak in the long term unless fused with GNSS, wheel odometry, lidar, camera, or map constraints.

GNSS is the satellite navigation family that includes GPS, Galileo, GLONASS, BeiDou, and related augmentation systems. It provides global position but can be degraded by urban canyons, multipath, tunnels, foliage, spoofing, or jamming.

Ultrasonic sensors use acoustic time of flight at short range. They are cheap and useful for parking, curb detection, and near-field obstacle sensing, but they have low angular resolution and are sensitive to surface geometry.

Extrinsic calibration estimates the rigid transform between sensor coordinate frames. Intrinsic calibration models a sensor's internal geometry, such as camera focal length and distortion. Time calibration aligns timestamps. A technically good sensor suite can still fail if space and time alignment are poor.

Key results

Range sensors often use time-of-flight reasoning. If a lidar pulse returns after round-trip time Δt\Delta t, then range is approximately:

r=cΔt2,r = \frac{c \Delta t}{2},

where cc is the speed of light in air. The division by two appears because the light travels to the object and back.

FMCW radar and FMCW lidar use a frequency sweep. For a linear chirp with slope SS in hertz per second, the beat frequency fbf_b is related to range:

rcfb2S.r \approx \frac{c f_b}{2S}.

For radar radial velocity, the Doppler shift is approximately:

vrλfD2,v_r \approx \frac{\lambda f_D}{2},

where λ\lambda is wavelength and fDf_D is Doppler frequency. The sign depends on convention. Radar measures radial velocity directly, which is one reason it remains valuable even when lidar and cameras are strong.

Camera geometry maps a 3D point P=(X,Y,Z)P = (X,Y,Z) into pixel coordinates through intrinsics:

u=fxXZ+cx,v=fyYZ+cy.\begin{aligned} u &= f_x \frac{X}{Z} + c_x, \\ v &= f_y \frac{Y}{Z} + c_y. \end{aligned}

This projection gives rich semantics but loses metric depth unless paired with stereo, motion, learned priors, lidar, or other constraints. Camera-only systems must therefore recover 3D structure indirectly.

IMU integration illustrates drift. With acceleration bias bab_a, even a small constant error accumulates quadratically in position:

Δx(t)12bat2.\Delta x(t) \approx \frac{1}{2} b_a t^2.

That is why IMU-only navigation is not enough for road driving over long intervals, even though IMUs are essential for high-rate motion propagation.

Sensor failure modes must be treated as first-class design inputs. Cameras can saturate under glare, struggle at night, or lose contrast in fog. Lidar can degrade in heavy rain, snow, dust, or with low-reflectivity targets. Radar can produce ghost detections, multipath, and poor elevation discrimination. GNSS can jump or drift under multipath. IMUs drift silently. Ultrasonic sensors can miss soft or angled surfaces. The stack must know both what was measured and how much to trust it.

Visual

SensorMeasuresStrengthsWeaknessesWeather sensitivityTypical AV role
CameraAppearance, texture, color, bearingSemantic detail, lane markings, signs, low costDepth ambiguity, glare, low light, rolling shutterFog, rain, snow, sun glareDetection, segmentation, traffic lights, lanes
LiDAR3D range and intensityAccurate geometry, direct depthCost, sparse returns, adverse weather artifactsRain, snow, fog, dust3D detection, occupancy, localization
RadarRange, radial velocity, angleVelocity, long range, some weather robustnessLower angular resolution, ghosts, multipathUsually better than camera and lidarTracking, cut-in detection, redundancy
IMUAcceleration, angular velocityHigh rate, works in darkness and tunnelsBias drift, no absolute positionLow weather sensitivityDead reckoning, stabilization, localization fusion
GNSSGlobal position and timeAbsolute reference, common infrastructureMultipath, outages, spoofing, jammingUrban canyons and foliage matterGlobal pose prior, map alignment
UltrasonicShort-range distanceCheap, useful near vehicleLow resolution, short rangeRain, dirt, angled surfacesParking, curb and near-field checks

Worked example 1: Computing lidar range from time of flight

Problem: A pulsed lidar emits light and receives a return after Δt=80\Delta t = 80 ns. Estimate the object range. Use c=3.0×108c = 3.0 \times 10^8 m/s.

  1. Write the time-of-flight equation:
r=cΔt2.r = \frac{c \Delta t}{2}.
  1. Convert nanoseconds to seconds:
80 ns=80×109 s.80\ \mathrm{ns} = 80 \times 10^{-9}\ \mathrm{s}.
  1. Substitute:
r=(3.0×108)(80×109)2=242=12 m.\begin{aligned} r &= \frac{(3.0 \times 10^8)(80 \times 10^{-9})}{2} \\ &= \frac{24}{2} \\ &= 12\ \mathrm{m}. \end{aligned}

Answer: the target is approximately 12 m away.

Check: Light travels 24 m in 80 ns. Because the path is out and back, the one-way distance is 12 m.

Worked example 2: Estimating camera rolling-shutter displacement

Problem: A forward camera has a rolling-shutter readout time of 20 ms from top row to bottom row. During the readout, the vehicle yaws at 1515^\circ/s over a bumpy road. Estimate the yaw difference between the top and bottom rows.

  1. Convert the yaw rate to degrees per millisecond:
15/s=0.015/ms.15^\circ / \mathrm{s} = 0.015^\circ / \mathrm{ms}.
  1. Multiply by readout time:
Δψ=0.015/ms×20 ms=0.30.\Delta \psi = 0.015^\circ/\mathrm{ms} \times 20\ \mathrm{ms} = 0.30^\circ.
  1. Interpret the result. A third of a degree may look small, but for long-range perception a tiny angular error can shift lane boundaries, signs, or vehicles by several pixels. The displacement grows with focal length and range.

Answer: the top and bottom rows can represent scene geometry under poses that differ by about 0.300.30^\circ.

Check: If the yaw rate doubles or the readout time doubles, the distortion doubles. This linear dependence is why global shutter and motion compensation matter for fast motion and long-range detection.

Code

import numpy as np

def project_points(points_vehicle, K, T_cam_vehicle):
"""Project vehicle-frame 3D points into a camera image."""
n = points_vehicle.shape[0]
homog = np.hstack([points_vehicle, np.ones((n, 1))])
points_cam = (T_cam_vehicle @ homog.T).T[:, :3]

z = np.maximum(points_cam[:, 2], 1e-6)
pixels_h = (K @ points_cam.T).T
uv = pixels_h[:, :2] / z[:, None]
return uv, points_cam[:, 2] > 0.0

K = np.array([[900.0, 0.0, 640.0],
[0.0, 900.0, 360.0],
[0.0, 0.0, 1.0]])
T_cam_vehicle = np.eye(4)
points = np.array([[5.0, 0.0, 20.0],
[2.0, -1.0, 10.0],
[1.0, 0.5, -3.0]])

uv, in_front = project_points(points, K, T_cam_vehicle)
print(uv)
print(in_front)

Common pitfalls

  • Comparing sensors by one headline range number. Field of view, angular resolution, latency, false positives, calibration stability, and failure modes matter as much as range.
  • Ignoring timestamp alignment. A 50 ms timing error at 20 m/s is a 1 m spatial error before perception even begins.
  • Treating radar as low-quality lidar. Radar provides Doppler velocity and has different noise statistics, so it should not be forced into a lidar-only mental model.
  • Assuming GNSS is always globally accurate. Multipath can produce plausible but wrong positions, especially near tall buildings.
  • Trusting calibration forever. Sensor mounts move under thermal cycling, vibration, impacts, maintenance, and manufacturing variation.
  • Forgetting occlusion. A perfect sensor cannot see through a truck, building, hill crest, or parked vehicle without cooperative perception or prior map information.

Connections