Sensor Models#
NCore supports multiple sensor models for representing camera and lidar intrinsics and distortion parameters. Each model provides different parameterizations suited for various sensor types and use cases.
Camera Models#
The following camera models are supported:
FTheta Camera Model - NVIDIA’s FTheta camera model with polynomial distortion parameterization, suitable for both perspective and wide field-of-view cameras
Ideal Pinhole Camera Model - Efficient distortion-free pinhole camera model, also used as the target for rectification
OpenCV Pinhole Camera Model - Standard pinhole camera model with rational radial, tangential, and thin prism distortion coefficients
OpenCV Fisheye Camera Model - OpenCV’s fisheye camera model with polynomial distortion for ultra-wide angle and fisheye lenses
Camera Model Parameterizations#
Camera models may optionally include external distortion parameters to account for distortion sources outside the camera itself (e.g., windshields). See External Distortion Models for details.
The camera_model_parameters dictionary will unconditionally contain:
resolution- width and height of the image in pixels (uint32, [2,])shutter_type- shutter type of the camera’s imaging sensor (str, one of [ROLLING_TOP_TO_BOTTOM, ROLLING_LEFT_TO_RIGHT, ROLLING_BOTTOM_TO_TOP, ROLLING_RIGHT_TO_LEFT, GLOBAL])
FTheta Camera Model
If camera_model_type = 'ftheta' the following intrinsic parameters
will additionally be available in camera_model_parameters:
principal_point- u and v coordinate of the principal point, following the NVIDIA default convention for FTheta camera models in which the pixel indices represent the center of the pixel (not the top-left corners). NOTE: principal point coordinates will be adapted internally in camera model APIs to reflect the image coordinate conventions (float32, [2,])reference_poly- indicating which of the two polynomials is the reference polynomial - the other polynomial is only an approximation of the inverse of the reference polynomial (str, one of [PIXELDIST_TO_ANGLE, ANGLE_TO_PIXELDIST])pixeldist_to_angle_poly- coefficients of the backward distortion polynomial (conditionally approximate, depending onreference_poly), mapping pixel-distances to angles [rad] (float32, [6,])angle_to_pixeldist_poly- coefficients of the forward distortion polynomial (conditionally approximate, depending onreference_poly), mapping angles [rad] to pixel-distances (float32, [6,])max_angle- maximal extrinsic ray angle [rad] with the principal direction (float32)linear_cde- Coefficients of the constrained linear term \(\begin{bmatrix} c & d \\ e & 1 \end{bmatrix}\) transforming between sensor coordinates (in mm) to image coordinates (in px) (float32, [3,])
Mathematical Model:
The FTheta model projects 3D camera rays to image coordinates through the following steps:
Ray to Angle: Convert camera ray direction [x, y, z] to angle θ:
\[\theta = \arctan2(\sqrt{x^2 + y^2}, z)\]Angle to Pixel Distance: Apply the forward polynomial f(θ):
\[\delta = f(\theta) = c_0 + c_1\theta + c_2\theta^2 + c_3\theta^3 + c_4\theta^4 + c_5\theta^5\]Linear Transformation: Apply constrained linear transformation A and add principal point:
\[\begin{split}\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} c & d \\ e & 1 \end{bmatrix} \frac{\delta}{\sqrt{x^2 + y^2}} \begin{bmatrix} x \\ y \end{bmatrix} + \begin{bmatrix} u_0 \\ v_0 \end{bmatrix}\end{split}\]
The inverse operation uses the backward polynomial (either reference or approximate inverse).
Example: For a typical FTheta model, both forward and backward polynomials contain 6 coefficients each (float32, [6,]).
Ideal Pinhole Camera Model
If camera_model_type = 'ideal-pinhole' the following intrinsic parameters
will additionally be available in camera_model_parameters:
principal_point- u and v coordinate of the principal point, following the image coordinate conventions (float32, [2,])focal_length- focal lengths in u and v direction, resp., mapping normalized camera coordinates to image coordinates relative to the principal point (float32, [2,])
Mathematical Model:
The ideal pinhole model is a distortion-free perspective projection. It is the most efficient camera model (no distortion polynomial evaluation, no iterative undistortion) and the natural target for rectification.
Normalize: Convert camera ray
[x, y, z]to normalized coordinates:\[x' = \frac{x}{z}, \quad y' = \frac{y}{z}\]Project to Image: Apply focal length and principal point:
\[\begin{split}\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} f_u x' + u_0 \\ f_v y' + v_0 \end{bmatrix}\end{split}\]
Unprojection is the closed-form inverse \(x' = (u - u_0) / f_u\),
\(y' = (v - v_0) / f_v\), followed by normalization of [x', y', 1].
Note
An OpenCV Pinhole whose distortion
coefficients are all zero (and which carries no external distortion) is
equivalent to an ideal pinhole and is evaluated through the same closed-form
path internally. The ideal-pinhole and opencv-pinhole model types are
siblings: an OpenCV pinhole is not an instance of an ideal pinhole.
OpenCV Pinhole Camera Model
If camera_model_type = 'opencv-pinhole' the following intrinsic parameters
will additionally be available in camera_model_parameters:
principal_point- u and v coordinate of the principal point, following the image coordinate conventions (float32, [2,])focal_length- focal lengths in u and v direction, resp., mapping (distorted) normalized camera coordinates to image coordinates relative to the principal point (float32, [2,])radial_coeffs- radial distortion coefficients[k1,k2,k3,k4,k5,k6]parameterizing the rational radial distortion factor \(\frac{1 + k_1r^2 + k_2r^4 + k_3r^6}{1 + k_4r^2 + k_5r^4 + k_6r^6}\) for squared norms \(r^2\) of normalized camera coordinates (float32, [6,])tangential_coeffs- tangential distortion coefficients[p1,p2]parameterizing the tangential distortion components \(\begin{bmatrix} 2p_1x'y' + p_2 \left(r^2 + 2{x'}^2 \right) \\ p_1 \left(r^2 + 2{y'}^2 \right) + 2p_2x'y' \end{bmatrix}\) for normalized camera coordinates \(\begin{bmatrix} x' \\ y' \end{bmatrix}\) (float32, [2,])thin_prism_coeffs- thins prism distortion coefficients[s1,s2,s3,s4]parameterizing the thin prism distortion components \(\begin{bmatrix} s_1r^2 + s_2r^4 \\ s_3r^2 + s_4r^4 \end{bmatrix}\) for squared norms \(r^2\) of normalized camera coordinates (float32, [4,])
Mathematical Model:
The OpenCV Pinhole model applies radial, tangential, and thin prism distortions:
Normalize: Convert camera ray [x, y, z] to normalized coordinates:
\[x' = \frac{x}{z}, \quad y' = \frac{y}{z}, \quad r^2 = {x'}^2 + {y'}^2\]Radial Distortion: Apply rational radial distortion factor:
\[d_r = \frac{1 + k_1r^2 + k_2r^4 + k_3r^6}{1 + k_4r^2 + k_5r^4 + k_6r^6}\]Tangential Distortion:
\[\begin{split}\begin{bmatrix} \Delta x_t \\ \Delta y_t \end{bmatrix} = \begin{bmatrix} 2p_1x'y' + p_2(r^2 + 2{x'}^2) \\ p_1(r^2 + 2{y'}^2) + 2p_2x'y' \end{bmatrix}\end{split}\]Thin Prism Distortion:
\[\begin{split}\begin{bmatrix} \Delta x_p \\ \Delta y_p \end{bmatrix} = \begin{bmatrix} s_1r^2 + s_2r^4 \\ s_3r^2 + s_4r^4 \end{bmatrix}\end{split}\]Project to Image: Apply focal length and principal point:
\[\begin{split}\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} f_u(x'd_r + \Delta x_t + \Delta x_p) + u_0 \\ f_v(y'd_r + \Delta y_t + \Delta y_p) + v_0 \end{bmatrix}\end{split}\]
Example: Typical parameter sizes: radial_coeffs [6,], tangential_coeffs [2,], thin_prism_coeffs [4,], all float32.
OpenCV Fisheye Camera Model
If camera_model_type = 'opencv-fisheye' the following intrinsic parameters
will additionally be available in camera_model_parameters:
principal_point- u and v coordinate of the principal point, following the image coordinate conventions (float32, [2,])focal_length- focal lengths in u and v direction, resp., mapping (distorted) normalized camera coordinates to image coordinates relative to the principal point (float32, [2,])radial_coeffs- radial distortion coefficients representing OpenCV-style[k1,k2,k3,k4]parameters of the fisheye distortion polynomial \(\theta(1 + k_1\theta^2 + k_2\theta^4 + k_3\theta^6 + k_4\theta^8)\) for extrinsic camera ray angles \(\theta\) with the principal direction (float32, [4,])max_angle- maximal extrinsic ray angle [rad] with the principal direction (float32)
Mathematical Model:
The OpenCV Fisheye model uses a fisheye distortion polynomial:
Ray to Angle: Convert camera ray direction [x, y, z] to angle θ and normalized direction:
\[\theta = \arctan2(\sqrt{x^2 + y^2}, z)\]\[x_n = \frac{x}{\sqrt{x^2 + y^2}}, \quad y_n = \frac{y}{\sqrt{x^2 + y^2}}\]Fisheye Distortion Polynomial: Apply distortion to angle:
\[\delta(\theta) = \theta(1 + k_1\theta^2 + k_2\theta^4 + k_3\theta^6 + k_4\theta^8)\]Project to Image: Scale by focal length and add principal point:
\[\begin{split}\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} f_u \delta(\theta) x_n + u_0 \\ f_v \delta(\theta) y_n + v_0 \end{bmatrix}\end{split}\]
The inverse operation uses Newton-Raphson iteration to invert the ninth-degree polynomial. Example: The radial_coeffs parameter contains [k₁, k₂, k₃, k₄] (float32, [4,]).
External Distortion Models#
Camera models can optionally include external distortion sources that affect rays before they reach the camera lens (e.g., windshields). If present, the camera_model_parameters dictionary will contain:
external_distortion_parameters- parameters for the external distortion model (optional, may be None)
When external_distortion_parameters is present, it will contain:
external_distortion_type- type of external distortion (str)
Bivariate Windshield Model
If external_distortion_type = 'bivariate-windshield', this models optical distortion caused by a vehicle’s windshield. This model is only applicable when the camera’s entire field of view projects through the windshield.
Mathematical Model:
The distortion operates on spherical angle representations of sensor rays with direction [x, y, z]:
Convert to Spherical Angles:
\[\phi = \arcsin\left(\frac{x}{\sqrt{x^2 + y^2 + z^2}}\right) \quad \text{(horizontal angle)}\]\[\theta = \arcsin\left(\frac{y}{\sqrt{x^2 + y^2 + z^2}}\right) \quad \text{(vertical angle)}\]Apply Bivariate Polynomial Distortion:
Distortion is applied via separate polynomials for horizontal and vertical components:
\[\phi' = P_h(\phi, \theta)\]\[\theta' = P_v(\phi, \theta)\]Where each polynomial has the bivariate form (example for order N=2):
\[P(\phi, \theta) = c_0 + c_1\phi + c_2\phi^2 + (c_3 + c_4\phi)\theta + c_5\theta^2\]General form for order N:
\[P(\phi, \theta) = \sum_{j=0}^{N} \left(\sum_{i=0}^{N-j} c_{i,j}\phi^i\right)\theta^j\]Reconstruct Distorted Ray:
\[[x', y', z'] = \left[\sin(\phi'), \sin(\theta'), \sqrt{1 - \sin^2(\phi') - \sin^2(\theta')}\right]\]
Parameters:
The following parameters will be available in external_distortion_parameters:
reference_poly- indicates which polynomial is the reference (str, one of [FORWARD, BACKWARD]). The reference polynomial is exact; the other is an approximate inverse computed during calibration.horizontal_poly- coefficients for forward horizontal distortion polynomial P_h (float32, [(N+1)·(N+2)/2,])vertical_poly- coefficients for forward vertical distortion polynomial P_v (float32, [(M+1)·(M+2)/2,])horizontal_poly_inverse- coefficients for backward horizontal distortion (float32, [(N+1)·(N+2)/2,])vertical_poly_inverse- coefficients for backward vertical distortion (float32, [(M+1)·(M+2)/2,])
Example: For a 2nd-order bivariate polynomial (N=2), the coefficient arrays contain 6 elements: [c₀, c₁, c₂, c₃, c₄, c₅]. For a 3rd-order polynomial (N=3), they contain 10 elements. The number of coefficients for order N is given by: (N+1)·(N+2)/2
Direction:
Forward distortion (distort_camera_rays): transforms rays from external (world) to internal (camera sensor)
Backward distortion (undistort_camera_rays): transforms rays from internal (camera sensor) to external (world)
Rectification#
Rectification maps imagery from a source camera into a target camera’s image domain. It is a generic “from-camera -> to-camera” intrinsic remap within a shared camera frame; the most common use is mapping a distorted source camera to a distortion-free ideal pinhole target, but the target may itself be distorted.
Deriving an ideal target (``IdealPinholeCameraModelParameters.from_source``)
The IdealPinholeCameraModelParameters.from_source(source, target_fov=None)
factory builds an ideal pinhole approximating
any supported source camera. It extracts the source’s paraxial pinhole geometry
(focal length, principal point, resolution), preserving the source focal aspect
ratio:
Ideal / OpenCV pinhole / OpenCV fisheye: the model’s own
focal_length(anisotropicfx != fypreserved; OpenCV pinhole distortion is dropped).FTheta: the first-order coefficient of the angles-to-pixeldistances (forward) polynomial scaled by the linear term, i.e.
[c1 * c, c1]with \(c_1 = \mathrm{angle\_to\_pixeldist\_poly}[1]\) and \(c = \mathrm{linear\_cde}[0]\), since \(\mathrm{d}\,\delta/\mathrm{d}\theta|_{\theta=0} = c_1\) and \(r = f\tan\theta \approx f\theta\) near \(\theta = 0\) (the linear term’s shear components are not representable by a pinhole and are dropped).
The target_fov argument selects the angular extent of the rectified pinhole
as a full field-of-view angle [rad]:
None: use the source’s natural field of view (the paraxial focal). A pinhole focal always yields a field of view strictly below 180 degrees, so for wide fisheye / omnidirectional cameras this is a narrow rectilinear central window of the captured scene.float: an isotropic target full field of view, preserving the source focal aspect ratio (the most binding image axis lands exactly at this value).np.ndarrayof shape[2,]: per-axis target full field of view[fov_x, fov_y].
target_fov selects which rays the pinhole covers, not a magnification:
because a pinhole maps angle to pixel distance as \(r = f\tan\theta\),
different values yield genuinely different views (the periphery stretches
increasingly for wider fields of view; the optical axis stays fixed). Widening
past the source’s captured field of view is allowed and yields invalid
(out-of-source) regions, which the Rectificator zeroes.
from_source raises ValueError if a requested field of view cannot be
represented by a pinhole (any axis at or beyond 180 degrees), and TypeError
for an unsupported source model.
Use IdealPinholeCameraModelParameters.natural_fov(source) (per-axis full FOV
np.ndarray) to discover a sensible value to scale, e.g.:
import numpy as np
from ncore.data import IdealPinholeCameraModelParameters
from ncore.sensors import CameraModel, Rectificator
source_params = source_model.get_parameters()
# Default rectification (source's natural field of view)
target_params = IdealPinholeCameraModelParameters.from_source(source_params)
# Widen by 20% (e.g. to recover more of a wide lens) or set an explicit FOV
target_params = IdealPinholeCameraModelParameters.from_source(
source_params, target_fov=1.2 * IdealPinholeCameraModelParameters.natural_fov(source_params)
)
target_params = IdealPinholeCameraModelParameters.from_source(source_params, target_fov=np.radians(90.0))
target = CameraModel.from_parameters(target_params)
rect = Rectificator(source_model, target)
Rectificator
The Rectificator (ncore.sensors.Rectificator) is constructed from a
source/target CameraModel pair and builds a backward sample map
(target -> source) once:
each target pixel is unprojected to a ray with the target model,
the ray is re-projected to a source image coordinate with the source model.
It exposes:
sample_map-[H, W, 2]source image coordinates to sample for each target pixel (usable directly with custom / GPU / OpenCV resampling).valid_mask-[H, W]boolean mask of target pixels whose ray projects inside the source image.apply(image, mode="bilinear", padding_mode="zeros")- resamples a source image ([H, W],[H, W, C]or[N, H, W, C], channels-last) into the target domain, zeroing invalid pixels.modeselects the interpolation ("bilinear","nearest"or"bicubic") andpadding_modethe out-of-bounds sampling behaviour ("zeros","border"or"reflection"); both are forwarded totorch.nn.functional.grid_sample.source_points_to_target(points)/target_points_to_source(points)- map sparse continuous image points between the two cameras.
Lidar Models#
NCore supports the following lidar models:
Row-Offset Structured Spinning Lidar - Structured spinning lidar model with per-row azimuth offsets (compatible with sensors like Hesai P128)
Lidar Model Parameterizations#
The lidar_model_parameters dictionary contains model-specific parameters for lidar sensors. Currently, the supported model type is row-offset-spinning.
Row-Offset Structured Spinning Lidar Model
If lidar_model_type = 'row-offset-spinning', this models structured spinning lidar sensors with per-row azimuth offsets (compatible with sensors like Hesai P128).
Mathematical Model:
The Row-Offset Structured Spinning Lidar model parameterizes lidar returns using per-row elevation angles and per-column azimuth angles with row-specific offsets.
Notation:
α (alpha): elevation angle in radians
β (beta): azimuth angle in radians
E: row_elevations_rad array - elevation angles per row
B: column_azimuths_rad array - base azimuth angles per column
Δ (Delta): row_azimuth_offsets_rad array - azimuth offsets per row (optional, can be zero if no row offsets)
i: row index
j: column index
Element to Angles: For element (i, j):
\[\alpha = E[i]\]\[\beta = B[j] + \Delta[i]\]Angles to Sensor Ray: Convert spherical angles to 3D ray direction:
\[x = \cos(\beta) \cdot \cos(\alpha)\]\[y = \sin(\beta) \cdot \cos(\alpha)\]\[z = \sin(\alpha)\]Sensor Ray to Angles: Convert 3D ray [x, y, z] to angles:
\[\alpha = \arcsin(z)\]\[\beta = \arctan2(y, x)\]
Parameters:
The following parameters will be available in lidar_model_parameters:
row_elevations_rad- elevation angle for each row (float32, [n_rows,])column_azimuths_rad- base azimuth angle for each column (float32, [n_columns,])row_azimuth_offsets_rad- azimuth offset for each row (float32, [n_rows,])spinning_frequency_hz- rotation frequency in Hz (float32)spinning_direction- rotation direction (str, one of [‘cw’, ‘ccw’])n_rows- number of vertical laser channels (int)n_columns- number of azimuth divisions per revolution (int)
Example: A Hesai P128 lidar would have n_rows=128 (for 128 laser channels), with row_elevations_rad containing 128 elevation angles and row_azimuth_offsets_rad containing 128 azimuth offsets.