nuScenes Dataset#
The NCore nuScenes tool converts data from the nuScenes autonomous driving dataset into NCore V4 format. All dataset versions are supported (v1.0-mini, v1.0-trainval, v1.0-test).
Conventions#
The nuScenes dataset provides data from 6 cameras, 1 lidar, and 5 radars. The converter handles all sensor modalities and 3D annotations.
Camera Sensors#
Front (camera_front) – 1600x900, 70 deg FOV
Front Left (camera_front_left) – 1600x900, 70 deg FOV
Front Right (camera_front_right) – 1600x900, 70 deg FOV
Back (camera_back) – 1600x900, 110 deg FOV
Back Left (camera_back_left) – 1600x900, 70 deg FOV
Back Right (camera_back_right) – 1600x900, 70 deg FOV
All cameras use Basler acA1600-60gc sensors (global shutter). Images are
provided undistorted with zero distortion coefficients. Camera intrinsics
are stored using OpenCVPinholeCameraModelParameters
with ShutterType.GLOBAL.
LiDAR Sensor#
Top LiDAR (lidar_top) – Velodyne HDL-32E, 32 layers, ~34k points/frame
Point clouds in nuScenes are motion-compensated to the sensor frame at the sweep reference timestamp. The converter decompensates them back to per-point-time sensor frames (raw measurements) before storing, since NCore V4 expects non-motion-compensated ray-bundle data.
Per-point timestamps are derived from the column structure of the .bin
file: each file contains 32-point columns (one per beam) in sequential
firing order. Timestamps use column_index / n_model_columns * frame_duration
(fencepost convention: next frame starts at frame_end, not the last column).
A structured lidar model (RowOffsetStructuredSpinningLidarModelParameters)
is stored as intrinsics. Two derivation modes are available:
Nominal (
--lidar-model-source nominal, recommended): Model parameters from the HDL-32E spec – uniform column azimuths, spec elevation angles, analytical firing offsets. No circular data dependency.Empirical (
--lidar-model-source empirical): Model derived from a decompensated reference frame. Row offsets are blended with analytical values for beams that lack far-range observations (steep downward-facing beams).
Resolution upsampling (--lidar-model-resolution 4, recommended) interpolates
column azimuths to 4x native resolution (4340 columns). This compensates for
per-revolution azimuth drift in the mechanical spinning and reduces alignment
quantization error from ~0.10 deg to ~0.03 deg.
Optional multi-frame optimization (--lidar-model-optimization-passes 1)
adjusts column azimuths and row offsets from median residuals across all frames,
further reducing systematic error.
Model parameters stored:
row_elevations_rad: per-beam elevation anglescolumn_azimuths_rad: per-column azimuth angles (n_columns depends on resolution)row_azimuth_offsets_rad: per-beam azimuth offsets from intra-column firing sequence (~0.25 deg total range; two 16-beam banks at 1.152 us pair intervals)spinning_direction: clockwise (“cw”)spinning_frequency_hz: derived from inter-sweep timestamps (~20 Hz)
Accuracy (4x nominal + optimization): 0.029 deg far-range angular error, comparable to PAI-level extraction quality.
The model_element field is populated with [ring_index, column_index]
per point, where column_index addresses the (possibly upsampled) model.
Column indices are assigned via iterative per-column alignment with fine-grained
sub-column refinement at the model’s resolution.
The minimum distance filter (1.0 m) matches the remove_close
default used by the nuScenes devkit to discard sensor housing
reflections.
Radar Sensors#
Front (radar_front) – Continental ARS 408
Front Left (radar_front_left) – Continental ARS 408
Front Right (radar_front_right) – Continental ARS 408
Back Left (radar_back_left) – Continental ARS 408
Back Right (radar_back_right) – Continental ARS 408
Radar detections are sparse (typically 10-100 per sweep). Each detection provides position (x, y, z), ego-motion-compensated velocity, and radar cross section (RCS). Per-frame generic data fields:
radial_velocity_m_s(float32, [N]) – radial velocity in m/s (positive = moving away from sensor), computed by projecting the ego-motion-compensated velocity vector onto the detection direction.rcs_dBsm(float32, [N]) – radar cross section in dBsm.
Radar is not a spinning sensor; all detections in a frame share a single timestamp.
Ego Poses#
Ego poses are derived from the per-sweep ego_pose records in the
nuScenes database (GPS/INS-based). Poses are stored as dynamic
("rig", "world") poses relative to the first frame. The absolute
first-frame pose is preserved as a static ("world", "world_global")
transform.
3D Annotations#
Cuboid annotations are stored in the world_global coordinate frame
(the nuScenes global map frame) as
CuboidsComponent observations. Only keyframe
annotations are included. The transform()
method can re-project them to any sensor frame at runtime via the pose
graph.
Category mapping from nuScenes to NCore class IDs:
vehicle.car -> car
vehicle.truck -> truck
vehicle.bus.* -> bus
vehicle.construction -> construction_vehicle
vehicle.motorcycle -> motorcycle
vehicle.bicycle -> bicycle
vehicle.trailer -> trailer
vehicle.emergency.* -> emergency_vehicle
human.pedestrian.* -> pedestrian
movable_object.barrier -> barrier
movable_object.trafficcone -> traffic_cone
Usage#
bazel run //tools/data_converter/nuscenes -- \
--root-dir /path/to/nuscenes \
--output-dir /path/to/output \
nuscenes-v4 \
--version v1.0-trainval
Convert a single scene by name:
bazel run //tools/data_converter/nuscenes -- \
--root-dir /path/to/nuscenes \
--output-dir /path/to/output \
nuscenes-v4 \
--version v1.0-mini \
--scene-name scene-0061
See tools/data_converter/nuscenes/README.md for full option documentation.