Data Formats#

The current NCore data format is V4 (Component-based Format) – a modular format that separates data into independent component stores. Each component (poses, intrinsics, sensors, labels, etc.) is stored as a separate zarr component that can be independently managed, versioned, and combined. This format enables:

Flexible data composition from multiple sources
Independent component updates without reprocessing entire sequences
Parallel access and distributed storage optimization
Extensibility through custom component types
Fine-grained access control and data sharing

The format uses coordinate system conventions and transformations described in Specification.

For details on how component stores are serialized to disk (.zarr.itar indexed tar archives vs. directory-based .zarr stores) and how to access them from local or cloud storage, see Storage and Access.

V4: Component Store Hierarchy (Component-Based Format)#

The component-based V4 data format represents sequences as collections of component groups. V4 distributes data across modular components that can be independently managed, versioned, and combined to form virtual sequences.

Component Architecture#

Each component group is a zarr store containing a specific number of data component instances. The NCore library provides the following default component types:

PosesComponent - Static and dynamic pose transformations between named coordinate frames
IntrinsicsComponent - Camera and lidar intrinsic calibration parameters
MasksComponent - Static masks associated with sensors
CameraSensorComponent - Camera frame data including images
LidarSensorComponent - Lidar frame data including point clouds
RadarSensorComponent - Radar frame data including detections
CuboidsComponent - 3D cuboid track observations and annotations
PointCloudsComponent - Pre-computed point clouds with optional typed per-point attributes
CameraLabelsComponent - Per-camera image-aligned labels (depth, flow, segmentation, masks, normals, features)

The component architecture is extensible, allowing custom component types to be defined for application-specific data.

Component Group Structure#

Each component group has the following root-level structure:

ncore4[-{component_group_name}].zarr[.itar]/
│
├── {sequence_meta_data}
│   ├── sequence_id: str
│   ├── version: str (currently "v4")
│   ├── sequence_timestamp_interval_us: {start, stop}
│   ├── generic_meta_data: {...}
│   └── component_group_name: str
│
└── {component_type}/
    └── {component_instance_name}/
         ├── {component_meta_data}
         │   ├── component_name: str
         │   ├── component_instance_name: str
         │   ├── component_version: str
         │   └── generic_meta_data: {...}
         │
         └── {component_specific_data}...

Component-Level Generic Data#

Components may optionally include a generic_data/ subgroup containing named numpy arrays. This allows converters to attach auxiliary data to any component without defining a custom component type. For example, a converter might attach raw GPS/IMU measurements to the poses component:

{component_type}/
└── {component_instance_name}/
    ├── {component_meta_data}
    │   └── generic_meta_data: {...}  (includes keys added via set_generic_data)
    │
    ├── {component_specific_data}...
    │
    └── generic_data/           (optional, only if generic data was set)
        ├── {dataset_name}      [shape] dtype  (lz4-compressed zarr dataset)
        └── ...

Writers use set_generic_data() to provide generic data arrays and optional additional metadata before finalization. Metadata keys passed via set_generic_data(meta_data={...}) replace keys with the same name from the initial register_component_writer(generic_meta_data={...}) call.

Readers access generic data via has_generic_data(), get_generic_data_names(), and get_generic_data().

Poses Component#

The poses component stores both static (time-invariant) and dynamic (time-dependent) rigid transformations between named coordinate frames:

poses/
└── {component_instance_name}/
    ├── static_poses/
    │   └── {attrs}
    │       └── ("source_frame", "target_frame"):
    │           ├── pose: [[4,4]] float32/64
    │           └── dtype: str
    │
    └── dynamic_poses/
        └── {attrs}
            └── ("source_frame", "target_frame"):
                ├── poses: [[N,4,4]] float32/64
                ├── timestamps_us: [N] uint64
                └── dtype: str

For ego-vehicle trajectories, the rig-to-world transformation is typically stored as a dynamic pose under the key ("rig", "world"). Transformations from local world to global world frames (like ECEF) are represented by a ("world", "world_global") record, if applicable.

Static poses are used for sensor extrinsic calibrations. For example, a camera-to-rig transformation would be stored under the key ("camera_front_wide_120fov", "rig").

Intrinsics Component#

Camera and lidar intrinsic model parameters:

intrinsics/
└── {component_instance_name}/
     ├── cameras/
     │   └── {camera_id}/
     │       └── {attrs}
     │           ├── camera_model_type: str
     │           ├── camera_model_parameters: {...}
     │           └── external_distortion_type: str  (optional)
    │
    └── lidars/
        └── {lidar_id}/
            └── {attrs}
                ├── lidar_model_type: str
                └── lidar_model_parameters: {...}

Model types include ftheta, opencv-pinhole, and opencv-fisheye for camera sensors, and row-offset-spinning for lidar sensors. For detailed model parameterizations and mathematical specifications, see Sensor Models.

Masks Component#

Static masks for sensors are stored per sensor instance (currently only cameras are supported):

masks/
└── {component_instance_name}/
    └── cameras/
        └── {camera_id}/
            ├── {attrs}
            │   └── mask_names: [str, ...]
            └── {mask_name} () |Sx  (encoded image, attrs: format: str)

Sensor Components#

Sensor components (cameras, lidars, radars) share a common frame-based structure:

{sensor_type}/
└── {sensor_id}/
    ├── {component_meta_data}
    │
    └── frames/
        ├── {attrs}
        │   └── frames_timestamps_us: [N, 2] uint64  (start, end per frame)
        │
        └── {frame_name}/  (keyed by end-of-frame timestamp)
            ├── {sensor_specific_data}
            └── generic_data/
                ├── {attrs: generic_meta_data}
                └── {named datasets}...

Camera Sensor Frames:

cameras/{camera_id}/frames/{frame_name}/
├── image () |Sx  (encoded image, attrs: format: str)
└── generic_data/...

Lidar Sensor Frames:

Lidar and radar data structures separate ray geometry (ray_bundle/) from multi-return properties (ray_bundle_returns/) for flexible data organization. Non-existing values are indicated via NaNs and must be consistent across all return datasets to define a coherent [R,N] valid-return mask. This mask is stored in bit-packed form as ray_bundle_returns_valid_mask_packed.

lidars/{lidar_id}/frames/{frame_name}/
├── ray_bundle/
│   ├── {attrs: n_rays: int}
│   ├── direction: [N,3] float32     (per-ray normalized ray directions in sensor coordinates)
│   ├── timestamp_us: [N] uint64     (per-ray timestamps of ray measurement time in us)
│   └── model_element: [N,2] uint16  (optional: model-element indices of each ray)
│
├── ray_bundle_returns/
│   ├── {attrs: n_returns: int}
│   ├── distance_m: [R,N] float32    (per-return measured metric distances along rays)
│   ├── intensity: [R,N] float32     (per-return measured return intensity values [0,1])
│   └── ...                          (may include additional return datasets)
│
└── ray_bundle_returns_valid_mask_packed () uint8  (bit-packed [R,N] valid mask, attrs: n_returns, n_rays)

Radar Sensor Frames:

radars/{radar_id}/frames/{frame_name}/
├── ray_bundle/
│   ├── {attrs: n_rays: int}
│   ├── direction: [N,3] float32  (per-ray normalized ray directions in sensor coordinates)
│   └── timestamp_us: [N] uint64  (per-ray timestamps of ray measurement time in us)
│
├── ray_bundle_returns/
│   ├── {attrs: n_returns: int}
│   ├── distance_m: [R,N] float32  (per-return measured metric distances along rays)
│   └── ...                        (may include radial velocities, RCS)
│
└── ray_bundle_returns_valid_mask_packed () uint8  (bit-packed [R,N] valid mask, attrs: n_returns, n_rays)

Cuboids Component#

3D cuboid track observations are stored in a structured format:

cuboids/
└── {component_instance_name}/
    └── cuboids/
        └── {attrs}
            └── cuboid_track_observations: [N] (JSON-serialized list)

Each observation is a JSON-serializable object containing:

track_id - Unique track identifier (str)
class_id - Object class label (str)
timestamp_us - Observation timestamp in us (int)
reference_frame_id - Reference frame identifier (str)
reference_frame_timestamp_us - Reference frame timestamp in us (int)
bbox3 - 3D bounding box in reference frame coordinates
source - Label source (e.g., AUTOLABEL, GT_SYNTHETIC)
source_version - Optional source version identifier (str)

Observations can be transformed between reference frames using the pose graph and support motion compensation across different sensor frames.

Point Clouds Component#

Pre-computed point clouds (e.g., SfM reconstructions, dense MVS outputs) are stored with optional typed per-point attributes. Unlike sensor components, point clouds store XYZ coordinates directly – no ray-bundle representation is used.

point_clouds/
└── {component_instance_name}/
    ├── {attrs}
    │   ├── coordinate_unit: str            ("METERS" | "UNITLESS")
    │   └── attribute_schemas: {             per-attribute metadata (extensible)
    │         "{name}": {
    │           "transform_type": str,      ("INVARIANT" | "DIRECTION" | "POINT")
    │           "dtype": str,               (e.g. "float32", "uint8")
    │           "shape_suffix": [int, ...]  (per-point shape, e.g. [3] for (N,3))
    │         }, ...
    │       }
    │
    ├── pc_timestamps_us  [M] uint64        derived from per-pc reference frame timestamps
    │
    └── pcs/
        └── {pc_index}/
            ├── {attrs}
            │   ├── reference_frame_id: str
            │   └── generic_meta_data: {...}
            │
            ├── xyz            [N,3] float32
            ├── {attr_name}    [N,...] dtype   (per attribute_schemas)
            └── generic_data/
                └── {name}     arbitrary per-pc arrays

Each point cloud snapshot (“pc”) carries its own reference frame and timestamp. The reference_frame_id is stored per-pc in .zattrs, while the reference_frame_timestamp_us is materialized in the source-level pc_timestamps_us array for efficient temporal lookups. Coordinate transforms are supported via transform() (analogous to transform()).

The attribute_schemas declare typed per-point attributes with explicit transformation semantics:

INVARIANT – unchanged by rigid transforms (e.g., RGB, intensity)
DIRECTION – rotation only (e.g., surface normals)
POINT – full rigid transform (e.g., secondary xyz positions)

The coordinate_unit field indicates whether coordinates are metric ("METERS") or at arbitrary scale ("UNITLESS", e.g., SfM reconstructions). All enum values are serialized as their uppercase name strings.

Lidar and radar point clouds can also be accessed through the unified PointCloudsSourceProtocol via the RayBundleSensorPointCloudsSourceAdapter.

Camera Labels Component#

Per-camera image-aligned labels (depth maps, optical flow, segmentation, masks, surface normals, material properties, feature embeddings) are stored as independently-timestamped label instances. Each instance stores labels of one type for one camera, enabling sparse coverage and multiple label sources per camera.

camera_labels/
└── {instance_name}/                         (e.g., "depth.z@front_50fov")
    │
    ├── timestamps_us  [N] uint64            (sorted label timestamps)
    │
    └── labels/
         ├── {descriptor}
         │     ├── camera_id: str                   (associated camera identifier)
         │     ├── label_type: {                    (tagged-union type descriptor)
         │     │     "category": str,               ("DEPTH", "FLOW", ...)
         │     │     "qualifier": str,              ("z", "optical_forward", ...)
         │     │     "unit": str | null             ("METERS", "PIXELS", ...)
         │     │   }
         │     ├── label_source: str                ("GT_ANNOTATION", "EXTERNAL", ...)
         │     ├── label_schema: {                  (storage format descriptor)
         │     │     "dtype": str,                  (e.g., "float32", "uint8")
         │     │     "shape_suffix": [int, ...],    (trailing dims after [H, W])
         │     │     "encoding": str,               ("RAW" | "IMAGE_ENCODED")
         │     │     "encoded_format": str | null,  ("png", "jpeg", null)
         │     │     "quantization": {...} | null   (optional dequant params)
         │     │   }
         │     └── generic_meta_data: {...}         (metadata common for all labels)
         │
         └── {timestamp_us}/                  (keyed by camera end-of-frame timestamp)
             ├── data  [H, W, ...] or |Sx     (label array or encoded bytes)
             └── {attrs}
                 ├── generic_meta_data: {...} (per-label metadata)
                 └── format: str              (IMAGE_ENCODED only)

Label Type System:

Labels use a tagged-union type consisting of a high-level LabelCategory enum and a free-form qualifier string. Well-known types are provided as constants (e.g., LabelType.DEPTH_Z_M, LabelType.SEGMENTATION_SEMANTIC), while project-specific labels use custom qualifiers without any code changes.

Supported categories:

DEPTH – Per-pixel distance measures ("z", "ray", "relative", …)
FLOW – Motion displacement fields ("optical_forward", "scene_backward", …)
SEGMENTATION – Per-pixel classification ("semantic", "instance", "logits")
MASK – Binary or multi-level masks ("background", "dynamic", "ego", …)
GEOMETRY – Per-pixel geometric vectors ("normal_camera", "ray_direction", …)
MATERIAL – Surface material properties ("albedo", "roughness", …)
FEATURE – Per-pixel feature embeddings ("dinov2", "clip", …)
OTHER – Catch-all for uncategorised labels

Encoding:

RAW – Numpy array stored as a zarr dataset regular compression. Shape is [H, W] + shape_suffix (e.g., [H, W, 2] for optical flow). Transparent quantization of raw labels is supported optionally (e.g., float32 depth quantized to uint16 with scale/offset).
IMAGE_ENCODED – Pre-encoded image bytes (PNG, JPEG) stored as a 1-D zarr uint8 dataset with no compression. Consumers can call get_encoded_data() for raw bytes (GPU-based decoding) or get_data() for Pillow-decoded numpy arrays.

Instance naming convention:

Instance names are opaque identifiers. The recommended convention is category.qualifier@camera_id (e.g., depth.z@front_50fov). The component does not parse or validate instance names.

Compat layer access:

Labels are accessed through CameraLabelsProtocol via get_camera_labels() (by ID) or query_camera_labels() (by camera and optional type/category filter).

Component Groups#

Multiple component instances can coexist using different component instance names. This enables scenarios such as:

Multiple calibrations (e.g., “factory”, “online_refined”)
Multiple label sources (e.g., “auto_labels”, “human_verified”)
Different processing versions (e.g., “v1”, “v2”)

The default component group name is default. Component stores with different group names are stored in separate zarr archives following the naming pattern: ncore4-{component_group_name}.zarr[.itar].

Custom Components#

The component architecture is extensible: define a ComponentWriter / ComponentReader pair with a unique COMPONENT_NAME and version string, then register instances through SequenceComponentGroupsWriter.

To avoid name clashes with built-in or third-party components, use a reverse-domain naming convention for custom component names, e.g. com.myorg.velocity.

A minimal custom component looks like:

from ncore.data.v4 import ComponentWriter, ComponentReader

class VelocityComponent:
    COMPONENT_NAME = "com.myorg.velocity"

    class Writer(ComponentWriter):
        @staticmethod
        def get_component_name() -> str:
            return VelocityComponent.COMPONENT_NAME

        @staticmethod
        def get_component_version() -> str:
            return "v1"

        def store_velocity(self, velocity, timestamp_us):
            ...  # collect data

        def finalize(self):
            ...  # write zarr datasets to self._group

    class Reader(ComponentReader):
        @staticmethod
        def get_component_name() -> str:
            return VelocityComponent.COMPONENT_NAME

        @staticmethod
        def supports_component_version(version: str) -> bool:
            return version == "v1"

        def get_velocities(self):
            return self._group["velocities"][:], self._group["timestamps_us"][:]

Writers must ensure that all stored timestamps fall within the sequence’s sequence_timestamp_interval_us time range. Existing datasets can be extended with new components by creating a writer via SequenceComponentGroupsWriter.from_reader() and finalizing the additional stores. For a complete working example (including component versioning and backward-compatible readers), see TestDataNewComponent.