drivaerml#

DrivAerML dataset source for mesh pipelines.

Reads the DrivAerML dataset — 500 parametrically morphed variants of the DrivAer notchback vehicle with high-fidelity scale-resolving CFD (OpenFOAM v2212).

The dataset provides three mesh types per run:

  • boundary — surface mesh with flow fields (VTP, ~660 MB each)

  • volume — volumetric field data (VTU, ~50 GB each, split into parts)

  • slices — x/y/z-normal slice planes with flow fields (VTP)

File discovery and caching are handled internally using fsspec.

Attributes#

Classes#

DrivAerMLSource

Read meshes from the DrivAerML dataset on HuggingFace Hub.

Module Contents#

class physicsnemo_curator.domains.mesh.sources.drivaerml.DrivAerMLSource(
mesh_type: MeshType = 'boundary',
url: str = _DRIVAERML_HF_URL,
storage_options: dict[str, object] | None = None,
cache_storage: str | None = None,
cache: bool = True,
manifold_dim: int | Literal['auto'] = 'auto',
point_source: Literal['vertices', 'cell_centroids'] = 'vertices',
warn_on_lost_data: bool = True,
mesh_parts: list[str] | None = None,
backend: Backend = 'pyvista',
)#

Bases: physicsnemo_curator.core.base.Source[physicsnemo.mesh.Mesh]

Read meshes from the DrivAerML dataset on HuggingFace Hub.

Each index maps to one simulation run. The mesh_type parameter selects which mesh to load for each run:

  • "boundary" — surface mesh (VTP) with flow fields

  • "volume" — volumetric mesh (VTU, reconstructed from split parts)

  • "slices" — x/y/z-normal slice planes (VTP); yields multiple meshes per index

  • "multi" — yields domain (as DomainMesh), stl, and/or single_solid meshes per run (all converted to float32). The domain mesh combines volume interior, boundary surface, and global data.

Parameters:
  • mesh_type ({"boundary", "volume", "slices", "multi"}) – Which mesh to read from each run directory.

  • url (str) – Base HuggingFace Hub URL. Override only for testing.

  • storage_options (dict[str, object] | None) – Extra fsspec keyword arguments (e.g. {"token": "hf_..."}).

  • cache_storage (str | None) – Local cache directory. None → temporary directory.

  • cache (bool) – Persist downloaded files across sessions.

  • manifold_dim (int or {"auto"}) – Target manifold dimension for from_pyvista conversion. Only used with the "pyvista" backend.

  • point_source ({"vertices", "cell_centroids"}) – Point source mode for from_pyvista conversion. Only used with the "pyvista" backend.

  • warn_on_lost_data (bool) – Warn when data arrays are discarded during conversion. Only used with the "pyvista" backend.

  • mesh_parts (list[str] or None) – Which mesh parts to yield in "multi" mode. Valid parts are "domain", "stl", and "single_solid". When None (the default), all three parts are yielded. Ignored for other mesh types.

  • backend ({"pyvista", "rust"}) –

    VTK reading backend:

    • "pyvista" (default): uses PyVista + from_pyvista for full conversion (manifold_dim, point_source options respected).

    • "rust": uses the native Rust VTK reader for faster I/O. Constructs Mesh directly from raw arrays. The manifold_dim and point_source options are ignored; data is returned as-is from the file.

Examples

>>> source = DrivAerMLSource(mesh_type="boundary")
>>> len(source)
484
>>> mesh = next(source[0])
>>> source = DrivAerMLSource(mesh_type="slices")
>>> for mesh in source[0]:  # yields multiple slice planes
...     print(mesh.n_points)

Note

mesh_name(index: int, seq: int) str#

Return the canonical output name for mesh at (index, seq).

Parameters:
  • index (int) – Source index (which run).

  • seq (int) – Sequence number within this index (which part).

Returns:

Resolved name like "domain_1" or "drivaer_5.stl".

Return type:

str

Raises:

IndexError – If seq is out of range for the active mesh_parts.

classmethod params() list[physicsnemo_curator.core.base.Param]#

Return parameter descriptors for the DrivAerML source.

Returns:

Parameter list for CLI configuration.

Return type:

list[Param]

run_id(index: int) int#

Return the dataset run ID for the given source index.

Parameters:

index (int) – Zero-based index into the sorted run list.

Returns:

The dataset run ID (e.g. 1, 5, 12).

Return type:

int

description: ClassVar[str] = 'DrivAerML dataset 500 DrivAer notchback variants with scale-resolving CFD'#
name: ClassVar[str] = 'DrivAerML'#
physicsnemo_curator.domains.mesh.sources.drivaerml.Backend#
physicsnemo_curator.domains.mesh.sources.drivaerml.MeshType#
physicsnemo_curator.domains.mesh.sources.drivaerml.logger#