ahmedml#

AhmedML dataset source for mesh pipelines.

Reads the AhmedML dataset — 500 geometric variations of the Ahmed Car Body with transient hybrid RANS-LES CFD (OpenFOAM v2212, ~20 M cells per case).

The dataset provides three mesh types per run:

  • boundary — surface mesh with flow fields (VTP, ~83 MB each)

  • volume — volumetric field data (VTU, ~5.6 GB each)

  • slices — x/y/z-normal slice planes with flow fields (VTP)

  • multi — domain mesh combining interior + boundary + STL with global data

Each run also includes CSV metadata (force/moment coefficients and geometric parameters) which is attached as global_data on every yielded mesh, regardless of mesh type.

File discovery and caching are handled internally using fsspec.

Attributes#

Classes#

AhmedMLSource

Read meshes from the AhmedML dataset on HuggingFace Hub.

Module Contents#

class physicsnemo_curator.domains.mesh.sources.ahmedml.AhmedMLSource(
mesh_type: MeshType = 'multi',
mesh_parts: list[MeshPart] | None = None,
url: str = _AHMEDML_HF_URL,
storage_options: dict[str, object] | None = None,
cache_storage: str | None = None,
manifold_dim: int | Literal['auto'] = 'auto',
point_source: Literal['vertices', 'cell_centroids'] = 'vertices',
warn_on_lost_data: bool = True,
backend: Backend = 'pyvista',
)#

Bases: physicsnemo_curator.core.base.Source[physicsnemo.mesh.Mesh]

Read meshes from the AhmedML dataset on HuggingFace Hub.

Each index maps to one simulation run. The mesh_type parameter selects which mesh to load for each run:

  • "boundary" — surface mesh (VTP) with flow fields

  • "volume" — volumetric mesh (VTU, single file per run)

  • "slices" — x/y/z-normal slice planes (VTP); yields multiple meshes per index

  • "multi" — yields domain mesh (DomainMesh) and/or STL mesh

All modes attach CSV metadata (force/moment coefficients and geometric parameters) as global_data on each yielded mesh.

Parameters:
  • mesh_type ({"boundary", "volume", "slices", "multi"}) – Which mesh to read from each run directory.

  • mesh_parts (list[str] | None) – Mesh parts to yield in "multi" mode. Valid parts are "domain" and "stl". Defaults to ["domain"]. Ignored for non-multi modes.

  • url (str) – Base HuggingFace Hub URL. Override only for testing.

  • storage_options (dict[str, object] | None) – Extra fsspec keyword arguments (e.g. {"token": "hf_..."}).

  • cache_storage (str | None) – Local cache directory. None → temporary directory.

  • manifold_dim (int or {"auto"}) – Target manifold dimension for from_pyvista conversion. Only used with the "pyvista" backend.

  • point_source ({"vertices", "cell_centroids"}) – Point source mode for from_pyvista conversion. Only used with the "pyvista" backend.

  • warn_on_lost_data (bool) – Warn when data arrays are discarded during conversion. Only used with the "pyvista" backend.

  • backend ({"pyvista", "rust"}) –

    VTK reading backend:

    • "pyvista" (default): use PyVista for full-featured reading.

    • "rust": use the native Rust backend for faster reading. Note: The Rust backend only supports ASCII VTU/VTP files and does not support manifold_dim or point_source options.

Examples

>>> source = AhmedMLSource(mesh_type="boundary")
>>> len(source)
500
>>> mesh = next(source[0])
>>> mesh.global_data["cd"]  # force coefficient from CSV
tensor([0.2405])
>>> source = AhmedMLSource(mesh_type="multi", mesh_parts=["domain", "stl"])
>>> for mesh in source[0]:
...     print(type(mesh).__name__)
DomainMesh
Mesh

Using the fast Rust backend:

>>> source = AhmedMLSource(mesh_type="boundary", backend="rust")

Note

mesh_name(index: int, seq: int) str#

Return the canonical output name for mesh at (index, seq).

Parameters:
  • index (int) – Source index (which run).

  • seq (int) – Sequence number within this index (ignored for boundary/volume).

Returns:

Resolved name like "boundary_1" or "volume_5".

Return type:

str

classmethod params() list[physicsnemo_curator.core.base.Param]#

Return parameter descriptors for the AhmedML source.

Returns:

Parameter list for CLI configuration.

Return type:

list[Param]

run_id(index: int) int#

Return the dataset run ID for the given source index.

Parameters:

index (int) – Zero-based index into the sorted run list.

Returns:

The dataset run ID (e.g. 1, 5, 12).

Return type:

int

description: ClassVar[str] = 'AhmedML dataset 500 Ahmed Car Body variants with hybrid RANS-LES CFD'#
name: ClassVar[str] = 'AhmedML'#
physicsnemo_curator.domains.mesh.sources.ahmedml.Backend#
physicsnemo_curator.domains.mesh.sources.ahmedml.MeshPart#
physicsnemo_curator.domains.mesh.sources.ahmedml.MeshType#