base#
Base classes for pipeline execution backends.
This module defines the abstract interface that all execution backends must implement, along with common utilities.
Classes#
Abstract base class for pipeline execution backends. |
|
Configuration for pipeline execution. |
|
Multi-line progress display showing per-worker activity. |
Functions#
|
Return a tqdm progress bar or None. |
|
Process a single pipeline index. |
|
Process a single pipeline index (packed arguments for map functions). |
Module Contents#
- class physicsnemo_curator.run.base.RunBackend[source]#
Bases:
abc.ABCAbstract base class for pipeline execution backends.
Subclasses implement different parallelization strategies (threading, multiprocessing, distributed computing, workflow orchestrators, etc.).
Class Attributes#
- namestr
Unique identifier for this backend (e.g., “sequential”, “thread_pool”).
- descriptionstr
Human-readable description of the backend.
- requirestuple[str, …]
Optional package dependencies required by this backend.
- classmethod is_available() bool[source]#
Check if this backend’s dependencies are installed.
- Returns:
True if all required packages are available.
- Return type:
- class physicsnemo_curator.run.base.RunConfig[source]#
Configuration for pipeline execution.
- Parameters:
n_jobs (int) – Number of parallel workers.
1forces sequential execution.-1uses all available CPUs. Values<= 0follow the conventioncpu_count + 1 + n_jobs.progress (bool) – Whether to show a progress indicator (if supported by backend).
indices (list[int] | None) – Specific source indices to process.
Noneprocesses all indices.backend_options (dict[str, Any]) – Additional backend-specific options.
- class physicsnemo_curator.run.base.WorkerProgressDisplay( )[source]#
Multi-line progress display showing per-worker activity.
Renders an overall progress bar plus one bar per active worker (up to
_MAX_WORKER_BARS). Falls back gracefully when tqdm is not installed or progress is disabled.- Parameters:
- complete_item() None[source]#
Increment the overall bar without worker tracking.
Use this for backends where individual worker identity is not available.
- worker_done(worker_id: int) None[source]#
Mark a worker as idle and update the overall bar.
- Parameters:
worker_id (int) – Zero-based worker identifier.
- physicsnemo_curator.run.base.process_single_index(
- pipeline: physicsnemo_curator.core.base.Pipeline[Any],
- index: int,
Process a single pipeline index.
This is a module-level function to support pickling for multiprocess backends. After processing, any stateful filters with
flushmethods are automatically flushed to shard files.