process_pool#
Process pool execution backend.
Uses concurrent.futures.ProcessPoolExecutor for parallel execution.
Suitable for CPU-bound workloads that benefit from true parallelism.
Classes#
Execute pipeline items using a process pool. |
Module Contents#
- class physicsnemo_curator.run.process_pool.ProcessPoolBackend[source]#
Bases:
physicsnemo_curator.run.base.RunBackendExecute pipeline items using a process pool.
This backend uses Python’s
concurrent.futures.ProcessPoolExecutor. It provides true parallelism for CPU-bound workloads by spawning separate processes that bypass the GIL.Warning
Stateful filters accumulate per-process state that is not merged back into the parent process. Use sequential execution when filter side-effects must be aggregated.
Backend Options#
- max_workersint | None
Maximum number of processes. Defaults to
config.resolved_n_jobs.- mp_contextstr | None
Multiprocessing context (“spawn”, “fork”, “forkserver”).
- initializerCallable | None
Callable to run at the start of each worker process.
- initargstuple
Arguments for the initializer.
- run(
- pipeline: physicsnemo_curator.core.base.Pipeline[Any],
- config: physicsnemo_curator.run.base.RunConfig,
Execute pipeline indices using a process pool.