thread_pool#

Thread pool execution backend.

Uses concurrent.futures.ThreadPoolExecutor for parallel execution. Suitable for I/O-bound workloads where the GIL is not a bottleneck.

Classes#

ThreadPoolBackend

Execute pipeline items using a thread pool.

Module Contents#

class physicsnemo_curator.run.thread_pool.ThreadPoolBackend[source]#

Bases: physicsnemo_curator.run.base.RunBackend

Execute pipeline items using a thread pool.

This backend uses Python’s concurrent.futures.ThreadPoolExecutor. It’s suitable for I/O-bound workloads but may not provide speedup for CPU-bound tasks due to the GIL.

Backend Options#

max_workersint | None

Maximum number of threads. Defaults to config.resolved_n_jobs.

thread_name_prefixstr

Prefix for thread names.

run(
pipeline: physicsnemo_curator.core.base.Pipeline[Any],
config: physicsnemo_curator.run.base.RunConfig,
) list[list[str]][source]#

Execute pipeline indices using a thread pool.

Parameters:
  • pipeline (Pipeline) – The pipeline to execute.

  • config (RunConfig) – Execution configuration.

Returns:

Sink outputs, one list per index.

Return type:

list[list[str]]

description: ClassVar[str] = 'Thread pool executor (good for I/O-bound tasks)'#
name: ClassVar[str] = 'thread_pool'#
requires: ClassVar[tuple[str, Ellipsis]] = ()#