loky#

Loky (joblib) execution backend.

Uses joblib.Parallel with the loky backend for robust parallel execution. Loky provides better process management than the standard multiprocessing module, including automatic worker restart on crashes and better memory cleanup.

Classes#

LokyBackend

Execute pipeline items using joblib with the loky backend.

Module Contents#

class physicsnemo_curator.run.loky.LokyBackend[source]#

Bases: physicsnemo_curator.run.base.RunBackend

Execute pipeline items using joblib with the loky backend.

Loky is a robust process executor that handles worker crashes gracefully and provides better memory management than standard multiprocessing. Key advantages over ProcessPoolExecutor:

  • Automatic worker restart: If a worker crashes or is killed, loky automatically restarts it without failing the entire job.

  • Memory cleanup: Workers are recycled after processing a configurable number of tasks to prevent memory leaks.

  • Robust serialization: Uses cloudpickle for better serialization of complex objects including lambdas and closures.

  • Timeout handling: Built-in timeout support per task.

Warning

Stateful filters accumulate per-process state that is not merged back into the parent process.

Backend Options#

preferstr

Soft hint for parallelization (“processes” or “threads”).

requirestr

Hard constraint for parallelization.

verboseint

Verbosity level (0-50). If not set, uses 0 (quiet).

batch_sizeint | str

Number of tasks per batch (“auto” or int).

pre_dispatchstr | int

Number of batches to pre-dispatch.

temp_folderstr | None

Folder for memmapping large arrays.

timeoutfloat | None

Timeout in seconds for retrieving results.

max_nbytesstr | int | None

Threshold for automatic memmapping (e.g., “1M”).

run(
pipeline: physicsnemo_curator.core.base.Pipeline[Any],
config: physicsnemo_curator.run.base.RunConfig,
) list[list[str]][source]#

Execute pipeline indices using joblib/loky.

Parameters:
  • pipeline (Pipeline) – The pipeline to execute.

  • config (RunConfig) – Execution configuration.

Returns:

Sink outputs, one list per index.

Return type:

list[list[str]]

Raises:

ImportError – If joblib is not installed.

description: ClassVar[str] = 'Joblib with loky backend (robust process management)'#
name: ClassVar[str] = 'loky'#
requires: ClassVar[tuple[str, Ellipsis]] = ('joblib',)#