Rank Filter

class nvidia_resiliency_ext.inprocess.rank_filter.ActiveWorldSizeDivisibleBy(divisor=1)[source]

Note

ActiveWorldSizeDivisibleBy is deprecated and will be removed in the next release. The functionality is moved to inprocess.rank_assignment.ActiveWorldSizeDivisibleBy.

ActiveWorldSizeDivisibleBy ensures that the active world size is divisible by a given number. Ranks within the adjusted world size are marked as active and are calling the wrapped function, while ranks outside this range are marked as inactive.

Parameters:

divisor (int) – the divisor to adjust the active world size by

class nvidia_resiliency_ext.inprocess.rank_filter.MaxActiveWorldSize(max_active_world_size=None)[source]

Note

MaxActiveWorldSize is deprecated and will be removed in the next release. The functionality is moved to inprocess.rank_assignment.MaxActiveWorldSize.

MaxActiveWorldSize ensures that the active world size is no greater than the specified max_active_world_size. Ranks with indices less than the active world size are active and calling the wrapped function, while ranks outside this range are inactive.

Parameters:

max_active_world_size (int | None) – maximum active world size, no limit if None

class nvidia_resiliency_ext.inprocess.rank_filter.RankFilter[source]

Note

RankFilter is deprecated and will be removed in the next release. The functionality is merged into inprocess.rank_assignment.RankAssignment.

RankFilter selects which ranks are active in the current restart iteration of inprocess.Wrapper.

Active ranks call the provided wrapped function. Inactive ranks are waiting idle, and could serve as a pool of static, preallocated and preinitialized reserve ranks. Reserve ranks would be activated in a subsequent restart iteration if previously active ranks were terminated or became unhealthy.

Multiple instances of RankFilter could be composed with inprocess.Compose to achieve the desired behavior.

class nvidia_resiliency_ext.inprocess.rank_filter.WorldSizeDivisibleBy(*args, **kwargs)[source]