Script.mbarrier.alloc

Script.mbarrier.alloc

Script.mbarrier.alloc(counts)[source]

Allocate and initialize one or more mbarriers in shared memory.

Each barrier is a 64-bit object in shared memory, initialized with:

  • phase = 0

  • pending_arrivals = counts[i] (the expected arrival count)

  • expected_arrivals = counts[i] (used to reset pending_arrivals on phase flip)

  • tx-count = 0

A single value allocates one barrier; a sequence allocates multiple barriers.

Parameters:

counts (Sequence[Expr | int] | Expr | int) – Expected arrival counts for the barriers. Each count must evaluate to a positive int32. A single value allocates one barrier; a sequence allocates multiple.

Returns:

ret – A register tensor of dtype uint32 containing the shared memory address(es) of the allocated barrier(s). Element i holds the address for counts[i].

Return type:

RegisterTensor

Notes

  • Thread group: Can be executed by any sized thread group.

  • Hardware: Requires compute capability 8.0+ (sm_80).

  • PTX: mbarrier.init.shared::cta.b64