Script.mbarrier.arrive¶
- Script.mbarrier.arrive(barrier, count=1, sem='release', scope='cta')[source]¶
Arrive at a barrier.
Each thread in the current thread group decrements the barrier’s pending arrival count by
count. When pending arrivals (and tx-count) both reach zero, the hardware flips the phase and resets the counters for the next phase.With
sem='release', all prior memory writes by the arriving thread are guaranteed visible to any thread that later completes a successfulwaitwith acquire semantics on this barrier.- Parameters:
barrier (RegisterTensor) – A single-element uint32 register tensor holding the barrier’s shared memory address.
count (Expr | int) – The number of arrivals contributed by each thread. Must evaluate to a positive int32. Default is 1.
sem (str) – Memory ordering semantics.
'release'ensures prior writes are visible to waiters;'relaxed'provides no ordering guarantees. Candidates:'relaxed','release'.scope (str) – Synchronization scope. Candidates:
'cta','cluster'.
- Return type:
None
Notes
Thread group: Can be executed by any sized thread group.
Hardware: Requires compute capability 8.0+ (sm_80).
PTX:
mbarrier.arrive.shared::cta.b64