Script.wgmma.mma¶
- Script.wgmma.mma(a, b, d)[source]¶
Perform warp group matrix multiply-accumulate (MMA) operation.
Computes
d = a @ b + dwhereais in shared or register memory,bis in shared memory, anddis in register memory (both input accumulator and output).All tensors must be 2D with compatible shapes:
ais[M, K],bis[K, N], anddis[M, N].A
wgmma.fence()must be called before this instruction, and awgmma.commit_group()followed bywgmma.wait_group()after.- Parameters:
a (SharedTensor | RegisterTensor) – The left-hand operand of the matrix multiplication. Shape
[M, K].b (SharedTensor) – The right-hand operand of the matrix multiplication. Shape
[K, N].d (RegisterTensor) – The accumulator tensor, used as both input and output. Shape
[M, N].
- Return type:
None
Notes
Thread group: Must be executed by a warp group (4 warps).
Hardware: Requires compute capability 9.0a+ (sm_90a).
PTX:
wgmma.mma_async.sync.aligned