Script.atomic.shared_sub¶
- Script.atomic.shared_sub(dst, values, *, sem='relaxed', scope='cta', output=None)[source]¶
Element-wise
dst[i] = dst[i] - values[i]atomically, on shared memory.PTX has no native
atom.sub; the codegen lowers this toatom.addwith the negated operand. Seeshared_add()for the full parameter description.Notes
Thread group: Can be executed by any sized thread group.
Hardware: Requires compute capability 7.0+ (sm_70).
PTX:
atom.{sem}.{scope}.shared.add.s32with a negated input.
- Parameters:
dst (SharedTensor)
values (RegisterTensor)
sem (str)
scope (str)
output (RegisterTensor | None)
- Return type:
RegisterTensor | None