Script.tcgen05.alloc

Script.tcgen05.alloc

Script.tcgen05.alloc(dtype, shape, cta_group=1)[source]

Allocate a tensor in tensor memory (TMEM).

Tensor memory is a high-bandwidth on-chip memory space accessible by the tensor core on Blackwell GPUs. The allocated tensor can be used as an accumulator for MMA operations or for load/store operations.

Parameters:
  • dtype (DataType) – The data type of the tensor elements (e.g., float32, float16).

  • shape (Sequence[int]) – The shape of the tensor. Must have at least 2 dimensions. The second-to-last dimension (shape[-2]) must be 32, 64, or 128.

  • cta_group (int) – The CTA group size for the allocation. Must be 1 or 2. When 2, the tensor is shared across two CTAs in the same cluster.

Returns:

ret – The allocated tensor memory tensor.

Return type:

TMemoryTensor

Notes

  • Thread group: Must be executed by a thread group with at least 32 threads (one warp).

  • Hardware: Requires compute capability 10.0+ (sm_100).

  • PTX: tcgen05.alloc