Script.tcgen05.alloc¶
- Script.tcgen05.alloc(dtype, shape, cta_group=1)[source]¶
Allocate a tensor in tensor memory (TMEM).
Tensor memory is a high-bandwidth on-chip memory space accessible by the tensor core on Blackwell GPUs. The allocated tensor can be used as an accumulator for MMA operations or for load/store operations.
- Parameters:
dtype (DataType) – The data type of the tensor elements (e.g.,
float32,float16).shape (Sequence[int]) – The shape of the tensor. Must have at least 2 dimensions. The second-to-last dimension (
shape[-2]) must be 32, 64, or 128.cta_group (int) – The CTA group size for the allocation. Must be 1 or 2. When 2, the tensor is shared across two CTAs in the same cluster.
- Returns:
ret – The allocated tensor memory tensor.
- Return type:
Notes
Thread group: Must be executed by a thread group with at least 32 threads (one warp).
Hardware: Requires compute capability 10.0+ (sm_100).
PTX:
tcgen05.alloc