cuda.core.experimental.LaunchConfig¶
- class cuda.core.experimental.LaunchConfig(grid: tuple | int = None, cluster: tuple | int = None, block: tuple | int = None, shmem_size: int | None = None, cooperative_launch: bool | None = False)¶
Customizable launch options.
Note
When cluster is specified, the grid parameter represents the number of clusters (not blocks). The hierarchy is: grid (clusters) -> cluster (blocks) -> block (threads). Each dimension in grid specifies clusters in the grid, each dimension in cluster specifies blocks per cluster, and each dimension in block specifies threads per block.
- grid¶
Collection of threads that will execute a kernel function. When cluster is not specified, this represents the number of blocks, otherwise this represents the number of clusters.
- cluster¶
Group of blocks (Thread Block Cluster) that will execute on the same GPU Processing Cluster (GPC). Blocks within a cluster have access to distributed shared memory and can be explicitly synchronized.
- block¶
Group of threads (Thread Block) that will execute on the same streaming multiprocessor (SM). Threads within a thread blocks have access to shared memory and can be explicitly synchronized.