cuda.core.LaunchConfig#

class cuda.core.LaunchConfig(
grid: int | tuple[int,
...] | None = None,
cluster: int | tuple[int,
...] | None = None,
block: int | tuple[int,
...] | None = None,
int shmem_size: int | None = None,
bool is_cooperative: bool = False,
)#

Customizable launch options.

Note

When cluster is specified, the grid parameter represents the number of clusters (not blocks). The hierarchy is: grid (clusters) -> cluster (blocks) -> block (threads). Each dimension in grid specifies clusters in the grid, each dimension in cluster specifies blocks per cluster, and each dimension in block specifies threads per block.

grid#

Collection of threads that will execute a kernel function. When cluster is not specified, this represents the number of blocks, otherwise this represents the number of clusters.

Type:

Union[tuple, int]

cluster#

Group of blocks (Thread Block Cluster) that will execute on the same GPU Processing Cluster (GPC). Blocks within a cluster have access to distributed shared memory and can be explicitly synchronized.

Type:

Union[tuple, int]

block#

Group of threads (Thread Block) that will execute on the same streaming multiprocessor (SM). Threads within a thread blocks have access to shared memory and can be explicitly synchronized.

Type:

Union[tuple, int]

shmem_size#

Dynamic shared-memory size per thread block in bytes. (Default to size 0)

Type:

int, optional

is_cooperative#

Whether this config can be used to launch a cooperative kernel.

Type:

bool, optional