PTX Instructions
- ld
- st
- shr
- shl
- bmsk
- prmt
- barrier.cluster
- bfind
- clusterlaunchcontrol
- cp.async.bulk
- cp.async.bulk.commit_group
- cp.async.bulk.wait_group
- cp.async.bulk.tensor
- cp.async.mbarrier.arrive
- cp.reduce.async.bulk
- cp.reduce.async.bulk.tensor
- exit
- fence
- getctarank
- mapa
- mbarrier.init
- mbarrier.arrive
- mbarrier.expect_tx
- mbarrier.test_wait
- mbarrier.try_wait
- multimem.ld_reduce
- multimem.red
- multimem.st
- red.async
- shfl.sync
- st.async
- st.bulk
- tcgen05.alloc
- tcgen05.commit
- tcgen05.cp
- tcgen05.fence
- tcgen05.ld
- tcgen05.mma
- tcgen05.mma.ws
- tcgen05.shift
- tcgen05.st
- tcgen05.wait
- tensormap.replace
- tensormap.cp_fenceproxy
- trap
- Special registers
Instructions by section
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
CCCL 3.0.0 |
|
No |
|
No |
|
No |
|
No |
|
No |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
Yes, CCCL 2.9.0 / CUDA 12.9 |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
|
No |
|
Yes, CCCL 3.0.0 / CUDA 13.0 |
|
CCCL 2.3.0 / CUDA 12.4 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
|
CCCL 2.4.0 / CUDA 12.5 |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
|
CCCL 2.4.0 / CUDA 12.5 |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
|
CCCL 2.4.0 / CUDA 12.5 |
|
CCCL 2.4.0 / CUDA 12.5 |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
CCCL 3.0.0 |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
|
No |
|
No |
|
CCCL 2.3.0 / CUDA 12.4 |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
CCCL 2.5.0 / CUDA Future |
|
No |
|
No |
|
No |
|
CCCL 2.3.0 / CUDA 12.4 |
|
No |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.3.0 / CUDA 12.4 |
|
CCCL 2.3.0 / CUDA 12.4 |
|
No |
|
CCCL 2.4.0 / CUDA 12.5 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
|
CCCL 2.8 / CUDA 12.9 |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
No |
Instruction |
Available in libcu++ |
---|---|
No |
|
No |
|
No |
|
CCCL 3.0.0 |
|
No |
Instruction |
PTX ISA |
SM Version |
Available in libcu++ |
---|---|---|---|
20 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
13 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
13 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
13 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
30 |
30 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
78 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
10 |
All |
CCCL 2.4.0 / CUDA 12.5 |
|
20 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
No |
|||
No |
|||
No |
|||
31 |
31 |
CCCL 2.4.0 / CUDA 12.5 |
|
No |
|||
41 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
81 |
90 |
CCCL 2.4.0 / CUDA 12.5 |
|
41 |
20 |
CCCL 2.4.0 / CUDA 12.5 |
|
80 |
50 |
CCCL 2.4.0 / CUDA 12.5 |