CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Functions | |
template<typename WarpShape > | |
constexpr int | simt_get_warp_threads_m () |
constexpr int | simt_transpose_padding (int threads, int crosswise, int size_in_bits) |
Computes padding in shared memory to perform efficient transpose without bank conflicts. More... | |
constexpr int cutlass::gemm::threadblock::detail::simt_get_warp_threads_m | ( | ) |
constexpr int cutlass::gemm::threadblock::detail::simt_transpose_padding | ( | int | threads, |
int | crosswise, | ||
int | size_in_bits | ||
) |