CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
#include <threadblock_swizzle.h>
Public Member Functions | |
CUTLASS_HOST_DEVICE | DefaultBlockSwizzle () |
Ctor. More... | |
CUTLASS_DEVICE dim3 | swizzle () |
Swizzle the block index. More... | |
CUTLASS_HOST_DEVICE dim3 | get_grid_layout (Coord< 3 > const &problem_size, Coord< 3 > const &OutputTile) |
CUTLASS_DEVICE Coord< 3 > | get_threadblock_offset (Coord< 3 > const &SubTile) |
|
inline |
|
inline |
|
inline |
|
inline |