CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
#include <mma_simt_tile_iterator.h>
Iterates over operands to warp-level matrix multiply operations targeting SIMT instructions
concept: MutableRandomAccessContiguousTileIteratorConcept