CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Launches a kernel calling a functor for each element in a tensor's index space.
#include <tensor_foreach.h>
Public Member Functions | |
TensorForEach (Coord< Rank > size, Params params=Params(), int grid_size=0, int block_size=0) | |
Constructor performs the operation. More... | |
|
inline |