CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Kernel performing a reduction over densely packed tensors in global memory. More...
#include "cutlass/cutlass.h"
#include "cutlass/tensor_ref.h"
#include "cutlass/numeric_types.h"
#include "cutlass/array.h"
#include "cutlass/functional.h"
#include "cutlass/numeric_conversion.h"
Go to the source code of this file.
Classes | |
struct | cutlass::reduction::thread::ReduceAdd< ElementAccumulator_, Element_, Count > |
Mixed-precision reduction. More... | |
struct | cutlass::reduction::thread::ReduceAdd< ElementAccumulator_, Element_, Count >::Params |
Namespaces | |
cutlass | |
cutlass::reduction | |
cutlass::reduction::thread | |