CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
thread Directory Reference
Directory dependency graph for thread:
thread

Files

file  reduce.h [code]
 Defines basic thread level reduction with specializations for Array<T, N>.
 
file  reduction_operators.h [code]
 Kernel performing a reduction over densely packed tensors in global memory.