CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
kernel Directory Reference
Directory dependency graph for kernel:
kernel

Files

file  default_gemm.h [code]
 Default kernel-level GEMM definitions combine threadblock-scoped matrix multiply-add with the appropriate threadblock-scoped epilogue.
 
file  default_gemm_splitk_parallel.h [code]
 Default kernel-level GEMM definitions combine threadblock-scoped matrix multiply-add with the appropriate threadblock-scoped epilogue.
 
file  default_gemv.h [code]
 
file  include/cutlass/gemm/kernel/gemm.h [code]
 Template for a pipelined GEMM kernel. Does not compute batching or support split-K.
 
file  kernel/gemm_batched.h [code]
 Template for a pipelined GEMM kernel. Does not compute batching or support split-K.
 
file  gemm_pipelined.h [code]
 Template for a pipelined GEMM kernel. Does not compute batching or support split-K.
 
file  kernel/gemm_splitk_parallel.h [code]
 Template for GEMM performing a reduction over K partitions in parallel.
 
file  gemv_batched_strided.h [code]