CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
cutlass Directory Reference
Directory dependency graph for cutlass:
cutlass

Directories

directory  arch
 
directory  epilogue
 
directory  gemm
 
directory  layout
 
directory  platform
 
directory  reduction
 
directory  thread
 
directory  transform
 
directory  util
 

Files

file  aligned_buffer.h [code]
 AlignedBuffer is a container for trivially copyable elements suitable for use in unions and shared memory.
 
file  array.h [code]
 Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe to use in a union.
 
file  array_subbyte.h [code]
 Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe to use in a union.
 
file  complex.h [code]
 
file  coord.h [code]
 A Coord is a coordinate of arbitrary rank into a tensor or matrix.
 
file  core_io.h [code]
 Helpers for printing cutlass/core objects.
 
file  cutlass.h [code]
 Basic include for CUTLASS.
 
file  device_kernel.h [code]
 Template for generic CUTLASS kernel.
 
file  fast_math.h [code]
 Math utilities.
 
file  functional.h [code]
 Define basic numeric operators with specializations for Array<T, N>. SIMD-ize where possible.
 
file  half.h [code]
 Defines a class for using IEEE half-precision floating-point types in host or device code.
 
file  integer_subbyte.h [code]
 Defines a class for using integer types smaller than one byte in host or device code.
 
file  kernel_launch.h [code]
 Defines structures and helpers to launch CUDA kernels within CUTLASS.
 
file  matrix_coord.h [code]
 Defines a canonical coordinate for rank=2 matrices offering named indices.
 
file  matrix_shape.h [code]
 Defines a Shape template for matrix tiles.
 
file  matrix_traits.h [code]
 Defines properties of matrices used to denote layout and operands to GEMM kernels.
 
file  numeric_conversion.h [code]
 Boost-like numeric conversion operator for CUTLASS numeric types.
 
file  numeric_types.h [code]
 Top-level include for all CUTLASS numeric types.
 
file  predicate_vector.h [code]
 Defines container classes and iterators for managing a statically sized vector of boolean predicates.
 
file  real.h [code]
 
file  relatively_equal.h [code]
 
file  semaphore.h [code]
 Implementation of a CTA-wide semaphore for inter-CTA synchronization.
 
file  subbyte_reference.h [code]
 Provides a mechanism for packing and unpacking elements smaller than one byte.
 
file  tensor_coord.h [code]
 Defines a canonical coordinate for rank=4 tensors offering named indices.
 
file  tensor_ref.h [code]
 Defines a structure containing strides, bounds, and a pointer to tensor data.
 
file  tensor_view.h [code]
 Defines a structure containing strides and a pointer to tensor data.
 
file  wmma_array.h [code]
 Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe to use in a union.