CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
cutlass::gemm::threadblock::DefaultMmaCore< Shape, WarpShape, InstructionShape, ElementA, LayoutA, ElementB, LayoutB, ElementC, LayoutC, OperatorClass, Stages, Operator, AccumulatorsInRowMajor > Struct Template Reference

#include <default_mma_core.h>

Detailed Description

template<typename Shape, typename WarpShape, typename InstructionShape, typename ElementA, typename LayoutA, typename ElementB, typename LayoutB, typename ElementC, typename LayoutC, typename OperatorClass, int Stages = 2, typename Operator = typename platform::conditional< (platform::is_same<OperatorClass, cutlass::arch::OpClassTensorOp>::value) && (platform::is_same<ElementA, int8_t>::value || platform::is_same<ElementA, int4b_t>::value || platform::is_same<ElementA, uint8_t>::value || platform::is_same<ElementA, uint4b_t>::value), cutlass::arch::OpMultiplyAddSaturate, cutlass::arch::OpMultiplyAdd>::type, bool AccumulatorsInRowMajor = false>
struct cutlass::gemm::threadblock::DefaultMmaCore< Shape, WarpShape, InstructionShape, ElementA, LayoutA, ElementB, LayoutB, ElementC, LayoutC, OperatorClass, Stages, Operator, AccumulatorsInRowMajor >

Template defininng default matrix multiply operators inferred from threadblock tile size, global memory data layout, and target math instruction.


The documentation for this struct was generated from the following file: