44 typename InstructionShape_,
58 typename Operator_ = arch::OpMultiplyAdd,
63 bool AccumulatorsInRowMajor =
false,
76 typename InstructionShape_,
95 bool AccumulatorsInRowMajor,
103 cutlass::layout::RowMajor, Operator_>,
108 WarpShape_, ElementA, LayoutA, ElementB, LayoutB, ElementC, LayoutC,
109 Policy, PartitionsK, AccumulatorsInRowMajor, PartitionsN>;
Describes the size of a matrix tile.
Definition: matrix_shape.h:42
Definition: aligned_buffer.h:35
Partial specialization for m-by-n-by-kgroup.
Definition: default_mma_tensor_op.h:67
Structure to compute the matrix product targeting CUDA cores and SIMT math instructions.
Definition: mma_tensor_op.h:82
Mapping function for column-major matrices.
Definition: layout/matrix.h:142
Policy.
Definition: mma_tensor_op_policy.h:48
Mapping function for row-major matrices.
Definition: layout/matrix.h:50
cutlass::gemm::warp::MmaTensorOpPolicy< cutlass::arch::Mma< InstructionShape_, 32, ElementA, cutlass::layout::RowMajor, ElementB, cutlass::layout::ColumnMajor, ElementC, cutlass::layout::RowMajor, Operator_ >, cutlass::MatrixShape< 1, 1 > > Policy
Definition: default_mma_tensor_op.h:104
Matrix multiply-add operation.
Definition: arch/mma.h:92
Templates implementing warp-level matrix multiply-accumulate operations targeting Tensor Cores...
Basic include for CUTLASS.