51 namespace threadblock {
81 ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_,
96 Shape::kM / WarpShape::kM,
97 Shape::kN / WarpShape::kN,
98 Shape::kK / WarpShape::kK
103 !(Shape::kM % WarpShape::kM) &&
104 !(Shape::kN % WarpShape::kN),
105 "Threadblock-scoped GEMM should be divisible by warp-scoped GEMM size." 112 static int const kThreads = WarpCount::kCount * kWarpSize;
Describes the size of a matrix tile.
Definition: matrix_shape.h:42
Definition: aligned_buffer.h:35
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::WarpShape WarpShape_ WarpShape
Definition: default_mma_core_sm50.h:84
Query the number of threads per warp.
Definition: gemm/warp/mma.h:43
Definition: default_mma_core.h:90
Templates implementing how threads are mapped to a given tile.
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::MmaPolicy MmaPolicy< WarpMma, MatrixShape< 0, 0 >, MatrixShape< 0, 0 >, WarpCount::kK > MmaPolicy
Policy used to define MmaPipelined.
Definition: default_mma_core_sm50.h:190
Structure to compute the matrix product targeting CUDA cores and SIMT math instructions.
Definition: mma_simt.h:74
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::ElementB ElementB_ ElementB
Definition: default_mma_core_sm50.h:88
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::OperatorClass arch::OpClassSimt OperatorClass
Definition: default_mma_core_sm50.h:92
Mapping function for column-major matrices.
Definition: layout/matrix.h:142
Template defining a shape used by pitch-linear operators.
Definition: pitch_linear.h:43
Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe ...
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::ElementC ElementC_ ElementC
Definition: default_mma_core_sm50.h:90
Describes the arrangement and configuration of per-lane operations in warp-level matrix multiply...
Definition: mma_simt_policy.h:46
Defines a Shape template for matrix tiles.
Defines the size of an element in bits.
Definition: numeric_types.h:42
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::InstructionShape InstructionShape_ InstructionShape
Definition: default_mma_core_sm50.h:85
Defines basic properties needed by CTA-level GEMMs assuming expectations about data layout of the glo...
Top-level include for all CUTLASS numeric types.
Shape of a matrix multiply-add operation.
Definition: include/cutlass/gemm/gemm.h:57
Mapping function for row-major matrices.
Definition: layout/matrix.h:50
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::ElementA ElementA_ ElementA
Definition: default_mma_core_sm50.h:86
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::LayoutC LayoutC_ LayoutC
Definition: default_mma_core_sm50.h:91
Templates implementing storing of tiles from pitch-linear rank=2 tensors.
Defines layout functions used by TensorRef and derived classes.
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::Shape Shape_ Shape
Definition: default_mma_core_sm50.h:83
cutlass::gemm::threadblock::DefaultMmaCore< Shape_, WarpShape_, GemmShape< 1, 1, 1 >, ElementA_, layout::ColumnMajor, ElementB_, layout::RowMajor, ElementC_, LayoutC_, arch::OpClassSimt, 2, Operator_, >::WarpMma cutlass::gemm::warp::MmaSimt< WarpShape, ElementA, SmemLayoutA, ElementB, SmemLayoutB, ElementC, LayoutC, warp::MmaSimtPolicy< MatrixShape< 4, 8 >, layout::RowMajorInterleaved< 2 >, GemmShape< 128/sizeof_bits< ElementA >::value, 128/sizeof_bits< ElementB >::value, 1 > > > > WarpMma
Definition: default_mma_core_sm50.h:182
Templates implementing warp-level matrix multiply-accumulate operations.
Basic include for CUTLASS.
Definition: layout/matrix.h:237