72 typename Enable =
bool 112 typename platform::conditional < platform::is_same< layout::RowMajorInterleaved<4>,
LayoutB >::value,
127 Shape::kM / Policy::WarpShape::kRow,
128 Shape::kN / Policy::WarpShape::kColumn,
129 Policy::LaneMmaShape::kK>,
198 FragmentC const &c,
int group_idx = 0)
const {
Describes the lane policy used by warp-level matrix multiply operators targeting SIMT instructions...
Describes the size of a matrix tile.
Definition: matrix_shape.h:42
ElementC_ ElementC
Data type of accumulator matrix C.
Definition: mma_simt.h:92
Definition: aligned_buffer.h:35
Describes the lane policy used by warp-level matrix multiply operators targeting SIMT instructions...
typename ThreadMma::FragmentC FragmentC
Storage for C tile.
Definition: mma_simt.h:180
Shape_ Shape
Shape of warp-level matrix operation (concept: GemmShape)
Definition: mma_simt.h:77
Structure to compute the matrix product targeting CUDA cores and SIMT math instructions.
Definition: mma_simt.h:74
Defines common types used for all GEMM-like operators.
CUTLASS_DEVICE void operator()(FragmentC &d, FragmentA const &a, FragmentB const &b, FragmentC const &c, int group_idx=0) const
Performs a warp-level matrix multiply-accumulate operation.
Definition: mma_simt.h:194
static constexpr bool use_dp4a
Definition: mma_simt.h:117
LayoutC_ LayoutC
Layout of accumulator matrix C.
Definition: mma_simt.h:95
Mapping function for column-major matrices.
Definition: layout/matrix.h:142
Statically sized array of elements that accommodates all CUTLASS-supported numeric types and is safe ...
Templates exposing architecture support for warp-level multiply-add operations.
Definition: mma_simt_tile_iterator.h:69
Defines a Shape template for matrix tiles.
arch::OpClassSimt OperatorClass
Indicates class of matrix operator.
Definition: mma_simt.h:101
typename platform::conditional< platform::is_same< layout::ColumnMajorInterleaved< 4 >, LayoutB >::value, layout::ColumnMajor, typename platform::conditional< platform::is_same< layout::RowMajorInterleaved< 4 >, LayoutB >::value, layout::RowMajor, LayoutB >::type >::type ThreadLayoutB
Definition: mma_simt.h:115
LayoutA_ LayoutA
Layout of multiplicand A.
Definition: mma_simt.h:83
Templates exposing architecture support for warp-level multiply-add operations.
Top-level include for all CUTLASS numeric types.
Shape of a matrix multiply-add operation.
Definition: include/cutlass/gemm/gemm.h:57
Policy_ Policy
Shape of the warp in units of thread (concept: MmaLanePolicySimt)
Definition: mma_simt.h:98
typename IteratorA::Fragment FragmentA
Storage for A tile.
Definition: mma_simt.h:154
typename platform::conditional< use_dp4a, int8_t, bool >::type dp4a_type
Definition: mma_simt.h:122
Mapping function for row-major matrices.
Definition: layout/matrix.h:50
Structure to compute the matrix product.
Definition: gemm/thread/mma.h:66
ElementA_ ElementA
Data type of multiplicand A.
Definition: mma_simt.h:80
typename platform::conditional< platform::is_same< layout::ColumnMajorInterleaved< 4 >, LayoutA >::value, layout::ColumnMajor, typename platform::conditional< platform::is_same< layout::RowMajorInterleaved< 4 >, LayoutA >::value, layout::RowMajor, LayoutA >::type >::type ThreadLayoutA
Definition: mma_simt.h:108
ElementB_ ElementB
Data type of multiplicand B.
Definition: mma_simt.h:86
Basic include for CUTLASS.
LayoutB_ LayoutB
Layout of multiplicand B.
Definition: mma_simt.h:89
CUTLASS_DEVICE MmaSimt()
Ctor.
Definition: mma_simt.h:190
typename IteratorB::Fragment FragmentB
Storage for B tile.
Definition: mma_simt.h:168