CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
#include <pitch_linear_thread_map.h>
Classes | |
struct | Detail |
Internal details made public to facilitate introspection Iterations along each dimension (concept: PitchLinearShape) More... | |
Public Types | |
using | TensorCoord = layout::PitchLinearCoord |
Tensor coordinate. More... | |
using | Shape = Shape_ |
Tile shape. More... | |
using | ThreadAccessShape = layout::PitchLinearShape< kElementsPerAccess, 1 > |
Shape of access by each thread. More... | |
using | Iterations = layout::PitchLinearShape< Detail::WarpAccessIterations::kContiguous/Detail::kWarpsContiguous, Detail::WarpAccessIterations::kStrided/Detail::kWarpsStrided > |
using | Delta = layout::PitchLinearShape< Detail::WarpThreadArrangement::kContiguous *kElementsPerAccess, Detail::WarpThreadArrangement::kStrided > |
Delta betweeen accesses (units of elements, concept: PitchLinearShape) More... | |
Static Public Member Functions | |
static CUTLASS_HOST_DEVICE TensorCoord | initial_offset (int thread_id) |
Maps thread ID to a coordinate offset within the tensor's logical coordinate space. More... | |
Static Public Attributes | |
static int const | kThreads = Threads |
Number of threads total. More... | |
static int const | kElementsPerAccess = ElementsPerAccess |
Extract vector length from Layout. More... | |
Policy defining a warp-raked arrangement in which a shape is partitioned into contiguous elements.
using cutlass::transform::PitchLinearWarpRakedThreadMap< Shape_, Threads, WarpThreadArrangement_, ElementsPerAccess >::Delta = layout::PitchLinearShape< Detail::WarpThreadArrangement::kContiguous * kElementsPerAccess, Detail::WarpThreadArrangement::kStrided > |
using cutlass::transform::PitchLinearWarpRakedThreadMap< Shape_, Threads, WarpThreadArrangement_, ElementsPerAccess >::Iterations = layout::PitchLinearShape< Detail::WarpAccessIterations::kContiguous / Detail::kWarpsContiguous, Detail::WarpAccessIterations::kStrided / Detail::kWarpsStrided > |
using cutlass::transform::PitchLinearWarpRakedThreadMap< Shape_, Threads, WarpThreadArrangement_, ElementsPerAccess >::Shape = Shape_ |
using cutlass::transform::PitchLinearWarpRakedThreadMap< Shape_, Threads, WarpThreadArrangement_, ElementsPerAccess >::TensorCoord = layout::PitchLinearCoord |
using cutlass::transform::PitchLinearWarpRakedThreadMap< Shape_, Threads, WarpThreadArrangement_, ElementsPerAccess >::ThreadAccessShape = layout::PitchLinearShape<kElementsPerAccess, 1> |
|
inlinestatic |
|
static |
|
static |