CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Thread Mapping a 2D threadtiled mapping as a transposed Pitchlinear2DThreadTile mapping.
#include <pitch_linear_thread_map.h>
Public Types | |
using | ThreadMap = ThreadMap_ |
Underlying ThreadMap. More... | |
using | TensorCoord = typename ThreadMap::TensorCoord |
Tensor coordinate. More... | |
using | Shape = typename ThreadMap::Shape |
Tile shape. More... | |
using | Iterations = layout::PitchLinearShape< ThreadMap::Iterations::kStrided, ThreadMap::Iterations::kContiguous > |
Iterations along each dimension (concept: PitchLinearShape) More... | |
using | ThreadAccessShape = typename ThreadMap::ThreadAccessShape |
Delta betweeen accesses (units of elements, concept: PitchLinearShape) More... | |
using | Delta = layout::PitchLinearShape< ThreadMap::Delta::kStrided, ThreadMap::Delta::kContiguous > |
Static Public Member Functions | |
static CUTLASS_HOST_DEVICE TensorCoord | initial_offset (int thread_id) |
Static Public Attributes | |
static int const | kThreads = ThreadMap::kThreads |
Number of threads total. More... | |
static int const | kElementsPerAccess = ThreadMap::kElementsPerAccess |
Extract vector length from Layout. More... | |
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::Delta = layout::PitchLinearShape<ThreadMap::Delta::kStrided, ThreadMap::Delta::kContiguous> |
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::Iterations = layout::PitchLinearShape<ThreadMap::Iterations::kStrided, ThreadMap::Iterations::kContiguous> |
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::Shape = typename ThreadMap::Shape |
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::TensorCoord = typename ThreadMap::TensorCoord |
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::ThreadAccessShape = typename ThreadMap::ThreadAccessShape |
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::ThreadMap = ThreadMap_ |
|
inlinestatic |
Maps thread ID to a coordinate offset within the tensor's logical coordinate space Note this is slightly different from the one of PitchLinearWarpRakedThreadMap.
|
static |
|
static |