CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
Public Types | Static Public Member Functions | Static Public Attributes | List of all members
cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ > Struct Template Reference

Thread Mapping a 2D threadtiled mapping as a tranposed Pitchlinear2DThreadTile mapping.

#include <pitch_linear_thread_map.h>

Public Types

using ThreadMap = ThreadMap_
 Underlying ThreadMap. More...
 
using TensorCoord = typename ThreadMap::TensorCoord
 Tensor coordinate. More...
 
using Shape = typename ThreadMap::Shape
 Tile shape. More...
 
using Iterations = layout::PitchLinearShape< ThreadMap::Iterations::kStrided, ThreadMap::Iterations::kContiguous >
 Iterations along each dimension (concept: PitchLinearShape) More...
 
using ThreadAccessShape = typename ThreadMap::ThreadAccessShape
 Delta betweeen accesses (units of elements, concept: PitchLinearShape) More...
 
using Delta = layout::PitchLinearShape< ThreadMap::Delta::kStrided, ThreadMap::Delta::kContiguous >
 

Static Public Member Functions

static CUTLASS_HOST_DEVICE TensorCoord initial_offset (int thread_id)
 

Static Public Attributes

static int const kThreads = ThreadMap::kThreads
 Number of threads total. More...
 
static int const kElementsPerAccess = ThreadMap::kElementsPerAccess
 Extract vector length from Layout. More...
 

Member Typedef Documentation

template<typename ThreadMap_ >
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::Delta = layout::PitchLinearShape<ThreadMap::Delta::kStrided, ThreadMap::Delta::kContiguous>
template<typename ThreadMap_ >
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::Iterations = layout::PitchLinearShape<ThreadMap::Iterations::kStrided, ThreadMap::Iterations::kContiguous>
template<typename ThreadMap_ >
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::Shape = typename ThreadMap::Shape
template<typename ThreadMap_ >
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::TensorCoord = typename ThreadMap::TensorCoord
template<typename ThreadMap_ >
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::ThreadAccessShape = typename ThreadMap::ThreadAccessShape
template<typename ThreadMap_ >
using cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::ThreadMap = ThreadMap_

Member Function Documentation

template<typename ThreadMap_ >
static CUTLASS_HOST_DEVICE TensorCoord cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::initial_offset ( int  thread_id)
inlinestatic

Maps thread ID to a coordinate offset within the tensor's logical coordinate space Note this is slightly different from the one of PitchLinearWarpRakedThreadMap.

Member Data Documentation

template<typename ThreadMap_ >
int const cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::kElementsPerAccess = ThreadMap::kElementsPerAccess
static
template<typename ThreadMap_ >
int const cutlass::transform::TransposePitchLinearThreadMap2DThreadTile< ThreadMap_ >::kThreads = ThreadMap::kThreads
static

The documentation for this struct was generated from the following file: