CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
Classes | Public Types | Static Public Member Functions | Static Public Attributes | List of all members
cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ > Struct Template Reference

#include <pitch_linear_thread_map.h>

Classes

struct  Detail
 Internal details made public to facilitate introspection Iterations along each dimension (concept: PitchLinearShape) More...
 

Public Types

using ThreadMap = ThreadMap_
 Underlying ThreadMap. More...
 
using TensorCoord = typename ThreadMap::TensorCoord
 Tensor coordinate. More...
 
using Shape = typename ThreadMap::Shape
 Tile shape. More...
 
using ThreadAccessShape = layout::PitchLinearShape< kElementsPerAccess, 1 >
 Shape of access by each thread. More...
 
using Iterations = layout::PitchLinearShape< ThreadMap::Iterations::kStrided, ThreadMap::Iterations::kContiguous >
 
using Delta = layout::PitchLinearShape< Detail::WarpThreadArrangement::kContiguous *kElementsPerAccess, Detail::WarpThreadArrangement::kStrided >
 Delta betweeen accesses (units of elements, concept: PitchLinearShape) More...
 

Static Public Member Functions

static CUTLASS_HOST_DEVICE TensorCoord initial_offset (int thread_id)
 

Static Public Attributes

static int const kThreads = ThreadMap::kThreads
 Number of threads total. More...
 
static int const kElementsPerAccess = ThreadMap::kElementsPerAccess
 Extract vector length from Layout. More...
 

Detailed Description

template<typename ThreadMap_, typename WarpThreadArrangement_>
struct cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >

Transpose the existing ThreadMap. For example, interleaved layout is like congruous in the global memory and crosswise in the shared memory. We need to transpose the coordinates between two.

Member Typedef Documentation

template<typename ThreadMap_ , typename WarpThreadArrangement_ >
using cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::Delta = layout::PitchLinearShape<Detail::WarpThreadArrangement::kContiguous * kElementsPerAccess, Detail::WarpThreadArrangement::kStrided>
template<typename ThreadMap_ , typename WarpThreadArrangement_ >
using cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::Iterations = layout::PitchLinearShape<ThreadMap::Iterations::kStrided, ThreadMap::Iterations::kContiguous>
template<typename ThreadMap_ , typename WarpThreadArrangement_ >
using cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::Shape = typename ThreadMap::Shape
template<typename ThreadMap_ , typename WarpThreadArrangement_ >
using cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::TensorCoord = typename ThreadMap::TensorCoord
template<typename ThreadMap_ , typename WarpThreadArrangement_ >
using cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::ThreadAccessShape = layout::PitchLinearShape<kElementsPerAccess, 1>
template<typename ThreadMap_ , typename WarpThreadArrangement_ >
using cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::ThreadMap = ThreadMap_

Member Function Documentation

template<typename ThreadMap_ , typename WarpThreadArrangement_ >
static CUTLASS_HOST_DEVICE TensorCoord cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::initial_offset ( int  thread_id)
inlinestatic

Maps thread ID to a coordinate offset within the tensor's logical coordinate space Note this is slightly different from the one of PitchLinearWarpRakedThreadMap.

Member Data Documentation

template<typename ThreadMap_ , typename WarpThreadArrangement_ >
int const cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::kElementsPerAccess = ThreadMap::kElementsPerAccess
static
template<typename ThreadMap_ , typename WarpThreadArrangement_ >
int const cutlass::transform::TransposePitchLinearThreadMap< ThreadMap_, WarpThreadArrangement_ >::kThreads = ThreadMap::kThreads
static

The documentation for this struct was generated from the following file: