CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
Classes | Public Types | Static Public Member Functions | Static Public Attributes | List of all members
cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > > Struct Template Reference

#include <pitch_linear_thread_map.h>

Classes

struct  Detail
 Internal implementation details. More...
 

Public Types

using TensorCoord = layout::PitchLinearCoord
 Tensor coordinate. More...
 
using Shape = Shape_
 Tile shape. More...
 
using ThreadAccessShape = cutlass::layout::PitchLinearShape< 4, 4 >
 Access Shape of each thread. More...
 
using Iterations = typename platform::conditional< Threads >=Detail::ShapeVec::kContiguous, layout::PitchLinearShape< 1,(Threads >=Detail::ShapeVec::kContiguous?Detail::ShapeVec::kStrided/(kThreads/Detail::ShapeVec::kContiguous):0) >, layout::PitchLinearShape< Detail::ShapeVec::kContiguous/kThreads, Detail::ShapeVec::kStrided > >::type
 Number of iterations by each thread. More...
 
using Delta = typename platform::conditional< Threads >=Detail::ShapeVec::kContiguous, layout::PitchLinearShape< Shape::kContiguous, kThreads *ThreadAccessShape::kStrided/Detail::ShapeVec::kContiguous >, layout::PitchLinearShape< kThreads *ThreadAccessShape::kContiguous, 1 > >::type
 

Static Public Member Functions

static CUTLASS_HOST_DEVICE TensorCoord initial_offset (int thread_id)
 

Static Public Attributes

static int const kThreads = Threads
 Number of threads total. More...
 
static int const kElementsPerAccess = ThreadAccessShape::kContiguous
 Extract length of each access from Layout. More...
 

Member Typedef Documentation

template<typename Shape_ , int Threads>
using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::Delta = typename platform::conditional< Threads >= Detail::ShapeVec::kContiguous, layout::PitchLinearShape< Shape::kContiguous, kThreads * ThreadAccessShape::kStrided / Detail::ShapeVec::kContiguous >, layout::PitchLinearShape< kThreads * ThreadAccessShape::kContiguous, 1 > >::type

Interval between accesses along each dimension of the tensor's logical coordinate space (in units of Elements)

template<typename Shape_ , int Threads>
using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::Iterations = typename platform::conditional< Threads >= Detail::ShapeVec::kContiguous, layout::PitchLinearShape< 1, (Threads >= Detail::ShapeVec::kContiguous ? Detail::ShapeVec::kStrided / (kThreads / Detail::ShapeVec::kContiguous) : 0) >, layout::PitchLinearShape< Detail::ShapeVec::kContiguous / kThreads, Detail::ShapeVec::kStrided > >::type
template<typename Shape_ , int Threads>
using cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::Shape = Shape_

Member Function Documentation

template<typename Shape_ , int Threads>
static CUTLASS_HOST_DEVICE TensorCoord cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::initial_offset ( int  thread_id)
inlinestatic

Maps thread ID to a coordinate offset within the tensor's logical coordinate space (in units of Elements)

Member Data Documentation

template<typename Shape_ , int Threads>
int const cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::kElementsPerAccess = ThreadAccessShape::kContiguous
static
template<typename Shape_ , int Threads>
int const cutlass::transform::PitchLinear2DThreadTileStripminedThreadMap< Shape_, Threads, cutlass::layout::PitchLinearShape< 4, 4 > >::kThreads = Threads
static

The documentation for this struct was generated from the following file: