CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
Classes | Namespaces
transform/threadblock/predicated_tile_iterator.h File Reference

Templates implementing loading of tiles from pitch-linear rank=2 tensors. More...

#include "cutlass/arch/memory.h"
#include "cutlass/transform/threadblock/predicated_tile_access_iterator.h"
Include dependency graph for transform/threadblock/predicated_tile_iterator.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  cutlass::transform::threadblock::PredicatedTileIterator< Shape, Element, Layout, AdvanceRank, ThreadMap, AccessSize >
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::PitchLinear, AdvanceRank, ThreadMap_, AccessSize >
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::PitchLinear, AdvanceRank, ThreadMap_, AccessSize >::Params
 Parameters object is precomputed state and is host-constructible. More...
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::ColumnMajor, AdvanceRank, ThreadMap_, AccessSize >
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::ColumnMajor, AdvanceRank, ThreadMap_, AccessSize >::Params
 Parameters object is precomputed state and is host-constructible. More...
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::RowMajor, AdvanceRank, ThreadMap_, AccessSize >
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::RowMajor, AdvanceRank, ThreadMap_, AccessSize >::Params
 Parameters object is precomputed state and is host-constructible. More...
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::ColumnMajorInterleaved< InterleavedK >, AdvanceRank, ThreadMap_, AccessSize >
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::ColumnMajorInterleaved< InterleavedK >, AdvanceRank, ThreadMap_, AccessSize >::Params
 Parameters object is precomputed state and is host-constructible. More...
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::RowMajorInterleaved< InterleavedK >, AdvanceRank, ThreadMap_, AccessSize >
 
class  cutlass::transform::threadblock::PredicatedTileIterator< Shape_, Element_, layout::RowMajorInterleaved< InterleavedK >, AdvanceRank, ThreadMap_, AccessSize >::Params
 Parameters object is precomputed state and is host-constructible. More...
 

Namespaces

 cutlass
 
 cutlass::transform
 
 cutlass::transform::threadblock
 

Detailed Description

This iterator uses masks to guard out-of-bounds accesses and visits the last "residue" tile first, with the objective of minimizing predicate mask updates during steady-state operation.

A precomputed "Params" object minimizes the amount of state that must be stored in registers, and integer addition is used to advance the pointer through memory.