CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Templates exposing SIMD operators for SM60. More...
#include "simd.h"
Go to the source code of this file.
Namespaces | |
cutlass | |
cutlass::arch | |
Functions | |
template<> | |
CUTLASS_HOST_DEVICE Array< half_t, 2 > | cutlass::arch::operator* (Array< half_t, 2 > const &a, Array< half_t, 2 > const &b) |
template<> | |
CUTLASS_HOST_DEVICE Array< half_t, 2 > | cutlass::arch::operator+ (AArray< half_t, 2 > const &a, Array< half_t, 2 > const &b) |
template<> | |
CUTLASS_HOST_DEVICE Array< half_t, 2 > | cutlass::arch::operator- (Array< half_t, 2 > const &a, Array< half_t, 2 > const &b) |
template<> | |
CUTLASS_HOST_DEVICE Array< half_t, 2 > | cutlass::arch::mac (Array< half_t, 2 > const &a, Array< half_t, 2 > const &b, Array< half_t, 2 > const &c) |
Multiply-accumulate operators - specialized for half_t x 2. More... | |
template<> | |
CUTLASS_HOST_DEVICE half_t | cutlass::arch::dot (Array< half_t, 2 > const &a, Array< half_t, 2 > const &b, half_t accum) |
Dot product operator - specialized for half_t <- (half_t * half_t) x 2 + half_t. More... | |
template<> | |
CUTLASS_HOST_DEVICE float | cutlass::arch::dot (Array< half_t, 2 > const &a, Array< half_t, 2 > const &b, float accum) |
Dot product operator - specialized for float <- (half_t * half_t) x 2 + float. More... | |