cub::DeviceAdjacentDifference
Defined in cub/device/device_adjacent_difference.cuh
-
struct DeviceAdjacentDifference
DeviceAdjacentDifference provides device-wide, parallel operations for computing the differences of adjacent elements residing within device-accessible memory.
Overview
DeviceAdjacentDifference calculates the differences of adjacent elements in d_input. Because the binary operation could be noncommutative, there are two sets of methods. Methods named SubtractLeft subtract left element
*(i - 1)
of input sequence from current element*i
. Methods namedSubtractRight
subtract current element*i
from the right one*(i + 1)
:int *d_values; // [1, 2, 3, 4] //... int *d_subtract_left_result <-- [ 1, 1, 1, 1 ] int *d_subtract_right_result <-- [ -1, -1, -1, 4 ]
For SubtractLeft, if the left element is out of bounds, the iterator is assigned to
*(result + (i - first))
without modification.For SubtractRight, if the right element is out of bounds, the iterator is assigned to
*(result + (i - first))
without modification.
Snippet
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the left difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_values; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] //... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_values, num_items); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_values, num_items); // d_values <-- [1, 1, -1, 1, -1, 1, -1, 1]
Public Static Functions
-
template<typename InputIteratorT, typename OutputIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = std::uint32_t>
static inline cudaError_t SubtractLeftCopy(void *d_temp_storage, std::size_t &temp_storage_bytes, InputIteratorT d_input, OutputIteratorT d_output, NumItemsT num_items, DifferenceOpT difference_op = {}, cudaStream_t stream = 0) Subtracts the left element of each adjacent pair of elements residing within device-accessible memory
Overview
Calculates the differences of adjacent elements in
d_input
. That is,*d_input
is assigned to*d_output
, and, for each iteratori
in the range[d_input + 1, d_input + num_items)
, the result ofdifference_op(*i, *(i - 1))
is assigned to*(d_output + (i - d_input))
.Note that the behavior is undefined if the input and output ranges overlap in any way.
Snippet
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> struct CustomDifference { template <typename DataType> __host__ DataType operator()(DataType &lhs, DataType &rhs) { return lhs - rhs; } }; // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_input; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] int *d_output; ... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractLeftCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractLeftCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // d_input <-- [1, 2, 1, 2, 1, 2, 1, 2] // d_output <-- [1, 1, -1, 1, -1, 1, -1, 1]
- Template Parameters
InputIteratorT –
is a model of Input Iterator, and
x
andy
are objects ofInputIteratorT
’svalue_type
, thenx - y
is defined, andInputIteratorT
’svalue_type
is convertible to a type inOutputIteratorT
’s set ofvalue_types
, and the return type ofx - y
is convertible to a type inOutputIteratorT
’s set ofvalue_types
.OutputIteratorT –
is a model of Output Iterator.
DifferenceOpT – Its
result_type
is convertible to a type inOutputIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of num_items
- Parameters
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [in] Pointer to the input sequence
d_output – [out] Pointer to the output sequence
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0
-
template<typename RandomAccessIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = std::uint32_t>
static inline cudaError_t SubtractLeft(void *d_temp_storage, std::size_t &temp_storage_bytes, RandomAccessIteratorT d_input, NumItemsT num_items, DifferenceOpT difference_op = {}, cudaStream_t stream = 0) Subtracts the left element of each adjacent pair of elements residing within device-accessible memory.
Overview
Calculates the differences of adjacent elements in
d_input
. That is, for each iteratori
in the range[d_input + 1, d_input + num_items)
, the result ofdifference_op(*i, *(i - 1))
is assigned to*(d_input + (i - d_input))
.Snippet
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> struct CustomDifference { template <typename DataType> __host__ DataType operator()(DataType &lhs, DataType &rhs) { return lhs - rhs; } }; // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_data; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] ... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_data, num_items, CustomDifference()); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_data, num_items, CustomDifference()); // d_data <-- [1, 1, -1, 1, -1, 1, -1, 1]
- Template Parameters
RandomAccessIteratorT –
is a model of Random Access Iterator,
RandomAccessIteratorT
is mutable. Ifx
andy
are objects ofRandomAccessIteratorT
’svalue_type
, andx - y
is defined, then the return type ofx - y
should be convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.DifferenceOpT – Its
result_type
is convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of
num_items
- Parameters
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [inout] Pointer to the input sequence and the result
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0.
-
template<typename InputIteratorT, typename OutputIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = std::uint32_t>
static inline cudaError_t SubtractRightCopy(void *d_temp_storage, std::size_t &temp_storage_bytes, InputIteratorT d_input, OutputIteratorT d_output, NumItemsT num_items, DifferenceOpT difference_op = {}, cudaStream_t stream = 0) Subtracts the right element of each adjacent pair of elements residing within device-accessible memory.
Overview
Calculates the right differences of adjacent elements in
d_input
. That is,*(d_input + num_items - 1)
is assigned to*(d_output + num_items - 1)
, and, for each iteratori
in the range[d_input, d_input + num_items - 1)
, the result ofdifference_op(*i, *(i + 1))
is assigned to*(d_output + (i - d_input))
.Note that the behavior is undefined if the input and output ranges overlap in any way.
Snippet
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> struct CustomDifference { template <typename DataType> __host__ DataType operator()(DataType &lhs, DataType &rhs) { return lhs - rhs; } }; // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_input; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] int *d_output; .. // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractRightCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractRightCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // d_input <-- [1, 2, 1, 2, 1, 2, 1, 2] // d_data <-- [-1, 1, -1, 1, -1, 1, -1, 2]
- Template Parameters
InputIteratorT –
is a model of Input Iterator, and
x
andy
are objects ofInputIteratorT
’svalue_type
, thenx - y
is defined, andInputIteratorT
’svalue_type
is convertible to a type inOutputIteratorT
’s set ofvalue_types
, and the return type ofx - y
is convertible to a type inOutputIteratorT
’s set ofvalue_types
.OutputIteratorT –
is a model of Output Iterator.
DifferenceOpT – Its
result_type
is convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of num_items
- Parameters
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [in] Pointer to the input sequence
d_output – [out] Pointer to the output sequence
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences.
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0.
-
template<typename RandomAccessIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = std::uint32_t>
static inline cudaError_t SubtractRight(void *d_temp_storage, std::size_t &temp_storage_bytes, RandomAccessIteratorT d_input, NumItemsT num_items, DifferenceOpT difference_op = {}, cudaStream_t stream = 0) Subtracts the right element of each adjacent pair of elements residing within device-accessible memory.
Overview
Calculates the right differences of adjacent elements in
d_input
. That is, for each iteratori
in the range[d_input, d_input + num_items - 1)
, the result ofdifference_op(*i, *(i + 1))
is assigned to*(d_input + (i - d_input))
.Snippet
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_data; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] ... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractRight( d_temp_storage, temp_storage_bytes, d_data, num_items); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractRight( d_temp_storage, temp_storage_bytes, d_data, num_items); // d_data <-- [-1, 1, -1, 1, -1, 1, -1, 2]
- Template Parameters
RandomAccessIteratorT –
is a model of Random Access Iterator,
RandomAccessIteratorT
is mutable. Ifx
andy
are objects ofRandomAccessIteratorT
’s value_type, andx - y
is defined, then the return type ofx - y
should be convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.DifferenceOpT – Its
result_type
is convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of num_items
- Parameters
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [inout] Pointer to the input sequence
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0.