cub::DeviceAdjacentDifference#
-
struct DeviceAdjacentDifference#
DeviceAdjacentDifference provides device-wide, parallel operations for computing the differences of adjacent elements residing within device-accessible memory.
Overview#
DeviceAdjacentDifference calculates the differences of adjacent elements in d_input. Because the binary operation could be noncommutative, there are two sets of methods. Methods named SubtractLeft subtract left element
*(i - 1)
of input sequence from current element*i
. Methods namedSubtractRight
subtract current element*i
from the right one*(i + 1)
:int *d_values; // [1, 2, 3, 4] //... int *d_subtract_left_result <-- [ 1, 1, 1, 1 ] int *d_subtract_right_result <-- [ -1, -1, -1, 4 ]
For SubtractLeft, if the left element is out of bounds, the iterator is assigned to
*(result + (i - first))
without modification.For SubtractRight, if the right element is out of bounds, the iterator is assigned to
*(result + (i - first))
without modification.
Snippet#
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the left difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_values; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] //... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_values, num_items); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_values, num_items); // d_values <-- [1, 1, -1, 1, -1, 1, -1, 1]
Public Static Functions
-
template<typename InputIteratorT, typename OutputIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = uint32_t>
static inline cudaError_t SubtractLeftCopy( - void *d_temp_storage,
- size_t &temp_storage_bytes,
- InputIteratorT d_input,
- OutputIteratorT d_output,
- NumItemsT num_items,
- DifferenceOpT difference_op = {},
- cudaStream_t stream = 0,
Subtracts the left element of each adjacent pair of elements residing within device-accessible memory
Overview#
Calculates the differences of adjacent elements in
d_input
. That is,*d_input
is assigned to*d_output
, and, for each iteratori
in the range[d_input + 1, d_input + num_items)
, the result ofdifference_op(*i, *(i - 1))
is assigned to*(d_output + (i - d_input))
.Note that the behavior is undefined if the input and output ranges overlap in any way.
Snippet#
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> struct CustomDifference { template <typename DataType> __host__ DataType operator()(DataType &lhs, DataType &rhs) { return lhs - rhs; } }; // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_input; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] int *d_output; ... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractLeftCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractLeftCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // d_input <-- [1, 2, 1, 2, 1, 2, 1, 2] // d_output <-- [1, 1, -1, 1, -1, 1, -1, 1]
- Template Parameters:
InputIteratorT –
is a model of Input Iterator, and
x
andy
are objects ofInputIteratorT
’svalue_type
, thenx - y
is defined, andInputIteratorT
’svalue_type
is convertible to a type inOutputIteratorT
’s set ofvalue_types
, and the return type ofx - y
is convertible to a type inOutputIteratorT
’s set ofvalue_types
.OutputIteratorT –
is a model of Output Iterator.
DifferenceOpT – Its
result_type
is convertible to a type inOutputIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of num_items
- Parameters:
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [in] Pointer to the input sequence
d_output – [out] Pointer to the output sequence
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0
-
template<typename RandomAccessIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = uint32_t>
static inline cudaError_t SubtractLeft( - void *d_temp_storage,
- size_t &temp_storage_bytes,
- RandomAccessIteratorT d_input,
- NumItemsT num_items,
- DifferenceOpT difference_op = {},
- cudaStream_t stream = 0,
Subtracts the left element of each adjacent pair of elements residing within device-accessible memory.
Overview#
Calculates the differences of adjacent elements in
d_input
. That is, for each iteratori
in the range[d_input + 1, d_input + num_items)
, the result ofdifference_op(*i, *(i - 1))
is assigned to*(d_input + (i - d_input))
.Snippet#
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> struct CustomDifference { template <typename DataType> __host__ DataType operator()(DataType &lhs, DataType &rhs) { return lhs - rhs; } }; // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_data; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] ... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_data, num_items, CustomDifference()); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractLeft( d_temp_storage, temp_storage_bytes, d_data, num_items, CustomDifference()); // d_data <-- [1, 1, -1, 1, -1, 1, -1, 1]
- Template Parameters:
RandomAccessIteratorT –
is a model of Random Access Iterator,
RandomAccessIteratorT
is mutable. Ifx
andy
are objects ofRandomAccessIteratorT
’svalue_type
, andx - y
is defined, then the return type ofx - y
should be convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.DifferenceOpT – Its
result_type
is convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of
num_items
- Parameters:
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [inout] Pointer to the input sequence and the result
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0.
-
template<typename InputIteratorT, typename OutputIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = uint32_t>
static inline cudaError_t SubtractRightCopy( - void *d_temp_storage,
- size_t &temp_storage_bytes,
- InputIteratorT d_input,
- OutputIteratorT d_output,
- NumItemsT num_items,
- DifferenceOpT difference_op = {},
- cudaStream_t stream = 0,
Subtracts the right element of each adjacent pair of elements residing within device-accessible memory.
Overview#
Calculates the right differences of adjacent elements in
d_input
. That is,*(d_input + num_items - 1)
is assigned to*(d_output + num_items - 1)
, and, for each iteratori
in the range[d_input, d_input + num_items - 1)
, the result ofdifference_op(*i, *(i + 1))
is assigned to*(d_output + (i - d_input))
.Note that the behavior is undefined if the input and output ranges overlap in any way.
Snippet#
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> struct CustomDifference { template <typename DataType> __host__ DataType operator()(DataType &lhs, DataType &rhs) { return lhs - rhs; } }; // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_input; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] int *d_output; .. // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractRightCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractRightCopy( d_temp_storage, temp_storage_bytes, d_input, d_output, num_items, CustomDifference()); // d_input <-- [1, 2, 1, 2, 1, 2, 1, 2] // d_data <-- [-1, 1, -1, 1, -1, 1, -1, 2]
- Template Parameters:
InputIteratorT –
is a model of Input Iterator, and
x
andy
are objects ofInputIteratorT
’svalue_type
, thenx - y
is defined, andInputIteratorT
’svalue_type
is convertible to a type inOutputIteratorT
’s set ofvalue_types
, and the return type ofx - y
is convertible to a type inOutputIteratorT
’s set ofvalue_types
.OutputIteratorT –
is a model of Output Iterator.
DifferenceOpT – Its
result_type
is convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of num_items
- Parameters:
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [in] Pointer to the input sequence
d_output – [out] Pointer to the output sequence
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences.
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0.
-
template<typename RandomAccessIteratorT, typename DifferenceOpT = ::cuda::std::minus<>, typename NumItemsT = uint32_t>
static inline cudaError_t SubtractRight( - void *d_temp_storage,
- size_t &temp_storage_bytes,
- RandomAccessIteratorT d_input,
- NumItemsT num_items,
- DifferenceOpT difference_op = {},
- cudaStream_t stream = 0,
Subtracts the right element of each adjacent pair of elements residing within device-accessible memory.
Overview#
Calculates the right differences of adjacent elements in
d_input
. That is, for each iteratori
in the range[d_input, d_input + num_items - 1)
, the result ofdifference_op(*i, *(i + 1))
is assigned to*(d_input + (i - d_input))
.Snippet#
The code snippet below illustrates how to use
DeviceAdjacentDifference
to compute the difference between adjacent elements.#include <cub/cub.cuh> // or equivalently <cub/device/device_adjacent_difference.cuh> // Declare, allocate, and initialize device-accessible pointers int num_items; // e.g., 8 int *d_data; // e.g., [1, 2, 1, 2, 1, 2, 1, 2] ... // Determine temporary device storage requirements void *d_temp_storage = nullptr; size_t temp_storage_bytes = 0; cub::DeviceAdjacentDifference::SubtractRight( d_temp_storage, temp_storage_bytes, d_data, num_items); // Allocate temporary storage cudaMalloc(&d_temp_storage, temp_storage_bytes); // Run operation cub::DeviceAdjacentDifference::SubtractRight( d_temp_storage, temp_storage_bytes, d_data, num_items); // d_data <-- [-1, 1, -1, 1, -1, 1, -1, 2]
- Template Parameters:
RandomAccessIteratorT –
is a model of Random Access Iterator,
RandomAccessIteratorT
is mutable. Ifx
andy
are objects ofRandomAccessIteratorT
’s value_type, andx - y
is defined, then the return type ofx - y
should be convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.DifferenceOpT – Its
result_type
is convertible to a type inRandomAccessIteratorT
’s set ofvalue_types
.NumItemsT – [inferred] Type of num_items
- Parameters:
d_temp_storage – [in] Device-accessible allocation of temporary storage. When
nullptr
, the required allocation size is written totemp_storage_bytes
and no work is done.temp_storage_bytes – [inout] Reference to size in bytes of
d_temp_storage
allocationd_input – [inout] Pointer to the input sequence
num_items – [in] Number of items in the input sequence
difference_op – [in] The binary function used to compute differences
stream – [in]
[optional] CUDA stream to launch kernels within. Default is stream0.