Parallel Execution Policies
template <typename DerivedPolicy> struct thrust::host_execution_policy;
template <typename DerivedPolicy> struct thrust::device_execution_policy;
static const detail::host_t thrust::host;
THRUST_INLINE_CONSTANT detail::device_t thrust::device;
Member Classes
Struct thrust::host_execution_policy
Inherits From: thrust::system::__THRUST_HOST_SYSTEM_NAMESPACE::execution_policy< DerivedPolicy >
Struct thrust::device_execution_policy
Inherits From: thrust::system::__THRUST_DEVICE_SYSTEM_NAMESPACE::execution_policy< DerivedPolicy >
Variables
Variable thrust::host
static const detail::host_t host;
thrust::host
is the default parallel execution policy associated with Thrust’s host backend system configured by the THRUST_HOST_SYSTEM
macro.
Instead of relying on implicit algorithm dispatch through iterator system tags, users may directly target algorithm dispatch at Thrust’s host system by providing thrust::host
as an algorithm parameter.
Explicit dispatch can be useful in avoiding the introduction of data copies into containers such as thrust::host_vector
.
Note that even though thrust::host
targets the host CPU, it is a parallel execution policy. That is, the order that an algorithm invokes functors or dereferences iterators is not defined.
The type of thrust::host
is implementation-defined.
The following code snippet demonstrates how to use thrust::host
to explicitly dispatch an invocation of thrust::for_each
to the host backend system:
#include <thrust/for_each.h>
#include <thrust/execution_policy.h>
#include <cstdio>
struct printf_functor
{
__host__ __device__
void operator()(int x)
{
printf("%d\n", x);
}
};
...
int vec[] = { 0, 1, 2 };
thrust::for_each(thrust::host, vec, vec + 3, printf_functor());
// 0 1 2 is printed to standard output in some unspecified order
See:
- host_execution_policy
- thrust::device
Variable thrust::device
THRUST_INLINE_CONSTANT detail::device_t device;
thrust::device
is the default parallel execution policy associated with Thrust’s device backend system configured by the THRUST_DEVICE_SYSTEM
macro.
Instead of relying on implicit algorithm dispatch through iterator system tags, users may directly target algorithm dispatch at Thrust’s device system by providing thrust::device
as an algorithm parameter.
Explicit dispatch can be useful in avoiding the introduction of data copies into containers such as thrust::device_vector
or to avoid wrapping e.g. raw pointers allocated by the CUDA API with types such as thrust::device_ptr
.
The user must take care to guarantee that the iterators provided to an algorithm are compatible with the device backend system. For example, raw pointers allocated by std::malloc
typically cannot be dereferenced by a GPU. For this reason, raw pointers allocated by host APIs should not be mixed with a thrust::device
algorithm invocation when the device backend is CUDA.
The type of thrust::device
is implementation-defined.
The following code snippet demonstrates how to use thrust::device
to explicitly dispatch an invocation of thrust::for_each
to the device backend system:
#include <thrust/for_each.h>
#include <thrust/device_vector.h>
#include <thrust/execution_policy.h>
#include <cstdio>
struct printf_functor
{
__host__ __device__
void operator()(int x)
{
printf("%d\n", x);
}
};
...
thrust::device_vector<int> vec(3);
vec[0] = 0; vec[1] = 1; vec[2] = 2;
thrust::for_each(thrust::device, vec.begin(), vec.end(), printf_functor());
// 0 1 2 is printed to standard output in some unspecified order
See:
- host_execution_policy
- thrust::device