cuda::access_property#

Defined in header <cuda/annotated_ptr>.

The class cuda::access_property provides an opaque encoding for L2 cache memory residence control and memory space properties. It is used in combination with cuda::annotated_ptr, cuda::associate_access_property and cuda::apply_access_property to request the application of properties to memory operations.

namespace cuda {

class access_property {
  public:
    // Static memory space property:
    struct shared {};
    struct global {};

    // Static global memory residence control property:
    struct normal {
        __host__ __device__ constexpr operator cudaAccessProperty() const noexcept;
    };
    struct persisting {
        __host__ __device__ constexpr operator cudaAccessProperty() const noexcept;
    };
    struct streaming {
        __host__ __device__ constexpr operator cudaAccessProperty() const noexcept;
    };

    access_property() noexcept = default;

    // Constructors from static global memory residence control properties:
    __host__ __device__ constexpr access_property(global)     noexcept;
    __host__ __device__ constexpr access_property(normal)     noexcept;
    __host__ __device__ constexpr access_property(streaming)  noexcept;
    __host__ __device__ constexpr access_property(persisting) noexcept;

    // Dynamic interleaved global memory residence control property constructors:
    __host__ __device__ constexpr access_property(normal,     float probability) noexcept;
    __host__ __device__ constexpr access_property(streaming,  float probability) noexcept;
    __host__ __device__ constexpr access_property(persisting, float probability) noexcept;
    __host__ __device__ constexpr access_property(normal,     float probability, streaming) noexcept;
    __host__ __device__ constexpr access_property(persisting, float probability, streaming) noexcept;

    // Dynamic range global memory residence control property constructors:
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, normal) noexcept;
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, streaming) noexcept;
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, persisting) noexcept;
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, global,     streaming) noexcept;
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, normal,     streaming) noexcept;
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, persisting, streaming) noexcept;
    __host__ __device__ access_property(void* ptr, size_t primary_bytes, size_t total_bytes, streaming,  streaming) noexcept;
};

} // namespace cuda

Kinds of Access Properties#

Access properties are either static compile-time values or dynamic runtime values. The following properties of a memory access are provided:

Shared Memory property:

  • cuda::access_property::shared: memory access to the shared memory space.

Global Memory properties:

  • cuda::access_property::global: memory access to the global memory space without indicating an expected frequency of access to that memory, namely the access behavior is not modified.

  • cuda::access_property::normal: memory access to the global memory space expecting the memory to be accessed as frequent as other memory.

  • cuda::access_property::persisting: memory access to the global memory space expecting the memory to be accessed more frequently than other memory; this priority is suitable for data that should remain persistent in cache.

  • cuda::access_property::streaming: memory access to the global memory space expecting the memory to be accessed infrequently; this priority is suitable for streaming data.

Note: The difference between cuda::access_property::global and cuda::access_property::normal is subtle. The cuda::access_property::normal hints that the pointer points to the global address space and the memory will be accessed with “normal frequency”, while cuda::access_property::global only hints that the pointer points to the global address-space, it does not hint about how frequent the accesses will be.

Warning

The behavior of requesting the application of cuda::access_property to memory accesses, or their association with memory addresses, outside of the corresponding address space is undefined (note: even if that address is not used). The correctness of the input pointer and memory properties are verified in debug mode.


Global Memory Property Definition#

The L2 residence control can be specified in two ways:

  • Interleaved: A memory address is accessed with a property with a given probability, while the remaining 1 - probability accesses are performed with a second one.

  • Range: The first primary_bytes of a memory address is accessed with one property and the remaining total_bytes - primary_bytes addresses with a second one.

Default constructor#

access_property() noexcept = default;

Effects: as if access_property(global) (unchanged).

Static global memory residence control property constructors#

__host__ __device__ constexpr access_property::access_property(global) noexcept;
__host__ __device__ constexpr access_property::access_property(normal) noexcept;
__host__ __device__ constexpr access_property::access_property(streaming) noexcept;
__host__ __device__ constexpr access_property::access_property(persisting) noexcept;

Effects: as if access_property(PROPERTY, 1.0) where PROPERTY is one of global (unchanged), normal, streaming, or persisting.

Dynamic interleaved global memory residence control property constructors#

__host__ __device__ constexpr access_property::access_property(normal,     float probability) noexcept;
__host__ __device__ constexpr access_property::access_property(streaming,  float probability) noexcept;
__host__ __device__ constexpr access_property::access_property(persisting, float probability) noexcept;
__host__ __device__ constexpr access_property::access_property(normal,     float probability, streaming) noexcept;
__host__ __device__ constexpr access_property::access_property(persisting, float probability, streaming) noexcept;

Preconditions: 0 < probability <= 1.0.

Effects: constructs an interleaved access property that requests the first and third arguments - access properties - to be applied with probability and 1 - probability to memory accesses. The overloads without a third argument request applying global (unchanged) with 1 - probability.

Dynamic range global memory residence control property constructors#

__host__ __device__ access_property::access_property(void* ptr, size_t leading_bytes, size_t total_bytes, normal) noexcept;
__host__ __device__ access_property::access_property(void* ptr, size_t leading_bytes, size_t total_bytes, streaming) noexcept;
__host__ __device__ access_property::access_property(void* ptr, size_t leading_bytes, size_t total_bytes, persisting) noexcept;
__host__ __device__ access_property::access_property(void* ptr, size_t leading_bytes, size_t total_bytes, normal,     streaming) noexcept;
__host__ __device__ access_property::access_property(void* ptr, size_t leading_bytes, size_t total_bytes, persisting, streaming) noexcept;

note: pointer arithmetic below performed char* ptr instead of void* ptr

Preconditions:

  • ptr is a generic pointer that is valid to cast to a pointer to the global memory address space.

  • 0 < leading_bytes <= total_bytes <= 4GB.

Postconditions: memory accesses requesting the application of this property must be in range [ptr, ptr + total_bytes).

Effects: the fourth and fifth arguments, access properties, are called primary and secondary properties. The overloads without a fifth argument use global as the secondary property. Constructs a range access property requesting the properties to be approximately applied to memory accesses as follows:

  • primary property to accesses in address-range: [ptr, ptr + leading_bytes)

  • secondary property to accesses in address-range: [ptr + leading_bytes, ptr + total_bytes)

Note: This property enables two main use cases:

  1. Unary range [ptr, ptr + total_bytes) with primary property by using leading_bytes == total_bytes.

  2. Binary range [ptr, ptr + leading_bytes) and [ptr + leading_bytes, ptr + total_bytes) with primary and secondary properties respectively.

Conversion operators#

__host__ __device__ constexpr access_property::normal::operator     cudaAccessProperty() const noexcept;
__host__ __device__ constexpr access_property::streaming::operator  cudaAccessProperty() const noexcept;
__host__ __device__ constexpr access_property::persisting::operator cudaAccessProperty() const noexcept;

Allows constexpr cuda::access_property::normal{}, cuda::access_property::streaming{}, and cuda::access_property::persisting{} to be used in lieu of the corresponding CUDA Runtime cudaAccessProperty. See also L2 Policy for Persisting Accesses.

Example#

#include <cuda/access_property>

__global__ void kernel(int* global_ptr, size_t num_bytes) {
    __shared__ int smem;
    cuda::access_property shared_prop{&smem, cuda::access_property::shared{}};
    cuda::access_property streaming_prop{global_ptr, sizeof(int), sizeof(int), cuda::access_property::streaming{}};
    cuda::access_property streaming_interleaved_prop{cuda::access_property::streaming{}, 1.0};
    cuda::access_property persisting_prop{global_ptr, num_bytes, num_bytes, cuda::access_property::persisting{});
}

__global__ void undefined_behavior(int* global_ptr) { // verified in debug mode
    __shared__ int smem;
    // Associating pointers with mismatching address spaces is undefined:
    cuda::access_property{global_ptr, cuda::access_property::shared{}}; // undefined behavior
    cuda::access_property{&smem, cuda::access_property::normal{}};      // undefined behavior
    cuda::access_property{&smem, cuda::access_property::streaming{}};   // undefined behavior
    cuda::access_property{&smem, cuda::access_property::persisting{}};  // undefined behavior

    // Using a zero probability or probability out-of-range (0, 1] is undefined:
    cuda::access_property{cuda::access_property::streaming{}, 0.0f};    // undefined behavior
    cuda::access_property{cuda::access_property::streaming{}, 2.0f};    // undefined behavior

    // Providing size values out-of-range is undefined:
    cuda::access_property{global_ptr, 0, 0, cuda::access_property::streaming{}, 0.0f}; // undefined behavior
    cuda::access_property{global_ptr, 8, 4, cuda::access_property::streaming{}, 2.0f}; // undefined behavior
}