CCCL Internal Macros#
The document describes the main internal macros used by CCCL. They are not intended to be used by end users, but for development of CCCL features only. We reserve the right to change them at any time without warning.
Compiler Macros#
Host compiler macros:
|
Clang |
|
GCC |
|
Nvidia HPC compiler |
|
Microsoft Visual Studio |
|
Microsoft Visual Studio 2019 |
|
Microsoft Visual Studio 2022 |
The _CCCL_COMPILER function-like macro can also be used to check the version of a compiler.
_CCCL_COMPILER(MSVC, <, 19, 24)
_CCCL_COMPILER(GCC, >=, 9)
Note: When used without specifying a minor version number, the macro will only test against
the compiler’s major version number. For example, when the compiler is gcc-9.1, the macro
_CCCL_COMPILER(GCC, >, 9) will be false even though 9.1 is greater than 9.
CUDA compiler macros:
|
Nvidia compiler |
|
Nvidia HPC compiler |
|
Nvidia Runtime Compiler |
|
Clang |
The _CCCL_CUDA_COMPILER function-like macro can also be used to check the version of a CUDA compiler.
_CCCL_CUDA_COMPILER(NVCC, <, 12, 3)
_CCCL_CUDA_COMPILER(CLANG, >=, 14)
Note: _CCCL_CUDA_COMPILER(...) check may result in a true value even during the compilation of a C++ source
file. Use _CCCL_CUDA_COMPILATION() to check for the compilation of a CUDA source file.
CUDA identification/version macros:
|
CUDA code is being compiled |
|
Compiling host code, |
|
Compiling device code, |
|
CUDA version below 12.7 when compiling a CUDA source file |
|
CUDA version at least 12.7 when compiling a CUDA source file |
Note: When compiling CUDA code with nvc++ both _CCCL_HOST_COMPILATION() and _CCCL_DEVICE_COMPILATION() result in a true value.
PTX macros:
|
Alias of |
|
PTX ISA version available with the current CUDA compiler, e.g. PTX ISA 8.4 ( |
Note: When compiling CUDA code with nvc++ the _CCCL_PTX_ARCH() macro expands to 0.
Architecture Macros#
The following macros are used to check the target architecture. They comply with the compiler supported by the CUDA toolkit. Compilers outside the CUDA toolkit may define such macros in a different way.
|
ARM 64-bit, including MSVC emulation |
|
X86 64-bit. False on ARM 64-bit MSVC emulation |
OS Macros#
|
Windows, including NVRTC LLP64 |
|
Linux, including NVRTC LP64 |
|
Android |
|
QNX |
Execution Space#
Functions
|
Host function |
|
Device function |
|
Host/Device function |
In addition, _CCCL_EXEC_CHECK_DISABLE disables the execution space check for the NVHPC compiler
Target Macros
|
Enable |
|
Enable |
|
Enable a single code block if any of |
Possible TARGET values:
|
Any target |
|
Host-code target |
|
Device-code target |
|
SM architecture is at least |
|
SM architecture is exactly |
Usage example:
NV_IF_TARGET(NV_IS_DEVICE, (auto x = threadIdx.x; return x;));
NV_IF_ELSE_TARGET(NV_IS_HOST, (return 0;), (auto x = threadIdx.x; return x;));
NV_DISPATCH_TARGET(NV_PROVIDES_SM_90, (return "Hopper+";),
NV_IS_EXACTLY_SM_75, (return "Turing";),
NV_IS_HOST, (return "Host";))
Pitfalls:
All target macros generate the code in a local scope, i.e.
{ code }.NV_DISPATCH_TARGETis NOT a switch statement. It enables the code associated with the first condition satisfied.The target macros take
codeas an argument, so it is not possible to use any conditional compilation, .e.g#if _CCCL_STD_VER >= 20within a target macro
CUDA attributes#
|
Grid constant kernel parameter |
|
Host/device global scope constant ( |
CUDA Toolkit macros#
|
CUDA toolkit is available if |
|
CUDA toolkit version below 12.7 |
|
CUDA toolkit version at least 12.7 |
Non-standard Types Support#
|
|
|
|
|
|
|
|
|
|
|
Disable |
|
Disable |
|
Disable |
|
Disable |
|
Disable |
|
|
|
|
C++ Language Macros#
The following macros are required only if the target C++ version does not support the corresponding attribute
|
C++ standard version, e.g. |
|
Enable |
|
Enable |
|
Features can use exceptions, e.g |
Concept-like Macros:
|
|
|
|
|
Traits conjunction only used with |
Usage example:
_CCCL_TEMPLATE(typename T)
_CCCL_REQUIRES(is_integral_v<T> _CCCL_AND(sizeof(T) > 1))
_CCCL_TEMPLATE(typename T)
_CCCL_REQUIRES(is_arithmetic_v<T> _CCCL_AND (!is_integral_v<T>))
Portable feature testing:
|
Portable |
|
Portable |
|
Portable |
Portable attributes:
|
Portable |
|
Portable |
|
Portable |
|
Portable |
|
Portable “always inline” attribute |
|
Portable “pure” function attribute |
|
Portable “constant” function attribute |
|
Portable “lifetime bound” function attribute |
Portable Builtin Macros:
|
Portable |
|
Portable |
|
Portable |
Portable Keyword Macros
|
Portable |
|
Portable |
|
Portable |
|
Portable |
Portable Pragma Macros
|
Portable |
|
Portable |
|
Portable |
Conditional Constant Evaluation Macros
In C++23, the if consteval statement (link) was introduced. CCCL mimics the behaviour with a set of macros that expand to an implementation supported by the compiler. If the compiler doesn’t support any kind of conditional constant evaluation, the macros expand to predefined fallback values.
|
Equivalent to |
|
Equivalent to |
|
Equivalent to |
|
Equivalent to |
Exception Macros
CUDA doesn’t support exceptions in device code, however, sometimes we need to write host/device functions that use exceptions on host and __trap() on device. CCCL provides a set of macros that should be used in place of the standard C++ keywords to make the code compile in both, host and device code.
|
Replacement for the |
|
Replacement for the |
|
Replacement for the |
|
Replacement for the |
|
Replacement for the plain |
Note: The _CCCL_CATCH clause must always introduce a named variable, like: _CCCL_CATCH(const exception_type& var).
Note
_CCCL_THROW requires to include the <stdexcept> header, regardless exceptions are enabled or not.
Example:
__host__ __device__ void* alloc(cuda::std::size_t nbytes)
{
if (void* ptr = cuda::std::malloc(nbytes))
{
return ptr;
}
_CCCL_THROW(std::bad_alloc{}); // on device calls cuda::std::terminate()
}
__host__ __device__ void do_something(int* buff)
{
_CCCL_THROW(std::runtime_error{"Something went wrong"}); // on device calls cuda::std::terminate()
}
__host__ __device__ void fn(cuda::std::size_t n)
{
int* buff{};
_CCCL_TRY
{
buff = reinterpret_cast<int*>(alloc(n * sizeof(int)));
do_something(buff);
}
_CCCL_CATCH ([[maybe_unused]] const std::bad_alloc& e) // must be always named
{
std::fprintf(stderr, "Failed to allocate memory\n"); // We can directly call host-only functions
cuda::std::terminate();
}
_CCCL_CATCH_ALL // or _CCCL_CATCH_FALLTHOUGH
{
cuda::std::free(buff);
_CCCL_RETHROW;
}
}
__global__ void kernel()
{
fn(10);
}
int main()
{
fn(10);
return 0;
}
Visibility Macros#
|
Hidden visibility attribute (e.g. |
|
Hidden visibility (i.e. |
|
Host/device function with hidden visibility. Most CCCL functions are hidden with this attribute |
Other Common Macros#
|
|
|
Defined during Doxygen parsing |
Debugging Macros#
|
Portable, conditional CCCL assert() macro. Requires ( |
|
Portable, always-on assert() reserved for critical checks that are always required |
|
Enable assertions |
|
Enable host-side assertions |
|
Enable device-side assertions |
|
Enable debug mode (and assertions) |
Warning Suppression Macros#
|
Portable |
|
Portable |
Compiler-specific Suppression Macros:
|
Suppress clang warning, e.g. |
|
Suppress gcc warning, e.g. |
|
Suppress nvhpc warning, e.g. |
|
Suppress msvc warning, e.g. |
|
Start to suppress nvcc warning, e.g. |
|
End to suppress nvcc warning |
Usage example:
_CCCL_DIAG_PUSH
_CCCL_DIAG_SUPPRESS_GCC("-Wattributes")
// code ..
_CCCL_DIAG_POP