Quick Start

To add NVTX ranges to your code, use the nvtx3::scoped_range RAII object. A range begins when the object is created, and ends when the object is destroyed.

#include "nvtx3/nvtx3.hpp"
void some_function() {
   // Begins a NVTX range with the message "some_function"
   // The range ends when some_function() returns and `r` is destroyed
   nvtx3::scoped_range r{"some_function"};
 
   for(int i = 0; i < 5; ++i) {
      nvtx3::scoped_range loop{"loop range"};
      std::this_thread::sleep_for(std::chrono::seconds{1});
   }
} // Range ends when `r` is destroyed

The example code above generates the following timeline view in Nsight Systems:

Alternatively, use the Convenience Macros like NVTX3_FUNC_RANGE() to add ranges to your code that automatically use the name of the enclosing function as the range's message.

#include "nvtx3/nvtx3.hpp"
void some_function() {
   // Creates a range with a message "some_function" that ends when the
   // enclosing function returns
   NVTX3_FUNC_RANGE();
   ...
}

Overview

The NVTX library provides a set of functions for users to annotate their code to aid in performance profiling and optimization. These annotations provide information to tools like Nsight Systems to improve visualization of application timelines.

Ranges are one of the most commonly used NVTX constructs for annotating a span of time. For example, imagine a user wanted to see every time a function, my_function, is called and how long it takes to execute. This can be accomplished with an NVTX range created on the entry to the function and terminated on return from my_function using the push/pop C APIs:

void my_function(...) {
   nvtxRangePushA("my_function"); // Begins NVTX range
   // do work
   nvtxRangePop(); // Ends NVTX range
}

One of the challenges with using the NVTX C API is that it requires manually terminating the end of the range with nvtxRangePop. This can be challenging if my_function() has multiple returns or can throw exceptions as it requires calling nvtxRangePop() before all possible return points.

NVTX C++ solves this inconvenience through the "RAII" technique by providing a nvtx3::scoped_range class that begins a range at construction and ends the range on destruction. The above example then becomes:

void my_function(...) {
   nvtx3::scoped_range r{"my_function"}; // Begins NVTX range
   // do work
} // Range ends on exit from `my_function` when `r` is destroyed

The range object r is deterministically destroyed whenever my_function returns—ending the NVTX range without manual intervention. For more information, see Ranges and nvtx3::scoped_range_in.

Another inconvenience of the NVTX C APIs are the several constructs where the user is expected to initialize an object at the beginning of an application and reuse that object throughout the lifetime of the application. For example see domains, categories, and registered messages.

Example:

nvtxDomainHandle_t D = nvtxDomainCreateA("my domain");

// Reuse `D` throughout the rest of the application

This can be problematic if the user application or library does not have an explicit initialization function called before all other functions to ensure that these long-lived objects are initialized before being used.

NVTX C++ makes use of the "construct on first use" technique to alleviate this inconvenience. In short, a function local static object is constructed upon the first invocation of a function and returns a reference to that object on all future invocations. See the documentation for nvtx3::domain, nvtx3::named_category, nvtx3::registered_string, and https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use for more information.

Using construct on first use, the above example becomes:

struct my_domain{ static constexpr char const* name{"my domain"}; };
 
// The first invocation of `domain::get` for the type `my_domain` will
// construct a `nvtx3::domain` object and return a reference to it. Future
// invocations simply return a reference.
nvtx3::domain const& D = nvtx3::domain::get<my_domain>();

For more information about NVTX and how it can be used, see https://docs.nvidia.com/cuda/profiler-users-guide/index.html#nvtx and https://devblogs.nvidia.com/cuda-pro-tip-generate-custom-application-profile-timelines-nvtx/ for more information.

Ranges

Ranges are used to describe a span of time during the execution of an application. Common examples are using ranges to annotate the time it takes to execute a function or an iteration of a loop.

NVTX C++ uses RAII to automate the generation of ranges that are tied to the lifetime of objects. Similar to std::lock_guard in the C++ Standard Template Library.

Scoped Range

nvtx3::scoped_range_in is a class that begins a range upon construction and ends the range at destruction. This is one of the most commonly used constructs in NVTX C++ and is useful for annotating spans of time on a particular thread. These ranges can be nested to arbitrary depths.

nvtx3::scoped_range is an alias for a nvtx3::scoped_range_in in the global NVTX domain. For more information about Domains, see Domains.

Various attributes of a range can be configured constructing a nvtx3::scoped_range_in with a nvtx3::event_attributes object. For more information, see Event Attributes.

Example:

void some_function() {
   // Creates a range for the duration of `some_function`
   nvtx3::scoped_range r{};
 
   while(true) {
      // Creates a range for every loop iteration
      // `loop_range` is nested inside `r`
      nvtx3::scoped_range loop_range{};
   }
}

Unique Range

nvtx3::unique_range is similar to nvtx3::scoped_range, with a few key differences:

unique_range objects can be destroyed in any order whereas scoped_range objects must be destroyed in exact reverse creation order
unique_range can start and end on different threads
unique_range is movable
unique_range objects can be constructed as heap objects

There is extra overhead associated with unique_range constructs and therefore use of nvtx3::scoped_range_in should be preferred.

Marks

nvtx3::mark annotates an instantaneous point in time with a "marker".

Unlike a "range" which has a beginning and an end, a marker is a single event in an application, such as detecting a problem:

bool success = do_operation(...);
if (!success) {
   nvtx3::mark("operation failed!");
}

Domains

Similar to C++ namespaces, domains allow for scoping NVTX events. By default, all NVTX events belong to the "global" domain. Libraries and applications should scope their events to use a custom domain to differentiate where the events originate from.

It is common for a library or application to have only a single domain and for the name of that domain to be known at compile time. Therefore, Domains in NVTX C++ are represented by tag types.

For example, to define a custom domain, simply define a new concrete type (a class or struct) with a static member called name that contains the desired name of the domain.

struct my_domain{ static constexpr char const* name{"my domain"}; };

For any NVTX C++ construct that can be scoped to a domain, the type my_domain can be passed as an explicit template argument to scope it to the custom domain.

The tag type nvtx3::domain::global represents the global NVTX domain.

// By default, `scoped_range_in` belongs to the global domain
nvtx3::scoped_range_in<> r0{};
 
// Alias for a `scoped_range_in` in the global domain
nvtx3::scoped_range r1{};
 
// `r` belongs to the custom domain
nvtx3::scoped_range_in<my_domain> r{};

When using a custom domain, it is recommended to define type aliases for NVTX constructs in the custom domain.

using my_scoped_range = nvtx3::scoped_range_in<my_domain>;
using my_registered_string = nvtx3::registered_string_in<my_domain>;
using my_named_category = nvtx3::named_category_in<my_domain>;

See nvtx3::domain for more information.

Event Attributes

NVTX events can be customized with various attributes to provide additional information (such as a custom message) or to control visualization of the event (such as the color used). These attributes can be specified per-event via arguments to a nvtx3::event_attributes object.

NVTX events can be customized via four "attributes":

color : color used to visualize the event in tools.
message : Custom message string.
payload : User-defined numerical value.
category : Intra-domain grouping.

It is possible to construct a nvtx3::event_attributes from any number of attribute objects (nvtx3::color, nvtx3::message, nvtx3::payload, nvtx3::category) in any order. If an attribute is not specified, a tool specific default value is used. See nvtx3::event_attributes for more information.

// Set message, same as passing nvtx3::message{"message"}
nvtx3::event_attributes attr{"message"};
 
// Set message and color
nvtx3::event_attributes attr{"message", nvtx3::rgb{127, 255, 0}};
 
// Set message, color, payload, category
nvtx3::event_attributes attr{"message",
                             nvtx3::rgb{127, 255, 0},
                             nvtx3::payload{42},
                             nvtx3::category{1}};
 
// Same as above -- can use any order of arguments
nvtx3::event_attributes attr{nvtx3::payload{42},
                             nvtx3::category{1},
                             "message",
                             nvtx3::rgb{127, 255, 0}};
 
// Multiple arguments of the same type are allowed, but only the first is
// used -- in this example, payload is set to 42:
nvtx3::event_attributes attr{ nvtx3::payload{42}, nvtx3::payload{7} };
 
// Using the nvtx3 namespace in a local scope makes the syntax more succinct:
using namespace nvtx3;
event_attributes attr{"message", rgb{127, 255, 0}, payload{42}, category{1}};

message

nvtx3::message sets the message string for an NVTX event.

Example:

// Create an `event_attributes` with the message "my message"
nvtx3::event_attributes attr{nvtx3::message{"my message"}};
 
// strings and string literals implicitly assumed to be a `nvtx3::message`
nvtx3::event_attributes attr{"my message"};

Registered Messages

Associating a nvtx3::message with an event requires copying the contents of the message every time the message is used, i.e., copying the entire message string. This may cause non-trivial overhead in performance sensitive code.

To eliminate this overhead, NVTX allows registering a message string, yielding a "handle" that is inexpensive to copy that may be used in place of a message string. When visualizing the events, tools such as Nsight Systems will take care of mapping the message handle to its string.

A message should be registered once and the handle reused throughout the rest of the application. This can be done by either explicitly creating static nvtx3::registered_string objects, or using the nvtx3::registered_string::get construct on first use helper (recommended).

Similar to Domains, nvtx3::registered_string::get requires defining a custom tag type with a static message member whose value will be the contents of the registered string.

Example:

// Explicitly constructed, static `registered_string` in my_domain:
static registered_string_in<my_domain> static_message{"my message"};
 
// Or use construct on first use:
// Define a tag type with a `message` member string to register
struct my_message{ static constexpr char const* message{ "my message" }; };
 
// Uses construct on first use to register the contents of
// `my_message::message`
auto& msg = nvtx3::registered_string_in<my_domain>::get<my_message>();

color

Associating a nvtx3::color with an event allows controlling how the event is visualized in a tool such as Nsight Systems. This is a convenient way to visually differentiate among different events.

// Define a color via rgb color values
nvtx3::color c{nvtx3::rgb{127, 255, 0}};
nvtx3::event_attributes attr{c};
 
// rgb color values can be passed directly to an `event_attributes`
nvtx3::event_attributes attr1{nvtx3::rgb{127,255,0}};

payload

Allows associating a user-defined numerical value with an event.

// Constructs a payload from the `int32_t` value 42

nvtx3:: event_attributes attr{nvtx3::payload{42}};

Example

Putting it all together:

// Define a custom domain tag type
struct my_domain{ static constexpr char const* name{"my domain"}; };
 
// Define a named category tag type
struct my_category{
   static constexpr char const* name{"my category"};
   static constexpr uint32_t id{42};
};
 
// Define a registered string tag type
struct my_message{ static constexpr char const* message{"my message"}; };
 
// For convenience, use aliases for domain scoped objects
using my_scoped_range = nvtx3::scoped_range_in<my_domain>;
using my_registered_string = nvtx3::registered_string_in<my_domain>;
using my_named_category = nvtx3::named_category_in<my_domain>;
 
// Default values for all attributes
nvtx3::event_attributes attr{};
my_scoped_range r0{attr};
 
// Custom (unregistered) message, and unnamed category
nvtx3::event_attributes attr1{"message", nvtx3::category{2}};
my_scoped_range r1{attr1};
 
// Alternatively, pass arguments of `event_attributes` constructor directly
// to `my_scoped_range`
my_scoped_range r2{"message", nvtx3::category{2}};
 
// construct on first use a registered string
auto& msg = my_registered_string::get<my_message>();
 
// construct on first use a named category
auto& cat = my_named_category::get<my_category>();
 
// Use registered string and named category with a custom payload
my_scoped_range r3{msg, cat, nvtx3::payload{42}};
 
// Any number of arguments in any order
my_scoped_range r{nvtx3::rgb{127, 255, 0}, msg};

Convenience Macros

Oftentimes users want to quickly and easily add NVTX ranges to their library or application to aid in profiling and optimization.

A convenient way to do this is to use the NVTX3_FUNC_RANGE and NVTX3_FUNC_RANGE_IN macros. These macros take care of constructing an nvtx3::scoped_range_in with the name of the enclosing function as the range's message.

void some_function() {
   // Automatically generates an NVTX range for the duration of the function
   // using "some_function" as the event's message.
   NVTX3_FUNC_RANGE();
}

Table of Contents

Quick Start

Overview

Ranges

Scoped Range

Unique Range

Marks

Domains

Event Attributes

message

Registered Messages

color

category

Named Categories

payload

Example

Convenience Macros