NVTX C++ API Reference 1.0
C++ convenience wrappers for NVTX v3 C API
|
To add NVTX ranges to your code, use the nvtx3::scoped_range
RAII object. A range begins when the object is created, and ends when the object is destroyed.
The example code above generates the following timeline view in Nsight Systems:
Alternatively, use the Convenience Macros like NVTX3_FUNC_RANGE()
to add ranges to your code that automatically use the name of the enclosing function as the range's message.
The NVTX library provides a set of functions for users to annotate their code to aid in performance profiling and optimization. These annotations provide information to tools like Nsight Systems to improve visualization of application timelines.
Ranges are one of the most commonly used NVTX constructs for annotating a span of time. For example, imagine a user wanted to see every time a function, my_function
, is called and how long it takes to execute. This can be accomplished with an NVTX range created on the entry to the function and terminated on return from my_function
using the push/pop C APIs:
One of the challenges with using the NVTX C API is that it requires manually terminating the end of the range with nvtxRangePop
. This can be challenging if my_function()
has multiple returns or can throw exceptions as it requires calling nvtxRangePop()
before all possible return points.
NVTX C++ solves this inconvenience through the "RAII" technique by providing a nvtx3::scoped_range
class that begins a range at construction and ends the range on destruction. The above example then becomes:
The range object r
is deterministically destroyed whenever my_function
returns—ending the NVTX range without manual intervention. For more information, see Ranges and nvtx3::scoped_range_in
.
Another inconvenience of the NVTX C APIs are the several constructs where the user is expected to initialize an object at the beginning of an application and reuse that object throughout the lifetime of the application. For example see domains, categories, and registered messages.
Example:
This can be problematic if the user application or library does not have an explicit initialization function called before all other functions to ensure that these long-lived objects are initialized before being used.
NVTX C++ makes use of the "construct on first use" technique to alleviate this inconvenience. In short, a function local static object is constructed upon the first invocation of a function and returns a reference to that object on all future invocations. See the documentation for nvtx3::domain
, nvtx3::named_category
, nvtx3::registered_string
, and https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use for more information.
Using construct on first use, the above example becomes:
For more information about NVTX and how it can be used, see https://docs.nvidia.com/cuda/profiler-users-guide/index.html#nvtx and https://devblogs.nvidia.com/cuda-pro-tip-generate-custom-application-profile-timelines-nvtx/ for more information.
Ranges are used to describe a span of time during the execution of an application. Common examples are using ranges to annotate the time it takes to execute a function or an iteration of a loop.
NVTX C++ uses RAII to automate the generation of ranges that are tied to the lifetime of objects. Similar to std::lock_guard
in the C++ Standard Template Library.
nvtx3::scoped_range_in
is a class that begins a range upon construction and ends the range at destruction. This is one of the most commonly used constructs in NVTX C++ and is useful for annotating spans of time on a particular thread. These ranges can be nested to arbitrary depths.
nvtx3::scoped_range
is an alias for a nvtx3::scoped_range_in
in the global NVTX domain. For more information about Domains, see Domains.
Various attributes of a range can be configured constructing a nvtx3::scoped_range_in
with a nvtx3::event_attributes
object. For more information, see Event Attributes.
Example:
nvtx3::unique_range
is similar to nvtx3::scoped_range
, with a few key differences:
unique_range
objects can be destroyed in any order whereas scoped_range
objects must be destroyed in exact reverse creation orderunique_range
can start and end on different threadsunique_range
is movableunique_range
objects can be constructed as heap objectsThere is extra overhead associated with unique_range
constructs and therefore use of nvtx3::scoped_range_in
should be preferred.
nvtx3::mark
annotates an instantaneous point in time with a "marker".
Unlike a "range" which has a beginning and an end, a marker is a single event in an application, such as detecting a problem:
Similar to C++ namespaces, domains allow for scoping NVTX events. By default, all NVTX events belong to the "global" domain. Libraries and applications should scope their events to use a custom domain to differentiate where the events originate from.
It is common for a library or application to have only a single domain and for the name of that domain to be known at compile time. Therefore, Domains in NVTX C++ are represented by tag types.
For example, to define a custom domain, simply define a new concrete type (a class
or struct
) with a static
member called name
that contains the desired name of the domain.
For any NVTX C++ construct that can be scoped to a domain, the type my_domain
can be passed as an explicit template argument to scope it to the custom domain.
The tag type nvtx3::domain::global
represents the global NVTX domain.
When using a custom domain, it is recommended to define type aliases for NVTX constructs in the custom domain.
See nvtx3::domain
for more information.
NVTX events can be customized with various attributes to provide additional information (such as a custom message) or to control visualization of the event (such as the color used). These attributes can be specified per-event via arguments to a nvtx3::event_attributes
object.
NVTX events can be customized via four "attributes":
It is possible to construct a nvtx3::event_attributes
from any number of attribute objects (nvtx3::color, nvtx3::message, nvtx3::payload, nvtx3::category) in any order. If an attribute is not specified, a tool specific default value is used. See nvtx3::event_attributes
for more information.
nvtx3::message
sets the message string for an NVTX event.
Example:
Associating a nvtx3::message
with an event requires copying the contents of the message every time the message is used, i.e., copying the entire message string. This may cause non-trivial overhead in performance sensitive code.
To eliminate this overhead, NVTX allows registering a message string, yielding a "handle" that is inexpensive to copy that may be used in place of a message string. When visualizing the events, tools such as Nsight Systems will take care of mapping the message handle to its string.
A message should be registered once and the handle reused throughout the rest of the application. This can be done by either explicitly creating static nvtx3::registered_string
objects, or using the nvtx3::registered_string::get
construct on first use helper (recommended).
Similar to Domains, nvtx3::registered_string::get
requires defining a custom tag type with a static message
member whose value will be the contents of the registered string.
Example:
Associating a nvtx3::color
with an event allows controlling how the event is visualized in a tool such as Nsight Systems. This is a convenient way to visually differentiate among different events.
A nvtx3::category
is simply an integer id that allows for fine-grain grouping of NVTX events. For example, one might use separate categories for IO, memory allocation, compute, etc.
Associates a name
string with a category id
to help differentiate among categories.
For any given category id Id
, a named_category{Id, "name"}
should only be constructed once and reused throughout an application. This can be done by either explicitly creating static nvtx3::named_category
objects, or using the nvtx3::named_category::get
construct on first use helper (recommended).
Similar to Domains, nvtx3::named_category::get
requires defining a custom tag type with static name
and id
members.
Allows associating a user-defined numerical value with an event.
Putting it all together:
Oftentimes users want to quickly and easily add NVTX ranges to their library or application to aid in profiling and optimization.
A convenient way to do this is to use the NVTX3_FUNC_RANGE and NVTX3_FUNC_RANGE_IN macros. These macros take care of constructing an nvtx3::scoped_range_in
with the name of the enclosing function as the range's message.