Time Library
See the documentation of the standard header <chrono>
Header |
Content |
Availability |
---|---|---|
|
Times, dates, and clocks |
libcu++ 1.1.0 / CCCL 2.0.0 / CUDA 12.3 |
Implementation-Defined Behavior
std::chrono::system_clock
std::chrono::system_clock is a clock that track real-world time. In the C++ Standard, it is unspecified whether or not this clock is monotonically increasing. In our implementation, it is not.
To implement std::chrono::system_clock, we use:
GetSystemTimePreciseAsFileTime and GetSystemTimeAsFileTime for host code on Windows.
clock_gettime(CLOCK_REALTIME, …) and gettimeofday for host code on Linux, Android, and QNX.
PTX’s %globaltimer for device code.
PTX’s %globaltimer is a system clock which also happens to be monotonically increasing on today’s NVIDIA GPUs (e.g. it cannot be updated and is not changed when the host system clock changes). However, this is not necessarily the case with respect to host threads, where updates of the system clock may occur during the execution of the program.
PTX’s %globaltimer is initialized from the host system clock upon device attach; that may be at program start, but it could be earlier (for example, due to CUDA persistence mode). Since PTX’s %globaltimer is a system clock, it counts real-world time, and thus it has the same tick rate as the host system clock.
There is potential for logical inconsistencies between the time that host threads and device threads observe from our std::chrono::system_clock. However, this is perfectly fine; it is an inherent property of system clocks. In fact, it is not even guaranteed that a system clock remain consistent between different host threads, or even within the same host thread. This can occur, for example, due to Daylights Savings Time or a time zone change.
The requirements for Clock state:
C1
denotes a clock type.t1
andt2
are values returned byC1::now()
where the call returningt1
happens before the call returningt2
and both of these calls occur beforeC1::time_point::max()
.
C1::is_steady
istrue
ift1 <= t2
is always true and the time between clock ticks is constant, otherwisefalse
.
The property is true for our std::chrono::system_clock
within device code, but it is not true for all threads. Therefore, in the NVIDIA C++ Standard Library today,
the value of the is_steady
member of
std::chrono::system_clock
is false
.
std::chrono::high_resolution_clock
The std::chrono::high_resolution_clock specification states:
Objects of
class high_resolution_clock
represent clocks with the shortest tick period.high_resolution_clock
may be a synonym forsystem_clock
orsteady_clock
.
In the NVIDIA C++ Standard Library, std::chrono::high_resolution_clock
is an alias for std::chrono::system_clock.
This means that it counts real-world time and that is_steady
is false for our
std::chrono::high_resolution_clock.
While our std::chrono::high_resolution_clock is not heterogeneously steady, it is steady within device code, so it is suitable for performance measurement within device code.
Omissions
The following facilities in section time.syn of ISO/IEC IS 14882 (the C++ Standard) are not available in the NVIDIA C++ Standard Library today:
std::chrono::steady_clock - a monotonically increasing clock.
std::chrono::steady_clock
std::chrono::steady_clock
is, by definition, a monotonically increasing clock (e.g. is_steady
is true
). We do not currently have a heterogeneous steady clock.
While we have a monotonically increasing clock in host code, and our system clock
(PTX’s %globaltimer)
is monotonically increasing in device code, it is not guaranteed that the host clocks and the device clocks are monotonically increasing with
respect to each other, due to how PTX’s %globaltimer
is initialized. Additionally, %globaltime
and the host steady clock may tick at different rates.
It may be technically possible to synchronize the clocks and to compute and adjust for the difference in tick rates. However, it would be challenging to do so, and may introduce substantial overhead in the initialization and access of the heterogeneous clock.
As such, today we do not provide std::chrono::steady_clock, as we cannot easily provide an efficient implementation that is truly heterogeneous and conforms to the specification.
std::chrono::duration I/O Operators
Implementing a heterogeneous C++ I/O streams library involves many challenges that we cannot overcome today.