std::chrono::system_clock is a clock that track real-world time. In the C++ Standard, it is unspecified whether or not this clock is monotonically increasing. In our implementation, it is not.
std::chrono::system_clock, we use:
GetSystemTimeAsFileTimefor host code on Windows.
gettimeofdayfor host code on Linux, Android, and QNX.
%globaltimerfor device code.
%globaltimer is a system clock which also happens to be monotonically increasing on today’s NVIDIA GPUs (e.g. it cannot be updated and is not changed when the host system clock changes). However, this is not necessarily the case with respect to host threads, where updates of the system clock may occur during the execution of the program. PTX’s
%globaltimer is initialized from the host system clock upon device attach; that may be at program start, but it could be earlier (for example, due to CUDA persistence mode). Since PTX’s
%globaltimer is a system clock, it counts real-world time, and thus it has the same tick rate as the host system clock.
There is potential for logical inconsistencies between the time that host threads and device threads observe from our
std::chrono::system_clock. However, this is perfectly fine; it is an inherent property of system clocks. In fact, it is not even guaranteed that a system clock remain consistent between different host threads, or even within the same host thread. This can occur, for example, due to Daylights Savings Time or a time zone change.
The requirements for Clock state:
C1denotes a clock type.
t2are values returned by
C1::now()where the call returning
t1happens before the call returning
t2and both of these calls occur before
t1 <= t2is always true and the time between clock ticks is constant, otherwise
The property is true for our
std::chrono::system_clock within device code, but it is not true for all threads. Therefore, in the NVIDIA C++ Standard Library today, the value of the
is_steady member of
std::chrono::high_resolution_clock specification states:
class high_resolution_clockrepresent clocks with the shortest tick period.
high_resolution_clockmay be a synonym for
In the NVIDIA C++ Standard Library,
std::chrono::high_resolution_clock is an alias for
std::chrono::system_clock. This means that it counts real-world time and that
is_steady is false for our
std::chrono::high_resolution_clock is not heterogeneously steady, it is steady within device code, so it is suitable for performance measurement within device code.
The following facilities in section time.syn of ISO/IEC IS 14882 (the C++ Standard) are not available in the NVIDIA C++ Standard Library today:
std::chrono::steady_clock is, by definition, a monotonically increasing clock (e.g.
true). We do not currently have a heterogeneous steady clock. While we have a monotonically increasing clock in host code, and our system clock (PTX’s
%globaltimer) is monotonically increasing in device code, it is not guaranteed that the host clocks and the device clocks are monotonically increasing with respect to each other, due to how PTX’s
%globaltimer is initialized. Additionally,
%globaltime and the host steady clock may tick at different rates.
It may be technically possible to synchronize the clocks and to compute and adjust for the difference in tick rates. However, it would be challenging to do so, and may introduce substantial overhead in the initialization and access of the heterogeneous clock.
As such, today we do not provide
std::chrono::steady_clock, as we cannot easily provide an efficient implementation that is truly heterogeneous and conforms to the specification.
Implementing a heterogeneous C++ I/O streams library involves many challenges that we cannot overcome today.