Stopwatch Example

Overview

The stopwatch can be used to measure time for different steps in the implementation, across different parts of the implementation. It supports warm-up handling and optional CPU usage measurement. It is a singleton, so you can enable & configure it once (e.g. in the main script) and then use it anywhere else assuming that it is enabled/configured if needed.

Overview:
  • The stopwatch is a singleton. Use Stopwatch() to retrieve the instance anywhere.

  • Call enable(...) once to activate measurements; otherwise calls become no-ops with minimal overhead.

  • You can measure repeated and one-time measurements across the code parts.

  • Fore repeated measurements, the name identifies the measurement (i.e. if the name is used multiple times, this is considered multiple instances of the same measurement).

  • Repeated measurements are performed after a warm-up phase is finished; One-time measurements can be performed at any time.

  • The stopwatch relies on tracking the number of finished iterations (for ending the warm-up phase, computing average times per iteration, knowing when to print outputs). To ensure that this information is correct, you need to call finish_iter() at the end of each iteration of the outer loop of interest (e.g. training loop).

  • The output is printed in a formatted way with multiple sections. It contains the following information:

    • For repeated measurements, multiple values are displayed:

      • Average runtime per measured interval (an interval is started by calling start_meas() and ended by calling end_meas())

      • Average iteration runtime (for the loop of interest). Here, if a measurement is performed multiple times in a single iteration, it is accumulated for the result. Similarly, if a measurement is performed in only some of the iterations, this will be reflected in the average iteration runtime (e.g. if a measurement is performed every 2nd iteration, the average iteration runtime will be half the average runtime per measured interval).

      • Total runtime

      • Mean CPU usage during the measurement which was selected (by calling set_cpu_usage_meas_name()).

    • For one-time measurements

      • Values are displayed in own section

      • Measurements are not influenced by the warm-up phase, and can be performed at any time.

      • A measurement with a given name can be performed only once.

See also

The code of this example can be found in the repository under packages/optim_test_tools/examples/stopwatch_example.py.

Example

Example Implementation

Please see the notes in the code for more details.

packages/optim_test_tools/examples/stopwatch_example.py
  1# Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
  2#
  3# Licensed under the Apache License, Version 2.0 (the "License");
  4# you may not use this file except in compliance with the License.
  5# You may obtain a copy of the License at
  6#
  7#     http://www.apache.org/licenses/LICENSE-2.0
  8#
  9# Unless required by applicable law or agreed to in writing, software
 10# distributed under the License is distributed on an "AS IS" BASIS,
 11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12# See the License for the specific language governing permissions and
 13# limitations under the License.
 14
 15'''
 16The stopwatch can be used to measure time for different steps in the implementation, across different parts of
 17the implementation. It supports warm-up handling and optional CPU usage measurement. It is a singleton, so you
 18can enable & configure it once (e.g. in the main script) and then use it anywhere else assuming that it is
 19enabled/configured if needed.
 20
 21Overview:
 22  - The stopwatch is a singleton. Use ``Stopwatch()`` to retrieve the instance anywhere.
 23  - Call ``enable(...)`` once to activate measurements; otherwise calls become no-ops with minimal overhead.
 24  - You can measure repeated and one-time measurements across the code parts.
 25  - Fore repeated measurements, the name identifies the measurement (i.e. if the name is used multiple times,
 26    this is considered multiple instances of the same measurement).
 27  - Repeated measurements are performed after a warm-up phase is finished; One-time measurements can be
 28    performed at any time.
 29  - The stopwatch relies on tracking the number of finished iterations (for ending the warm-up phase,
 30    computing average times per iteration, knowing when to print outputs). To ensure that this information is
 31    correct, you need to call :meth:`finish_iter` at the end of each iteration of the outer loop of interest
 32    (e.g. training loop).
 33  - The output is printed in a formatted way with multiple sections. It contains the following information:
 34
 35    - For repeated measurements, multiple values are displayed:
 36
 37        - Average runtime per measured interval (an interval is started by calling :meth:`start_meas` and ended
 38          by calling :meth:`end_meas`)
 39        - Average iteration runtime (for the loop of interest). Here, if a measurement is performed multiple
 40          times in a single iteration, it is accumulated for the result. Similarly, if a measurement is
 41          performed in only some of the iterations, this will be reflected in the average iteration runtime
 42          (e.g. if a measurement is performed every 2nd iteration, the average iteration runtime will be half
 43          the average runtime per measured interval).
 44        - Total runtime
 45        - Mean CPU usage during the measurement which was selected (by calling :meth:`set_cpu_usage_meas_name`).
 46
 47    - For one-time measurements
 48
 49      - Values are displayed in own section
 50      - Measurements are not influenced by the warm-up phase, and can be performed at any time.
 51      - A measurement with a given name can be performed only once.
 52'''
 53
 54import time
 55from accvlab.optim_test_tools import Stopwatch
 56
 57# @NOTE
 58# In this example, the individual sections (such as Main Script, Code Part I, Code Part II, etc.),
 59# indicate parts of the code which, in the actual use case, would be in different files, and would rely
 60# on the `Stopwatch` being a singleton to enable its use throughout the files as shown here.
 61
 62
 63# ---------------------------- Main Script ----------------------------
 64
 65stopwatch = Stopwatch()
 66# @NOTE
 67# To activate the stop watch, it needs to be enabled.
 68# If the following step is omitted, calls to method of the stopwatch will have no effect
 69# (and `get_num_nonwarmup_iters_measured()`) will always return `0``.
 70# Try commenting out the following line:
 71stopwatch.enable(num_warmup_iters=5, print_every_n_iters=2, do_cuda_sync=False)
 72# @NOTE
 73# Note that if the stopwatch is not enabled, calling its methods has minimal overhead
 74# (call to an empty method).
 75
 76# ---------------------------------------------------------------------
 77
 78
 79# ---------------------------- Code Part I ----------------------------
 80
 81# @NOTE
 82# Note that if a code part (such as this one) is used in isolation (meaning that there is no other code which
 83# already enabled the stopwatch), the stopwatch will be disabled and any related calls will have no effect.
 84# The overhead for the stopwatch is minimal in this case (call to an empty function).
 85
 86# @NOTE
 87# Obtain the stopwatch instance (assuming that this is a separate file without access to the previously
 88# obtained one).
 89stopwatch = Stopwatch()
 90
 91# @NOTE: Set up "meas1" to also measure mean CPU usage (can e.g. be the training loop in actual use case).
 92stopwatch.set_cpu_usage_meas_name("meas1")
 93
 94# @NOTE
 95# One-time measurements can be performed at any point in the code. They are not affected by warm-up
 96# iterations and can be performed only once per used name, and are reported as such
 97# (i.e. in own section and without averages etc.).
 98stopwatch.start_one_time_measurement("complete_run")
 99
100# @NOTE: Another one-time measurement (start and end).
101stopwatch.start_one_time_measurement("preparation")
102time.sleep(0.1)
103stopwatch.end_one_time_measurement("preparation")
104
105
106num_iters = 16
107for i in range(num_iters):
108
109    # -------------------------- Code Part II --------------------------
110
111    # @NOTE
112    # Obtain the stopwatch instance and use it to start a measurement (ended in another part of the code).
113    stopwatch = Stopwatch()
114    stopwatch.start_meas("meas1")
115    # Dummy workload to simulate some work.
116    k = 0
117    for j in range(1000000):
118        k += float(j) ** 0.3 + 1.0 / (float(j) + 1.0)
119    # ... continue and at some point call code part III
120
121    # ------------------------------------------------------------------
122
123    # -------------------------- Code Part III -------------------------
124
125    # @NOTE: Here, we perform a measurement in another part of the code.
126    stopwatch = Stopwatch()
127    stopwatch.start_meas("meas2")
128    time.sleep(0.05)
129    stopwatch.end_meas("meas2")
130    # ... continue and at some point call code part IV
131
132    # ------------------------------------------------------------------
133
134    # ------------------------- Code Part IV  --------------------------
135
136    # @NOTE: End the measurement started in part II.
137    stopwatch.end_meas("meas1")
138    # ... continue and at some point call code part V
139
140    # ------------------------------------------------------------------
141
142    # --------------------------- Code Part V --------------------------
143
144    stopwatch = Stopwatch()
145    # @NOTE
146    # Measurements may also be performed in only some of the iterations. In this case,
147    # the average run-time per iteration will be lower than the average run-time
148    # per measured interval.
149    if i % 3 == 2:
150        stopwatch.start_meas("meas3")
151        time.sleep(0.01)
152        stopwatch.end_meas("meas3")
153    # @NOTE
154    # We can perform a measurement with the same name multiple times in a
155    # single iteration. This can be in the same or different code parts. In this case,
156    # the average run-time per iteration will be higher than the average run-time
157    # per measured interval.
158    stopwatch.start_meas("meas2")
159    time.sleep(0.01)
160    stopwatch.end_meas("meas2")
161    # ... continue and at some point call code part VI
162
163    # ------------------------------------------------------------------
164
165    # -------------------------- Code Part VI ---------------------------
166
167    # @NOTE
168    # Finish the iteration. This is very important to
169    #   - Ensure that the warm-up iterations are counted down correctly and measurements are
170    #     started after the warm-up iterations.
171    #   - Ensure that the measurements are correctly accumulated and printed.
172    stopwatch = Stopwatch()
173    stopwatch.finish_iter()
174
175    # ------------------------------------------------------------------
176
177
178# ---------------------------- Code Part VI ----------------------------
179
180# @NOTE
181# End the one-time measurement
182stopwatch.end_one_time_measurement("complete_run")
183# @NOTE
184# Print the final result
185print("Final measurement:")
186stopwatch.print_eval_times()
187
188# ----------------------------------------------------------------------

Resulting Output

Here, we show an example output for the last time that the stopwatch was used to print the results. Note that the actual result will vary depending on the actual runtime of the code parts. Note that for meas2 and meas3, the average runtime per iteration is different from the average runtime per measured interval. This is because these measurements were performed multiple times per iteration / only in some of the iterations, respectively.

######################### Stopwatch #########################
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Num. measured iterations: 11
. . . . . . . . . . . . . . . . . . . . . . . . .
Average runtime per iteration:
  `meas1`: 0.16802375966852362
  `meas2`: 0.060098127885298294
  `meas3`: 0.0036563439802689986
. . . . . . . . . . . . . . . . . . . . . . . . .
Average runtime per measured interval:
  `meas1`: 0.16802375966852362
  `meas2`: 0.030049063942649147
  `meas3`: 0.010054945945739746
. . . . . . . . . . . . . . . . . . . . . . . . .
Total runtime:
  `meas1`: 1.8482613563537598
  `meas2`: 0.6610794067382812
  `meas3`: 0.040219783782958984
. . . . . . . . . . . . . . . . . . . . . . . . .
Mean CPU usage during `meas1`: 2.655618891226586
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
One-time measurements:
  `complete_run`: 3.0058188438415527
  `preparation`: 0.10008645057678223
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#############################################################