perf

Utility functions for performance measurement.

Classes

Timer

A Timer that can be used as a decorator as well.

AccumulatingTimer

A timer that accumulates time across multiple calls and works for both CUDA and non-CUDA operations.

Functions

clear_cuda_cache

Clear the CUDA cache.

get_cuda_memory_stats

Get memory usage of specified GPU in Bytes.

report_memory

Simple GPU memory report.

class AccumulatingTimer

Bases: ContextDecorator

A timer that accumulates time across multiple calls and works for both CUDA and non-CUDA operations.

__init__(name='')

Initialize AccumulatingTimer.

Parameters:
  • name – Name of the timer for reporting

  • use_cuda – Whether to synchronize CUDA before timing

classmethod get_call_count(name)

Get the number of calls for a timer.

classmethod get_total_time(name)

Get the total accumulated time for a timer in milliseconds.

classmethod report()

Report the accumulated times and call counts.

classmethod reset()

Reset the accumulated times and call counts.

start()

Start the timer.

Return type:

None

stop()

End the timer and return the elapsed time in milliseconds.

Return type:

float

class Timer

Bases: ContextDecorator

A Timer that can be used as a decorator as well.

__init__(name='')

Initialize Timer.

start()

Start the timer.

stop()

End the timer.

Return type:

float

clear_cuda_cache()

Clear the CUDA cache.

get_cuda_memory_stats(device=None)

Get memory usage of specified GPU in Bytes.

report_memory(name='', rank=0, device=None)

Simple GPU memory report.