perf
Utility functions for performance measurement.
Classes
A timer that accumulates time across multiple calls and works for both CUDA and non-CUDA operations. |
|
A Timer that can be used as a decorator as well. |
Functions
Clear the CUDA cache. |
|
Get memory usage of specified GPU in Bytes. |
|
Get used GPU memory as a fraction of total memory. |
|
Simple GPU memory report. |
- class AccumulatingTimer
Bases:
ContextDecoratorA timer that accumulates time across multiple calls and works for both CUDA and non-CUDA operations.
- __init__(name='')
Initialize AccumulatingTimer.
- Parameters:
name – Name of the timer for reporting
use_cuda – Whether to synchronize CUDA before timing
- classmethod get_call_count(name)
Get the number of calls for a timer.
- classmethod get_total_time(name)
Get the total accumulated time for a timer in milliseconds.
- classmethod report()
Report the accumulated times and call counts.
- classmethod reset()
Reset the accumulated times and call counts.
- start()
Start the timer.
- Return type:
None
- stop()
End the timer and return the elapsed time in milliseconds.
- Return type:
float
- class Timer
Bases:
ContextDecoratorA Timer that can be used as a decorator as well.
- __init__(name='')
Initialize Timer.
- start()
Start the timer.
- stop()
End the timer.
- Return type:
float
- clear_cuda_cache()
Clear the CUDA cache.
- get_cuda_memory_stats(device=None)
Get memory usage of specified GPU in Bytes.
- get_used_gpu_mem_fraction(device='cuda:0')
Get used GPU memory as a fraction of total memory.
- Parameters:
device – Device identifier (default: “cuda:0”)
- Returns:
- Fraction of GPU memory currently used (0.0 to 1.0).
Returns 0.0 if CUDA is not available.
- Return type:
float
- report_memory(name='', rank=0, device=None)
Simple GPU memory report.