bench_implicit_gemm
Latency benchmark: implicit GEMM (quant / non-quant) vs cuDNN conv3d.
- Usage:
python -m experimental.conv.bench_implicit_gemm python -m experimental.conv.bench_implicit_gemm –shapes wan22 python -m experimental.conv.bench_implicit_gemm –shapes all –warmup 20 –iters 100
Functions
Benchmark a callable, return median time in ms. |
|
Return list of benchmark shapes by name or all shapes. |
|
Entry point for the benchmark CLI. |
|
Run latency benchmark for the given shapes. |
- bench_fn(fn, warmup, iters)
Benchmark a callable, return median time in ms.
- Parameters:
warmup (int)
iters (int)
- Return type:
float
- get_shapes(name)
Return list of benchmark shapes by name or all shapes.
- Parameters:
name (str)
- main()
Entry point for the benchmark CLI.
- run_benchmark(shapes_name, warmup, iters, fp4_block_size)
Run latency benchmark for the given shapes.
- Parameters:
shapes_name (str)
warmup (int)
iters (int)
fp4_block_size (int)