bench_implicit_gemm

Latency benchmark: implicit GEMM (quant / non-quant) vs cuDNN conv3d.

Usage:

python -m experimental.conv.bench_implicit_gemm python -m experimental.conv.bench_implicit_gemm –shapes wan22 python -m experimental.conv.bench_implicit_gemm –shapes all –warmup 20 –iters 100

Functions

bench_fn

Benchmark a callable, return median time in ms.

get_shapes

Return list of benchmark shapes by name or all shapes.

main

Entry point for the benchmark CLI.

run_benchmark

Run latency benchmark for the given shapes.

bench_fn(fn, warmup, iters)

Benchmark a callable, return median time in ms.

Parameters:
  • warmup (int)

  • iters (int)

Return type:

float

get_shapes(name)

Return list of benchmark shapes by name or all shapes.

Parameters:

name (str)

main()

Entry point for the benchmark CLI.

run_benchmark(shapes_name, warmup, iters, fp4_block_size)

Run latency benchmark for the given shapes.

Parameters:
  • shapes_name (str)

  • warmup (int)

  • iters (int)

  • fp4_block_size (int)