Performance Tuning Guide ======================= .. include:: introduction.md :parser: myst_parser.sphinx_ .. toctree:: :maxdepth: 1 benchmarking-default-performance useful-build-time-flags tuning-max-batch-size-and-max-num-tokens deciding-model-sharding-strategy fp8-quantization useful-runtime-flags