Model Optimizer
Getting Started
Overview
Installation
Quick Start: PTQ - PyTorch
Quick Start: PTQ - ONNX
Quick Start: PTQ - PyTorch to ONNX
Quick Start: PTQ - Windows
Quick Start: QAT
Quick Start: Pruning
Quick Start: Distillation
Quick Start: Speculative Decoding
Quick Start: Sparsity
Guides
Support Matrix
Quantization
Saving & Restoring
Pruning
Distillation
Speculative Decoding
Sparsity
NAS
AutoCast (ONNX)
Deployment
TensorRT-LLM
Onnxruntime
Unified HuggingFace Checkpoint
Examples
All GitHub Examples
Reference
NVIDIA Model Optimizer Changelog
modelopt API
deploy
onnx
torch
distill
export
nas
opt
peft
prune
quantization
algorithms
backends
calib
modelopt.torch.quantization.compress
config
conversion
export_onnx
extensions
mode
model_calib
model_quant
nn
plugins
qtensor
tensor_quant
triton
utils
sparsity
speculative
trace
utils
Support
Contact us
FAQs
Model Optimizer
modelopt API
torch
quantization
nn
modules
quant_batchnorm
View page source
quant_batchnorm
Quantized batch normalization module.