Installation
System requirements
Model Optimizer (nvidia-modelopt
) currently has the following system requirements:
OS |
Linux |
Architecture |
x86_64 |
Python |
>=3.8,<3.13 |
PyTorch |
>=1.11 |
CUDA |
>=11.8 (Recommended) |
Install Model Optimizer
ModelOpt including its dependencies can be installed via pip
. Please review the
license terms of ModelOpt and any dependencies before use.
Setting up a virtual environment
We recommend setting up a virtual environment if you don’t have one already. Run the following
command to set up and activate a conda
virtual environment named modelopt
with Python 3.12:
conda create -n modelopt python=3.12 pip
conda activate modelopt
(Optional) Install desired PyTorch version
By default, the latest PyTorch version (torch>=1.11
) available on pip
will
be installed. If you want to install a specific PyTorch version for a specific CUDA version, please first
follow the instructions to install your desired PyTorch version.
For example, to install latest torch>=1.11
with CUDA 11.8 run:
pip install torch --extra-index-url https://download.pytorch.org/whl/cu118
Identify correct partial dependencies
Note that when installing nvidia-modelopt
without optional dependencies, only the barebone
requirements are installed and none of the modules will work without the appropriate optional
dependencies or [all]
optional dependencies. Below is a list of optional dependencies that
need to be installed to correctly use the corresponding modules:
Module |
Optional dependencies |
---|---|
|
|
|
|
|
|
|
|
Additionally, we support the following 3rd-party plugins:
Third-party package |
Optional dependencies |
---|---|
|
|
Install Model Optimizer (nvidia-modelopt
)
pip install "nvidia-modelopt[all]" --no-cache-dir --extra-index-url https://pypi.nvidia.com
Check installation
Tip
When you use ModelOpt’s PyTorch quantization APIs for the first time, it will compile the fast quantization kernels using your installed torch and CUDA if available. This may take a few minutes but subsequent quantization calls will be much faster. To invoke the compilation now and check if it is successful, run the following command:
python -c "import modelopt.torch.quantization.extensions as ext; print(ext.cuda_ext); print(ext.cuda_ext_fp8)"