Installing on Windows
Note
The Windows release of TensorRT-LLM is currently in beta. We recommend checking out the v0.16.0 tag for the most stable experience.
Note
TensorRT-LLM on Windows only supports single-GPU execution.
Prerequisites
Clone this repository using Git for Windows.
Install the dependencies one of two ways:
Install all dependencies together.
Run the provided PowerShell script
setup_env.ps1
located under the/windows/
folder which installs Python and CUDA 12.6.3 automatically with default settings. Run PowerShell as Administrator to use the script.
./setup_env.ps1 [-skipCUDA] [-skipPython]
Close and re-open any existing PowerShell or Git Bash windows so they pick up the new
Path
modified by thesetup_env.ps1
script above.
Install the dependencies one at a time.
Install Python 3.10.
Select Add python.exe to PATH at the start of the installation. The installation may only add the
python
command, but not thepython3
command.Navigate to the installation path
%USERPROFILE%\AppData\Local\Programs\Python\Python310
(AppData
is a hidden folder) and copypython.exe
topython3.exe
.
Install CUDA 12.6.3 Toolkit. Use the Express Installation option. Installation may require a restart.
If using conda environment, run the following command before installing TensorRT-LLM.
conda install -c conda-forge pyarrow
Steps
Install TensorRT-LLM.
If you have an existing TensorRT installation (from older versions of tensorrt_llm
), please execute
pip uninstall -y tensorrt tensorrt_libs tensorrt_bindings
pip uninstall -y nvidia-cublas-cu12 nvidia-cuda-nvrtc-cu12 nvidia-cuda-runtime-cu12 nvidia-cudnn-cu12
before installing TensorRT-LLM with the following command.
pip install tensorrt_llm==0.16.0 --extra-index-url https://download.pytorch.org/whl/
Run the following command to verify that your TensorRT-LLM installation is working properly.
python -c "import tensorrt_llm; print(tensorrt_llm._utils.trt_version())"
Build the model.
Deploy the model.
Known Issue
OSError: exception: access violation reading 0x0000000000000000
duringimport tensorrt_llm
ortrtllm-build
.
This may be caused by an outdated Microsoft Visual C++ Redistributable Version. Please install
the latest MSVC
and retry. Check the system path to make sure the latest version installed in System32
is searched first. Check dependencies to make sure no other packages are using an outdated version (e.g. package pyarrow
might contain an outdated MSVC DLL).
OSError: [WinError 126] The specified module could not be found. Error loading “…\Lib\site-packages\torch\lib\fbgemm.dll” or one of its dependencies.
Installing the latest [Build Tools for Visual Studio 2022] (https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2022) will resolve the issue.