TensorRT Edge-LLM Documentation#
Welcome to the TensorRT Edge-LLM documentation. This library provides optimized inference capabilities for large language models and vision-language models on edge devices.
Getting Started
- Overview
- Supported Models
- Installation
- Quick Start Guide
- Examples and Complete Workflows
- Overview
- Complete Workflow Summary
- Example 1: VLM (Vision-Language Model) Inference
- Example 2: LLM EAGLE Speculative Decoding
- Example 3: VLM EAGLE Speculative Decoding
- Example 4: LoRA-Enabled Models
- Example 5: Phi-4-Multimodal with LoRA Merge
- Input File Format Reference
- Common Build Parameters
- Profiling and Performance Analysis
- Input JSON Format
- Chat Template Format
Software Design
Advanced Features
Customization
APIs
Need help? Visit our GitHub repository for issues and discussions.