Skip to main content

Ctrl+K

TensorRT Edge-LLM

TensorRT Edge-LLM

Table of Contents

Getting Started

Overview
Supported Models
Installation
Quick Start Guide
Limitations and Known Issues

Examples

Examples
VLM (Vision-Language Model) Inference
Speculative Decoding
Phi-4-Multimodal
ASR (Automatic Speech Recognition)
MoE (Mixture of Experts)
TTS (Text-to-Speech)
Alpamayo-R1-10B (Vision-Language-Action)
Experimental High-Level Python API and Server

Features

LoRA (Low-Rank Adaptation)
Quantization
Vocabulary Reduction
FP8 KV Cache
FP8 Embedding
Streaming Output
System Prompt Cache

Input & Chat Format

Input JSON Format
Chat Template Format

Performance

Performance Benchmarks

Software Design

Checkpoint Exporter Design
Quantization Package Design
Engine Builder
C++ Runtime Overview
LLM Inference Runtime
LLM Streaming — Design

Customization

Customization Guide
Calibration Dataset Customization
TensorRT Plugins Guide

Testing

Code Coverage with SonarQube
Few-Layer Numeric Validation

APIs

Python API Reference
C++ API Reference

Quick Links

Releases
GitHub
Roadmap

C++ API Reference
Runtime Module
Audio Utils

Audio Utils#

namespace trt_edgellm

namespace rt

namespace audioUtils

previous

Audio Loader

next

Decoder Registry

On this page

trt_edgellm
- trt_edgellm::rt
  - trt_edgellm::rt::audioUtils

Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2025, Nvidia.

Last updated on July 22, 2026.

This page is generated by TensorRT-Edge-LLM commit f98267f.