C++ API Reference#
This section provides documentation for the TensorRT Edge-LLM C++ API.
- Action Module
- Builder Module
- Common Module
- Kernels Module
- Nv Fp4 Mo E Utils
- Apply Rope Write KV
- Batch Evict Kernels
- Build Layout
- Causal Conv1d
- Common
- Context FMHA Runner
- Conversion
- Cute Dsl Ffpa Runner
- Cute Dsl FMHA Runner
- Cute Dsl Gdn Runner
- Cute Dsl GEMM Runner
- Cute Dsl Nvfp4 Moe Runner
- Cute Dsl Nvfp4 Moe Sm110 Runner
- Cute Dsl Ssd Runner
- Decoder XQA Runner
- Dequant
- Dequantize
- Dflash Accept Kernels
- Dflash Runtime Kernels
- EAGLE Accept Kernels
- EAGLE Util Kernels
- Embedding Kernels
- FMHA Params V2
- Fp4 Quantize
- Gdn Kernel Utils
- Gemma4 Audio Attention
- Image Util Kernels
- Initialize Cos Sin Cache
- Int4 Groupwise GEMM
- Kernel
- Kernel Selector
- KV Cache Utils Kernels
- Marlin
- Marlin Dtypes
- Marlin Mma
- Marlin Template
- Moe Activation Kernels
- Moe Align Sum Kernels
- Moe Marlin
- Moe Marlin Indices Kernels
- Moe Sigmoid Group Topk Kernels
- Moe Topk Softmax Kernels
- Mtp State Scatter Kernels
- Nvfp4 Moe Types
- Nvfp4 Dequant
- Nvfp4 Tensor
- Selective State Update
- Ssd Varlen Metadata
- Talker Mlp Kernels
- Util Kernels
- Vectorized Types
- Multimodal Module
- Plugins Module
- Profiling Module
- Runtime Module
- Audio Loader
- Audio Utils
- Decoder Registry
- Decoding Inference Context
- Decoding Strategy
- Deepstack Binding
- Deployment Config
- Dflash Decoder
- EAGLE Decoder
- Embedding Preprocessor
- Engine Executor
- External Weight Manager
- Gemma4 Embedding Preprocessor
- Hybrid Cache Manager
- Image Utils
- Inference Dims
- Inference Phase
- KV Cache Manager
- LLM Engine Config
- LLM Inference Runtime
- LLM Runtime Utils
- Lora Manager
- Mamba Cache Manager
- Mel Spectrogram
- Mtp Decoder
- Pipeline Io
- Qwen3 Omni Tts Runtime
- Registry Builder
- Rope Cache
- Shared Resources
- Spec Decode Utils
- Step Preparer
- Streaming
- System Prompt KV Cache
- Tensor Map
- Tensor Registry
- Vanilla Decoder
- Sampler Module
- Tokenizer Module