Runtime Module#
API documentation for the runtime module.
- Audio Utils
- Decoder Registry
- Decoding Inference Context
- Decoding Strategy
- Deepstack Binding
- Deployment Config
- EAGLE Decoder
- EAGLE Draft Engine Runner
- Embedding Preprocessor
- Engine Executor
- External Weight Manager
- Hybrid Cache Manager
- Image Utils
- Inference Dims
- Inference Phase
- KV Cache Manager
- LLM Engine Config
- LLM Engine Runner
- LLM Inference Runtime
- LLM Runtime Utils
- Lora Manager
- Mamba Cache Manager
- Mtp Decoder
- Pipeline Io
- Qwen3 Omni Tts Runtime
- Registry Builder
- Rope Cache
- Shared Resources
- Spec Decode Utils
- Step Preparer
- Streaming
- System Prompt KV Cache
- Tensor Map
- Tensor Registry
- Vanilla Decoder