Feature Combination Matrix#
Feature  | 
Overlap Scheduler  | 
CUDA Graph  | 
Attention Data Parallelism  | 
Disaggregated Serving  | 
Chunked Prefill  | 
MTP  | 
EAGLE-3(One Model Engine)  | 
EAGLE-3(Two Model Engine)  | 
Torch Sampler  | 
TLLM C++ Sampler  | 
KV Cache Reuse  | 
Slide Window Attention  | 
Logits Post Processor  | 
Guided Decoding  | 
LoRA  | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Overlap Scheduler  | 
—  | 
||||||||||||||
CUDA Graph  | 
Yes  | 
—  | 
|||||||||||||
Attention Data Parallelism  | 
Yes  | 
Yes  | 
—  | 
||||||||||||
Disaggregated Serving  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
|||||||||||
Chunked Prefill  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
||||||||||
MTP  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
|||||||||
EAGLE-3(One Model Engine)  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
No  | 
—  | 
||||||||
EAGLE-3(Two Model Engine)  | 
No  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
No  | 
No  | 
—  | 
|||||||
Torch Sampler  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
||||||
TLLM C++ Sampler  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
No  | 
No  | 
No  | 
No  | 
—  | 
|||||
KV Cache Reuse  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
||||
Slide Window Attention  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
No  | 
Untested  | 
Untested  | 
Yes  | 
Yes  | 
WIP  | 
—  | 
|||
Logits Post Processor  | 
Yes  | 
Yes  | 
Yes  | 
No  | 
Yes  | 
No  | 
No  | 
No  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
||
Guided Decoding  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
No  | 
No  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
—  | 
|
LoRA  | 
Yes  | 
No  | 
Untested  | 
Untested  | 
Untested  | 
Untested  | 
Untested  | 
Untested  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Yes  | 
Untested  | 
—  |