LLM API with TensorRT Engine#

A simple inference example with TinyLlama using the LLM API:

For more advanced usage including distributed inference, multimodal, and speculative decoding, please refer to this README.