Tripy: A Python Programming Model For TensorRT

Quick Start | Installation | Examples | Notebooks | Contributing | Documentation

Tripy is a debuggable, Pythonic frontend for TensorRT, a deep learning inference compiler.

What you can expect:

  • High performance by leveraging TensorRT’s optimization capabilties.

  • An intuitive API that follows conventions of the ecosystem.

  • Debuggability with features like eager mode to interactively debug mistakes.

  • Excellent error messages that are informative and actionable.

  • Friendly documentation that is comprehensive but concise, with code examples.

Quick Start

See the Introduction To Tripy guide for details:

  • Defining a model:

    1class Model(tp.Module):
    2    def __init__(self):
    3        self.conv = tp.Conv(in_channels=1, out_channels=1, kernel_dims=[3, 3])
    4
    5    def __call__(self, x):
    6        x = self.conv(x)
    7        x = tp.relu(x)
    8        return x
    
  • Initializing it:

    1model = Model()
    2model.load_state_dict(
    3    {
    4        "conv.weight": tp.ones((1, 1, 3, 3)),
    5        "conv.bias": tp.ones((1,)),
    6    }
    7)
    8
    9dummy_input = tp.ones((1, 1, 4, 4))
    
  • Executing in eager mode:

    1eager_out = model(dummy_input)
    
  • Compiling and executing:

    1compiled_model = tp.compile(
    2    model,
    3    args=[tp.InputInfo(shape=(1, 1, 4, 4), dtype=tp.float32)],
    4)
    5
    6compiled_out = compiled_model(dummy_input)
    

Installation

python3 -m pip install nvtripy -f https://nvidia.github.io/TensorRT-Incubator/packages.html