Tripy: A Python Programming Model For TensorRT¶

Tripy is a debuggable, Pythonic frontend for TensorRT, a deep learning inference compiler.

What you can expect:

High performance by leveraging TensorRT’s optimization capabilties.
An intuitive API that follows conventions of the ecosystem.
Debuggability with features like eager mode to interactively debug mistakes.
Excellent error messages that are informative and actionable.
Friendly documentation that is comprehensive but concise, with code examples.

Installation¶

python3 -m pip install nvtripy -f https://nvidia.github.io/TensorRT-Incubator/packages.html

Quick Start¶

See the Introduction To Tripy guide for details:

Defining a model:

class Model(tp.Module):
    def __init__(self):
        self.conv = tp.Conv(in_channels=1, out_channels=1, kernel_dims=[3, 3])

    def forward(self, x):
        x = self.conv(x)
        x = tp.relu(x)
        return x

Initializing it:

model = Model()
model.load_state_dict(
    {
        "conv.weight": tp.ones((1, 1, 3, 3)),
        "conv.bias": tp.ones((1,)),
    }
)

dummy_input = tp.ones((1, 1, 4, 4)).eval()

Executing in eager mode:
```
1eager_out = model(dummy_input)
```

Compiling and executing:

compiled_model = tp.compile(
    model,
    args=[tp.InputInfo(shape=(1, 1, 4, 4), dtype=tp.float32)],
)

compiled_out = compiled_model(dummy_input)