Adding New Operators¶

You may find it helpful to read the architecture documentation before you start reading this guide.

Adding new operators to Tripy typically involves making changes in the frontend as well as in the FlatIR. In some cases, the frontend operator can be expressed in terms of existing FlatIR operators, in which case you only need to make changes in the frontend.

Let’s take a look at an example of how you might add an Iota operator to Tripy. So that it doesn’t clash with Tripy’s actual Iota implementation, we’ll call it Theta instead.

Implementation¶

`FlatIR` Operator¶

The FlatIR operator is usually the most challenging aspect of implementing operators in Tripy. The good news is that you might not even need to do this if the low-level operators you need already exist in the FlatIR. And if you do, then it’ll only get easier after this!

We’ll start by adding a new file under nvtripy/flat_ir/ops called theta.py; see the inline comments for explanations of what’s happening:

from dataclasses import dataclass

from mlir_tensorrt.compiler import ir
from mlir_tensorrt.compiler.dialects import stablehlo

from nvtripy.flat_ir.ops.base import BaseFlatIROp


# Every `FlatIR` operator is implemented as a `dataclass` so that the base
# class can automatically implement several methods by inspecting the child
# class fields at runtime. The `repr=False` is important because the default
# `__repr__` method generated by `dataclass` will be extremely verbose and
# makes interactive debugging more difficult.
@dataclass(repr=False)
class ThetaOp(BaseFlatIROp):
    dim: int

    # `to_mlir()` is the trickiest bit. As the name implies, the method is
    # meant to lower the `FlatIR` operator into MLIR. To figure out which
    # MLIR operators to use, refer to the 'MLIR Python API Guide'
    # (linked below).
    def to_mlir(self, operands):
        out_type = self.outputs[0].to_mlir()
        theta_dim = ir.IntegerAttr.get(
            type=ir.IntegerType.get_signless(64), value=self.dim
        )
        output = stablehlo.DynamicIotaOp(
            result=out_type, output_shape=operands[0], iota_dimension=theta_dim
        )
        return [output]

Links:

MLIR Python API Guide

Exposing The Operator¶

One of the principles we follow when writing submodules is that other submodules should not need to reach into the internals of a submodule to retrieve something they need.

For example, a class which needs to import ThetaOp does not need to know where exactly within the flat_ir.ops module the ThetaOp lives - it should be able to just import it from the submodule.

To make this possible, we need to import the ThetaOp into the flat_ir.ops submodule. We can do so by adding the following line into nvtripy/flat_ir/ops/__init__.py:

from nvtripy.flat_ir.ops.theta import ThetaOp

`Trace` Operator And The Public API¶

Now that we have a FlatIR operator, we can implement a Trace operator that will use it along with a public API function. Let’s create a new file under nvtripy/frontend/trace/ops called theta.py.

`Trace` Operator¶

First, we’ll implement the Trace operator itself:

from dataclasses import dataclass
from typing import Tuple

from nvtripy import utils
from nvtripy.common import datatype, device
from nvtripy.common.exception import raise_error
from nvtripy.frontend.trace.ops.base import BaseTraceOp
import nvtripy.frontend.trace.ops.utils as op_utils


# Just like with `FlatIR` operators, all `Trace` operators are implemented
# as `dataclass`es. As before, we want `repr=False` here.
@dataclass(repr=False)
class Theta(BaseTraceOp):
    # Notice that we do *not* need to define a constructor and can rely on
    # the default implementation provided by `dataclass`.
    dim: int
    dtype: datatype.dtype

    # `infer_rank()` populates the rank of the output `TraceTensor`s.
    # Here we use one of the predefined policies to set the output rank
    # to the same as the shape (i.e. the length) of the shape operand.
    infer_rank = op_utils.InferRankPolicies.same_as_shape_of_shape_input()

    # *Optional* `infer_dtypes()` populates the data types of the
    # output `TraceTensor`s. The default implementation copies the input
    # data types if they are all the same, so you may not need to implement
    # this.
    def infer_dtypes(self):
        self.outputs[0].dtype = self.dtype

    # *Optional* `infer_devices()` populates the devices of the
    # output `TraceTensor`s. The default implementation copies the input
    # devices if they are all the same, so you may not need to implement
    # this either.
    def infer_devices(self):
        self.outputs[0].device = device("gpu")

    # `to_flat_ir()` translates the `Trace` operator to a subgraph of
    # one or more `FlatIR` operators. In our case, it's just a 1:1
    # mapping to the `ThetaOp` we created earlier.
    def to_flat_ir(self, inputs, outputs):
        # Note that we import the `FlatIR` operator within the function
        # call - this is to avoid circular dependencies.
        from nvtripy.flat_ir.ops import ThetaOp
        import nvtripy.frontend.trace.ops.utils as op_utils

        # This code may look a bit confusing; for more details, look at the
        # 'FlatIR section in the architecture document' (linked below).
        ThetaOp.build(inputs, outputs, dim=self.dim)

Links:

FlatIR section in the architecture document

Public API¶

Next, we can define the public interface. Since our public interface maps 1:1 with the Trace operator we just implemented and does not require weights, we’ll add it in the same file.

If our API required a composition of multiple Trace operators, then we would instead implement it under frontend/ops/.

If it required weights (i.e. inputs that are expected to always be constant), then we would implement it as a nvtripy.Module under frontend/module.

from nvtripy import export
from nvtripy.utils import wrappers
from nvtripy.types import ShapeLike


# We can use the `export.public_api()` decorator to automatically export this
# function into the top-level module. This means it will be accessible as
# `nvtripy.theta`.
#
# This decorator also controls how the API is exposed in the documentation -
# the `document_under` option determines where in the documentation hierarchy
# this API will show up.
#
# If we needed to provide any special autodoc options, we could use the
# `autodoc_options` parameter.
@export.public_api(document_under="tensor_operations")

# We can use the `wrappers.interface` decorator to specify constraints on
# inputs and perform transformations on them, like automatically converting
# compatible arguments (e.g., `TensorLike` or `ShapeLike`s) into tensors.
# We will aim to include most constraints and transformations in this decorator
# so as to avoid layering too many decorators.
@wrappers.interface(convert_to_tensors=True)
def theta(
    shape: ShapeLike, dim: int = 0, dtype: datatype.dtype = datatype.float32
) -> "nvtripy.Tensor":
    # For any public facing interfaces, we have documentation requirements which
    # you can read about in the 'Docs README' (linked below). The docstring
    # we've implemented here adheres to all of these requirements. Non-compliant
    # docstrings will, in most cases, cause test failures; however, you should
    # still manually ensure you're writing high-quality docstrings.
    #
    # The examples in docstrings are run as part of our tests, so you should
    # also add assertions to make sure things are functionally correct. In this
    # case, we check that the `output` we create in the code example is what we
    # expect.
    """
    Fills an output tensor with consecutive values starting from zero
    along the given dimension.

    Args:
        shape: The desired shape.
        dim: Dimension along which to perform the theta operation.
            This cannot exceed the rank of the specified shape.
        dtype: The desired data type.

    Returns:
        A tensor of shape ``shape`` and data type ``dtype``.

    .. code-block:: python
        :linenos:

        output = tp.theta([3])

        assert np.array_equal(
            cp.from_dlpack(output).get(), np.arange(0, 3, dtype=np.float32)
        )
    """

    # Next we build the trace operator. The `build()` function is also
    # responsible for constructing the output frontend Tensors. All of the
    # arguments that follow the inputs are forwarded directly to the
    # constructor of the `Trace` operator.
    return Theta.build([shape], dim, dtype)

Links:

Docs README

Exposing The Operator¶

Similarly to the FlatIR operator, we need to import Theta into the frontend.trace.ops submodule. We can do so by adding the following line into nvtripy/frontend/trace/ops/__init__.py:

from nvtripy.frontend.trace.ops.theta import Theta, theta

Testing¶

Now that we’ve implemented our operator, let’s write tests for it. The structure of the tests/ directory mirrors that of the nvtripy/ directory (you can read more about that here). We need to test both the FlatIR and Trace operators.

Testing The Trace Operator And Public API¶

Since we implemented our Trace operator and public API in nvtripy/frontend/trace/ops, we’ll add the test under tests/frontend/trace/ops. Create a new file there called test_theta.py:

import nvtripy as tp
from tests import helper
from nvtripy.frontend.trace.ops import Theta


class TestTheta:
    # This ensures that the public API function creates a frontend `Tensor`
    # and populates it with the right `Trace` operator.
    def test_op_func(self):
        a = tp.theta([2, 3])
        assert isinstance(a, tp.Tensor)
        assert isinstance(a.trace_tensor.producer, Theta)

    # You should also include negative tests for anything that is expected to
    # fail. In our case, we just have `test_invalid_dim`,
    # which ensures that we emit an error if the `dim` parameter is outside
    # the allowed range.
    def test_invalid_dim(self):
        with helper.raises(
            tp.TripyException,
            match="iota dimension cannot go beyond the output rank",
        ):
            tp.theta([2, 3], dim=3).eval()

Integration Tests¶

The code examples in the docstring of the public API serve as good sanity integration tests. However, you should still add separate integration tests to get better coverage.

Our docstring covers the 1D case, so let’s add an integration test to cover the multidimensional case. Create a new file called test_theta.py under tests/integration:

import numpy as np
import cupy as cp

import nvtripy as tp


def test_multi_dimensional():
    output = tp.theta([2, 3], dim=1)
    expected = tp.Tensor([[0.0, 1.0, 2.0], [0.0, 1.0, 2.0]], dtype=tp.float32)

    assert tp.equal(output, expected)

Done!¶

If you’ve reached this point, you have successfully added a new operation to Tripy. Congratulations!

Adding New Operators¶

Implementation¶

FlatIR Operator¶

Exposing The Operator¶

Trace Operator And The Public API¶

Trace Operator¶

Public API¶

Exposing The Operator¶

Testing¶

Testing The Trace Operator And Public API¶

Integration Tests¶

Done!¶

`FlatIR` Operator¶

`Trace` Operator And The Public API¶

`Trace` Operator¶