ConvTranspose¶

class nvtripy.ConvTranspose(in_channels: int, out_channels: int, kernel_dims: Sequence[int], stride: Sequence[int] | None = None, padding: Sequence[Tuple[int, int]] | None = None, dilation: Sequence[int] | None = None, groups: int | None = None, bias: bool = True, dtype: dtype = float32)[source]¶

Applies a transposed convolution operation on the input tensor.

Transposed convolution, also known as fractionally-strided convolution or deconvolution, performs a “reverse” of a standard convolution. It upsamples the input to a larger spatial resolution, such that if you were to apply a standard convolution and then a transpose convolution with the same parameters, you would get back the original spatial dimensions.

The transposed convolution operation can be thought of as a regular convolution operation applied to a dilated (i.e. zeros are inserted between the input values) version of the input tensor. The stride parameter controls the dilation factor, and the padding effectively indicates how much to crop from the output.

Note that transposed convolution is not a strict inverse of standard convolution.

Parameters:

in_channels (int) – The number of channels in the input tensor.
out_channels (int) – The number of channels produced by the convolution.
kernel_dims (Sequence[int]) – The spatial shape of the kernel.
padding (Sequence[Tuple[int, int]]) – A sequence of pairs of integers of length \(M\) indicating the implicit zero padding applied along each spatial dimension before and after the dimension respectively, where \(M\) is the number of spatial dimensions, i.e. \(M = \text{rank(input)} - 2\). In particular, \(\text{dilation} \times (\text{kernel_dims}_i - 1) - \text{padding}_i\) will be added to or cropped from the input. This is set so that when this module is initialized with the same parameters as nvtripy.Conv, they are inverses with respect to the input/output shapes. Defaults to all 0.
stride (Sequence[int]) – A sequence of length \(M\) indicating the stride of convolution across each spatial dimension, where \(M\) is the number of spatial dimensions, i.e. \(M = \text{rank(input)} - 2\). For transposed convolution, this effectively controls the dilation of the input; for each dimension with value \(x\), \(x-1\) zeros are inserted between input values. Defaults to all 1.
groups (int) – The number of groups in a grouped convolution where the input and output channels are divided into groups groups. Each output group is connected only to its corresponding input group through the convolution kernel weights, and the outputs for each group are concatenated to produce the final result. This is in contrast to a standard convolution which has full connectivity between all input and output channels. Grouped convolutions reduce computational cost by a factor of groups and can benefit model parallelism and memory usage. Note that in_channels and out_channels must both be divisible by groups. Defaults to 1 (standard convolution).
dilation (Sequence[int]) – A sequence of length \(M\) indicating the number of zeros to insert between kernel weights across each spatial dimension, where \(M\) is the number of spatial dimensions, i.e. \(M = \text{rank(input)} - 2\). This is known as the a trous algorithm and further downsamples the output by increasing the receptive field of the kernel. For each dimension with value \(x\), \(x-1\) zeros are inserted between kernel weights.
bias (Tensor | None) – Whether to add a bias term to the output or not. The bias has a shape of \((\text{out_channels},)\).
dtype (dtype) – The data type to use for the convolution weights.

Example

input = tp.reshape(tp.arange(4, dtype=tp.float32), (1, 1, 2, 2))
upsample = tp.ConvTranspose(
    1, 1, (3, 3), stride=(2, 2), bias=False, dtype=tp.float32
)

upsample.weight = tp.iota(upsample.weight.shape)

output = upsample(input)

Local Variables¶

>>> input
tensor(
    [[[[0, 1],
       [2, 3]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 1, 2, 2))

>>> upsample
ConvTranspose(
    weight: Parameter = (shape=(1, 1, 3, 3), dtype=float32),
)
>>> upsample.state_dict()
{
    weight: tensor(
        [[[[0, 0, 0],
           [0, 0, 0],
           [0, 0, 0]]]], 
        dtype=float32, loc=gpu:0, shape=(1, 1, 3, 3)),
}

>>> output
tensor(
    [[[[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 1, 5, 5))

Example: "Inversing" Convolution

# This process restores the input spatial dimensions, but not its values
input = tp.reshape(tp.arange(16, dtype=tp.float32), (1, 1, 4, 4))
downsample = tp.Conv(
    1,
    1,
    (2, 2),
    stride=(2, 2),
    padding=((1, 1), (1, 1)),
    bias=False,
    dtype=tp.float32,
)

downsample.weight = tp.iota(downsample.weight.shape)

upsample = tp.ConvTranspose(
    1,
    1,
    (2, 2),
    stride=(2, 2),
    padding=((1, 1), (1, 1)),
    bias=False,
    dtype=tp.float32,
)

upsample.weight = tp.iota(upsample.weight.shape)

output_down = downsample(input)
output_up = upsample(output_down)

Local Variables¶

>>> input
tensor(
    [[[[0, 1, 2, 3],
       [4, 5, 6, 7],
       [8, 9, 10, 11],
       [12, 13, 14, 15]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 1, 4, 4))

>>> downsample
Conv(
    weight: Parameter = (shape=(1, 1, 2, 2), dtype=float32),
)
>>> downsample.state_dict()
{
    weight: tensor(
        [[[[0, 0],
           [0, 0]]]], 
        dtype=float32, loc=gpu:0, shape=(1, 1, 2, 2)),
}

>>> upsample
ConvTranspose(
    weight: Parameter = (shape=(1, 1, 2, 2), dtype=float32),
)
>>> upsample.state_dict()
{
    weight: tensor(
        [[[[0, 0],
           [0, 0]]]], 
        dtype=float32, loc=gpu:0, shape=(1, 1, 2, 2)),
}

>>> output_down
tensor(
    [[[[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 1, 3, 3))

>>> output_up
tensor(
    [[[[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 1, 4, 4))

dtype: dtype¶: The data type to use for the convolution weights.

padding: Sequence[Tuple[int, int]]¶: A sequence of pairs of integers of length \(M\) indicating the implicit zero padding applied along each spatial dimension before and after the dimension respectively, where \(M\) is the number of spatial dimensions, i.e. \(M = \text{rank(input)} - 2\). In particular, \(\text{dilation} \times (\text{kernel_dims}_i - 1) - \text{padding}_i\) will be added to or cropped from the input. This is set so that when this module is initialized with the same parameters as nvtripy.Conv, they are inverses with respect to the input/output shapes.

stride: Sequence[int]¶: A sequence of length \(M\) indicating the stride of convolution across each spatial dimension, where \(M\) is the number of spatial dimensions, i.e. \(M = \text{rank(input)} - 2\). For transposed convolution, this effectively controls the dilation of the input; for each dimension with value \(x\), \(x-1\) zeros are inserted between input values.

__call__(*args: Any, **kwargs: Any) → Any¶

Calls the module with the specified arguments.

Parameters:

*args (Any) – Positional arguments to the module.
**kwargs (Any) – Keyword arguments to the module.

Returns:

The outputs computed by the module.

Return type:

Any

Example

class Module(tp.Module):
    def forward(self, x):
        return tp.relu(x)


module = Module()

input = tp.arange(-3, 3)
out = module(input)  # Note that we do not call `forward` directly.

Local Variables¶

>>> module
Module(
)
>>> module.state_dict()
{}

>>> input
tensor([-3, -2, -1, 0, 1, 2], dtype=float32, loc=gpu:0, shape=(6,))

>>> out
tensor([0, 0, 0, 0, 1, 2], dtype=float32, loc=gpu:0, shape=(6,))

initialize_dummy_parameters() → None¶

Initializes any uninitialized parameters in the module with dummy values. This is useful for debugging and testing purposes.

Example

linear = tp.Linear(2, 2)
print(linear.state_dict())

linear.initialize_dummy_parameters()
print(linear.state_dict())

Output¶

{'weight': <nvtripy.frontend.module.parameter.DefaultParameter object at 0x7ed8161a1580>, 'bias': <nvtripy.frontend.module.parameter.DefaultParameter object at 0x7ed8161a1460>}
{'weight': tensor(
    [[1, 1],
     [1, 1]], 
    dtype=float32, loc=gpu:0, shape=(2, 2)), 'bias': tensor([1, 1], dtype=float32, loc=gpu:0, shape=(2,))}

load_state_dict(state_dict: Dict[str, Tensor], strict: bool = True) → Tuple[Set[str], Set[str]]¶

Loads parameters from the provided state_dict into the current module. This will recurse over any nested child modules.

Parameters:

state_dict (Dict[str, Tensor]) – A dictionary mapping names to parameters.
strict (bool) – If True, keys in state_dict must exactly match those in this module. If not, an error will be raised.

Returns:

missing_keys: keys that are expected by this module but not provided in state_dict.
unexpected_keys: keys that are not expected by this module but provided in state_dict.

Return type:

A tuple of two sets of strings representing

Example

class MyModule(tp.Module):
    def __init__(self):
        super().__init__()
        self.param = tp.ones((2,), dtype=tp.float32)


module = MyModule()

print(f"Before: {module.param}")

module.load_state_dict({"param": tp.zeros((2,), dtype=tp.float32)})

print(f"After: {module.param}")

Output¶

Before: tensor([1, 1], dtype=float32, loc=gpu:0, shape=(2,))
After: tensor([0, 0], dtype=float32, loc=gpu:0, shape=(2,))

See also

state_dict()

named_children() → Iterator[Tuple[str, Module]]¶

Returns an iterator over immediate children of this module, yielding tuples containing the name of the child module and the child module itself.

Returns:: An iterator over tuples containing the name of the child module and the child module itself.
Return type:: Iterator[Tuple[str, Module]]

Example

class StackedLinear(tp.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = tp.Linear(2, 2)
        self.linear2 = tp.Linear(2, 2)


stacked_linear = StackedLinear()

for name, module in stacked_linear.named_children():
    print(f"{name}: {type(module).__name__}")

Output¶

linear1: Linear
linear2: Linear

named_parameters() → Iterator[Tuple[str, Tensor]]¶

Returns:: An iterator over tuples containing the name of a parameter and the parameter itself.
Return type:: Iterator[Tuple[str, Tensor]]

Example

class MyModule(tp.Module):
    def __init__(self):
        super().__init__()
        self.alpha = tp.Tensor(1)
        self.beta = tp.Tensor(2)


linear = MyModule()

for name, parameter in linear.named_parameters():
    print(f"{name}: {parameter}")

Output¶

alpha: tensor(1, dtype=int32, loc=cpu:0, shape=())
beta: tensor(2, dtype=int32, loc=cpu:0, shape=())

state_dict() → Dict[str, Tensor]¶

Returns a dictionary mapping names to parameters in the module. This will recurse over any nested child modules.

Returns:: A dictionary mapping names to parameters.
Return type:: Dict[str, Tensor]

Example

class MyModule(tp.Module):
    def __init__(self):
        super().__init__()
        self.param = tp.ones((2,), dtype=tp.float32)
        self.linear1 = tp.Linear(2, 2)
        self.linear2 = tp.Linear(2, 2)


module = MyModule()

state_dict = module.state_dict()

Local Variables¶

>>> state_dict
{
    param: tensor([1, 1], dtype=float32, loc=gpu:0, shape=(2,)),
    linear1.weight: <nvtripy.frontend.module.parameter.DefaultParameter object at 0x7ed8161693a0>,
    linear1.bias: <nvtripy.frontend.module.parameter.DefaultParameter object at 0x7ed8161936a0>,
    linear2.weight: <nvtripy.frontend.module.parameter.DefaultParameter object at 0x7ed8161fe160>,
    linear2.bias: <nvtripy.frontend.module.parameter.DefaultParameter object at 0x7ed8161fe5b0>,
}

groups: int¶: The number of groups in a grouped convolution where the input and output channels are divided into groups groups. Each output group is connected only to its corresponding input group through the convolution kernel weights, and the outputs for each group are concatenated to produce the final result. This is in contrast to a standard convolution which has full connectivity between all input and output channels. Grouped convolutions reduce computational cost by a factor of groups and can benefit model parallelism and memory usage. Note that in_channels and out_channels must both be divisible by groups.

dilation: Sequence[int]¶: A sequence of length \(M\) indicating the number of zeros to insert between kernel weights across each spatial dimension, where \(M\) is the number of spatial dimensions, i.e. \(M = \text{rank(input)} - 2\). This is known as the a trous algorithm and further downsamples the output by increasing the receptive field of the kernel. For each dimension with value \(x\), \(x-1\) zeros are inserted between kernel weights.

bias: Tensor | None¶: The bias term to add to the output. The bias has a shape of \((\text{out_channels},)\).

weight: Tensor¶: The kernel of shape \((\text{in_channels}, \frac{\text{out_channels}}{\text{groups}}, *\text{kernel_dims})\).

forward(input: Tensor) → Tensor[source]¶

Parameters:: input (Tensor) – The input tensor.
Returns:: A tensor of the same data type as the input with a shape \((N, \text{out_channels}, D_{0_{\text{out}}},\ldots,D_{n_{\text{out}}})\) where \(D_{k_{\text{out}}} = (D_{k_{\text{in}}} - 1) \times \text{stride}_k - \text{padding}_{k_0} - \text{padding}_{k_1} + \text{dilation}_k \times (\text{kernel_dims}_k - 1) + 1\)
Return type:: Tensor