BatchNorm

class nvtripy.BatchNorm(num_features: int, dtype: dtype = float32, eps: float = 1e-05)[source]

Bases: Module

Applies batch normalization over an N-dimensional input tensor using precomputed statistics:

\(y = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} * \gamma + \beta\)

where:
  • \(\mu\) is the precomputed running mean.

  • \(\sigma^2\) is the precomputed running variance.

  • \(\gamma\) and \(\beta\) are learnable parameter vectors (wieight and bias).

This implementation supports 1D, 2D, and 3D inputs (e.g., time-series, images, and volumetric data). Batch Normalization normalizes across the specified feature dimension (typically the second dimension in the input).

Parameters:
  • num_features (int) – The number of feature channels in the input tensor (the size of the second dimension).

  • dtype (dtype) – The data type to use for the weight, bias, running_mean and running_var parameters.

  • eps (float) – \(\epsilon\) value added to the denominator to prevent division by zero during normalization.

Example
1batch_norm = tp.BatchNorm(2)
2
3input = tp.iota((1, 2, 1, 1))
4output = batch_norm(input)
Local Variables
>>> batch_norm
BatchNorm(
    weight: Parameter = (shape=[2], dtype=float32),
    bias: Parameter = (shape=[2], dtype=float32),
    running_mean: Parameter = (shape=[2], dtype=float32),
    running_var: Parameter = (shape=[2], dtype=float32),
)
>>> batch_norm.state_dict()
{
    weight: tensor([0.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
    bias: tensor([0.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
    running_mean: tensor([0.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
    running_var: tensor([0.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
}

>>> input
tensor(
    [[[[0.0000]],

      [[0.0000]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 2, 1, 1))

>>> output
tensor(
    [[[[0.0000]],

      [[0.0000]]]], 
    dtype=float32, loc=gpu:0, shape=(1, 2, 1, 1))
dtype: dtype

The data type used to perform the operation.

load_state_dict(state_dict: Dict[str, Tensor], strict: bool = True) Tuple[Set[str], Set[str]]

Loads parameters from the provided state_dict into the current module. This will recurse over any nested child modules.

Parameters:
  • state_dict (Dict[str, Tensor]) – A dictionary mapping names to parameters.

  • strict (bool) – If True, keys in state_dict must exactly match those in this module. If not, an error will be raised.

Returns:

  • missing_keys: keys that are expected by this module but not provided in state_dict.

  • unexpected_keys: keys that are not expected by this module but provided in state_dict.

Return type:

A tuple of two sets of strings representing

Example
1# Using the `module` and `state_dict` from the `state_dict()` example:
2print(f"Before: {module.param}")
3
4state_dict["param"] = tp.zeros((2,), dtype=tp.float32)
5module.load_state_dict(state_dict)
6
7print(f"After: {module.param}")
Output
Before: tensor([1.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,))
After: tensor([0.0000, 0.0000], dtype=float32, loc=gpu:0, shape=(2,))

See also

state_dict()

named_children() Iterator[Tuple[str, Module]]

Returns an iterator over immediate children of this module, yielding tuples containing the name of the child module and the child module itself.

Returns:

An iterator over tuples containing the name of the child module and the child module itself.

Return type:

Iterator[Tuple[str, Module]]

Example
 1class StackedLinear(tp.Module):
 2    def __init__(self):
 3        super().__init__()
 4        self.linear1 = tp.Linear(2, 2)
 5        self.linear2 = tp.Linear(2, 2)
 6
 7
 8stacked_linear = StackedLinear()
 9
10for name, module in stacked_linear.named_children():
11    print(f"{name}: {type(module).__name__}")
Output
linear1: Linear
linear2: Linear
named_parameters() Iterator[Tuple[str, Tensor]]
Returns:

An iterator over tuples containing the name of a parameter and the parameter itself.

Return type:

Iterator[Tuple[str, Tensor]]

Example
 1class MyModule(tp.Module):
 2    def __init__(self):
 3        super().__init__()
 4        self.alpha = tp.Tensor(1)
 5        self.beta = tp.Tensor(2)
 6
 7
 8linear = MyModule()
 9
10for name, parameter in linear.named_parameters():
11    print(f"{name}: {parameter}")
Output
alpha: tensor(1, dtype=int32, loc=gpu:0, shape=())
beta: tensor(2, dtype=int32, loc=gpu:0, shape=())
state_dict() Dict[str, Tensor]

Returns a dictionary mapping names to parameters in the module. This will recurse over any nested child modules.

Returns:

A dictionary mapping names to parameters.

Return type:

Dict[str, Tensor]

Example
 1class MyModule(tp.Module):
 2    def __init__(self):
 3        super().__init__()
 4        self.param = tp.ones((2,), dtype=tp.float32)
 5        self.linear1 = tp.Linear(2, 2)
 6        self.linear2 = tp.Linear(2, 2)
 7
 8
 9module = MyModule()
10
11state_dict = module.state_dict()
Local Variables
>>> state_dict
{
    param: tensor([1.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
    linear1.weight: tensor(
        [[0.0000, 1.0000],
         [2.0000, 3.0000]], 
        dtype=float32, loc=gpu:0, shape=(2, 2)),
    linear1.bias: tensor([0.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
    linear2.weight: tensor(
        [[0.0000, 1.0000],
         [2.0000, 3.0000]], 
        dtype=float32, loc=gpu:0, shape=(2, 2)),
    linear2.bias: tensor([0.0000, 1.0000], dtype=float32, loc=gpu:0, shape=(2,)),
}
num_features: int

The number of feature channels in the input tensor (the size of the second dimension).

eps: float

\(\epsilon\) value added to the denominator to prevent division by zero during normalization.

weight: Tensor

The \(\gamma\) parameter of shape \([\text{num_features}]\).

bias: Tensor

The \(\beta\) parameter of shape \([\text{num_features}]\).

running_mean: Tensor

The running mean for the feature channels of shape \([\text{num_features}]\).

running_var: Tensor

The running variance for the feature channels of shape \([\text{num_features}]\).

__call__(x: Tensor) Tensor[source]
Parameters:

x (Tensor) – The input tensor with shape \((N, C, ...)\), where C is the feature dimension.

Returns:

A tensor of the same shape as the input.

Return type:

Tensor