apex.normalization.fused_layer_norm¶

class
apex.normalization.
FusedLayerNorm
(normalized_shape, eps=1e05, elementwise_affine=True)[source]¶ Applies Layer Normalization over a minibatch of inputs as described in the paper Layer Normalization .
Currently only runs on cuda() tensors.
\[y = \frac{x  \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]The mean and standarddeviation are calculated separately over the last certain number dimensions which have to be of the shape specified by
normalized_shape
. \(\gamma\) and \(\beta\) are learnable affine transform parameters ofnormalized_shape
ifelementwise_affine
isTrue
.Note
Unlike Batch Normalization and Instance Normalization, which applies scalar scale and bias for each entire channel/plane with the
affine
option, Layer Normalization applies perelement scale and bias withelementwise_affine
.This layer uses statistics computed from input data in both training and evaluation modes.
Parameters:  normalized_shape (int or list or torch.Size) –
input shape from an expected input of size
\[[* \times \text{normalized}\_\text{shape}[0] \times \text{normalized}\_\text{shape}[1] \times \ldots \times \text{normalized}\_\text{shape}[1]]\]If a single integer is used, it is treated as a singleton list, and this module will normalize over the last dimension which is expected to be of that specific size.
 eps – a value added to the denominator for numerical stability. Default: 1e5
 elementwise_affine – a boolean value that when set to
True
, this module has learnable perelement affine parameters initialized to ones (for weights) and zeros (for biases). Default:True
.
 Shape:
 Input: \((N, *)\)
 Output: \((N, *)\) (same shape as input)
Examples:
>>> input = torch.randn(20, 5, 10, 10) >>> # With Learnable Parameters >>> m = apex.normalization.FusedLayerNorm(input.size()[1:]) >>> # Without Learnable Parameters >>> m = apex.normalization.FusedLayerNorm(input.size()[1:], elementwise_affine=False) >>> # Normalize over last two dimensions >>> m = apex.normalization.FusedLayerNorm([10, 10]) >>> # Normalize over last dimension of size 10 >>> m = apex.normalization.FusedLayerNorm(10) >>> # Activating the module >>> output = m(input)
 normalized_shape (int or list or torch.Size) –