Gated Recurrent Unit Cell (GRUCell)#

API#

class warp_nn.modules.layers.GRUCell(input_size: int, hidden_size: int, *, bias: bool = True)[source]#

Bases: Module

Apply a Gated Recurrent Unit (GRU) cell.

\[\text{GRUCell}(x, h) = h'\]

where

\[\begin{split}\begin{array}{ll} r = \sigma(W_{ir} \, x + b_{ir} + W_{hr} \, h + b_{hr}) \\ z = \sigma(W_{iz} \, x + b_{iz} + W_{hz} \, h + b_{hz}) \\ n = \tanh(W_{in} \, x + b_{in} + r \odot (W_{hn} \, h + b_{hn})) \\ h' = (1 - z) \odot n + z \odot h \end{array}\end{split}\]

and \(\sigma\) is the sigmoid function and \(\odot\) is the element-wise product.

Learnable parameters:

	Name	Shape	Description
\(W_{ir}, W_{iz}, W_{in}\)	`weight_ih`	`(3 * hidden_size, input_size)`	Input-to-hidden weights
\(W_{hr}, W_{hz}, W_{hn}\)	`weight_hh`	`(3 * hidden_size, hidden_size)`	Hidden-to-hidden weights
\(b_{ir}, b_{iz}, b_{in}\)	`bias_ih`	`(3 * hidden_size, 1)`	Input-to-hidden bias. Only if `bias` is true
\(b_{hr}, b_{hz}, b_{hn}\)	`bias_hh`	`(3 * hidden_size, 1)`	Hidden-to-hidden bias. Only if `bias` is true

The parameters are initialized from the uniform distribution \(u(-k, k)\) where \(k = \frac{1}{\sqrt{\text{hidden\_size}}}\).

Parameters:

input_size – The number of input features.
hidden_size – The number of hidden features.
bias – Whether to include a bias term.

__call__( input: array, hidden: array, ) → array[source]#

Forward pass of the module.

Parameters:

input – The input array, with shape (batch_size, input_size).
hidden – The initial hidden state array, with shape (batch_size, hidden_size).

Returns:

The next hidden state array, with shape (batch_size, hidden_size).