Long Short-Term Memory Cell (LSTMCell)#

API#

class warp_nn.modules.layers.LSTMCell(input_size: int, hidden_size: int, *, bias: bool = True)[source]#

Bases: Module

Apply a Long Short-Term Memory (LSTM) cell.

\[\text{LSTMCell}(x, (h, c)) = (h', c')\]

where

\[\begin{split}\begin{array}{ll} i = \sigma(W_{ii} \, x + b_{ii} + W_{hi} \, h + b_{hi}) \\ f = \sigma(W_{if} \, x + b_{if} + W_{hf} \, h + b_{hf}) \\ g = \tanh(W_{ig} \, x + b_{ig} + W_{hg} \, h + b_{hg}) \\ o = \sigma(W_{io} \, x + b_{io} + W_{ho} \, h + b_{ho}) \\ c' = f \odot c + i \odot g \\ h' = o \odot \tanh(c') \\ \end{array}\end{split}\]

and \(\sigma\) is the sigmoid function and \(\odot\) is the element-wise product.

Learnable parameters:

	Name	Shape	Description
\(W_{ii}, W_{if}, W_{ig}, W_{io}\)	`weight_ih`	`(4 * hidden_size, input_size)`	Input-to-hidden weights
\(W_{hi}, W_{hf}, W_{hg}, W_{ho}\)	`weight_hh`	`(4 * hidden_size, hidden_size)`	Hidden-to-hidden weights
\(b_{ii}, b_{if}, b_{ig}, b_{io}\)	`bias_ih`	`(4 * hidden_size, 1)`	Input-to-hidden bias. Only if `bias` is true
\(b_{hi}, b_{hf}, b_{hg}, b_{ho}\)	`bias_hh`	`(4 * hidden_size, 1)`	Hidden-to-hidden bias. Only if `bias` is true

The parameters are initialized from the uniform distribution \(u(-k, k)\) where \(k = \frac{1}{\sqrt{\text{hidden\_size}}}\).

Parameters:

input_size – The number of input features.
hidden_size – The number of hidden features.
bias – Whether to include a bias term.

__call__( input: array, hidden: tuple[array, array], ) → tuple[array, array][source]#

Forward pass of the module.

Parameters:

input – The input array, with shape (batch_size, input_size).
hidden – A tuple of the initial hidden state and cell state arrays, both with shapes (batch_size, hidden_size).

Returns:

A tuple of the next hidden state and cell state arrays, both with shapes (batch_size, hidden_size).