Long Short-Term Memory Cell (LSTMCell)#

API#

class warp_nn.modules.layers.LSTMCell(input_size: int, hidden_size: int, *, bias: bool = True)[source]#

Bases: Module

Apply a Long Short-Term Memory (LSTM) cell.

\[\text{LSTMCell}(x, (h, c)) = (h', c')\]

where

\[\begin{split}\begin{array}{ll} i = \sigma(W_{ii} \, x + b_{ii} + W_{hi} \, h + b_{hi}) \\ f = \sigma(W_{if} \, x + b_{if} + W_{hf} \, h + b_{hf}) \\ g = \tanh(W_{ig} \, x + b_{ig} + W_{hg} \, h + b_{hg}) \\ o = \sigma(W_{io} \, x + b_{io} + W_{ho} \, h + b_{ho}) \\ c' = f \odot c + i \odot g \\ h' = o \odot \tanh(c') \\ \end{array}\end{split}\]

and \(\sigma\) is the sigmoid function and \(\odot\) is the element-wise product.


Learnable parameters:

Name

Shape

Description

\(W_{ii}, W_{if}, W_{ig}, W_{io}\)

weight_ih

(4 * hidden_size, input_size)

Input-to-hidden weights

\(W_{hi}, W_{hf}, W_{hg}, W_{ho}\)

weight_hh

(4 * hidden_size, hidden_size)

Hidden-to-hidden weights

\(b_{ii}, b_{if}, b_{ig}, b_{io}\)

bias_ih

(4 * hidden_size, 1)

Input-to-hidden bias. Only if bias is true

\(b_{hi}, b_{hf}, b_{hg}, b_{ho}\)

bias_hh

(4 * hidden_size, 1)

Hidden-to-hidden bias. Only if bias is true

The parameters are initialized from the uniform distribution \(u(-k, k)\) where \(k = \frac{1}{\sqrt{\text{hidden\_size}}}\).


Parameters:
  • input_size – The number of input features.

  • hidden_size – The number of hidden features.

  • bias – Whether to include a bias term.

__call__(
input: array,
hidden: tuple[array, array],
) tuple[array, array][source]#

Forward pass of the module.

Parameters:
  • input – The input array, with shape (batch_size, input_size).

  • hidden – A tuple of the initial hidden state and cell state arrays, both with shapes (batch_size, hidden_size).

Returns:

A tuple of the next hidden state and cell state arrays, both with shapes (batch_size, hidden_size).