diff --git a/previews/PR23/.documenter-siteinfo.json b/previews/PR23/.documenter-siteinfo.json index c72ba54..15959ad 100644 --- a/previews/PR23/.documenter-siteinfo.json +++ b/previews/PR23/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-13T16:01:16","documenter_version":"1.8.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-13T16:07:54","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/previews/PR23/api/cells/index.html b/previews/PR23/api/cells/index.html index 3336f6c..9490a8a 100644 --- a/previews/PR23/api/cells/index.html +++ b/previews/PR23/api/cells/index.html @@ -17,31 +17,31 @@ #result with default initialization of internal states result = rancell(inp) #result with internal states provided -result_state = rancell(inp, (state, c_state))source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
+result_state = rancell(inp, (state, c_state))
source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
-    bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
+    bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light recurrent unit. See LightRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Minimal gated unit. See MGU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Neural Architecture Search unit. See NAS for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -68,7 +68,7 @@ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
     couple_carry::Bool = true,
     cell_kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ @@ -76,9 +76,9 @@ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
-    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
+    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 1 cell. See MUT1 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -86,7 +86,7 @@ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 2 cell. See MUT2 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -94,7 +94,7 @@ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 3 cell. See MUT3 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -102,7 +102,7 @@ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -110,7 +110,7 @@
 s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
-\end{aligned}\]

Forward

rnncell(inp, [state, c_state])
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state, c_state])
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Peephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -119,17 +119,17 @@ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). -\end{aligned}\]

Forward

lstmcell(x, [h, c])

The forward pass takes the following arguments:

  • x: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.
  • h: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.
  • c: The candidate state, sized out, or a matrix of size out x batch_size.

If not provided, both h and c default to vectors of zeros.

Examples

source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
+\end{aligned}\]

Forward

lstmcell(x, [h, c])

The forward pass takes the following arguments:

  • x: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.
  • h: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.
  • c: The candidate state, sized out, or a matrix of size out x batch_size.

If not provided, both h and c default to vectors of zeros.

Examples

source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} -\end{aligned}\]

Forward

fastrnncell(inp, [state])
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
+\end{aligned}\]

Forward

fastrnncell(inp, [state])
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} -\end{aligned}\]

Forward

fastgrnncell(inp, [state])
source
+\end{aligned}\]

Forward

fastgrnncell(inp, [state])
source diff --git a/previews/PR23/api/wrappers/index.html b/previews/PR23/api/wrappers/index.html index fd64429..8bd83b5 100644 --- a/previews/PR23/api/wrappers/index.html +++ b/previews/PR23/api/wrappers/index.html @@ -5,20 +5,20 @@ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) -\end{aligned}\]

source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
-    kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
+    kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. -\end{aligned}\]

source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t -\end{aligned}\]

source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t -\end{aligned}\]

source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ @@ -42,28 +42,28 @@ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) -\end{aligned}\]

source
RecurrentLayers.RHNType
RHN((input_size => hidden_size)::Pair depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.RHNType
RHN((input_size => hidden_size)::Pair depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -71,17 +71,17 @@
 s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
-\end{aligned}\]

source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} +\end{aligned}\]

source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). -\end{align}\]

source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{align}\]

source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} -\end{aligned}\]

Forward

fastrnn(inp, [state])
source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

fastrnn(inp, [state])
source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} -\end{aligned}\]

Forward

fastgrnn(inp, [state])
source
+\end{aligned}\]

Forward

fastgrnn(inp, [state])
source diff --git a/previews/PR23/index.html b/previews/PR23/index.html index 9cc2c91..b7fcab4 100644 --- a/previews/PR23/index.html +++ b/previews/PR23/index.html @@ -1,2 +1,2 @@ -Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Any bugs and mistakes of course!
  • Documentation, in any form: examples, how tos, docstrings
+Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Any bugs and mistakes of course!
  • Documentation, in any form: examples, how tos, docstrings
diff --git a/previews/PR23/objects.inv b/previews/PR23/objects.inv index 27975d8..158ee5b 100644 Binary files a/previews/PR23/objects.inv and b/previews/PR23/objects.inv differ diff --git a/previews/PR23/roadmap/index.html b/previews/PR23/roadmap/index.html index 80852e0..948dcd4 100644 --- a/previews/PR23/roadmap/index.html +++ b/previews/PR23/roadmap/index.html @@ -1,2 +1,2 @@ -Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!

+Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!