diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 27497ef..c52200f 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-22T19:18:52","documenter_version":"1.8.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-24T16:26:28","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/dev/api/cells/index.html b/dev/api/cells/index.html index c16cc84..2728a61 100644 --- a/dev/api/cells/index.html +++ b/dev/api/cells/index.html @@ -9,11 +9,11 @@ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]

Forward

rancell(inp, (state, cstate))
-rancell(inp)

Arguments

Returns

source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
+rancell(inp)

Arguments

  • inp: The input to the rancell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnncell(inp, state)
-indrnncell(inp)

Arguments

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
+indrnncell(inp)

Arguments

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light recurrent unit. See LightRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -21,7 +21,7 @@ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]

Forward

lightrucell(inp, state)
-lightrucell(inp)

Arguments

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
+lightrucell(inp)

Arguments

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -29,7 +29,7 @@ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]

Forward

ligrucell(inp, state)
-ligrucell(inp)

Arguments

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
+ligrucell(inp)

Arguments

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Minimal gated unit. See MGU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -37,7 +37,7 @@ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]

Forward

mgucell(inp, state)
-mgucell(inp)

Arguments

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
+mgucell(inp)

Arguments

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Neural Architecture Search unit. See NAS for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -65,7 +65,7 @@ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]

Forward

nascell(inp, (state, cstate))
-nascell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
+nascell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
     couple_carry::Bool = true,
     cell_kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ @@ -73,9 +73,9 @@ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
-    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
+    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 1 cell. See MUT1 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -84,7 +84,7 @@ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
-mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
+mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 2 cell. See MUT2 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -93,7 +93,7 @@ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
-mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
+mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 3 cell. See MUT3 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -102,7 +102,7 @@ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
-mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
+mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -111,7 +111,7 @@
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
 \end{aligned}\]

Forward

scrncell(inp, (state, cstate))
-scrncell(inp)

Arguments

  • inp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
+scrncell(inp)

Arguments

  • inp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Peephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -121,14 +121,14 @@ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{aligned}\]

Forward

peepholelstmcell(inp, (state, cstate))
-peepholelstmcell(inp)

Arguments

  • inp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
+peepholelstmcell(inp)

Arguments

  • inp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]

Forward

fastrnncell(inp, state)
-fastrnncell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
+fastrnncell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -136,4 +136,4 @@ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]

Forward

fastgrnncell(inp, state)
-fastgrnncell(inp)

Arguments

  • inp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
+fastgrnncell(inp)

Arguments

Returns

source diff --git a/dev/api/layers/index.html b/dev/api/layers/index.html index 6b2c717..f034dd9 100644 --- a/dev/api/layers/index.html +++ b/dev/api/layers/index.html @@ -6,24 +6,24 @@ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]

Forward

ran(inp, (state, cstate))
-ran(inp)

Arguments

Returns

source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
+ran(inp)

Arguments

  • inp: The input to the ran. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RAN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
     kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnn(inp, state)
-indrnn(inp)

Arguments

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +indrnn(inp)

Arguments

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]

Forward

lightru(inp, state)
-lightru(inp)

Arguments

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +lightru(inp)

Arguments

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]

Forward

ligru(inp, state)
-ligru(inp)

Arguments

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +ligru(inp)

Arguments

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]

Forward

mgu(inp, state)
-mgu(inp)

Arguments

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +mgu(inp)

Arguments

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ @@ -48,31 +48,31 @@ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]

Forward

nas(inp, (state, cstate))
-nas(inp)

Arguments

  • inp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.RHNType
RHN((input_size => hidden_size) depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +nas(inp)

Arguments

  • inp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.RHNType
RHN((input_size => hidden_size) depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
-mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
-mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
-mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
+mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -81,20 +81,20 @@
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
 \end{aligned}\]

Forward

scrn(inp, (state, cstate))
-scrn(inp)

Arguments

  • inp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} +scrn(inp)

Arguments

  • inp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{align}\]

Forward

peepholelstm(inp, (state, cstate))
-peepholelstm(inp)

Arguments

  • inp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +peepholelstm(inp)

Arguments

  • inp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]

Forward

fastrnn(inp, state)
-fastrnn(inp)

Arguments

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +fastrnn(inp)

Arguments

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]

Forward

fastgrnn(inp, state)
-fastgrnn(inp)

Arguments

  • inp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
+fastgrnn(inp)

Arguments

Returns

source diff --git a/dev/api/wrappers/index.html b/dev/api/wrappers/index.html index 4956a4a..9e1e98c 100644 --- a/dev/api/wrappers/index.html +++ b/dev/api/wrappers/index.html @@ -1,3 +1,3 @@ Wrappers · RecurrentLayers.jl

Wrappers

RecurrentLayers.StackedRNNType
StackedRNN(rlayer, (input_size, hidden_size), args...;
-    num_layers = 1, kwargs...)

Constructs a stack of recurrent layers given the recurrent layer type.

Arguments:

  • rlayer: Any recurrent layer such as MGU, RHN, etc... or Flux.RNN, Flux.LSTM, etc.
  • input_size: Defines the input dimension for the first layer.
  • hidden_size: defines the dimension of the hidden layer.
  • num_layers: The number of layers to stack. Default is 1.
  • args...: Additional positional arguments passed to the recurrent layer.
  • kwargs...: Additional keyword arguments passed to the recurrent layers.

Returns: A StackedRNN instance containing the specified number of RNN layers and their initial states.

source
+ num_layers = 1, kwargs...)

Constructs a stack of recurrent layers given the recurrent layer type.

Arguments:

Returns: A StackedRNN instance containing the specified number of RNN layers and their initial states.

source diff --git a/dev/index.html b/dev/index.html index 8587eee..84d7fe5 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Fixes for any bugs/errors
  • Documentation, in any form: examples, how tos, docstrings
+Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Fixes for any bugs/errors
  • Documentation, in any form: examples, how tos, docstrings
diff --git a/dev/roadmap/index.html b/dev/roadmap/index.html index 7e9ce02..2ce5cff 100644 --- a/dev/roadmap/index.html +++ b/dev/roadmap/index.html @@ -1,2 +1,2 @@ -Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!

+Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!