diff --git a/docs/src/index.md b/docs/src/index.md index e2add232..ed7fa8a3 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,10 +1,9 @@ # ReservoirComputing.jl -ReservoirComputing.jl provides an efficient, modular, and easy to use implementation of Reservoir Computing models such as Echo State Networks (ESNs). Reservoir Computing (RC) is an umbrella term used to describe a family of models such as ESNs and Liquid State Machines (LSMs). The key concept is to expand the input data into a higher dimension and use regression to train the model; in some ways, Reservoir Computers can be considered similar to kernel methods. - +ReservoirComputing.jl is a versatile and user-friendly Julia package designed for the implementation of advanced Reservoir Computing models, such as Echo State Networks (ESNs). Central to Reservoir Computing is the expansion of input data into a higher-dimensional space, leveraging regression techniques for effective model training. This approach bears resemblance to kernel methods, offering a unique perspective in machine learning. ReservoirComputing.jl offers a modular design, ensuring both ease of use for newcomers and flexibility for advanced users, establishing it as a key tool for innovative computing solutions. !!! info "Introductory material" - This library assumes some basic knowledge of Reservoir Computing. For a good introduction, we suggest the following papers: the first two are the seminal papers about ESN and LSM, the others are in-depth review papers that should cover all the needed information. For the majority of the algorithms implemented in this library, we cited in the documentation the original work introducing them. If you ever have doubts about a method or a function, just type ```? function``` in the Julia REPL to read the relevant notes. + This library assumes some basic knowledge of Reservoir Computing. For a good introduction, we suggest the following papers: the first two are the seminal papers about ESN and LSM, the others are in-depth review papers that should cover all the needed information. For the majority of the algorithms implemented in this library we cited in the documentation the original work introducing them. If you ever are in doubt about a method or a function just type ```? function``` in the Julia REPL to read the relevant notes. * Jaeger, Herbert: The “echo state” approach to analyzing and training recurrent neural networks-with an erratum note. * Maass W, Natschläger T, Markram H: Real-time computing without stable states: a new framework for neural computation based on perturbations. @@ -12,92 +11,36 @@ ReservoirComputing.jl provides an efficient, modular, and easy to use implementa * Lukoševičius, Mantas, and Herbert Jaeger: Reservoir computing approaches to recurrent neural network training. !!! info "Performance tip" - For faster computations on the CPU, it is suggested to add `using MKL` to the script. For clarity's sake, this library will not be indicated under every example in the documentation. + For faster computations on the CPU it is suggested to add `using MKL` to the script. For clarity's sake this library will not be indicated under every example in the documentation. ## Installation +To install ReservoirComputing.jl, ensure you have Julia version 1.6 or higher. Follow these steps: -To install ReservoirComputing.jl, use the Julia package manager: + 1. Open the Julia command line. + 2. Enter the Pkg REPL mode by pressing ]. + 3. Type add ReservoirComputing and press Enter. +For a more customized installation or to contribute to the package, consider cloning the repository: ```julia using Pkg -Pkg.add("ReservoirComputing") +Pkg.clone("https://github.com/SciML/ReservoirComputing.jl.git") ``` -The support for this library is for Julia v1.6 or greater. +or `dev` the package. ## Features Overview -This library provides multiple ways of training the chosen RC model. More specifically, the available algorithms are: -- ```StandardRidge```: a naive implementation of Ridge Regression. The default choice for training. -- ```LinearModel```: a wrap around [MLJLinearModels](https://juliaai.github.io/MLJLinearModels.jl/stable/). -- ```LIBSVM.AbstractSVR```: a direct call of [LIBSVM](https://github.com/JuliaML/LIBSVM.jl) regression methods. - -Also provided are two different ways of making predictions using RC: -- ```Generative```: the algorithm uses the prediction of the model in the previous step to continue the prediction. It only needs the number of steps as input. -- ```Predictive```: standard Machine Learning type of prediction. Given the features, the RC model will return the label/prediction. - -It is possible to modify the RC obtained states in the training and prediction steps using the following: -- ```StandardStates```: default choice, no changes will be made to the states. -- ```ExtendedStates```: the states are extended using a vertical concatenation, with the input data. -- ```PaddedStates```: the states are padded using a vertical concatenation with the chosen padding value. -- ```PaddedExtendedStates```: a combination of the first two. First, the states are extended and then padded. - -In addition, another modification is possible through the choice of non-linear algorithms: -- ```NLADefault```: default choice, no changes will be made to the states. -- ```NLAT1``` -- ```NLAT2``` -- ```NLAT3``` - -### Echo State Networks -For ESNs the following input layers are implemented : -- ```WeightedLayer```: weighted layer matrix with weights sampled from a uniform distribution. -- ```DenseLayer```: dense layer matrix with weights sampled from a uniform distribution. -- ```SparseLayer```: sparse layer matrix with weights sampled from a uniform distribution. -- ```MinimumLayer```: matrix with constant weights and weight sign decided following one of the two: - - ```BernoulliSample``` - - ```IrrationalSample``` -- ```InformedLayer```: special kin of weighted layer matrix for Hybrid ESNs. - -The package also contains multiple implementations of Reservoirs: -- ```RandSparseReservoir```: random sparse matrix with scaling of spectral radius -- ```PseudoSVDReservoir```: Pseudo SVD construction of a random sparse matrix -- ```DelayLineReservoir```: minimal matrix with chosen weights -- ```DelayLineBackwardReservoir```: minimal matrix with chosen weights -- ```SimpleCycleReservoir```: minimal matrix with chosen weights -- ```CycleJumpsReservoir```: minimal matrix with chosen weights - -In addition, multiple ways of driving the reservoir states are also provided: -- ```RNN```: standard Recurrent Neural Network driver. -- ```MRNN```: Multiple RNN driver, it consists of a linear combination of RNNs -- ```GRU```: gated Recurrent Unit driver, with all the possible GRU variants available: - - ```FullyGated``` - - ```Minimal``` - -A hybrid version of the model is also available through ```Hybrid``` - -### Reservoir Computing with Cellular Automata -The package provides also an implementation of Reservoir Computing models based on one dimensional Cellular Automata through the ```RECA``` call. For the moment, the only input encoding available (an input encoding plays a similar role to the input matrix for ESNs) is a random mapping, called through ```RandomMapping```. - -All the training methods described above can be used, as can all the modifications to the states. Both prediction methods are also possible in theory, although in the literature only ```Predictive``` tasks have been explored. +- **Multiple Training Algorithms**: Supports Ridge Regression, Linear Models, and LIBSVM regression methods for Reservoir Computing models. +- **Diverse Prediction Methods**: Offers both generative and predictive methods for Reservoir Computing predictions. +- **Modifiable Training and Prediction**: Allows modifications in Reservoir Computing states, such as state extension, padding, and combination methods. +- **Non-linear Algorithm Options**: Includes options for non-linear modifications in algorithms. +- **Echo State Networks (ESNs)**: Features various input layers, reservoirs, and methods for driving ESN reservoir states. +- **Cellular Automata-Based Reservoir Computing**: Introduces models based on one-dimensional Cellular Automata for Reservoir Computing. ## Contributing - - - Please refer to the - [SciML ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://github.com/SciML/ColPrac/blob/master/README.md) - for guidance on PRs, issues, and other matters relating to contributing to SciML. - - - See the [SciML Style Guide](https://github.com/SciML/SciMLStyle) for common coding practices and other style decisions. - - There are a few community forums: - - + The #diffeq-bridged and #sciml-bridged channels in the - [Julia Slack](https://julialang.org/slack/) - + The #diffeq-bridged and #sciml-bridged channels in the - [Julia Zulip](https://julialang.zulipchat.com/#narrow/stream/279055-sciml-bridged) - + On the [Julia Discourse forums](https://discourse.julialang.org) - + See also [SciML Community page](https://sciml.ai/community/) - +Contributions to ReservoirComputing.jl are highly encouraged and appreciated. Whether it's through implementing new RC model variations, enhancing documentation, adding examples, or any improvement, your contribution is valuable. We welcome posts of relevant papers or ideas in the issues section. For deeper insights into the library's functionality, the API section in the documentation is a great resource. For any queries not suited for issues, please reach out to the lead developers via Slack or email. ## Citing -If you use this library in your work, please cite: +If you use ReservoirComputing.jl in your work, we kindly ask you to cite it. Here is the BibTeX entry for your convenience: ```bibtex @article{JMLR:v23:22-0611, diff --git a/src/esn/echostatenetwork.jl b/src/esn/echostatenetwork.jl index 1230ee5f..2fd774db 100644 --- a/src/esn/echostatenetwork.jl +++ b/src/esn/echostatenetwork.jl @@ -16,7 +16,9 @@ end """ Default() -Sets the type of the ESN as the standard model. No parameters are needed. +The `Default` struct specifies the use of the standard model in Echo State Networks (ESNs). +It requires no parameters and is used when no specific variations or customizations of the ESN model are needed. +This struct is ideal for straightforward applications where the default ESN settings are sufficient. """ struct Default <: AbstractVariation end struct Hybrid{T, K, O, I, S, D} <: AbstractVariation @@ -31,11 +33,24 @@ end """ Hybrid(prior_model, u0, tspan, datasize) -Given the model parameters, returns an ```Hybrid``` variation of the ESN. This entails -a different training and prediction. Construction based on [1]. +Constructs a `Hybrid` variation of Echo State Networks (ESNs) integrating a knowledge-based model +(`prior_model`) with ESNs for advanced training and prediction in chaotic systems. -[1] Jaideep Pathak et al. "Hybrid Forecasting of Chaotic Processes: Using Machine -Learning in Conjunction with a Knowledge-Based Model" (2018) +# Parameters +- `prior_model`: A knowledge-based model function for integration with ESNs. +- `u0`: Initial conditions for the model. +- `tspan`: Time span as a tuple, indicating the duration for model operation. +- `datasize`: The size of the data to be processed. + +# Returns +- A `Hybrid` struct instance representing the combined ESN and knowledge-based model. + +This method is effective for chaotic processes as highlighted in [^Pathak]. + +Reference: +[^Pathak]: Jaideep Pathak et al. + "Hybrid Forecasting of Chaotic Processes: + Using Machine Learning in Conjunction with a Knowledge-Based Model" (2018). """ function Hybrid(prior_model, u0, tspan, datasize) trange = collect(range(tspan[1], tspan[2], length = datasize)) @@ -47,28 +62,33 @@ function Hybrid(prior_model, u0, tspan, datasize) end """ - ESN(train_data; - variation = Default(), - input_layer = DenseLayer(), - reservoir = RandSparseReservoir(), - bias = NullLayer(), - reservoir_driver = RNN(), - nla_type = NLADefault(), - states_type = StandardStates()) - (esn::ESN)(prediction::AbstractPrediction, - output_layer::AbstractOutputLayer; - initial_conditions=output_layer.last_value, - last_state=esn.states[:, end]) - -Constructor for the Echo State Network model. It requires the reservoir size as the input -and the data for the training. It returns a struct ready to be trained with the states -already harvested. - -After the training, this struct can be used for the prediction following the second -function call. This will take as input a prediction type and the output layer from the -training. The ```initial_conditions``` and ```last_state``` parameters can be left as -they are, unless there is a specific reason to change them. All the components are -detailed in the API documentation. More examples are given in the general documentation. + ESN(train_data; kwargs...) -> ESN + +Creates an Echo State Network (ESN) using specified parameters and training data, suitable for various machine learning tasks. + +# Parameters +- `train_data`: Matrix of training data (columns as time steps, rows as features). +- `variation`: Variation of ESN (default: `Default()`). +- `input_layer`: Input layer of ESN (default: `DenseLayer()`). +- `reservoir`: Reservoir of the ESN (default: `RandSparseReservoir(100)`). +- `bias`: Bias vector for each time step (default: `NullLayer()`). +- `reservoir_driver`: Mechanism for evolving reservoir states (default: `RNN()`). +- `nla_type`: Non-linear activation type (default: `NLADefault()`). +- `states_type`: Format for storing states (default: `StandardStates()`). +- `washout`: Initial time steps to discard (default: `0`). +- `matrix_type`: Type of matrices used internally (default: type of `train_data`). + +# Returns +- An initialized ESN instance with specified parameters. + +# Examples +```julia +using ReservoirComputing + +train_data = rand(10, 100) # 10 features, 100 time steps + +esn = ESN(train_data, reservoir=RandSparseReservoir(200), washout=10) +``` """ function ESN(train_data; variation = Default(), @@ -187,11 +207,42 @@ end #training dispatch on esn """ - train(esn::AbstractEchoStateNetwork, target_data, training_method=StandardRidge(0.0)) + train(esn::AbstractEchoStateNetwork, target_data, training_method = StandardRidge(0.0)) + +Trains an Echo State Network (ESN) using the provided target data and a specified training method. + +# Parameters +- `esn::AbstractEchoStateNetwork`: The ESN instance to be trained. +- `target_data`: Supervised training data for the ESN. +- `training_method`: The method for training the ESN (default: `StandardRidge(0.0)`). + +# Returns +- The trained ESN model. Its type and structure depend on `training_method` and the ESN's implementation. + + +# Returns +The trained ESN model. The exact type and structure of the return value depends on the +`training_method` and the specific ESN implementation. + +```julia +using ReservoirComputing + +# Initialize an ESN instance and target data +esn = ESN(train_data, reservoir=RandSparseReservoir(200), washout=10) +target_data = rand(size(train_data, 2)) + +# Train the ESN using the default training method +trained_esn = train(esn, target_data) + +# Train the ESN using a custom training method +trained_esn = train(esn, target_data, training_method=StandardRidge(1.0)) +``` -Training of the built ESN over the ```target_data```. The default training method is -RidgeRegression. The output is an ```OutputLayer``` object to be fed to the esn call -for the prediction. +# Notes +- When using a `Hybrid` variation, the function extends the state matrix with data from the + physical model included in the `variation`. +- The training is handled by a lower-level `_train` function which takes the new state matrix + and performs the actual training using the specified `training_method`. """ function train(esn::AbstractEchoStateNetwork, target_data, diff --git a/src/esn/esn_input_layers.jl b/src/esn/esn_input_layers.jl index 4794be43..8347a4bc 100644 --- a/src/esn/esn_input_layers.jl +++ b/src/esn/esn_input_layers.jl @@ -8,13 +8,21 @@ end WeightedInput(scaling) WeightedInput(;scaling=0.1) -Returns a weighted layer initializer object, that will produce a weighted input matrix with -random non-zero elements drawn from [-```scaling```, ```scaling```], as described -in [1]. The ```scaling``` factor can be given as arg or kwarg. +Creates a `WeightedInput` layer initializer for Echo State Networks. +This initializer generates a weighted input matrix with random non-zero +elements distributed uniformly within the range [-`scaling`, `scaling`], +following the approach in [^Lu]. -[1] Lu, Zhixin, et al. "_Reservoir observers: Model-free inference of unmeasured variables -in chaotic systems._" -Chaos: An Interdisciplinary Journal of Nonlinear Science 27.4 (2017): 041102. +# Parameters +- `scaling`: The scaling factor for the weight distribution (default: 0.1). + +# Returns +- A `WeightedInput` instance to be used for initializing the input layer of an ESN. + +Reference: +[^Lu]: Lu, Zhixin, et al. + "Reservoir observers: Model-free inference of unmeasured variables in chaotic systems." + Chaos: An Interdisciplinary Journal of Nonlinear Science 27.4 (2017): 041102. """ function WeightedLayer(; scaling = 0.1) return WeightedLayer(scaling) @@ -45,10 +53,16 @@ end DenseLayer(scaling) DenseLayer(;scaling=0.1) -Returns a fully connected layer initializer object, that will produce a weighted input -matrix with random non-zero elements drawn from [-```scaling```, ```scaling```]. -The ```scaling``` factor can be given as arg or kwarg. This is the default choice in the -```ESN``` construction. +Creates a `DenseLayer` initializer for Echo State Networks, generating a fully connected input layer. +The layer is initialized with random weights uniformly distributed within [-`scaling`, `scaling`]. +This scaling factor can be provided either as an argument or a keyword argument. +The `DenseLayer` is the default input layer in `ESN` construction. + +# Parameters +- `scaling`: The scaling factor for weight distribution (default: 0.1). + +# Returns +- A `DenseLayer` instance for initializing the ESN's input layer. """ struct DenseLayer{T} <: AbstractLayer scaling::T @@ -61,8 +75,15 @@ end """ create_layer(input_layer::AbstractLayer, res_size, in_size) -Returns a ```res_size``` times ```in_size``` matrix layer, built according to the -```input_layer``` constructor. +Generates a matrix layer of size `res_size` x `in_size`, constructed according to the specifications of the `input_layer`. + +# Parameters +- `input_layer`: An instance of `AbstractLayer` determining the layer construction. +- `res_size`: The number of rows (reservoir size) for the layer. +- `in_size`: The number of columns (input size) for the layer. + +# Returns +- A matrix representing the constructed layer. """ function create_layer(input_layer::DenseLayer, res_size, @@ -78,9 +99,16 @@ end SparseLayer(scaling; sparsity=0.1) SparseLayer(;scaling=0.1, sparsity=0.1) -Returns a sparsely connected layer initializer object, that will produce a random sparse -input matrix with random non-zero elements drawn from [-```scaling```, ```scaling```] and -given sparsity. The ```scaling``` and ```sparsity``` factors can be given as args or kwargs. +Creates a `SparseLayer` initializer for Echo State Networks, generating a sparse input layer. +The layer is initialized with weights distributed within [-`scaling`, `scaling`] +and a specified `sparsity` level. Both `scaling` and `sparsity` can be set as arguments or keyword arguments. + +# Parameters +- `scaling`: Scaling factor for weight distribution (default: 0.1). +- `sparsity`: Sparsity level of the layer (default: 0.1). + +# Returns +- A `SparseLayer` instance for initializing ESN's input layer with sparse connections. """ struct SparseLayer{T} <: AbstractLayer scaling::T @@ -117,13 +145,21 @@ end BernoulliSample(p) BernoulliSample(;p=0.5) -Returns a Bernoulli sign constructor for the ```MinimumLayer``` call. The ```p``` factor -determines the probability of the result, as in the Distributions call. The value can be -passed as an arg or kwarg. This sign weight determination for input layers is introduced -in [1]. +Creates a `BernoulliSample` constructor for the `MinimumLayer`. +It uses a Bernoulli distribution to determine the sign of weights in the input layer. +The parameter `p` sets the probability of a weight being positive, as per the `Distributions` package. +This method of sign weight determination for input layers is based on the approach in [^Rodan]. + +# Parameters +- `p`: Probability of a positive weight (default: 0.5). -[1] Rodan, Ali, and Peter Tino. "_Minimum complexity echo state network._" -IEEE transactions on neural networks 22.1 (2010): 131-144. +# Returns +- A `BernoulliSample` instance for generating sign weights in `MinimumLayer`. + +Reference: +[^Rodan]: Rodan, Ali, and Peter Tino. + "Minimum complexity echo state network." + IEEE Transactions on Neural Networks 22.1 (2010): 131-144. """ function BernoulliSample(; p = 0.5) return BernoulliSample(p) @@ -138,13 +174,22 @@ end IrrationalSample(irrational, start) IrrationalSample(;irrational=pi, start=1) -Returns an irrational sign constructor for the ```MinimumLayer``` call. The values can be -passed as args or kwargs. The sign of the weight is decided from the decimal expansion of -the given ```irrational```. The first ```start``` decimal digits are thresholded at 4.5, -then the n-th input sign will be + and - respectively. +Creates an `IrrationalSample` constructor for the `MinimumLayer`. +It determines the sign of weights in the input layer based on the decimal expansion of an `irrational` number. +The `start` parameter sets the starting point in the decimal sequence. +The signs are assigned based on the thresholding of each decimal digit against 4.5, as described in [^Rodan]. + +# Parameters +- `irrational`: An irrational number for weight sign determination (default: π). +- `start`: Starting index in the decimal expansion (default: 1). -[1] Rodan, Ali, and Peter Tiňo. "_Simple deterministically constructed cycle reservoirs -with regular jumps._" Neural computation 24.7 (2012): 1822-1852. +# Returns +- An `IrrationalSample` instance for generating sign weights in `MinimumLayer`. + +Reference: +[^Rodan]: Rodan, Ali, and Peter Tiňo. + "Simple deterministically constructed cycle reservoirs with regular jumps." + Neural Computation 24.7 (2012): 1822-1852. """ function IrrationalSample(; irrational = pi, start = 1) return IrrationalSample(irrational, start) @@ -160,15 +205,25 @@ end MinimumLayer(weight; sampling=BernoulliSample(0.5)) MinimumLayer(;weight=0.1, sampling=BernoulliSample(0.5)) -Returns a fully connected layer initializer object. The matrix constructed with this -initializer presents the same absolute weight value, decided by the ```weight``` factor. -The sign of each entry is decided by the ```sampling``` struct. Construction detailed -in [1] and [2]. - -[1] Rodan, Ali, and Peter Tino. "_Minimum complexity echo state network._" -IEEE transactions on neural networks 22.1 (2010): 131-144. -[2] Rodan, Ali, and Peter Tiňo. "_Simple deterministically constructed cycle reservoirs -with regular jumps._" Neural computation 24.7 (2012): 1822-1852. +Creates a `MinimumLayer` initializer for Echo State Networks, generating a fully connected input layer. +This layer has a uniform absolute weight value (`weight`) with the sign of each +weight determined by the `sampling` method. This approach, as detailed in [^Rodan1] and [^Rodan2], +allows for controlled weight distribution in the layer. + +# Parameters +- `weight`: Absolute value of weights in the layer. +- `sampling`: Method for determining the sign of weights (default: `BernoulliSample(0.5)`). + +# Returns +- A `MinimumLayer` instance for initializing the ESN's input layer. + +References: +[^Rodan1]: Rodan, Ali, and Peter Tino. + "Minimum complexity echo state network." + IEEE Transactions on Neural Networks 22.1 (2010): 131-144. +[^Rodan2]: Rodan, Ali, and Peter Tiňo. + "Simple deterministically constructed cycle reservoirs with regular jumps." + Neural Computation 24.7 (2012): 1822-1852. """ function MinimumLayer(weight; sampling = BernoulliSample(0.5)) return MinimumLayer(weight, sampling) @@ -234,13 +289,26 @@ end """ InformedLayer(model_in_size; scaling=0.1, gamma=0.5) -Returns a weighted input layer matrix, with random non-zero elements drawn from -[-```scaling```, ```scaling```], where some γ of reservoir nodes are connected exclusively -to the raw inputs, and the rest to the outputs of the prior knowledge model, -as described in [1]. +Creates an `InformedLayer` initializer for Echo State Networks (ESNs) that generates +a weighted input layer matrix. The matrix contains random non-zero elements drawn from +the range [-```scaling```, ```scaling```]. This initializer ensures that a fraction (`gamma`) +of reservoir nodes are exclusively connected to the raw inputs, while the rest are +connected to the outputs of a prior knowledge model, as described in [^Pathak]. -[1] Jaideep Pathak et al. "_Hybrid Forecasting of Chaotic Processes: Using Machine Learning -in Conjunction with a Knowledge-Based Model_" (2018) +# Arguments +- `model_in_size`: The size of the prior knowledge model's output, + which determines the number of columns in the input layer matrix. + +# Keyword Arguments +- `scaling`: The absolute value of the weights (default: 0.1). +- `gamma`: The fraction of reservoir nodes connected exclusively to raw inputs (default: 0.5). + +# Returns +- An `InformedLayer` instance for initializing the ESN's input layer matrix. + +Reference: +[^Pathak]: Jaideep Pathak et al. + "Hybrid Forecasting of Chaotic Processes: Using Machine Learning in Conjunction with a Knowledge-Based Model" (2018). """ function InformedLayer(model_in_size; scaling = 0.1, gamma = 0.5) return InformedLayer(scaling, gamma, model_in_size) @@ -286,9 +354,12 @@ function create_layer(input_layer::InformedLayer, end """ - NullLayer(model_in_size; scaling=0.1, gamma=0.5) + NullLayer() + +Creates a `NullLayer` initializer for Echo State Networks (ESNs) that generates a vector of zeros. -Returns a vector of zeros. +# Returns +- A `NullLayer` instance for initializing the ESN's input layer matrix. """ struct NullLayer <: AbstractLayer end diff --git a/src/esn/esn_reservoir_drivers.jl b/src/esn/esn_reservoir_drivers.jl index 8590b279..cebbc915 100644 --- a/src/esn/esn_reservoir_drivers.jl +++ b/src/esn/esn_reservoir_drivers.jl @@ -2,10 +2,28 @@ abstract type AbstractReservoirDriver end """ create_states( - reservoir_driver::AbstractReservoirDriver, train_data, reservoir_matrix,input_matrix + reservoir_driver::AbstractReservoirDriver, + train_data, + washout, + reservoir_matrix, + input_matrix, + bias_vector ) -Return the trained ESN states according to the given driver. +Create and return the trained Echo State Network (ESN) states according to the specified reservoir driver. + +# Arguments +- `reservoir_driver::AbstractReservoirDriver`: The reservoir driver that determines how the ESN states evolve over time. +- `train_data`: The training data used to train the ESN. +- `washout::Int`: The number of initial time steps to discard during training to allow the reservoir dynamics to wash out the initial conditions. +- `reservoir_matrix`: The reservoir matrix representing the dynamic, recurrent part of the ESN. +- `input_matrix`: The input matrix that defines the connections between input features and reservoir nodes. +- `bias_vector`: The bias vector to be added at each time step during the reservoir update. + +# Returns +- A matrix of trained ESN states, where each column represents the state at a specific time step. + +This function is responsible for creating and returning the states of the ESN during training based on the provided training data and parameters. """ function create_states(reservoir_driver::AbstractReservoirDriver, train_data, @@ -79,7 +97,19 @@ end RNN(activation_function, leaky_coefficient) RNN(;activation_function=tanh, leaky_coefficient=1.0) -Returns a Recurrent Neural Network initializer for the ESN. This is the default choice. +Returns a Recurrent Neural Network (RNN) initializer for the Echo State Network (ESN). + +# Arguments +- `activation_function`: The activation function used in the RNN. +- `leaky_coefficient`: The leaky coefficient used in the RNN. + +# Keyword Arguments +- `activation_function`: The activation function used in the RNN. Defaults to `tanh`. +- `leaky_coefficient`: The leaky coefficient used in the RNN. Defaults to 1.0. + +This function creates an RNN object with the specified activation function and leaky coefficient, +which can be used as a reservoir driver in the ESN. + """ function RNN(; activation_function = NNlib.fast_act(tanh), leaky_coefficient = 1.0) RNN(activation_function, leaky_coefficient) @@ -127,18 +157,27 @@ end """ MRNN(activation_function, leaky_coefficient, scaling_factor) - MRNN(;activation_function=[tanh, sigmoid], leaky_coefficient=1.0, + MRNN(;activation_function=[tanh, sigmoid], leaky_coefficient=1.0, scaling_factor=fill(leaky_coefficient, length(activation_function))) -Returns a Multiple RNN initializer, where multiple functions are combined in a linear -combination with chosen parameters ```scaling_factor```. The ```activation_function``` -and ```scaling_factor``` arguments must be vectors of the same size. Multiple combinations -are possible. The implementation is based upon the double activation function idea, -found in [1]. +Returns a Multiple RNN (MRNN) initializer for the Echo State Network (ESN), introduced in [^lun]. -[1] Lun, Shu-Xian, et al. "_A novel model of leaky integrator echo state network for -time-series prediction._" Neurocomputing 159 (2015): 58-66. +# Arguments +- `activation_function`: A vector of activation functions used in the MRNN. +- `leaky_coefficient`: The leaky coefficient used in the MRNN. +- `scaling_factor`: A vector of scaling factors for combining activation functions. +# Keyword Arguments +- `activation_function`: A vector of activation functions used in the MRNN. Defaults to `[tanh, sigmoid]`. +- `leaky_coefficient`: The leaky coefficient used in the MRNN. Defaults to 1.0. +- `scaling_factor`: A vector of scaling factors for combining activation functions. Defaults to an array of the same size as `activation_function` with all elements set to `leaky_coefficient`. + +This function creates an MRNN object with the specified activation functions, leaky coefficient, and scaling factors, which can be used as a reservoir driver in the ESN. + +# Reference: +[^lun]: Lun, Shu-Xian, et al. + "_A novel model of leaky integrator echo state network for + time-series prediction._" Neurocomputing 159 (2015): 58-66. """ function MRNN( ; @@ -192,20 +231,28 @@ end """ FullyGated() -Returns a standard Gated Recurrent Unit ESN initializer, as described in [1]. +Returns a Fully Gated Recurrent Unit (FullyGated) initializer for the Echo State Network (ESN). + +This function creates a FullyGated object, which can be used as a reservoir driver in the ESN. +The FullyGated variant is described in the literature reference [^cho]. + +# Returns +- `FullyGated`: A FullyGated reservoir driver. + +# Reference +[^cho]: Cho, Kyunghyun, et al. + "_Learning phrase representations using RNN encoder-decoder for statistical machine translation._" + arXiv preprint arXiv:1406.1078 (2014). -[1] Cho, Kyunghyun, et al. “_Learning phrase representations using RNN encoder-decoder -for statistical machine translation._” -arXiv preprint arXiv:1406.1078 (2014). """ struct FullyGated <: AbstractGRUVariant end """ Minimal() -Returns a minimal GRU ESN initializer as described in [1]. +Returns a minimal GRU ESN initializer as described in [^Zhou]. -[1] Zhou, Guo-Bing, et al. "_Minimal gated unit for recurrent neural networks._" +[^Zhou]: Zhou, Guo-Bing, et al. "_Minimal gated unit for recurrent neural networks._" International Journal of Automation and Computing 13.3 (2016): 226-234. """ struct Minimal <: AbstractGRUVariant end @@ -218,10 +265,22 @@ struct Minimal <: AbstractGRUVariant end bias = fill(DenseLayer(), 2), variant = FullyGated()) -Returns a Gated Recurrent Unit [1] reservoir driver. +Returns a Gated Recurrent Unit (GRU) reservoir driver for Echo State Networks (ESNs). This driver is based on the GRU architecture [^Cho], which is designed to capture temporal dependencies in data and is commonly used in various machine learning applications. + +# Arguments +- `activation_function`: An array of activation functions for the GRU layers. By default, it uses sigmoid activation functions for the update gate, reset gate, and tanh for the hidden state. +- `inner_layer`: An array of inner layers used in the GRU architecture. By default, it uses two dense layers. +- `reservoir`: An array of reservoir layers. By default, it uses two random sparse reservoirs. +- `bias`: An array of bias layers for the GRU. By default, it uses two dense layers. +- `variant`: The GRU variant to use. By default, it uses the "FullyGated" variant. + +# Returns +A GRUParams object containing the parameters needed for the GRU-based reservoir driver. -[1] Cho, Kyunghyun, et al. “_Learning phrase representations using RNN encoder-decoder for -statistical machine translation._” arXiv preprint arXiv:1406.1078 (2014). +# References +[^Cho]: Cho, Kyunghyun, et al. + "_Learning phrase representations using RNN encoder-decoder for statistical machine translation._" + arXiv preprint arXiv:1406.1078 (2014). """ function GRU( ; diff --git a/src/esn/esn_reservoirs.jl b/src/esn/esn_reservoirs.jl index 434a8cda..c4c3a86f 100644 --- a/src/esn/esn_reservoirs.jl +++ b/src/esn/esn_reservoirs.jl @@ -18,9 +18,19 @@ end RandSparseReservoir(res_size, radius, sparsity) RandSparseReservoir(res_size; radius=1.0, sparsity=0.1) -Returns a random sparse reservoir initializer, that will return a matrix with given -`sparsity` and scaled spectral radius according to `radius`. This is the default choice -in the ```ESN``` construction. + +Returns a random sparse reservoir initializer, which generates a matrix of size `res_size x res_size` with the specified `sparsity` and scaled spectral radius according to `radius`. This type of reservoir initializer is commonly used in Echo State Networks (ESNs) for capturing complex temporal dependencies. + +# Arguments +- `res_size`: The size of the reservoir matrix. +- `radius`: The desired spectral radius of the reservoir. By default, it is set to 1.0. +- `sparsity`: The sparsity level of the reservoir matrix, controlling the fraction of zero elements. By default, it is set to 0.1. + +# Returns +A RandSparseReservoir object that can be used as a reservoir initializer in ESN construction. + +# References +This type of reservoir initialization is a common choice in ESN construction for its ability to capture temporal dependencies in data. However, there is no specific reference associated with this function. """ function RandSparseReservoir(res_size; radius = 1.0, sparsity = 0.1) return RandSparseReservoir(res_size, radius, sparsity) @@ -30,8 +40,18 @@ end create_reservoir(reservoir::AbstractReservoir, res_size) create_reservoir(reservoir, args...) -Given an ```AbstractReservoir` constructor and the reservoir size, it returns the -corresponding matrix. Alternatively, it accepts a given matrix. +Given an `AbstractReservoir` constructor and the size of the reservoir (`res_size`), this function returns the corresponding reservoir matrix. Alternatively, it accepts a pre-generated matrix. + +# Arguments +- `reservoir`: An `AbstractReservoir` object or constructor. +- `res_size`: The size of the reservoir matrix. +- `matrix_type`: The type of the resulting matrix. By default, it is set to `Matrix{Float64}`. + +# Returns +A matrix representing the reservoir, generated based on the properties of the specified `reservoir` object or constructor. + +# References +The choice of reservoir initialization is crucial in Echo State Networks (ESNs) for achieving effective temporal modeling. Specific references for reservoir initialization methods may vary based on the type of reservoir used, but the practice of initializing reservoirs for ESNs is widely documented in the ESN literature. """ function create_reservoir(reservoir::RandSparseReservoir, res_size; @@ -84,11 +104,22 @@ end PseudoSVDReservoir(max_value, sparsity, sorted, reverse_sort) PseudoSVDReservoir(max_value, sparsity; sorted=true, reverse_sort=false) -Returns an initializer to build a sparse reservoir matrix, with given ```sparsity``` -created using SVD as described in [1]. +Returns an initializer to build a sparse reservoir matrix with the given `sparsity` by using a pseudo-SVD approach as described in [^yang]. + +# Arguments +- `res_size`: The size of the reservoir matrix. +- `max_value`: The maximum absolute value of elements in the matrix. +- `sparsity`: The desired sparsity level of the reservoir matrix. +- `sorted`: A boolean indicating whether to sort the singular values before creating the diagonal matrix. By default, it is set to `true`. +- `reverse_sort`: A boolean indicating whether to reverse the sorted singular values. By default, it is set to `false`. + +# Returns +A PseudoSVDReservoir object that can be used as a reservoir initializer in ESN construction. -[1] Yang, Cuili, et al. "_Design of polynomial echo state networks for time -series prediction._" Neurocomputing 290 (2018): 148-160. +# References +This reservoir initialization method, based on a pseudo-SVD approach, is inspired by the work in [^yang], which focuses on designing polynomial echo state networks for time series prediction. + +[^yang]: Yang, Cuili, et al. "_Design of polynomial echo state networks for time series prediction._" Neurocomputing 290 (2018): 148-160. """ function PseudoSVDReservoir(res_size, max_value, sparsity; sorted = true, reverse_sort = false) @@ -165,10 +196,17 @@ end DelayLineReservoir(res_size; weight=0.1) Returns a Delay Line Reservoir matrix constructor to obtain a deterministic reservoir as -described in [1]. The ```weight``` can be passed as arg or kwarg, and it determines the -absolute value of all the connections in the reservoir. +described in [^Rodan2010]. -[1] Rodan, Ali, and Peter Tino. "_Minimum complexity echo state network._" +# Arguments +- `res_size::Int`: The size of the reservoir. +- `weight::T`: The weight determines the absolute value of all the connections in the reservoir. + +# Returns +A `DelayLineReservoir` object. + +# References +[^Rodan2010]: Rodan, Ali, and Peter Tino. "Minimum complexity echo state network." IEEE transactions on neural networks 22.1 (2010): 131-144. """ function DelayLineReservoir(res_size; weight = 0.1) @@ -199,11 +237,20 @@ end DelayLineBackwardReservoir(res_size, weight, fb_weight) DelayLineBackwardReservoir(res_size; weight=0.1, fb_weight=0.2) -Returns a Delay Line Reservoir constructor to create a matrix with Backward connections -as described in [1]. The ```weight``` and ```fb_weight``` can be passed as either args or -kwargs, and they determine the only absolute values of the connections in the reservoir. +Returns a Delay Line Reservoir constructor to create a matrix with backward connections +as described in [^Rodan2010]. The `weight` and `fb_weight` can be passed as either arguments or +keyword arguments, and they determine the absolute values of the connections in the reservoir. + +# Arguments +- `res_size::Int`: The size of the reservoir. +- `weight::T`: The weight determines the absolute value of forward connections in the reservoir. +- `fb_weight::T`: The `fb_weight` determines the absolute value of backward connections in the reservoir. -[1] Rodan, Ali, and Peter Tino. "_Minimum complexity echo state network._" +# Returns +A `DelayLineBackwardReservoir` object. + +# References +[^Rodan2010]: Rodan, Ali, and Peter Tino. "Minimum complexity echo state network." IEEE transactions on neural networks 22.1 (2010): 131-144. """ function DelayLineBackwardReservoir(res_size; weight = 0.1, fb_weight = 0.2) @@ -235,10 +282,18 @@ end SimpleCycleReservoir(res_size; weight=0.1) Returns a Simple Cycle Reservoir constructor to build a reservoir matrix as -described in [1]. The ```weight``` can be passed as arg or kwarg, and it determines the +described in [^Rodan2010]. The `weight` can be passed as an argument or a keyword argument, and it determines the absolute value of all the connections in the reservoir. -[1] Rodan, Ali, and Peter Tino. "Minimum complexity echo state network." +# Arguments +- `res_size::Int`: The size of the reservoir. +- `weight::T`: The weight determines the absolute value of connections in the reservoir. + +# Returns +A `SimpleCycleReservoir` object. + +# References +[^Rodan2010]: Rodan, Ali, and Peter Tino. "Minimum complexity echo state network." IEEE transactions on neural networks 22.1 (2010): 131-144. """ function SimpleCycleReservoir(res_size; weight = 0.1) @@ -272,13 +327,21 @@ end CycleJumpsReservoir(res_size, cycle_weight, jump_weight, jump_size) Return a Cycle Reservoir with Jumps constructor to create a reservoir matrix as described -in [1]. The ```weight``` and ```jump_weight``` can be passed as args or kwargs, and they -determine the absolute values of all the connections in the reservoir. The ```jump_size``` -can also be passed either as arg or kwarg, and it determines the jumps between -```jump_weight```s. +in [^Rodan2012]. The `cycle_weight`, `jump_weight`, and `jump_size` can be passed as arguments or keyword arguments, and they +determine the absolute values of connections in the reservoir. The `jump_size` determines the jumps between `jump_weight`s. -[1] Rodan, Ali, and Peter Tiňo. "_Simple deterministically constructed cycle reservoirs -with regular jumps._" Neural computation 24.7 (2012): 1822-1852. +# Arguments +- `res_size::Int`: The size of the reservoir. +- `cycle_weight::T`: The weight of cycle connections. +- `jump_weight::T`: The weight of jump connections. +- `jump_size::Int`: The number of steps between jump connections. + +# Returns +A `CycleJumpsReservoir` object. + +# References +[^Rodan2012]: Rodan, Ali, and Peter Tiňo. "Simple deterministically constructed cycle reservoirs +with regular jumps." Neural computation 24.7 (2012): 1822-1852. """ function CycleJumpsReservoir(res_size; cycle_weight = 0.1, jump_weight = 0.1, jump_size = 3) return CycleJumpsReservoir(res_size, cycle_weight, jump_weight, jump_size) @@ -310,7 +373,16 @@ end """ NullReservoir() -Return a constructor for a matrix `zeros(res_size, res_size)`. +Return a constructor for a matrix of zeros with dimensions `res_size x res_size`. + +# Arguments +- None + +# Returns +A `NullReservoir` object. + +# References +- None """ struct NullReservoir <: AbstractReservoir end diff --git a/src/states.jl b/src/states.jl index b5344601..c1ef6648 100644 --- a/src/states.jl +++ b/src/states.jl @@ -16,16 +16,21 @@ end """ StandardStates() -No modification of the states takes place, default option. +When this struct is employed, the states of the reservoir are not modified. It represents the default behavior +in scenarios where no specific state modification is required. This approach is ideal for applications +where the inherent dynamics of the reservoir are sufficient, and no external manipulation of the states +is necessary. It maintains the original state representation, ensuring that the reservoir's natural properties +are preserved and utilized in computations. """ struct StandardStates <: AbstractStates end """ ExtendedStates() -The states are extended with the input data, for the training section, and the prediction -data, during the prediction section. This is obtained with a vertical concatenation of the -data and the states. +The `ExtendedStates` struct is used to extend the reservoir states by +vertically concatenating the input data (during training) and the prediction data (during the prediction phase). +This method enriches the state representation by integrating external data, enhancing the model's capability +to capture and utilize complex patterns in both training and prediction stages. """ struct ExtendedStates <: AbstractStates end @@ -41,8 +46,12 @@ end PaddedStates(padding) PaddedStates(;padding=1.0) -The states are padded with a chosen value. Usually, this value is set to one. The padding is obtained through a -vertical concatenation of the padding value and the states. +Creates an instance of the `PaddedStates` struct with specified padding value. +This padding is typically set to 1.0 by default but can be customized. +The states of the reservoir are padded by vertically concatenating this padding value, +enhancing the dimensionality and potentially improving the performance of the reservoir computing model. +This function is particularly useful in scenarios where adding a constant baseline to the states is necessary +for the desired computational task. """ function PaddedStates(; padding = 1.0) return PaddedStates(padding) @@ -52,9 +61,12 @@ end PaddedExtendedStates(padding) PaddedExtendedStates(;padding=1.0) -The states are extended with the training data or predicted data and subsequently padded with a chosen value. -Usually, the padding value is set to one. The padding and the extension are obtained through a vertical concatenation -of the padding value, the data, and the states. +Constructs a `PaddedExtendedStates` struct, which first extends the reservoir states with training or prediction data, +then pads them with a specified value (defaulting to 1.0). This process is achieved through vertical concatenation, +combining the padding value, data, and states. +This function is particularly useful for enhancing the reservoir's state representation in more complex scenarios, +where both extended contextual information and consistent baseline padding are crucial for the computational +effectiveness of the reservoir computing model. """ function PaddedExtendedStates(; padding = 1.0) return PaddedExtendedStates(padding) @@ -85,7 +97,12 @@ end """ NLADefault() -Returns the array untouched, default option. +`NLADefault` represents the default non-linear algorithm option. +When used, it leaves the input array unchanged. +This option is suitable in cases where no non-linear transformation of the data is required, +maintaining the original state of the input array for further processing. +It's the go-to choice for preserving the raw data integrity within the computational pipeline +of the reservoir computing model. """ struct NLADefault <: NonLinearAlgorithm end @@ -95,15 +112,21 @@ end """ NLAT1() -Applies the \$ \\text{T}_1 \$ transformation algorithm, as defined in [1] and [2]. -[1] Chattopadhyay, Ashesh, et al. "_Data-driven prediction of a multi-scale Lorenz 96 -chaotic system using a hierarchy of deep learning methods: Reservoir computing, -ANN, and RNN-LSTM._" (2019). - -[2] Pathak, Jaideep, et al. "_Model-free prediction of large spatiotemporally chaotic -systems from data: A reservoir computing approach._" -Physical review letters 120.2 (2018): 024102. +`NLAT1` implements the T₁ transformation algorithm introduced in [^Chattopadhyay] and [^Pathak]. +The T₁ algorithm selectively squares elements of the input array, +specifically targeting every second row. This non-linear transformation enhances certain data characteristics, +making it a valuable tool in analyzing chaotic systems and improving the performance of reservoir computing models. +The T₁ transformation's uniqueness lies in its selective approach, allowing for a more nuanced manipulation of the input data. + +References: +[^Chattopadhyay]: Chattopadhyay, Ashesh, et al. + "Data-driven prediction of a multi-scale Lorenz 96 chaotic system using a + hierarchy of deep learning methods: Reservoir computing, ANN, and RNN-LSTM." (2019). +[^Pathak]: Pathak, Jaideep, et al. + "Model-free prediction of large spatiotemporally chaotic systems from data: + A reservoir computing approach." + Physical review letters 120.2 (2018): 024102. """ struct NLAT1 <: NonLinearAlgorithm end @@ -120,11 +143,18 @@ end """ NLAT2() -Apply the \$ \\text{T}_2 \$ transformation algorithm, as defined in [1]. -[1] Chattopadhyay, Ashesh, et al. "_Data-driven prediction of a multi-scale Lorenz 96 -chaotic system using a hierarchy of deep learning methods: Reservoir computing, ANN, -and RNN-LSTM._" (2019). +`NLAT2` implements the T₂ transformation algorithm as defined in [^Chattopadhyay]. +This transformation algorithm modifies the reservoir states by multiplying each odd-indexed +row (starting from the second row) with the product of its two preceding rows. +This specific approach to non-linear transformation is useful for capturing and +enhancing complex patterns in the data, particularly beneficial in the analysis of chaotic +systems and in improving the dynamics within reservoir computing models. + +Reference: +[^Chattopadhyay]: Chattopadhyay, Ashesh, et al. + "Data-driven prediction of a multi-scale Lorenz 96 chaotic system using a + hierarchy of deep learning methods: Reservoir computing, ANN, and RNN-LSTM." (2019). """ struct NLAT2 <: NonLinearAlgorithm end @@ -141,11 +171,18 @@ end """ NLAT3() -Apply the \$ \\text{T}_3 \$ transformation algorithm, as defined in [1]. -[1] Chattopadhyay, Ashesh, et al. "_Data-driven prediction of a multi-scale Lorenz 96 -chaotic system using a hierarchy of deep learning methods: Reservoir computing, ANN, -and RNN-LSTM._" (2019). +The `NLAT3` struct implements the T₃ transformation algorithm as detailed in [^Chattopadhyay]. +This algorithm modifies the reservoir's states by multiplying each odd-indexed row +(beginning from the second row) with the product of the immediately preceding and the +immediately following rows. T₃'s unique approach to data transformation makes it particularly +useful for enhancing complex data patterns, thereby improving the modeling and analysis +capabilities within reservoir computing, especially for chaotic and dynamic systems. + +Reference: +[^Chattopadhyay]: Chattopadhyay, Ashesh, et al. + "Data-driven prediction of a multi-scale Lorenz 96 chaotic system using a hierarchy of deep learning methods: + Reservoir computing, ANN, and RNN-LSTM." (2019). """ struct NLAT3 <: NonLinearAlgorithm end @@ -159,3 +196,15 @@ function nla(::NLAT3, x_old) return x_new end +struct NLAT3 <: NonLinearAlgorithm end + +function nla(::NLAT3, x_old) + x_new = copy(x_old) + for i in 2:(size(x_new, 1) - 1) + if mod(i, 2) != 0 + x_new[i, :] = copy(x_old[i - 1, :] .* x_old[i + 1, :]) + end + end + + return x_new +end \ No newline at end of file