Add normalization to BlockRNNModel #1748

JanFidor · 2023-05-08T06:41:38Z

Summary

I've added normalization parameter to the BlockRNNModel, I've brainstormed how to do it for RNNModel and I couldn't come up with a way that wouldn't require some type of dynamic aggregation of the hidden states, so I decided to make the PR for BlockRNN for now.
I added two torch modules to simplify the rnn sequence, not sure if it's the cleanest way to implement it, but it's at least very readable.

Other Information

I also added layer norm, because it was a simple addition and it seems to be the recommended normalization for RNNs. I also considered adding group normalization, but it would either need constant num_groups parameter or additional constructor parameter for BlockRNNModel

JanFidor · 2023-05-15T06:40:27Z

Some of the tests were failing, I'll check if continues after merging develop. One of them was test_fit_predict_determinism() which after debugging turned out to fail for ARIMA model, It wasn't in a scope of this PR so I'm unsure what might have happened. Might be a problem with my local build, I'll wait and see what the github actions say

codecov-commenter · 2023-05-15T07:08:27Z

Codecov Report

Attention: 14 lines in your changes are missing coverage. Please review.

Comparison is base (8cb04f6) 93.88% compared to head (5de39ea) 93.78%.
Report is 2 commits behind head on master.

Files	Patch %	Lines
darts/models/forecasting/block_rnn_model.py	80.55%	7 Missing ⚠️
darts/utils/torch.py	66.66%	7 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1748      +/-   ##
==========================================
- Coverage   93.88%   93.78%   -0.10%     
==========================================
  Files         135      135              
  Lines       13425    13461      +36     
==========================================
+ Hits        12604    12625      +21     
- Misses        821      836      +15

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

JanFidor · 2023-05-15T10:45:18Z

I've been thinking whether adding batch norm makes sense in this case, as repeated rescaling would cause gradient explosion, the very thing LSTM / GRU were supposed to combat. I'm inclined to only allow layer normalization (maybe also group norm), so that users don't accidentally fall into that trap. Let me know if you think that would fit with darts design philosophy !

madtoinou

Hi @JanFidor, thank you again for contributing to darts.

Found this article, applying batch norm only on the output (not the hidden state). I would be curious to see a benchmark of the BlockRNNModel with and without normalization, check if we observe similar results.

madtoinou · 2023-08-22T08:26:03Z

darts/utils/torch.py

+        self.norm = nn.BatchNorm1d(feature_size)
+
+    def forward(self, input):
+        input = self._reshape_input(input)  # Reshape N L C -> N C L


This line is more about swapping axes that reshaping, I would instead use the corresponding torch function:

Suggested change

input = self._reshape_input(input) # Reshape N L C -> N C L

# Reshape N L C -> N C L

input = input.swapaxes(1,2)

madtoinou · 2023-08-22T08:26:25Z

darts/utils/torch.py

+    def forward(self, input):
+        input = self._reshape_input(input)  # Reshape N L C -> N C L
+        input = self.norm(input)
+        input = self._reshape_input(input)


Suggested change

input = self._reshape_input(input)

input = input.swapaxes(1,2)

JanFidor · 2023-08-28T10:01:58Z

Thanks for another review @madtoinou ! The article looks exciting (at least after skimming it and reading the abstract ;P ) I found some implementations online, but I'd rather understand the actual idea first before implementing it, so it might take me a little longer compared to the other 2 PRs

JanFidor · 2023-09-07T07:53:11Z

Hi @madtoinou , quick update! I've read the paper and have an idea how to implement it. It might need a little bit of magic to get the time_step_index into the model input, but I think it should be doable, I'll let you know when I'll get everything running or if I stumble into some problem

JanFidor · 2023-09-25T10:45:08Z

Quick update @madtoinou. I've been browsing through the codebase and wanted to get your thoughts on my planned approach. I think that the simplest approach would be to manually add a past encoder with static position, but that would require expanding IntegerIndexEncoder which only supports 'relative' for now. That said, I'm not sure at which point the Encoders are applied to the TS and this approach depends on it happening before TS are sliced for training. It's also possible to manually add a "static index" component, but I think this approach would be more elegant and static IntegerIndexEncoder might be useful in other implementations in the future

JanFidor · 2023-10-06T14:50:20Z

Hi again @madtoinou! I wanted to get your thoughts on my new idea for the implementation. I went back to the paper and found the mention of using the batch norms specifically when training. Wouldn't it just suffice to store input_chunk_length batch norms instead, which would be significantly easier, might actually be more inline with what the paper proposes and wouldn't require TS of same length for training? I'll go ahead with this idea and give another update once I make some basic benchmarks.

review-notebook-app · 2024-02-07T17:43:18Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…lization # Conflicts: # darts/models/forecasting/block_rnn_model.py

JanFidor · 2024-02-07T18:04:35Z

It took some playing around but I think I managed to fix most of the git history (please ignore the git push --force hahaha)

VascoSch92 · 2024-02-08T10:10:58Z

darts/models/forecasting/block_rnn_model.py

+        target_size: int,
+        normalization: str = None,
+    ):
+        if not num_layers_out_fc:


I don't get this point here.

num_layers_out_fc is a list of integers correct?
Suppose num_layers_out_fc = [], then not num_layers_out_fc is True.
So why num_layers_out_fc = [] ?

VascoSch92 · 2024-02-08T10:16:02Z

darts/models/forecasting/block_rnn_model.py

+
+        last = input_size
+        feats = []
+        for feature in num_layers_out_fc + [


i will rather use the extend method for lists

VascoSch92 · 2024-02-08T10:18:40Z

darts/models/forecasting/block_rnn_model.py

+            last = feature
+        return nn.Sequential(*feats)
+
+    def _normalization_layer(self, normalization: str, hidden_size: int):


if normalization is different from batch and layer the method return None. is this intended?

JanFidor added 2 commits May 8, 2023 08:11

add normalization to block_rnn

8804f0b

remove todo

ca9fddb

JanFidor requested review from hrzn and dennisbader as code owners May 8, 2023 06:41

JanFidor added 4 commits May 11, 2023 18:00

fix indexing

e4c6089

clean up unused code

80c9e10

pass hidden state to fc layer

e89c22e

Merge branch 'master' into feature/rnn-normalization

544ab44

madtoinou reviewed Aug 22, 2023

View reviewed changes

JanFidor force-pushed the feature/rnn-normalization branch from 1a4627f to 544ab44 Compare February 7, 2024 17:48

JanFidor added 2 commits February 7, 2024 19:02

update block rnn

76a58fe

Merge remote-tracking branch 'upstream/master' into feature/rnn-norma…

f951dae

…lization # Conflicts: # darts/models/forecasting/block_rnn_model.py

refactor temporal batch norm

5de39ea

VascoSch92 reviewed Feb 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add normalization to BlockRNNModel #1748

Add normalization to BlockRNNModel #1748

JanFidor commented May 8, 2023

JanFidor commented May 15, 2023

codecov-commenter commented May 15, 2023 •

edited

Loading

JanFidor commented May 15, 2023

madtoinou left a comment

madtoinou Aug 22, 2023

madtoinou Aug 22, 2023

JanFidor commented Aug 28, 2023

JanFidor commented Sep 7, 2023

JanFidor commented Sep 25, 2023 •

edited

Loading

JanFidor commented Oct 6, 2023

review-notebook-app bot commented Feb 7, 2024

JanFidor commented Feb 7, 2024 •

edited

Loading

VascoSch92 Feb 8, 2024

VascoSch92 Feb 8, 2024

VascoSch92 Feb 8, 2024

	input = self._reshape_input(input) # Reshape N L C -> N C L
	# Reshape N L C -> N C L
	input = input.swapaxes(1,2)

	input = self._reshape_input(input)
	input = input.swapaxes(1,2)

Add normalization to BlockRNNModel #1748

Are you sure you want to change the base?

Add normalization to BlockRNNModel #1748

Conversation

JanFidor commented May 8, 2023

Summary

Other Information

JanFidor commented May 15, 2023

codecov-commenter commented May 15, 2023 • edited Loading

Codecov Report

JanFidor commented May 15, 2023

madtoinou left a comment

Choose a reason for hiding this comment

madtoinou Aug 22, 2023

Choose a reason for hiding this comment

madtoinou Aug 22, 2023

Choose a reason for hiding this comment

JanFidor commented Aug 28, 2023

JanFidor commented Sep 7, 2023

JanFidor commented Sep 25, 2023 • edited Loading

JanFidor commented Oct 6, 2023

review-notebook-app bot commented Feb 7, 2024

JanFidor commented Feb 7, 2024 • edited Loading

VascoSch92 Feb 8, 2024

Choose a reason for hiding this comment

VascoSch92 Feb 8, 2024

Choose a reason for hiding this comment

VascoSch92 Feb 8, 2024

Choose a reason for hiding this comment

codecov-commenter commented May 15, 2023 •

edited

Loading

JanFidor commented Sep 25, 2023 •

edited

Loading

JanFidor commented Feb 7, 2024 •

edited

Loading