fix!: variable scaling, pressure level scalings only applied in specific circumstances #52

sahahner · 2024-12-27T13:35:33Z

Solve the problem explained in issue #7 by refactoring the variable scalings into a general variable scaling and a pressure level scaling.
@mc4117 , @pinnstorm and me came up with a new structure. This PR implements this.

This is first draft. Feedback very welcome!

allow several variable level scaling (i.e. pressure level and model level)
implement/update tests
decide: do we want to allow scaling by variable_ref and variable_name, i.e. scale q_50 by q and q_50?
get variable level and name from dataset metadata

b8raoult · 2024-12-30T09:35:33Z

Please consider using the knowledge about variables that come from the dataset metadata. See https://github.com/ecmwf/anemoi-transform/blob/7cbf5f3d4baa37453022a5a97e17cc71a5b8ceeb/src/anemoi/transform/variables/__init__.py#L47

sahahner · 2024-12-30T09:51:50Z

Please consider using the knowledge about variables that come from the dataset metadata. See https://github.com/ecmwf/anemoi-transform/blob/7cbf5f3d4baa37453022a5a97e17cc71a5b8ceeb/src/anemoi/transform/variables/__init__.py#L47

We have given this some thought, and after wanting to use the information from the dataset in the beginning, I have opted for allowing the definition of our own groups here to use different scaling for self-defined groups.
Also, I was also told that it is possible to build datasets without information about the variable types and therefore not to rely on that metadata.
If you have strong opinions on this I am happy to discuss it again.

training/src/anemoi/training/train/forecaster.py

training/src/anemoi/training/train/scaling.py

training/src/anemoi/training/train/forecaster.py

…umstances' of https://github.com/ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

FussyDuck · 2025-01-02T11:48:17Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ sahahner
✅ mc4117
❌ pinnstorm
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

training/src/anemoi/training/train/scaling.py

JPXKQX · 2025-01-08T11:43:58Z

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

mc4117 · 2025-01-09T14:24:20Z

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

JPXKQX · 2025-01-09T14:44:22Z

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

I’m not sure what the best approach is. On the one hand, adding more work to this PR would increase its complexity, which might make it more logical to address this refactor in a future PR. On the other hand, this PR already introduces some changes to the configs, and the future PRs would also involve changes to the configs. From this, it might be better to have 1 PR and communicate all the changes to users at once. What do you think?

pinnstorm · 2025-01-10T11:16:32Z

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

I’m not sure what the best approach is. On the one hand, adding more work to this PR would increase its complexity, which might make it more logical to address this refactor in a future PR. On the other hand, this PR already introduces some changes to the configs, and the future PRs would also involve changes to the configs. From this, it might be better to have 1 PR and communicate all the changes to users at once. What do you think?

I'm happy for it to be included in this PR! Not sure if @sahahner or @mc4117 have other views?

for more information, see https://pre-commit.ci

training/src/anemoi/training/config/training/default.yaml

training/src/anemoi/training/train/scaling.py

for more information, see https://pre-commit.ci

…umstances' of github.com:ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

…caling

…umstances' of https://github.com/ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

for more information, see https://pre-commit.ci

mc4117 · 2025-01-21T08:44:24Z

@jakob-schloer @HCookie could you review this please?

…umstances' of github.com:ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

HCookie · 2025-01-22T09:58:46Z

training/src/anemoi/training/train/forecaster.py

+        if isinstance(config_container, list):
+            scalar = [
+                (
+                    instantiate(
+                        scalar_config,
+                        scaling_config=config.training.variable_loss_scaling,
+                        data_indices=data_indices,
+                        statistics=statistics,
+                        statistics_tendencies=statistics_tendencies,
+                    )
+                    if scalar_config["name"] == "tendency"
+                    else instantiate(
+                        scalar_config,
+                        scaling_config=config.training.variable_loss_scaling,
+                        data_indices=data_indices,


Given that the tendency scalar is a subclass of the same generic Scalar base class this overly specific if should be removed. My suggestion would be to add to the statistics, & tendencies to the base class and leave the usage up to the implementation.

HCookie · 2025-01-22T09:59:34Z

training/src/anemoi/training/train/scaling.py

+if TYPE_CHECKING:
+    from omegaconf import DictConfig
+    from anemoi.models.data_indices.collection import IndexCollection


This will raise an issue once the CI is back, this block should be after any other imports.

HCookie · 2025-01-22T10:01:01Z

training/src/anemoi/training/train/scaling.py

+class BaseVariableLossScaler(ABC):
+    """Configurable method converting variable to loss scaling."""
+
+    def __init__(
+        self,
+        scaling_config: DictConfig,
+        data_indices: IndexCollection,
+        metadata_variables: dict | None = None,
+    ) -> None:
+        """Initialise Scaler.


Could it make sense to further 'genericsize' this class? BaseLossScalar?

HCookie · 2025-01-22T10:01:56Z

training/src/anemoi/training/config/training/default.yaml

+  variable_groups:
+    default: sfc
+    pl: [q, t, u, v, w, z]


A comment on the use of this would be useful here

HCookie · 2025-01-22T10:03:01Z

training/src/anemoi/training/train/scaling.py

+        self.default_group = self.scaling_config.variable_groups.default
+        self.metadata_variables = metadata_variables
+
+        self.ExtractVariableGroupAndLevel = ExtractVariableGroupAndLevel(


The variable should not be camel case, and instead snake_case

HCookie · 2025-01-22T10:03:50Z

training/src/anemoi/training/train/scaling.py

+        )
+
+    @abstractmethod
+    def get_variable_scaling(self) -> np.ndarray: ...


A quick documentation here on what this method is expected to return would be good

HCookie · 2025-01-22T10:05:51Z

training/src/anemoi/training/train/scaling.py

+class GeneralVariableLossScaler(BaseVariableLossScaler):
+    """General scaling of variables to loss scaling."""


This documentation is hard to read, maybe I'm just missing the point.

sahahner added 3 commits December 27, 2024 10:15

first version of refactor of variable scaling

511ed18

config training changes

7ddf6d6

avoid multiple scaling

3ddeccc

sahahner linked an issue Dec 27, 2024 that may be closed by this pull request

Loss scalings #5

Open

2 tasks

sahahner linked an issue Dec 30, 2024 that may be closed by this pull request

Pressure Level Scalings only applied in specific circumstances #7

Open

mc4117 reviewed Dec 30, 2024

View reviewed changes

training/src/anemoi/training/train/forecaster.py Show resolved Hide resolved

mc4117 reviewed Dec 30, 2024

View reviewed changes

training/src/anemoi/training/train/scaling.py Outdated Show resolved Hide resolved

docstring and explain variable reference

be4602c

mc4117 reviewed Dec 31, 2024

View reviewed changes

training/src/anemoi/training/train/forecaster.py Outdated Show resolved Hide resolved

mc4117 added 4 commits December 31, 2024 10:47

fix to config for pressure level scaler

195af07

instantiating scalars as a list

2644c18

preparing for tendency losses

718fc57

Merge branch '7-pressure-level-scalings-only-applied-in-specific-circ…

a34ac02

…umstances' of https://github.com/ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

sahahner changed the title ~~pressure level scalings only applied in specific circumstances~~ refactor variable scaling, pressure level scalings only applied in specific circumstances Jan 2, 2025

log the variable level scaling information as before

b91af11

HCookie added the training label Jan 6, 2025

HCookie self-requested a review January 6, 2025 14:36

mc4117 reviewed Jan 7, 2025

View reviewed changes

training/src/anemoi/training/train/scaling.py Show resolved Hide resolved

pinnstorm added 3 commits January 8, 2025 15:01

adding tendency scaler to additional scalers

c22c50b

reformatting

1f4a532

updating description in configs

2843d98

anaprietonem assigned sahahner Jan 9, 2025

updating var-tendency-scaler spec

c978871

pinnstorm and others added 2 commits January 12, 2025 09:22

updating training/default config

f56f9b2

[pre-commit.ci] auto fixes from pre-commit.com hooks

be90000

for more information, see https://pre-commit.ci

mc4117 reviewed Jan 13, 2025

View reviewed changes

training/src/anemoi/training/config/training/default.yaml Outdated Show resolved Hide resolved

pinnstorm added 2 commits January 13, 2025 11:08

updating training/default.yaml

e474ae9

updating training/default.yaml

f005f84

floriankrb reviewed Jan 16, 2025

View reviewed changes

training/src/anemoi/training/train/scaling.py Outdated Show resolved Hide resolved

mc4117 and others added 12 commits January 17, 2025 09:51

first try at tests

7cdccc5

[pre-commit.ci] auto fixes from pre-commit.com hooks

61e7933

for more information, see https://pre-commit.ci

variable name and level from mars metadata

462bb34

Merge branch '7-pressure-level-scalings-only-applied-in-specific-circ…

960a602

…umstances' of github.com:ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

get variable group and level in utils file

af10173

empty line

395cd6f

convert test for new strucutre. pressure level and general variable s…

1f53a82

…caling

more plausible check for availability of mars metadata

3747959

update to tendency tests (still not working)

68cd6e3

Merge branch '7-pressure-level-scalings-only-applied-in-specific-circ…

d3a7c29

…umstances' of https://github.com/ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

tendency scaler tests now working

d6e127a

[pre-commit.ci] auto fixes from pre-commit.com hooks

fd29cbc

for more information, see https://pre-commit.ci

mc4117 marked this pull request as ready for review January 20, 2025 14:44

sahahner added 4 commits January 22, 2025 09:27

change function into class, extracting variable group and name

8bff68b

Merge branch '7-pressure-level-scalings-only-applied-in-specific-circ…

4c7cbc1

…umstances' of github.com:ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances

correct function call

7d8c76d

correct typo in test

d928b30

HCookie changed the title ~~refactor variable scaling, pressure level scalings only applied in specific circumstances~~ fix!: variable scaling, pressure level scalings only applied in specific circumstances Jan 22, 2025

HCookie requested changes Jan 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix!: variable scaling, pressure level scalings only applied in specific circumstances #52

fix!: variable scaling, pressure level scalings only applied in specific circumstances #52

sahahner commented Dec 27, 2024 •

edited

Loading

b8raoult commented Dec 30, 2024

sahahner commented Dec 30, 2024

FussyDuck commented Jan 2, 2025 •

edited

Loading

JPXKQX commented Jan 8, 2025

mc4117 commented Jan 9, 2025

JPXKQX commented Jan 9, 2025

pinnstorm commented Jan 10, 2025

mc4117 commented Jan 21, 2025

HCookie Jan 22, 2025

HCookie Jan 22, 2025

HCookie Jan 22, 2025

HCookie Jan 22, 2025

HCookie Jan 22, 2025

HCookie Jan 22, 2025

HCookie Jan 22, 2025

		class GeneralVariableLossScaler(BaseVariableLossScaler):
		"""General scaling of variables to loss scaling."""

fix!: variable scaling, pressure level scalings only applied in specific circumstances #52

Are you sure you want to change the base?

fix!: variable scaling, pressure level scalings only applied in specific circumstances #52

Conversation

sahahner commented Dec 27, 2024 • edited Loading

b8raoult commented Dec 30, 2024

sahahner commented Dec 30, 2024

FussyDuck commented Jan 2, 2025 • edited Loading

JPXKQX commented Jan 8, 2025

mc4117 commented Jan 9, 2025

JPXKQX commented Jan 9, 2025

pinnstorm commented Jan 10, 2025

mc4117 commented Jan 21, 2025

HCookie Jan 22, 2025

Choose a reason for hiding this comment

HCookie Jan 22, 2025

Choose a reason for hiding this comment

HCookie Jan 22, 2025

Choose a reason for hiding this comment

HCookie Jan 22, 2025

Choose a reason for hiding this comment

HCookie Jan 22, 2025

Choose a reason for hiding this comment

HCookie Jan 22, 2025

Choose a reason for hiding this comment

HCookie Jan 22, 2025

Choose a reason for hiding this comment

sahahner commented Dec 27, 2024 •

edited

Loading

FussyDuck commented Jan 2, 2025 •

edited

Loading