Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for LayerNorm #263

Open
Niccolo-Ajroldi opened this issue Sep 22, 2022 · 3 comments
Open

Support for LayerNorm #263

Niccolo-Ajroldi opened this issue Sep 22, 2022 · 3 comments
Labels
good first issue Good for newcomers

Comments

@Niccolo-Ajroldi
Copy link

I was trying to extend a Vision Transformer model using backpack. However, I encounter the following error:

UserWarning: Extension saving to grad_batch does not have an extension for Module <class 'torch.nn.modules.normalization.LayerNorm'> although the module has parameters

I know that torch.nn.BatchNormNd leads to ill-defined first-order quantities and hence it is not implemented here. Does the same hold for Layer Normalization?

Thank you in advance!

@f-dangel
Copy link
Owner

Hi,

thanks for your question. The exception you get for LayerNorm is because BackPACK currently does not support it.

In contrast to BatchNorm however, this layer treats each sample in a mini-batch independently (mean and variance for the normalization for a sample are computed along its feature dimensions; for BN they are computed along the batch dimension). Hence, first-order quantities like individual gradients are defined.

To add support for LayerNorm, the following example from the documentation is a good starting point. It describes how to write BackPACK extensions for new layers (the "Custom module extension" is the most relevant).

I'd be happy to help merging a PR.

Best,
Felix

@f-dangel f-dangel added the good first issue Good for newcomers label Sep 24, 2022
@KOVVURISATYANARAYANAREDDY

Any update on this?

@f-dangel
Copy link
Owner

No progress, and I don't have capacities to work on this feature.

To break things further down, adding limited support for LayerNorm, e.g. only the BatchGrad extension, would be a feasible starting point. This can be achieved by following the above example in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants