Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VarianceScaling Initializer Is Unseeded #218

Open
cn4750 opened this issue May 7, 2023 · 8 comments
Open

VarianceScaling Initializer Is Unseeded #218

cn4750 opened this issue May 7, 2023 · 8 comments

Comments

@cn4750
Copy link

cn4750 commented May 7, 2023

@Tilps ran into this issue after upgrading TensorFlow:

/usr/local/lib/python3.8/dist-packages/keras/initializers/initializers.py:120: UserWarning: The initializer VarianceScaling is unseeded and being called multiple times, which will return identical values each time (even if the initializer is unseeded). Please update your code to provide a seed to the initializer, or avoid using the same initalizer instance more than once.

From Tilps:

I think its a bug in our code - I think you are supposed to construct a separate instance for every use case, so that 'call' is only invoked at most once.

I assume it is this code the above warning references.

@masterkni6, can you look at this when you get a chance?

@cn4750 cn4750 transferred this issue from LeelaChessZero/lc0 May 7, 2023
@Tilps
Copy link
Contributor

Tilps commented May 7, 2023

fix would be to change 'initializer' to be a function - which constructs a VarianceScaling with appropriate parameters - so each FFN gets its own instance rather than sharing the same.

@masterkni6
Copy link
Contributor

creating two instances of the initializer does not get rid of the warning. It seems a seed is required to get rid of the warning even if seed is hardcoded to 0

@Tilps
Copy link
Contributor

Tilps commented May 7, 2023

hmm - maybe there is more than one broken place?

@Tilps
Copy link
Contributor

Tilps commented May 7, 2023

@@ -1347,6 +1357,7 @@ class TFProcess:
     def encoder_layer(self, inputs, emb_size: int, d_model: int,
                       num_heads: int, dff: int, name: str):
         initializer = None
+        initializer2 = None
         if self.encoder_layers > 0:
             # DeepNorm
             alpha = tf.cast(tf.math.pow(2. * self.encoder_layers, 0.25),
@@ -1355,10 +1366,14 @@ class TFProcess:
                            self.model_dtype)
             xavier_norm = tf.keras.initializers.VarianceScaling(
                 scale=beta, mode='fan_avg', distribution='truncated_normal')
+            xavier_norm2 = tf.keras.initializers.VarianceScaling(
+                scale=beta, mode='fan_avg', distribution='truncated_normal')
             initializer = xavier_norm
+            initializer2 = xavier_norm2
         else:
             alpha = 1
             initializer = "glorot_normal"
+            initializer2 = "glorot_normal"
         # multihead attention
         attn_output, attn_wts = self.mha(inputs,
                                          emb_size,
@@ -1377,7 +1392,7 @@ class TFProcess:
         ffn_output = self.ffn(out1,
                               emb_size,
                               dff,
-                              initializer,
+                              initializer2,
                               name=name + "/ffn")
         ffn_output = tf.keras.layers.Dropout(self.dropout_rate,
                                              name=name +

I'm giving this a test now...

@masterkni6
Copy link
Contributor

masterkni6 commented May 7, 2023

def _warn_reuse(self):
        if getattr(self, "_used", False):
            if getattr(self, "seed", None) is None:
                warnings.warn(
                    f"The initializer {self.__class__.__name__} is unseeded "
                    "and being called multiple times, which will return "
                    "identical values each time (even if the initializer is "
                    "unseeded). Please update your code to provide a seed to "
                    "the initializer, or avoid using the same initalizer "
                    "instance more than once."
                )
        else:
            self._used = True

seems the source code is only checking for seed is None

@Tilps
Copy link
Contributor

Tilps commented May 7, 2023

yes - because they assume if you are setting the seed you know what you are doing...

@Tilps
Copy link
Contributor

Tilps commented May 7, 2023

I also see no improvement with the fix...

@Tilps
Copy link
Contributor

Tilps commented May 7, 2023

oh ffn creates 2 dense...
so we actually need 4 initializers...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants