You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I noticed that in your adap_sche function, you normalized the obtained mask ratio function so that the sum of the mask ratios of all steps equals one. I can roughly understand your intention. This means that the total number of tokens retained from all your steps is the final number of tokens (for example, 16x16=256).
However, this seems to be different fromOfficial Jax Implementation of MaskGIT (https://github.com/google-research/maskgit). The maximum value of its mask ratio is from 1 to 0. This means that it predicts all tokens at once in the last decoding step and retains all tokens obtained in the last step. I’m not sure if I misunderstood it. Could you please clarify? Thanks a lot!
The text was updated successfully, but these errors were encountered:
Hello, I noticed that in your
adap_sche
function, you normalized the obtained mask ratio function so that the sum of the mask ratios of all steps equals one. I can roughly understand your intention. This means that the total number of tokens retained from all your steps is the final number of tokens (for example, 16x16=256).However, this seems to be different fromOfficial Jax Implementation of MaskGIT (https://github.com/google-research/maskgit). The maximum value of its mask ratio is from 1 to 0. This means that it predicts all tokens at once in the last decoding step and retains all tokens obtained in the last step. I’m not sure if I misunderstood it. Could you please clarify? Thanks a lot!
The text was updated successfully, but these errors were encountered: