You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been trying to reproduce the results reported in the paper, and noticed that Table 4 in Appendix A does not incorporate the hyperparameters used for training MDEQ-XL on ImageNet. In particular, I'm curious about the following:
In general, is the stop mode "rel" or "abs"?
What epsilon is used as the threshold in the Broyden solver? Should I assume it was 1e-3 as is the default value?
What were the forward and backward quasi-Newton thresholds $T_f, T_b$?
Thanks so much!
The text was updated successfully, but these errors were encountered:
Hi,
I've been trying to reproduce the results reported in the paper, and noticed that Table 4 in Appendix A does not incorporate the hyperparameters used for training MDEQ-XL on ImageNet. In particular, I'm curious about the following:
Thanks so much!
The text was updated successfully, but these errors were encountered: