pytorch-optimizer v3.0.1
Change Log
Feature
- Implement
FAdam
optimizer. (#241, #242) - Tweak
AdaFactor
optimizer. (#236, #243)- support not-using-first-momentum when beta1 is not given
- default dtype for first momentum to
bfloat16
- clip second momentum to 0.999
- Implement
GrokFast
optimizer. (#244, #245)
Bug
- Wrong typing of reg_noise. (#239, #240)
- Lookahead`s param_groups attribute is not loaded from checkpoint. (#237, #238)
Contributions
thanks to @michaldyczko