pytorch-optimizer v3.0.1

kozistr released this 22 Jun 06:26

7c40a79

Change Log

Feature

Implement FAdam optimizer. (#241, #242)
- Adam is a natural gradient optimizer using diagonal empirical Fisher information
Tweak AdaFactor optimizer. (#236, #243)
- support not-using-first-momentum when beta1 is not given
- default dtype for first momentum to bfloat16
- clip second momentum to 0.999
Implement GrokFast optimizer. (#244, #245)
- Accelerated Grokking by Amplifying Slow Gradients

Bug

Wrong typing of reg_noise. (#239, #240)
Lookahead`s param_groups attribute is not loaded from checkpoint. (#237, #238)

Contributions

thanks to @michaldyczko

Contributors

michaldyczko

Assets 2