pytorch-optimizer v3.3.0
Change Log
Feature
- Support
PaLM
variant forScheduleFreeAdamW
optimizer. (#286, #288)- you can use this feature by setting
use_palm
toTrue
.
- you can use this feature by setting
- Implement
ADOPT
optimizer. (#289, #290) - Implement
FTRL
optimizer. (#291) - Implement
Cautious optimizer
feature. (#294)- Improving Training with One Line of Code
- you can use it by setting
cautious=True
forLion
,AdaFactor
andAdEMAMix
optimizers.
- Improve the stability of
ADOPT
optimizer. (#294) - Support a new projection type
random
forGaLoreProjector
. (#294) - Implement
DeMo
optimizer. (#300, #301) - Implement
Muon
optimizer. (#302) - Implement
ScheduleFreeRAdam
optimizer. (#304) - Implement
LaProp
optimizer. (#304) - Support
Cautious
variant toLaProp
,AdamP
,Adopt
optimizers. (#304).
Refactor
- Big refactoring, removing direct import from
pytorch_optimizer.*
.- I removed some methods not to directly import from it from
pytorch_optimzier.*
because they're probably not used frequently and actually not an optimizer rather utils only used for specific optimizers. pytorch_optimizer.[Shampoo stuff]
->pytorch_optimizer.optimizers.shampoo_utils.[Shampoo stuff]
.shampoo_utils
likeGraft
,BlockPartitioner
,PreConditioner
, etc. You can check the details here.
pytorch_optimizer.GaLoreProjector
->pytorch_optimizer.optimizers.galore.GaLoreProjector
.pytorch_optimizer.gradfilter_ema
->pytorch_optimizer.optimizers.grokfast.gradfilter_ema
.pytorch_optimizer.gradfilter_ma
->pytorch_optimizer.optimizers.grokfast.gradfilter_ma
.pytorch_optimizer.l2_projection
->pytorch_optimizer.optimizers.alig.l2_projection
.pytorch_optimizer.flatten_grad
->pytorch_optimizer.optimizers.pcgrad.flatten_grad
.pytorch_optimizer.un_flatten_grad
->pytorch_optimizer.optimizers.pcgrad.un_flatten_grad
.pytorch_optimizer.reduce_max_except_dim
->pytorch_optimizer.optimizers.sm3.reduce_max_except_dim
.pytorch_optimizer.neuron_norm
->pytorch_optimizer.optimizers.nero.neuron_norm
.pytorch_optimizer.neuron_mean
->pytorch_optimizer.optimizers.nero.neuron_mean
.
- I removed some methods not to directly import from it from
Docs
- Add more visualizations. (#297)
Bug
- Add optimizer parameter to
PolyScheduler
constructor. (#295)
Contributions
thanks to @tanganke