Skip to content

Releases: kozistr/pytorch_optimizer

pytorch-optimizer v3.3.2

21 Dec 10:38
8f538d4
Compare
Choose a tag to compare

Change Log

Feature

Bug

  • Clone exp_avg before calling apply_cautious not to mask exp_avg. (#316)

pytorch-optimizer v3.3.1

21 Dec 07:20
d16a368
Compare
Choose a tag to compare

Change Log

Feature

Bug

  • Fix bias_correction in AdamG optimizer. (#305, #308)
  • Fix a potential bug when loading the state for Lookahead optimizer. (#306, #310)

Docs

Contributions

thanks to @Vectorrent

pytorch-optimizer v3.3.0

06 Dec 14:44
5def5d7
Compare
Choose a tag to compare

Change Log

Feature

Refactor

  • Big refactoring, removing direct import from pytorch_optimizer.*.
    • I removed some methods not to directly import from it from pytorch_optimzier.* because they're probably not used frequently and actually not an optimizer rather utils only used for specific optimizers.
    • pytorch_optimizer.[Shampoo stuff] -> pytorch_optimizer.optimizers.shampoo_utils.[Shampoo stuff].
      • shampoo_utils like Graft, BlockPartitioner, PreConditioner, etc. You can check the details here.
    • pytorch_optimizer.GaLoreProjector -> pytorch_optimizer.optimizers.galore.GaLoreProjector.
    • pytorch_optimizer.gradfilter_ema -> pytorch_optimizer.optimizers.grokfast.gradfilter_ema.
    • pytorch_optimizer.gradfilter_ma -> pytorch_optimizer.optimizers.grokfast.gradfilter_ma.
    • pytorch_optimizer.l2_projection -> pytorch_optimizer.optimizers.alig.l2_projection.
    • pytorch_optimizer.flatten_grad -> pytorch_optimizer.optimizers.pcgrad.flatten_grad.
    • pytorch_optimizer.un_flatten_grad -> pytorch_optimizer.optimizers.pcgrad.un_flatten_grad.
    • pytorch_optimizer.reduce_max_except_dim -> pytorch_optimizer.optimizers.sm3.reduce_max_except_dim.
    • pytorch_optimizer.neuron_norm -> pytorch_optimizer.optimizers.nero.neuron_norm.
    • pytorch_optimizer.neuron_mean -> pytorch_optimizer.optimizers.nero.neuron_mean.

Docs

  • Add more visualizations. (#297)

Bug

  • Add optimizer parameter to PolyScheduler constructor. (#295)

Contributions

thanks to @tanganke

pytorch-optimizer v3.2.0

28 Oct 23:30
a59f2e1
Compare
Choose a tag to compare

Change Log

Feature

  • Implement SOAP optimizer. (#275)
  • Support AdEMAMix variants. (#276)
    • bnb_ademamix8bit, bnb_ademamix32bit, bnb_paged_ademamix8bit, bnb_paged_ademamix32bit
  • Support 8/4bit, fp8 optimizers. (#208, #281)
    • torchao_adamw8bit, torchao_adamw4bit, torchao_adamwfp8.
  • Support a module-name-level (e.g. LayerNorm) weight decay exclusion for get_optimizer_parameters. (#282, #283)
  • Implement CPUOffloadOptimizer, which offloads optimizer to CPU for single-GPU training. (#284)
  • Support a regex-based filter for searching names of optimizers, lr schedulers, and loss functions.

Bug

  • Fix should_grokfast condition when initialization. (#279, #280)

Contributions

thanks to @Vectorrent

pytorch-optimizer v3.1.2

10 Sep 10:58
9d5e181
Compare
Choose a tag to compare

Change Log

Feature

Bug

  • Add **kwargs to the parameters for dummy placeholder. (#270, #271)

pytorch-optimizer v3.1.1

14 Aug 09:47
a8eb19c
Compare
Choose a tag to compare

Change Log

Feature

Bug

  • Handle the optimizers that only take the model instead of the parameters in create_optimizer(). (#263)
  • Move the variable to the same device with the parameter. (#266, #267)

pytorch-optimizer v3.1.0

21 Jul 11:54
d00136f
Compare
Choose a tag to compare

Change Log

Feature

Refactor

  • Refactor AdamMini optimizer. (#258)
  • Deprecate optional dependency, bitsandbytes. (#258)
  • Move get_rms, approximate_sq_grad functions to BaseOptimizer for reusability. (#258)
  • Refactor shampoo_utils.py. (#259)
  • Add debias, debias_adam methods in BaseOptimizer. (#261)
  • Refactor to use BaseOptimizer only, not inherit multiple classes. (#261)

Bug

  • Fix several bugs in AdamMini optimizer. (#257)

Contributions

thanks to @sdbds

pytorch-optimizer v3.0.2

06 Jul 11:04
232f72e
Compare
Choose a tag to compare

Change Log

Feature

Refactor

  • Refactor Chebyschev lr scheduler modules. (#248)
    • Rename get_chebyshev_lr to get_chebyshev_lr_lambda.
    • Rename get_chebyshev_schedule to get_chebyshev_perm_steps.
    • Call get_chebyshev_schedule function to get LamdbaLR scheduler object.
  • Refactor with ScheduleType. (#248)

pytorch-optimizer v3.0.1

22 Jun 06:26
7c40a79
Compare
Choose a tag to compare

Change Log

Feature

Bug

  • Wrong typing of reg_noise. (#239, #240)
  • Lookahead`s param_groups attribute is not loaded from checkpoint. (#237, #238)

Contributions

thanks to @michaldyczko

pytorch-optimizer v3.0.0

21 May 09:02
eda736f
Compare
Choose a tag to compare

Change Log

The major version is updated! (v2.12.0 -> v3.0.0) (#164)

Many optimizers, learning rate schedulers, and objective functions are in pytorch-optimizer.
Currently, pytorch-optimizer supports 67 optimizers (+ bitsandbytes), 11 lr schedulers, and 13 loss functions, and reached about 4 ~ 50K downloads / month (peak is 75K downloads / month)!

The reason for updating the major version from v2 to v3 is that I think it's a good time to ship the recent implementations (the last update was about 7 months ago) and plan to pivot to new concepts like training utilities while maintaining the original features (e.g. optimizers).
Also, rich test cases, benchmarks, and examples are on the list!

Finally, thanks for using the pytorch-optimizer, and feel free to make any requests :)

Feature

Fix

  • Fix SRMM to allow operation beyond memory_length. (#227)

Dependency

  • Drop Python 3.7 support officially. (#221)
  • Update bitsandbytes to 0.43.0. (#228)

Docs

  • Add missing parameters in Ranger21 optimizer document. (#214, #215)
  • Fix WSAM optimizer paper link. (#219)

Contributions

thanks to @sdbds, @i404788

Diff