Skip to content

pytorch-optimizer v3.2.0

Compare
Choose a tag to compare
@kozistr kozistr released this 28 Oct 23:30
a59f2e1

Change Log

Feature

  • Implement SOAP optimizer. (#275)
  • Support AdEMAMix variants. (#276)
    • bnb_ademamix8bit, bnb_ademamix32bit, bnb_paged_ademamix8bit, bnb_paged_ademamix32bit
  • Support 8/4bit, fp8 optimizers. (#208, #281)
    • torchao_adamw8bit, torchao_adamw4bit, torchao_adamwfp8.
  • Support a module-name-level (e.g. LayerNorm) weight decay exclusion for get_optimizer_parameters. (#282, #283)
  • Implement CPUOffloadOptimizer, which offloads optimizer to CPU for single-GPU training. (#284)
  • Support a regex-based filter for searching names of optimizers, lr schedulers, and loss functions.

Bug

  • Fix should_grokfast condition when initialization. (#279, #280)

Contributions

thanks to @Vectorrent