Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
set the default to use set_to_none for clearing gradients in BF16 opt…
…imizer. (microsoft#5434) as discussed in microsoft#5175, set the default to use set_to_none for clearing gradients in BF16 optimizer. Additionally, for the case of zero clearing, use foreach_zero. Verified correctness with mega-ds llama 7B training. FYI @loadams --------- Co-authored-by: Logan Adams <[email protected]>
- Loading branch information