怎么在使用gemini plugin的时候,使用梯度累计 #4610
Answered
by
Fridge003
bisunny
asked this question in
Community | Q&A
-
我在尝试跑llama2的例子时发现没有梯度累计的代码,而例子中(gemini.sh)是用的gemini,请问这个例子怎么开启梯度累计,把llama每个global batch的token数量调整到4M |
Beta Was this translation helpful? Give feedback.
Answered by
Fridge003
Oct 17, 2023
Replies: 2 comments 3 replies
-
Gemini 是基于Chunk内存管理和异构内存管理的 Zero-3,它不支持局部梯度累积 |
Beta Was this translation helpful? Give feedback.
0 replies
-
您好,gemini对梯度累积的支持已经完成。 |
Beta Was this translation helpful? Give feedback.
3 replies
Answer selected by
Fridge003
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
您好,gemini对梯度累积的支持已经完成。
使用方法可以参考这个文档。