Support for Sequence / Context Parallelism #1972

dwzhu-pku · 2024-10-15T10:29:48Z

⚠️ Please check that this feature request hasn't been suggested before.

I searched previous Ideas in Discussions didn't find any similar feature requests.
I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

Support sequence / context parallelism to allow for the SFT of >128k tokens on A/H100 GPUs. With only 8H100 gpus, we can only manage to SFT of no more than 64k tokens now.

✔️ Solution

Axolotl is backboned with Accelerate and can already intergrate with many frameworks such as Deepspeed to utilize their features. But there is still no straightforward ways to use sequence / context parallelism with these intergrations. I guess maybe this repo can offer some clues: https://github.com/jzhang38/EasyContext . It seems that we only need to monkeypatch the model, and do some stuffs with the dataloading procedure.

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this feature has not been requested yet.
I have provided enough information for the maintainers to understand and evaluate this request.

chiragjn · 2024-10-28T17:59:38Z

+1
https://github.com/pytorch/torchtitan aslo has FSDP2 implementation that includes 4D parallelism with Context Parallelism

dwzhu-pku added the enhancement New feature or request label Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Sequence / Context Parallelism #1972

Support for Sequence / Context Parallelism #1972

dwzhu-pku commented Oct 15, 2024

chiragjn commented Oct 28, 2024

Support for Sequence / Context Parallelism #1972

Support for Sequence / Context Parallelism #1972

Comments

dwzhu-pku commented Oct 15, 2024

⚠️ Please check that this feature request hasn't been suggested before.

🔖 Feature description

✔️ Solution

❓ Alternatives

📝 Additional Context

Acknowledgements

chiragjn commented Oct 28, 2024