Skip to content

Issues: pytorch/torchtitan

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Checkpoint conversion
#758 opened Dec 20, 2024 by MaxiBoether
JobConfig does not support typing enhancement New feature or request
#753 opened Dec 18, 2024 by greeneggsandyaml
Low bit Optimizers & FA-3
#742 opened Dec 16, 2024 by asahni04
Issue: Loss Discrepancy Between FSDP1 and FSDP2 with AdamW Optimizer question Further information is requested
#724 opened Dec 9, 2024 by Teng-xu
Context parallelism understanding context_parallel question Further information is requested
#723 opened Dec 9, 2024 by jinsong-mao
First Shard Group Save and Load Checkpoint for HSDP question Further information is requested
#709 opened Nov 29, 2024 by qsh-zh
[rfc] torchtitan release practices release_blocking Issues that are blocking the milestone / release completion
#688 opened Nov 22, 2024 by tianyu-l torchtitan v1.0.0 release
[Parallelism] Implement vocabulary parallelism enhancement New feature or request
#680 opened Nov 15, 2024 by casper-hansen
Any suggestion for Llama-3.1-70b(128k seq len) deploy mesh with torchtian? enhancement New feature or request question Further information is requested
#678 opened Nov 15, 2024 by medivh-xp
Very low wps with H200 Gpus question Further information is requested
#676 opened Nov 13, 2024 by aniltrkkn
Questions about FSDP2 support and memory usage. question Further information is requested
#658 opened Oct 29, 2024 by tangjiasheng
torch.distributed.breakpoint(rank=1) hangs because of --local-ranks-filter 0 documentation Improvements or additions to documentation
#652 opened Oct 25, 2024 by weifengpy
[Multimodal] Adding OBELICS DataLoader enhancement New feature or request
#650 opened Oct 24, 2024 by TJ-Solergibert
[Config] Make FSDP reshard_after_forward: bool configurable enhancement New feature or request
#644 opened Oct 22, 2024 by awgu
What is the expected inference steps after I apply torchao in training? question Further information is requested
#638 opened Oct 21, 2024 by goldhuang
add H100 in CI better_engineering Repo code quality improvements integration test Adding integration tests
#632 opened Oct 18, 2024 by tianyu-l
create a note on torchtitan official release documentation Improvements or additions to documentation release_blocking Issues that are blocking the milestone / release completion
#631 opened Oct 18, 2024 by tianyu-l torchtitan v1.0.0 release
Non-DP runs default to float32 precision enhancement New feature or request
#630 opened Oct 18, 2024 by carmocca
ProTip! no:milestone will show everything without a milestone.