Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Enabling UCC backend for PP communication
#1157 opened Sep 24, 2024 by youngeunkwon0405 Loading…
opt:opt ltor masks
#1155 opened Sep 24, 2024 by Baibaifan Loading…
Fix typo lobal_smoothing -> label_smoothing
#1137 opened Sep 13, 2024 by lifeiteng Loading…
Fix shape of qk_layernorm.
#1130 opened Sep 10, 2024 by ftgreat Loading…
Update fully_parallel.py
#1067 opened Sep 4, 2024 by weipingtao Loading…
use torch.exp_ not torch.exp(..., out=)
#1054 opened Aug 30, 2024 by crcrpar Loading…
Fix dataset helper compilation
#1048 opened Aug 28, 2024 by yzygitzh Loading…
fix packaging import bug when using setuptools v70
#1045 opened Aug 28, 2024 by 1195343015 Loading…
Fixing doc link
#1037 opened Aug 26, 2024 by MekkCyber Loading…
Fix BlendableDataset for low sampling probs
#1007 opened Aug 14, 2024 by janEbert Loading…
[Bugfix] Fix typo in moe doc
#986 opened Aug 8, 2024 by billishyahao Loading…
[bugfix] Fix _warmup_jit_function
#973 opened Aug 7, 2024 by taowangcheng Loading…
[bugfix] Fix the incorrect with-statement
#972 opened Aug 7, 2024 by aaa123git Loading…
Update README.md
#961 opened Aug 1, 2024 by ArtificialZeng Loading…
fix typo in token_dispatcher.py
#960 opened Jul 31, 2024 by xinqiu Loading…
[DOC] Fix wrong llama2 pretrain url in README stale No activity in 60 days on issue or PR
#941 opened Jul 22, 2024 by lausannel Loading…
Update README.md to add some mcore documentation and links stale No activity in 60 days on issue or PR
#928 opened Jul 11, 2024 by shanmugamr1992 Loading…
Fix weight name mismatch for checkpoint conversion in legacy Megatron using TransformerEngine stale No activity in 60 days on issue or PR
#921 opened Jul 10, 2024 by singleheart Loading…
remove hardcoded gelu from BERT models. stale No activity in 60 days on issue or PR
#917 opened Jul 9, 2024 by skothenhill-nv Loading…
[BUG] Suppress initialization for moe router if not necessary stale No activity in 60 days on issue or PR
#914 opened Jul 9, 2024 by haolin-nju Loading…
Fix typos
#904 opened Jul 6, 2024 by omahs Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.