-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix duplicate init for self.module in DistributedDataParallel
#1065
opened Sep 4, 2024 by
Aurelius84
Loading…
Fix: timers('interval-time') bug and abnormal termination
#1017
opened Aug 20, 2024 by
bingnandu
Loading…
add hoper llama golden with mcore calling stack
#987
opened Aug 8, 2024 by
yiakwy-xpu-ml-framework-team
Loading…
[DOC] Fix wrong llama2 pretrain url in README
stale
No activity in 60 days on issue or PR
#941
opened Jul 22, 2024 by
lausannel
Loading…
Update README.md to add some mcore documentation and links
stale
No activity in 60 days on issue or PR
#928
opened Jul 11, 2024 by
shanmugamr1992
Loading…
Fix weight name mismatch for checkpoint conversion in legacy Megatron using TransformerEngine
stale
No activity in 60 days on issue or PR
#921
opened Jul 10, 2024 by
singleheart
Loading…
remove hardcoded gelu from BERT models.
stale
No activity in 60 days on issue or PR
#917
opened Jul 9, 2024 by
skothenhill-nv
Loading…
[BUG] Suppress initialization for moe router if not necessary
stale
No activity in 60 days on issue or PR
#914
opened Jul 9, 2024 by
haolin-nju
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.