huggingface / accelerate Public

Notifications You must be signed in to change notification settings
Fork 967
Star 8k

Code
Issues 105
Pull requests 28
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/accelerate

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

105 Open 1,530 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Wrong epoch when resuming from checkpoint

#3242 opened Nov 17, 2024 by xiechun-tsukuba

2 of 4 tasks

deepspeed inference

#3241 opened Nov 17, 2024 by Reginald-L

OOM error when training llama 7B model using Accelerate FSDP setting

#3239 opened Nov 14, 2024 by JlPang863

2 of 4 tasks

slurmstepd: error: execve(): accelerate: No such file or directory

#3237 opened Nov 13, 2024 by huiyang865

2 of 4 tasks

Code Logical Bug: Using Init Handler Kwargs for Grad Scaler In FP8 Training (accelerate/accelerator.py)

#3233 opened Nov 11, 2024 by immortalCO

1 of 4 tasks

fsdp checkpoint saving leads to NCCL WARN Cuda failure 2 'out of memory'

#3232 opened Nov 10, 2024 by edchengg

2 of 4 tasks

Error while fine tuning with peft, lora, accelerate, SFTConfig and SFTTrainer

#3230 opened Nov 8, 2024 by Isdriai

2 of 4 tasks

torch.cuda.is_available() false when running multi-gpu inference with accelerate launch

#3225 opened Nov 6, 2024 by paulgekeler

2 of 4 tasks

"mat2 must be a matrix" error when finetuning Dreambooth flux with FSDP

#3224 opened Nov 5, 2024 by weixiong-ur

2 of 4 tasks

Add case-insensitive parsing of bool environment variables

#3222 opened Nov 5, 2024 by wizeng23

Incorrect type in output of utils.pad_across_processes when input is torch.bool

#3218 opened Nov 4, 2024 by mariusarvinte

2 of 4 tasks

PyPI published Accelerate==1.1.0 is missing Source Distributions

#3216 opened Nov 4, 2024 by helloworld1

4 tasks

ConnectionError: Tried to launch distributed communication on port 29401, but another process is utilizing it. Please specify a different port (such as using the --main_process_port flag or specifying a different main_process_port in your config file) and rerun your script. To automatically use the next open port (on a single node), you can set this to 0.

#3214 opened Nov 4, 2024 by qinchangchang

1 of 4 tasks

How could I convert ZeRO-0 deepspeed weights into fp32 model checkpoint?

#3210 opened Nov 1, 2024 by liming-ai

The optimizer is not receiving the FSDP model parameters.

#3209 opened Nov 1, 2024 by eljandoubi

2 of 4 tasks

Multiple node inference

#3208 opened Nov 1, 2024 by DLCM-wrz

Multinode, multigpu example fails

#3206 opened Oct 31, 2024 by ffrancesco94

2 of 4 tasks

Command line arguments related to deepspeed for accelerate launch do not override those of default_config.yaml

#3203 opened Oct 29, 2024 by JdbermeoUZH

2 of 4 tasks

Problem with metrics calculation and dataloader

#3202 opened Oct 28, 2024 by gssriram

2 of 4 tasks

Cuda OOM when accelerator.prepare

#3200 opened Oct 25, 2024 by antoinedelplace

2 of 4 tasks

using deepspeed original json config, when using bf16, get the error RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != c10::Half.

#3197 opened Oct 25, 2024 by PMPBinZhang

2 of 4 tasks

Possible issue in Accelerate FSDP Documentation

#3195 opened Oct 24, 2024 by Quicksilver466

Unable to access model gradients with DeepSpeed and Accelerate

#3184 opened Oct 22, 2024 by shouyezhe

2 of 4 tasks

accelerator.prepare() get OOM,but available in single GPU

#3182 opened Oct 21, 2024 by lqf0624

2 of 4 tasks

[Bug] The clip_grad_norm of xla fsdp is not right

#3180 opened Oct 21, 2024 by hanwen-sun

4 tasks

Previous 1 2 3 4 5 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly