Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we use multi GPU while exporting (diffusers ) onnx model? #96

Open
wxsms opened this issue Oct 29, 2024 · 5 comments
Open

Can we use multi GPU while exporting (diffusers ) onnx model? #96

wxsms opened this issue Oct 29, 2024 · 5 comments
Assignees

Comments

@wxsms
Copy link

wxsms commented Oct 29, 2024

I'm building a SDXL model in float16 using 4090x2, therefore the GPU memory available is ~48GB.

however, the script in diffusers/quantizatoin does not looks like to able to use both of them, and raise OOM error while exporting onnx model.

I tried to export the model using CPU, but it's too slow.

@jingyu-ml jingyu-ml self-assigned this Oct 29, 2024
@jingyu-ml
Copy link
Collaborator

jingyu-ml commented Oct 29, 2024

@wxsma
could you try something like this?

        backbone.eval()
        with torch.no_grad():
            modelopt_export_sd(backbone, f"{str(args.onnx_dir)}", args.model, args.format)

And also move other parts to cpu, like vae and clip. Please let me know if it works.

@wxsms
Copy link
Author

wxsms commented Nov 1, 2024

@wxsma could you try something like this?

    backbone.eval()
    with torch.no_grad():
        modelopt_export_sd(backbone, f"{str(args.onnx_dir)}", args.model, args.format)

And also move other parts to cpu, like vae and clip. Please let me know if it works.

Sadly it does not work. I manage to export the onnx model in A800 and complie in 4090.

@jingyu-ml
Copy link
Collaborator

I'll take a look and get back to you, I barely tested on 4090. Just to confirm, can you export the FP16 SDXL on a 4090?

@wxsms
Copy link
Author

wxsms commented Nov 5, 2024

Thank you, I will try it later.

@ZhenshengWu
Copy link

ZhenshengWu commented Nov 19, 2024

Has there been any progress on this issue? I encountered the same problem on an RTX 4090. Eventually, I performed the ONNX model conversion on an A800. Using nvidia-smi, I noticed that the ONNX conversion process requires around 30GB of VRAM

model:SDXL-1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants