Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make pipelines tests device-agnostic (part1) #9399

Open
wants to merge 58 commits into
base: main
Choose a base branch
from

Conversation

faaany
Copy link
Contributor

@faaany faaany commented Sep 9, 2024

What does this PR do?

Below are some evidences:

PASSED tests/pipelines/animatediff/test_animatediff.py::AnimateDiffPipelineFastTests::test_to_device
PASSED tests/pipelines/amused/test_amused.py::AmusedPipelineFastTests::test_model_cpu_offload_forward_pass
PASSED tests/pipelines/amused/test_amused.py::AmusedPipelineFastTests::test_cpu_offload_
forward_pass_twice
PASSED tests/pipelines/amused/test_amused_inpaint.py::AmusedInpaintPipelineFastTests::te
st_to_device

@yiyixuxu

@faaany faaany marked this pull request as ready for review September 9, 2024 22:55
@sayakpaul
Copy link
Member

Could you provide some details about the machine you used to run the tests changed in this PR?

@faaany
Copy link
Contributor Author

faaany commented Sep 10, 2024

Could you provide some details about the machine you used to run the tests changed in this PR?

yes, I am running on Intel(R) Data Center GPU Max 1550.

@faaany
Copy link
Contributor Author

faaany commented Sep 10, 2024

Could you provide some details about the machine you used to run the tests changed in this PR?

xpu is the common device name for intel gpu.

@faaany
Copy link
Contributor Author

faaany commented Sep 12, 2024

Hi @sayakpaul , any concern on this PR?

@faaany
Copy link
Contributor Author

faaany commented Sep 13, 2024

could you pls help retrigger the CI? Thx a lot!

@faaany
Copy link
Contributor Author

faaany commented Sep 18, 2024

Hi @yiyixuxu, could you let me know your thoughts on this PR?

@faaany
Copy link
Contributor Author

faaany commented Sep 20, 2024

Hi folks, 2 weeks already, any feedback? I am waiting to enable more cases on XPU... If this is not the right approach, pls let me know as well. Thanks a lot!

@sayakpaul @yiyixuxu

@sayakpaul
Copy link
Member

Please be a little more respectful towards the maintainers' time. It will be reviewed soon.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@faaany
Copy link
Contributor Author

faaany commented Sep 20, 2024

Please be a little more respectful towards the maintainers' time. It will be reviewed soon.

oh, if my expression sounds impolite, sorry for that. And thanks for letting me know!

No hurries, I am just a bit worried that I might not be on the right track. Thanks for your understanding.

@yao-matrix
Copy link

Please be a little more respectful towards the maintainers' time. It will be reviewed soon.

@sayakpaul , could you pls take a review on this? This PR and following PRs are part of efforts to integrate intel gpu into hugging face ecosystems and making CIs can run as expected, thx.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments!

@faaany
Copy link
Contributor Author

faaany commented Oct 23, 2024

@yiyixuxu thanks for the comments. I went through all the changes again and updated the places when necessary. Pls have a review. Thx!

@faaany
Copy link
Contributor Author

faaany commented Oct 31, 2024

Hi @yiyixuxu , can we move forward with this PR?

@faaany
Copy link
Contributor Author

faaany commented Nov 6, 2024

Hi guys, any concern about this PR?

@yiyixuxu yiyixuxu requested a review from DN6 November 8, 2024 01:38
@@ -58,7 +58,7 @@ def get_dummy_inputs(self, device, seed=0):
def test_save_load_optional_components(self):
self._test_save_load_optional_components()

@unittest.skipIf(torch_device != "cuda", reason="float16 requires CUDA")
@require_non_cpu
Copy link
Collaborator

@yiyixuxu yiyixuxu Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float16 does not work with all accelerator, no?

Copy link
Contributor Author

@faaany faaany Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, we have 2 options here:

  1. we allow these tests only to run on cuda or xpu.
  2. we allow them to run on all accelerators and those failed accelerators should either support it or skip these test.

I think the 2 option is better because float16 is a common data type and failing will make people seriously considering it rather than ignoring it.

but I am fine with both. just let me know what you would prefer. Thx!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@faaany Let's go with Option 1 please. I feel like with 2, you just end up with noise in the CI when running on accelerators that don't support FP16.

Suggested change
@require_non_cpu
@unittest.skipIf(torch_device not in ["cuda", "xpu"], reason="float16 requires CUDA or XPU")
@require_accelerator

@@ -373,6 +373,14 @@ def require_note_seq(test_case):
return unittest.skipUnless(is_note_seq_available(), "test requires note_seq")(test_case)


def require_non_cpu(test_case):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def require_non_cpu(test_case):
def require_accelerator(test_case):

@@ -272,7 +278,7 @@ def test_inference_batch_single_identical(
max_diff = np.abs(to_np(output_batch[0][0]) - to_np(output[0][0])).max()
assert max_diff < expected_max_diff

@unittest.skipIf(torch_device != "cuda", reason="CUDA and CPU are required to switch devices")
@require_non_cpu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@require_non_cpu
@require_accelerator

@@ -21,7 +21,7 @@
from diffusers.models.attention import FreeNoiseTransformerBlock
from diffusers.utils import logging
from diffusers.utils.import_utils import is_xformers_available
from diffusers.utils.testing_utils import torch_device
from diffusers.utils.testing_utils import require_non_cpu, torch_device
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from diffusers.utils.testing_utils import require_non_cpu, torch_device
from diffusers.utils.testing_utils import require_accelerator, torch_device

@@ -281,7 +281,7 @@ def test_inference_batch_single_identical(
max_diff = np.abs(to_np(output_batch[0][0]) - to_np(output[0][0])).max()
assert max_diff < expected_max_diff

@unittest.skipIf(torch_device != "cuda", reason="CUDA and CPU are required to switch devices")
@require_non_cpu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@require_non_cpu
@require_accelerator

@@ -58,7 +58,7 @@ def get_dummy_inputs(self, device, seed=0):
def test_save_load_optional_components(self):
self._test_save_load_optional_components()

@unittest.skipIf(torch_device != "cuda", reason="float16 requires CUDA")
@require_non_cpu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@faaany Let's go with Option 1 please. I feel like with 2, you just end up with noise in the CI when running on accelerators that don't support FP16.

Suggested change
@require_non_cpu
@unittest.skipIf(torch_device not in ["cuda", "xpu"], reason="float16 requires CUDA or XPU")
@require_accelerator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants