Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabled DETR (Object Detection) model #1046

Merged
merged 6 commits into from
Jul 28, 2024

Conversation

cfgfung
Copy link
Contributor

@cfgfung cfgfung commented Jun 5, 2024

What does this PR do?

This PR contains the patch, example, and test codes for DETR models.

A100 CUDA BF16(Autocast) benchmarks:
n_iterations: 10
Total latency (ms): 241.29199981689453
Average latency (ms): 24.129199981689453

Gaudi2 BF16(Autocast and Graph mode) benchmarks:
n_iterations: 10
Total latency (ms): 65.3073787689209
Average latency (ms): 6.53073787689209

Before submitting

  • [N.A.] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [Yes] Did you make sure to update the documentation with your changes?
  • [Yes] Did you write any new necessary tests?

@cfgfung cfgfung requested a review from regisss as a code owner June 5, 2024 21:40
@libinta
Copy link
Collaborator

libinta commented Jun 25, 2024

@cfgfung can you rebase the PR?

@cfgfung
Copy link
Contributor Author

cfgfung commented Jun 25, 2024

@cfgfung can you rebase the PR?

Hi,
I have applied rebase and here is the test result

image

Copy link
Contributor

@imangohari1 imangohari1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cfgfung
thank you for the work here.
Below are my suggestions for this PR. Let's work on these and do a final review.

  • I've made some changes regarding some minor clean ups and ci tests. Please review and implement them via git am < 0001* (file attached).
  • Few minor comments are give in comment.

0001-fea-Minor-CI-updates-and-clean-ups.patch

G2 results (after applying the patch)

---------------------------: System Configuration :---------------------------
Num CPU Cores : 160
CPU RAM       : 1056375224 KB
------------------------------------------------------------------------------
Detected cat with confidence 1.0 at location [344.0, 25.25, 640.0, 376.0]
Detected remote with confidence 0.996 at location [328.0, 75.5, 372.0, 188.0]
Detected remote with confidence 0.996 at location [39.0, 70.5, 176.0, 118.0]
Detected cat with confidence 1.0 at location [15.62, 52.5, 316.0, 472.0]
Detected couch with confidence 0.996 at location [-1.25, 0.94, 640.0, 472.0]

Stats:
------------------------------------------------------------
Total latency (ms): 58.38346481323242 (for n_iterations=10)
Average latency (ms): 5.838346481323242 (per iteration)
------------------------------------------------------------

Copy link
Contributor

@imangohari1 imangohari1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments.

Makefile Outdated
# Run unit and integration tests related to Image segmentation
fast_tests_object_detection:
python -m pip install .[tests]
python -m pip install timm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no_timm is used for the test, is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is still needed. The model has called some API/modules from timm library

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then let's move this to here: https://github.com/huggingface/optimum-habana/blob/main/setup.py#L41-L51
doesn't need to be a separate pip install.


adapt_transformers_to_gaudi()

# you can specify the revision tag if you don't want the timm dependency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how to approach this here, but would be nice to able to pass/toggle between no_timm and main revisions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
The DETR is composed of different modules from timm, transformer and other popular libraries.
Calling this line will apply any applicable optimizations to this DETR model, especially the transformer submodules.

@imangohari1
Copy link
Contributor

@cfgfung
Can we move this pr along? Pls. apply the changes I suggested. Tnx.

@cfgfung
Copy link
Contributor Author

cfgfung commented Jul 15, 2024

Hi @cfgfung thank you for the work here. Below are my suggestions for this PR. Let's work on these and do a final review.

  • I've made some changes regarding some minor clean ups and ci tests. Please review and implement them via git am < 0001* (file attached).
  • Few minor comments are give in comment.

0001-fea-Minor-CI-updates-and-clean-ups.patch

G2 results (after applying the patch)

---------------------------: System Configuration :---------------------------
Num CPU Cores : 160
CPU RAM       : 1056375224 KB
------------------------------------------------------------------------------
Detected cat with confidence 1.0 at location [344.0, 25.25, 640.0, 376.0]
Detected remote with confidence 0.996 at location [328.0, 75.5, 372.0, 188.0]
Detected remote with confidence 0.996 at location [39.0, 70.5, 176.0, 118.0]
Detected cat with confidence 1.0 at location [15.62, 52.5, 316.0, 472.0]
Detected couch with confidence 0.996 at location [-1.25, 0.94, 640.0, 472.0]

Stats:
------------------------------------------------------------
Total latency (ms): 58.38346481323242 (for n_iterations=10)
Average latency (ms): 5.838346481323242 (per iteration)
------------------------------------------------------------

I have applied the patch.

@cfgfung
Copy link
Contributor Author

cfgfung commented Jul 15, 2024

@imangohari1 Hi, I have applied rebase and addressed the changes mentioned above.

Copy link
Contributor

@imangohari1 imangohari1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor change.
Also could we test this with multiple input images?

Please make sure to run make style and rebase/sync with the head of OH at main.

Makefile Outdated
# Run unit and integration tests related to Image segmentation
fast_tests_object_detection:
python -m pip install .[tests]
python -m pip install timm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then let's move this to here: https://github.com/huggingface/optimum-habana/blob/main/setup.py#L41-L51
doesn't need to be a separate pip install.

@cfgfung
Copy link
Contributor Author

cfgfung commented Jul 15, 2024

minor change. Also could we test this with multiple input images?

Please make sure to run make style and rebase/sync with the head of OH at main.

Moved the timm installation command to setup.py. This is the result for the make style
image

A single image should be enough to safeguard the precision/numerical issues and throughput. The following merged PRs are also using a single image as a test.

#801
#814
#826

Copy link
Contributor

@imangohari1 imangohari1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@regisss
Could you take a final look here please?
Thanks

@libinta libinta added the run-test Run CI for PRs from external contributors label Jul 16, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

setup.py Show resolved Hide resolved
@libinta libinta added the synapse1.17 PR that should be available along with Synapse 1.17 but have no dependency on Synapse 1.17 content. label Jul 24, 2024
@yeonsily
Copy link
Collaborator

@cfgfung can you please fix code style?

@cfgfung
Copy link
Contributor Author

cfgfung commented Jul 25, 2024

@cfgfung can you please fix code style?

Hi @yeonsily,

On my side, I cannot see anything changed using make style

image

Anyhow, I have rebased with the latest main and pushed the updates.

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@regisss regisss merged commit a246660 into huggingface:main Jul 28, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors synapse1.17 PR that should be available along with Synapse 1.17 but have no dependency on Synapse 1.17 content.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants