-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tensorflow] Build with GPU enabled #7648
Conversation
A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for branch IB/CMSSW_12_3_X/master. @cmsbuild, @smuzaffar, @iarspider can you please review it and eventually sign? Thanks. |
test parameters:
|
please test |
-1 Failed Tests: RelVals-GPU The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see more details here: RelVals-GPU
Comparison SummarySummary:
|
assign heterogeneous |
This works for me (I did not try the other two):
|
please test for slc7_ppc64le_gcc11 |
-1 Failed Tests: UnitTests RelVals RelVals-THREADING AddOn The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Unit TestsI found errors in the following unit tests: ---> test testTFGraphLoading had ERRORS ---> test testTFMetaGraphLoading had ERRORS ---> test testTFThreadPools had ERRORS ---> test DRNTest had ERRORS and more ... RelVals
Expand to see more relval errors ...RelVals-THREADING
Expand to see more relval errors ...AddOn Tests
Expand to see more addon errors ... |
please test for slc7_ppc64le_gcc11 |
Are the tests stuck for 3 days now? |
@cmsbuild , please abort |
please test for slc7_ppc64le_gcc11 |
-1 Failed Tests: UnitTests RelVals AddOn The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Unit TestsI found errors in the following unit tests: ---> test TestDQMOnlineClient-dt_dqm_sourceclient had ERRORS ---> test TestDQMOnlineClient-visualization_secondInstance had ERRORS ---> test TestDQMOnlineClient-visualization had ERRORS ---> test TestDQMOnlineClient-dt4ml_dqm_sourceclient had ERRORS and more ... RelVals
Expand to see more relval errors ...AddOn Tests
Expand to see more addon errors ... |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5fdcc1/30467/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
GPU Comparison SummarySummary:
|
I'm trying to understand why I still see CUDA out of memory issues from Tensorflow in Instead the tests for |
humm, looks like ppc64le were not run using the cms-sw/cmssw#40551 . I have updated #7648 (comment) so that pr test always use cmssw pr |
please test with cms-sw/cmssw#40551 for el8_aarch64_gcc11 |
please test with cms-sw/cmssw#40551 for el8_ppc64le_gcc11 |
please test with cms-sw/cmssw#40551 |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5fdcc1/30515/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
GPU Comparison SummarySummary:
|
please test with cms-sw/cmssw#40551 for el8_ppc64le_gcc11 |
I don't get how the simple unittest can saturate the CUDA memory. Is there maybe something wrong with the machine?
|
please test with cms-sw/cmssw#40551 for el8_ppc64le_gcc11 |
I am planning to introduce some improvements in the organization of the TF options we are exposing to the users, but I think that can be done in a separate PR. The changes in the PR allow all the tests to pass (with backend::cpu by default). We will provide a test in the runTheMatrix with a GPU activated workflow. Do you think we can merge this one? Thanks! |
+1
|
merge |
No description provided.