Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBHEGPU: protection when detector is out #34429

Merged
merged 2 commits into from
Jul 13, 2021

Conversation

mariadalfonso
Copy link
Contributor

fix issue reported in #34197

@cmsbuild cmsbuild added this to the CMSSW_12_0_X milestone Jul 9, 2021
@mariadalfonso mariadalfonso changed the title protection when detector is out HBHEGPU: protection when detector is out Jul 9, 2021
@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34429/23838

  • This PR adds an extra 20KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34429/23839

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

A new Pull Request was created by @mariadalfonso for master.

It involves the following packages:

  • RecoLocalCalo/HcalRecProducers (reconstruction)

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@apsallid, @abdoulline, @bsunanda this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy, @perrotta you are the release manager for this.

cms-bot commands are listed here

@mariadalfonso
Copy link
Contributor Author

type bugfix

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-49cdd2/16692/summary.html
COMMIT: ccc744f
CMSSW: CMSSW_12_0_X_2021-07-11-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/34429/16692/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 38
  • DQMHistoTests: Total histograms compared: 2787742
  • DQMHistoTests: Total failures: 868
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2786874
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 37 files compared)
  • Checked 160 log files, 37 edm output root files, 38 DQM output files
  • TriggerResults: no differences found

@mariadalfonso
Copy link
Contributor Author

enable gpu

@mariadalfonso
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-49cdd2/16707/summary.html
COMMIT: ccc744f
CMSSW: CMSSW_12_0_X_2021-07-12-1100/slc7_amd64_gcc900
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/34429/16707/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19855
  • DQMHistoTests: Total failures: 86
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19769
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 3 differences found in the comparisons
  • DQMHistoTests: Total files compared: 38
  • DQMHistoTests: Total histograms compared: 2787742
  • DQMHistoTests: Total failures: 871
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2786870
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 37 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 160 log files, 37 edm output root files, 38 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

@mariadalfonso the effect of the fix is not testable with the standard jenkins tests with "normal" inputs, of course.
Could you please confirm that you verified it actually fixes the cases that generated the issue?

@mariadalfonso
Copy link
Contributor Author

@mariadalfonso the effect of the fix is not testable with the standard jenkins tests with "normal" inputs, of course.
Could you please confirm that you verified it actually fixes the cases that generated the issue?

I replicated the crash following instructions from here
#34197 (comment)
and the fix provided solve the crash for that particular configuration

@perrotta
Copy link
Contributor

+reconstruction

  • Verified to fix the HCAL and ECAL local reconstruction eon a GPU when any of those detectors is out
  • The fix has no effect on the "normal" workflows when all detectors are present
  • Since the issue showed up during MWGR, the already provided backport (HBHEGPU: protection when detector is out #34428) is welcome in 11_3

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy, @perrotta (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants