Simplify cudacompat layer to use a 1-dimensional grid #586

fwyzard · 2020-11-30T10:56:18Z

Remove the possibility of changing the grid size used by the cms::cudacompat layer, and make it a constant equal to {1, 1, 1}.

This avoids a thread-related problem caused by TBB using worker threads where the grid size had not been initialised.

The kernel for pixel clustering need to be rewritten to support a one-dimensional grid to run on the CPU.
Currently they are only used on the GPU in the Patatrack workflows, but they are exercised on the CPU by the gpuClustering_t tests; those tests have been commented out until the kernels can be updated.

Remove the possibility of changing the grid size used by the cms::cudacompat layer, and make it a constant equal to {1, 1, 1}. This avoids a thread-related problem caused by TBB using worker threads where the grid size had not been initialised. The kernel for pixel clustering need to be rewritten to support a one-dimensional grid to run on the CPU. Currently they are only used on the GPU in the Patatrack workflows, but they are exercised on the CPU by the gpuClustering_t tests; those tests have been commented out until the kernels can be updated.

fwyzard · 2020-11-30T10:56:54Z

Validation summary

Reference release CMSSW_11_2_0_pre10 at 6c149b2
Development branch cms-patatrack/CMSSW_11_2_X_Patatrack at e454ee0
Testing branch cms-patatrack/CMSSW_11_2_X_Patatrack at e454ee0 with PRs:

Simplify cudacompat layer to use a 1-dimensional grid #586 at 9a6fe7d

Validation plots

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 11634.5
tracking validation plots and summary for workflow 11634.501
tracking validation plots and summary for workflow 11634.502
tracking validation plots and summary for workflow 11634.505
tracking validation plots and summary for workflow 11634.506

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 11634.5
tracking validation plots and summary for workflow 11634.501
tracking validation plots and summary for workflow 11634.502
tracking validation plots and summary for workflow 11634.505
tracking validation plots and summary for workflow 11634.506

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 11634.5
tracking validation plots and summary for workflow 11634.501
tracking validation plots and summary for workflow 11634.502
tracking validation plots and summary for workflow 11634.505
tracking validation plots and summary for workflow 11634.506

Validation plots (CPU vs GPU)

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflows 11634.502 and 11634.501
tracking validation plots and summary for workflows 11634.506 and 11634.505

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflows 11634.502 and 11634.501
tracking validation plots and summary for workflows 11634.506 and 11634.505

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflows 11634.502 and 11634.501
tracking validation plots and summary for workflows 11634.506 and 11634.505

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and `nvprof`/`nvvp` profiles

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

reference release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.501
- ❌ step3.py: log
development release, workflow 11634.502
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 11634.505
- ❌ step3.py: log
development release, workflow 11634.506
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 11634.511
- ✔️ step3.py: log
development release, workflow 11634.512
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
development release, workflow 11634.521
- ✔️ step3.py: log
development release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.885502
development release, workflow 136.885512
development release, workflow 136.885522
testing release, workflow 11634.5
- ✔️ step3.py: log
testing release, workflow 11634.501
- ✔️ step3.py: log
testing release, workflow 11634.502
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 11634.505
- ✔️ step3.py: log
testing release, workflow 11634.506
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 11634.511
- ✔️ step3.py: log
testing release, workflow 11634.512
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
testing release, workflow 11634.521
- ✔️ step3.py: log
testing release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.885502
testing release, workflow 136.885512
testing release, workflow 136.885522

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

reference release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.501
- ✔️ step3.py: log
development release, workflow 11634.502
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 11634.505
- ✔️ step3.py: log
development release, workflow 11634.506
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 11634.511
- ✔️ step3.py: log
development release, workflow 11634.512
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
development release, workflow 11634.521
- ✔️ step3.py: log
development release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.885502
development release, workflow 136.885512
development release, workflow 136.885522
testing release, workflow 11634.5
- ✔️ step3.py: log
testing release, workflow 11634.501
- ✔️ step3.py: log
testing release, workflow 11634.502
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 11634.505
- ✔️ step3.py: log
testing release, workflow 11634.506
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 11634.511
- ✔️ step3.py: log
testing release, workflow 11634.512
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
testing release, workflow 11634.521
- ✔️ step3.py: log
testing release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.885502
testing release, workflow 136.885512
testing release, workflow 136.885522

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

reference release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.501
- ✔️ step3.py: log
development release, workflow 11634.502
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 11634.505
- ✔️ step3.py: log
development release, workflow 11634.506
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 11634.511
- ✔️ step3.py: log
development release, workflow 11634.512
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
development release, workflow 11634.521
- ✔️ step3.py: log
development release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.885502
development release, workflow 136.885512
development release, workflow 136.885522
testing release, workflow 11634.5
- ✔️ step3.py: log
testing release, workflow 11634.501
- ✔️ step3.py: log
testing release, workflow 11634.502
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 11634.505
- ✔️ step3.py: log
testing release, workflow 11634.506
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 11634.511
- ✔️ step3.py: log
testing release, workflow 11634.512
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
testing release, workflow 11634.521
- ✔️ step3.py: log
testing release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.885502
testing release, workflow 136.885512
testing release, workflow 136.885522

Logs

The full log is available at https://patatrack.web.cern.ch/patatrack/validation/pulls/6fb888134e1c0c96596810e45bca76019e19f734/log .

fwyzard · 2020-11-30T11:03:05Z

Should fix #564

VinInn · 2020-11-30T11:13:14Z

if compiles and run the code is ok with me.
CPU workflows may even go faster w/o TLS

RecoPixelVertexing/PixelVertexFinding/plugins/gpuVertexFinderImpl.h

fwyzard

remove unnecessary asserts.

RecoPixelVertexing/PixelVertexFinding/plugins/gpuVertexFinderImpl.h

RecoPixelVertexing/PixelVertexFinding/test/VertexFinder_t.h

HeterogeneousCore/CUDAUtilities/interface/cudaCompat.h

Co-authored-by: Matti Kortelainen <[email protected]>

fwyzard · 2020-12-01T01:21:53Z

All crashes in the CPU workflows are indeed fixed.

No changes to the physics performance or throughput, as expected.

Remove the possibility of changing the grid size used by the cms::cudacompat layer, and make it a constant equal to {1, 1, 1}. This avoids a thread-related problem caused by TBB using worker threads where the grid size had not been initialised. The kernel for pixel clustering need to be rewritten to support a one-dimensional grid to run on the CPU. Currently they are only used on the GPU in the Patatrack workflows, but they are exercised on the CPU by the gpuClustering_t tests; those tests have been commented out until the kernels can be updated.

…ms-patatrack/cmssw#586)

Remove the possibility of changing the grid size used by the cms::cudacompat layer, and make it a constant equal to {1, 1, 1}. This avoids a thread-related problem caused by TBB using worker threads where the grid size had not been initialised. The kernel for pixel clustering need to be rewritten to support a one-dimensional grid to run on the CPU. Currently they are only used on the GPU in the Patatrack workflows, but they are exercised on the CPU by the gpuClustering_t tests; those tests have been commented out until the kernels can be updated.

fwyzard requested a review from VinInn November 30, 2020 10:56

fwyzard mentioned this pull request Nov 30, 2020

Segmentation fault in the quadruplets workflow on CPU #564

Closed

fwyzard added bug-fix Pixels Pixels-related developments labels Nov 30, 2020

VinInn approved these changes Nov 30, 2020

View reviewed changes

RecoPixelVertexing/PixelVertexFinding/plugins/gpuVertexFinderImpl.h Outdated Show resolved Hide resolved

fwyzard commented Nov 30, 2020

View reviewed changes

RecoPixelVertexing/PixelVertexFinding/plugins/gpuVertexFinderImpl.h Outdated Show resolved Hide resolved

RecoPixelVertexing/PixelVertexFinding/test/VertexFinder_t.h Outdated Show resolved Hide resolved

Remove unnecessary asserts

e32c4ed

VinInn mentioned this pull request Nov 30, 2020

make clusterizer kernels independent of grid size #588

Merged

makortel reviewed Nov 30, 2020

View reviewed changes

HeterogeneousCore/CUDAUtilities/interface/cudaCompat.h Outdated Show resolved Hide resolved

Fix typo in the comment

40220bd

Co-authored-by: Matti Kortelainen <[email protected]>

fwyzard merged commit 9de7905 into cms-patatrack:CMSSW_11_2_X_Patatrack Dec 1, 2020

fwyzard deleted the cudacompat_one_dimensional_grid branch December 1, 2020 01:22

fwyzard mentioned this pull request Dec 3, 2020

Patatrack integration - Pixel local reconstruction (9/N) cms-sw/cmssw#31721

Merged

fwyzard mentioned this pull request Dec 26, 2020

Simplify cudacompat layer to use a 1-dimensional grid cms-sw/cmssw#32586

Merged

fwyzard mentioned this pull request Dec 28, 2020

Patatrack integration - Pixel vertex reconstruction (11/N) cms-sw/cmssw#31723

Merged

makortel added a commit to makortel/pixeltrack-standalone that referenced this pull request Dec 29, 2020

[cudacompat] Simplify cudacompat layer to use a 1-dimensional grid (c…

e71bfab

…ms-patatrack/cmssw#586)

makortel added a commit to makortel/pixeltrack-standalone that referenced this pull request Dec 29, 2020

[cudacompat] Simplify cudacompat layer to use a 1-dimensional grid (c…

1613135

…ms-patatrack/cmssw#586)

makortel mentioned this pull request Dec 30, 2020

[cudacompat] Add a CPU implementation through cudacompat cms-patatrack/pixeltrack-standalone#151

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify cudacompat layer to use a 1-dimensional grid #586

Simplify cudacompat layer to use a 1-dimensional grid #586

fwyzard commented Nov 30, 2020

fwyzard commented Nov 30, 2020 •

edited

Loading

fwyzard commented Nov 30, 2020

VinInn commented Nov 30, 2020

fwyzard left a comment

fwyzard commented Dec 1, 2020

Simplify cudacompat layer to use a 1-dimensional grid #586

Simplify cudacompat layer to use a 1-dimensional grid #586

Conversation

fwyzard commented Nov 30, 2020

fwyzard commented Nov 30, 2020 • edited Loading

Validation summary

Validation plots

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

Validation plots (CPU vs GPU)

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and nvprof/nvvp profiles

/RelValTTbar_14TeV/CMSSW_11_2_0_pre7-PU_112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

/RelValZMM_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v2/GEN-SIM-DIGI-RAW

/RelValZEE_14/CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Nov 30, 2020

VinInn commented Nov 30, 2020

fwyzard left a comment

Choose a reason for hiding this comment

fwyzard commented Dec 1, 2020

fwyzard commented Nov 30, 2020 •

edited

Loading

logs and `nvprof`/`nvvp` profiles