Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplementation of distributed band-to-tridiagonal #946

Merged
merged 8 commits into from
Aug 30, 2023

Conversation

rasolca
Copy link
Collaborator

@rasolca rasolca commented Jul 21, 2023

Similar improvement as local.

Scaling remains the same.

@msimberg can you have a quick look?
I would like to use this PR for the benchmarks.

@rasolca rasolca added this to the release v0.2.0 milestone Jul 21, 2023
@rasolca rasolca self-assigned this Jul 21, 2023
Copy link
Collaborator

@msimberg msimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me from a superficial look. Nice work!

@albestro
Copy link
Collaborator

FIY

[3]
36.1356
14.3168
46.0902
93.3744
36.7428
[3] 228.214s dL (30097, 30097) (512, 512) 128 (2, 2) 18 MC
[4]
35.683
Rank 0 [Mon Jul 24 18:55:14 2023] [c0-0c0s5n2] Fatal error in PMPI_Test: Message truncated, error stack:
PMPI_Test(178)...................: MPI_Test(request=0x1553dadf0d70, flag=0x1553dadf0dc0, status=0x1) failed
MPIR_Test_impl(67)...............: 
MPID_nem_gni_lmt_start_recv(1667): Message from rank 1 and tag 234 truncated; 147456 bytes received but buffer size is 2048
srun: error: nid00022: task 0: Exited with exit code 255
srun: launch/slurm: _step_signal: Terminating StepId=47929378.2
slurmstepd: error: *** STEP 47929378.2 ON nid00022 CANCELLED AT 2023-07-24T18:55:16 ***

@rasolca
Copy link
Collaborator Author

rasolca commented Aug 28, 2023

cscs-ci run

@rasolca rasolca marked this pull request as ready for review August 28, 2023 17:00
@codecov-commenter
Copy link

codecov-commenter commented Aug 28, 2023

Codecov Report

Merging #946 (1d75410) into master (6084329) will increase coverage by 1.46%.
Report is 1 commits behind head on master.
The diff coverage is 97.39%.

❗ Current head 1d75410 differs from pull request most recent head b454c75. Consider uploading reports for the commit b454c75 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master     #946      +/-   ##
==========================================
+ Coverage   93.35%   94.81%   +1.46%     
==========================================
  Files         143      129      -14     
  Lines        8605     7809     -796     
  Branches     1103     1055      -48     
==========================================
- Hits         8033     7404     -629     
+ Misses        388      238     -150     
+ Partials      184      167      -17     
Files Changed Coverage Δ
include/dlaf/eigensolver/band_to_tridiag/mc.h 97.82% <97.39%> (-0.02%) ⬇️

... and 35 files with indirect coverage changes

Copy link
Collaborator

@albestro albestro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much to say, I'm not able to fully understand all the changes happening here.

include/dlaf/eigensolver/band_to_tridiag/mc.h Outdated Show resolved Hide resolved
include/dlaf/eigensolver/band_to_tridiag/mc.h Outdated Show resolved Hide resolved
Co-authored-by: Alberto Invernizzi <[email protected]>
@rasolca
Copy link
Collaborator Author

rasolca commented Aug 30, 2023

cscs-ci run

@rasolca
Copy link
Collaborator Author

rasolca commented Aug 30, 2023

cscs-ci run

@rasolca rasolca merged commit dd22d71 into master Aug 30, 2023
3 checks passed
@rasolca rasolca deleted the rasolca/band_to_trid_sp_dist branch August 30, 2023 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants