Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store ctest output from all ranks in CI as GitLab artifacts #1208

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

msimberg
Copy link
Collaborator

@msimberg msimberg commented Nov 6, 2024

No description provided.

@msimberg msimberg self-assigned this Nov 6, 2024
@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

cscs-ci run

4 similar comments
@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

cscs-ci run

@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

cscs-ci run

@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

cscs-ci run

@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

cscs-ci run

@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

This seems to be working nicely. All ranks now have verbose test output, but only rank 0 actually outputs to the terminal. All ranks store the output to a file that's stored as an artifact. For example, https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4700071344751697/7514005670787789/-/jobs/8288337613 had test_multiplication_hermitian fail again with a segfault. The main output says srun: error: nid01590: task 2: Exited with exit code 1. If one then goes to the artifacts and looks at the log for rank 2 (ctest.2.txt: https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4700071344751697/7514005670787789/-/jobs/8288337613/artifacts/external_file/output/ctest.2.txt) it'll say in the backtrace:
´´´
60: Backtrace:
60: /opt/spack/opt/spack/linux-ubuntu24.04-x86_64/clang-15.0.7/mimalloc-2.1.7-rn6y2rsalkj2ojyubzejabtelannsbga/lib/libmimalloc.so.2(__libc_free+0x10)[0x7ffff69456c0]
60: /DLA-Future-build/lib/libgtest.so.1.13.0(_ZN7testing8UnitTest13PopGTestTraceEv+0xba)[0x7ffff6c67fea]
60: /DLA-Future-build/lib/libgtest.so.1.13.0(_ZN7testing11ScopedTraceD1Ev+0x18)[0x7ffff6c6aef8]
60: /DLA-Future-build/test/unit/multiplication/test_multiplication_hermitian(+0x7162a)[0x5555555c562a]
60: /DLA-Future-build/test/unit/multiplication/test_multiplication_hermitian(+0x7126e)[0x5555555c526e]
´´´
with PopGTestTrace already quite a good hint as to what's going on. Not all failures actually print this, but at least there's higher chance of seeing it now.

@msimberg
Copy link
Collaborator Author

msimberg commented Nov 6, 2024

cscs-ci run

@msimberg msimberg marked this pull request as ready for review November 6, 2024 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant