-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precice test fails on perlmutter #50
Comments
Are there any specifics of the spec used to build preCICE? Also as a note, the tilde in the variant doesn't work well in markdown. I suggest to wrap it in a code block. |
@fsimonis Fixed that variant. Here is the full dependency tree. Is there anything else that would help pin this down?
|
It looks like this was caused by a poisoned runtime environment. This error doesn't appear on a fresh run node. |
@fsimonis I resolved this too quickly. I get a hang/timeout the run output in a clean environment:
|
Both solvers fail in Given that both of them fail with the same error, I expect that this is some kind of problem in the environment. We don't do any fancy things in preCICE, so this should be reproducible with any dummy MPI code. |
I'm seeing the same issue on Crusher. Error and variants/dependencies for the crusher install are below. This is in a clean environment (basically all I've done is
|
We test MPICH in our CI using fedora, which is still at version 34 (mpich 3.4.1). Have you tried launching multiple other MPI programs simultaneously to see if the system can handle this? We experienced problems on the SuperMUC(-NG) with multiple MPI programs running simultaneously on the same slots, whilst spanning multiple nodes. This could be another symptom of the same problem. (Of course this is more of a guess, as you don't actually run the solverdummies with mpirun. ) |
Your test runs fine locally with:
Using the spec:
|
@MakisH @fsimonis
The precice test defined here: https://github.com/E4S-Project/testsuite/tree/master/validation_tests/precice
Fails on perlmutter for this variant installed with e4s 22.11:
With the following console output:
The text was updated successfully, but these errors were encountered: