-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deterministic behavior with scotch 7 #87
Comments
Hi there, thanks for the bug report! Can you please point us to the flags that you are talking about, and check if they still exist in scotch 7? If they do, would be happy to add them to the builds here. |
Thanks for the quick reply. I will try to point you to the flags that I believe are the ones causing the problems, but I am really no expert in compiling things - I usually let conda do the work for me (sorry about that). So investigating the recipe directory in the 6.0.9 PR https://github.com/regro-cf-autotick-bot/scotch-feedstock/tree/4532f8f5ec7e4094d7df9f6e317c24eb01f1eaf7/recipe it seems to me that the build.sh file used to build scotch is using the compile flags defined in Makefile.inc. There, the flag If I see this correctly, this option is not set in the current build script https://github.com/conda-forge/scotch-feedstock/blob/main/recipe/build-scotch.sh Moreover, it also seems that the flags I guess that these are the compiler flags which are responsible for the (non)-determinism of parallel runs. |
In the cmake build that we use, these flags always seem to be set: https://github.com/live-clones/scotch/blob/82ec87f558f4acb7ccb69a079f531be380504c92/src/CMakeLists.txt#L49 So I’m not sure what’s causing the issue. Maybe someone else knows - I’m not overly familiar with scotch. |
Okay, thanks a lot. Also the |
Comment:
Hello everyone,
I have a question / issue with scotch 7. Recently, I could update the dependencies of my code to use scotch 7. However, I have since experienced some issues when running my code in parallel - so I guess this is related to ptscotch / libptscotch, but I am not entirely sure.
I am using FEniCS, which in turn uses scotch for mesh partitioning and graph reordering. Since switching to scotch 7, some tests at https://github.com/sblauth/cashocs/ fail irreproducibly / non-deterministically when run in parallel. I know that this problem is related to scotch as changing the mesh partitioner to ParMETIS (which FEniCS also supports) does not raise any problems. Due to licensing issues, I would prefer to stick with scotch as mesh partitioning tool.
I have investigated the recipe for the conda-forge build a bit and it seems to me that the previous version (6.0.9), which works fine for me, sets some deterministic build flags, whereas version 7.0.4, with which I have the issues, does not?
As far as I have seen, this could be addressed dynamically in scotch 7 now (using contexts) - however, as I am using FEniCS from python, I have no idea how to do so - it seems that this cannot be done with environment variables.
Are my observations regarding determinism in the conda-forge build correct? Is there any way for me, who uses scotch via FEniCS, to restore the parallel determinism? Or would it be thinkable to provide a deterministic conda-forge scotch build?
I am also happy to provide further information if this is required.
Thanks a lot in advance,
Sebastian
The text was updated successfully, but these errors were encountered: