Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCEPLIBS-bufr ctest will fail with jedi/intel-impi/2020.2 module #231

Closed
nicholasesposito opened this issue Aug 30, 2022 · 8 comments · Fixed by #524
Closed

NCEPLIBS-bufr ctest will fail with jedi/intel-impi/2020.2 module #231

nicholasesposito opened this issue Aug 30, 2022 · 8 comments · Fixed by #524
Assignees
Labels
bug Something isn't working

Comments

@nicholasesposito
Copy link

nicholasesposito commented Aug 30, 2022

When building NCEPLIBS-bufr with the jedi/intel-impi/2020.2 module, the process will get all the way through. However, running ctest will cause the first 7 tests to fail.

          1 - test_pyncepbufr_checkpoint (Failed)
          2 - test_pyncepbufr_gps (Failed)
          3 - test_pyncepbufr_prepbufr (Failed)
          4 - test_pyncepbufr_rad (Failed)
          5 - test_pyncepbufr_satwnd (Failed)
          6 - test_pyncepbufr_write (Failed)
          7 - test_pyncepbufr_test (Failed)

When we build with gnu (jedi/gnu-openmpi/9.2.0) module, everything builds and all the ctests pass.

The full module load process is below:

module purge
export JEDI_OPT=/scratch1/NCEPDEV/jcsda/jedipara/opt/modules
module use $JEDI_OPT/modulefiles/core
#module load jedi/gnu-openmpi/9.2.0   # Use this OR intel-impi below
module load jedi/intel-impi/2020.2

module load pybind11
module load bufr
module unload bufr/noaa-emc-11.5.0
module unload json-schema-validator/2.1.0
module unload json/3.9.1
module use -a /scratch2/NCEPDEV/marineda/Jong.Kim/save/modulefiles/
module load anaconda/3.15.1
@jbathegit
Copy link
Collaborator

Just to clarify, what platform are you seeing this on Nick? And are you seeing any warnings during the Python build step on this particular platform? If so, then perhaps it's related to #200 which is still awaiting resolution.

I also wonder if this is somehow related to #135, where we've seen similar behavior on other platforms including the WCOSS, and irrespective of whether or not code coverage flags are being used.

Either way, this will probably require assistance from someone more experienced with Python, so I've added some such folks to the thread with the hopes that they can step in to help.

@climbfuji
Copy link
Contributor

climbfuji commented Aug 30, 2022 via email

@nicholasesposito
Copy link
Author

This was performed on Hera.
There were no warnings throughout the ecbuild or make steps.

I ran the ctest with the options above. For test 7, the output was this:


7: Test command: /apps/oneapi/intelpython/latest/envs/2022.1.0/bin/python3.9 "/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/python/test/test.py"
7: Environment variables:
7:  PYTHONPATH=/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/build/bufr/lib64/python3.9/site-packages:
7: Test timeout computed to be: 10000000
7: Traceback (most recent call last):
7:   File "/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/python/test/test.py", line 2, in <module>
7:     import ncepbufr
7: ModuleNotFoundError: No module named 'ncepbufr'
7/7 Test #7: test_pyncepbufr_test .............***Failed    0.04 sec
Traceback (most recent call last):
  File "/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/python/test/test.py", line 2, in <module>
    import ncepbufr
ModuleNotFoundError: No module named 'ncepbufr'


@climbfuji
Copy link
Contributor

This was performed on Hera. There were no warnings throughout the ecbuild or make steps.

I ran the ctest with the options above. For test 7, the output was this:


7: Test command: /apps/oneapi/intelpython/latest/envs/2022.1.0/bin/python3.9 "/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/python/test/test.py"
7: Environment variables:
7:  PYTHONPATH=/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/build/bufr/lib64/python3.9/site-packages:
7: Test timeout computed to be: 10000000
7: Traceback (most recent call last):
7:   File "/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/python/test/test.py", line 2, in <module>
7:     import ncepbufr
7: ModuleNotFoundError: No module named 'ncepbufr'
7/7 Test #7: test_pyncepbufr_test .............***Failed    0.04 sec
Traceback (most recent call last):
  File "/scratch1/NCEPDEV/da/Nicholas.Esposito/nceplibs-intel/NCEPLIBS-bufr/python/test/test.py", line 2, in <module>
    import ncepbufr
ModuleNotFoundError: No module named 'ncepbufr'

Seems like the location of the ncepbufr python module/library is not in your PYTHONPATH.

@jbathegit
Copy link
Collaborator

Just touching base.

@nicholasesposito has this issue been resolved for you, or are you still waiting on help for this?

@nicholasesposito
Copy link
Author

I just tried again. I'm still getting the same ctest failures as earlier. There's still the other way of compiling using gnu compiler that works the whole way. @emilyhcliu how much do we want to pursue this since we have the other way of compiling?

@jbathegit
Copy link
Collaborator

@nicholasesposito hopefully this is fixed now, because the python extensions to the library are now generated during the build step and not the install step. Thanks to @AlexanderRichert-NOAA for figuring this out!

@jbathegit
Copy link
Collaborator

See #524 for more details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants