Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module wrongly found in HMNS with user-Modulepath #4593

Open
Flamefire opened this issue Aug 2, 2024 · 8 comments
Open

Module wrongly found in HMNS with user-Modulepath #4593

Flamefire opened this issue Aug 2, 2024 · 8 comments
Milestone

Comments

@Flamefire
Copy link
Contributor

We have a site-wide software installation and allow custom modules, i.e. --envvars-user-modules

To be able to build on top of that the user needs to modify it's module path.
For a base path like /software/modules/all/Core the user needs to do ml use /software/modules/all as otherwise EB won't be able to find modules in "Core". I mentioned this usability issue in #3703 (comment)

However that now leads to a different problem I haven't seen before:

  • I reinstall intel-compilers/2022.1.0 into the user env
  • EasyBuild then claims that e.g. impi/2021.6.0 (impi-2021.6.0-intel-compilers-2022.1.0.eb (module: Compiler/intel/2022.1.0 | impi/2021.6.0) is already installed

However it is only installed in the global tree but not in my user env. Hence I cannot load the module:
$ ml intel-compilers/2022.1.0 impi/2021.6.0 fails

More curiously: It does find the impi installed in the global env but only via the full path:

$ ml intel-compilers/2022.1.0
Module intel-compilers/2022.1.0 and 3 dependencies loaded.
[email protected] ~  $ ml av impi

---------------- /software/modules/all -------------------
   Compiler/intel/2022.1.0/impi/2021.6.0

So using the trace output I see that EasyBuild checks that full path: Compiler/intel/2022.1.0/impi/2021.6.0 is already installed (module found), skipping.

However using the current modules that isn't available by the module name "impi/2021.6.0" which EB uses e.g. in the sanity check to load the module and what a user expects.

@boegel boegel added this to the 4.x milestone Aug 13, 2024
@boegel
Copy link
Member

boegel commented Aug 13, 2024

I wonder if @ocaisa can help out here a bit, I'm not sure I fully grasp the problem here...

Are you basically saying that there's bad side effects of EasyBuild checking for existence of modules with the "full" module name?

@Flamefire
Copy link
Contributor Author

Are you basically saying that there's bad side effects of EasyBuild checking for existence of modules with the "full" module name?

Yes, that's basically the issue here.

@ocaisa
Copy link
Member

ocaisa commented Aug 14, 2024

At JSC, they arrange the setup for users to install modules via a module, and set an alias for eb itself: https://github.com/easybuilders/JSC/blob/2024/dev_modules/UserInstallations/easybuild.lua#L89

The difference here seems to be that you are allowing users to define entire toolchains in their custom tree. This is something we created hooks to disallow at JSC. You'd need to create a custom setup to allow the user to install modules that extend the module path which also extend back into the system path. What you've done is shadow the system version which extends in 2 directions with something that only extends the module path in one direction.

@Flamefire
Copy link
Contributor Author

I actually just wanted to extend in one direction: If the user installs an own module relevant for HMNS the system modules are gone. This is acceptable especially because current EB doesn't allow an --env-var-user-modules that does NOT take precedence. Otherwise a user could use that

The bug here is that EB finds a module that isn't available given some modules are loaded. Not sure how to fix that without actually loading modules which might not even work when modules "in the middle" went missing. In this case all existing, but now hidden modules would be (re)installed by eb --robot

@ocaisa
Copy link
Member

ocaisa commented Aug 15, 2024

This seems complicated to me, you need the original module path (/software/modules/all) soeb finds the original compliers etc., then you want to switch to another module tree for the user that doesn't just compliment the original tree.

I think eb is not set up to support this use case, I believe it loops over possibilities in the module path using the full path, so it finds the modules you dont want in the system path. HMNS is the complication as EasyBuild is designed with generic support, this works fine with a single HMNS and an overlay for user installations on top. What you are trying to do is mix two different hierarchies and that isn't really possible with the MNS agnostic implementation. If I was to suggest something, I'd say just rebuild a new module tree leveraging the existing installations...but that is not something you can really say to users.

@Flamefire
Copy link
Contributor Author

this works fine with a single HMNS and an overlay for user installations on top.

What is that exactly? There is a single HMNS and the user should be able to install additional modules. So the limitation is that users shouldn't be able to install modules expanding the module path (e.g. compiler)?
I see a related complication: The user installs a compiler (via --robot) that doesn't exist yet, all works. Now the global modules get that compiler and things stop to work.

This is something we created hooks to disallow at JSC.

That is to prevent exactly that, isn't it? Can you share that?

@ocaisa
Copy link
Member

ocaisa commented Aug 15, 2024

Yes, that's it, users installing things can extend the module path is not allowed. That basically means that they can only install things using known and supported toolchains...but they can install whatever they want with that restriction.

The hook to do this has evolved a lot since I was there, but you can find it at https://github.com/easybuilders/JSC/blob/2024/Custom_Hooks/eb_hooks.py

@Flamefire
Copy link
Contributor Author

I see. Currently it prevents user from installing GCCcore or anything not in a list of allowed toolchain names. I guess things like "GCC" or "icc" that also extend the module path were contained there before too

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants