Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Nextflow compatible with NixOS (or other systems without /bin/bash) #5432

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

rollf
Copy link

@rollf rollf commented Oct 25, 2024

UPDATE: After having created this PR I realized that I could do better (see comments below) and have also invested more time to understand the code base. Now, I (hopefully) have a version that fits well and is satisfying for everybody. Please refer to all the comments below before looking into the diff.


ORIGINAL PR description:
(This comment was made when 29df9b7 was HEAD.)

This PR allows to use nextfow on NixOS where /bin/bash does not exist. This is achieved by invoking /usr/bin/env -S bash where necessary. This approach (or similar) has been discussed previously:

However, compared the previous discussions/implementations, I specifically do not propose to change /bin/bash unless really necessary. By "necessary" I mean that bash is invoked on a local host system where it may not be available. Namely, when <task>/.command.run shall be invoked (always on the host) and when <task>/.command.sh is invoked in a non-container context (i.e. also on the host). I have not touched any other occurrences of /bin/bash, esp. not within the container.

My initial implementation used which bash instead if /usr/bin/env -S bash but I esteemed the latter to be more correct. Please refer to the git commit history/messages for some further insights.

I understand that it is possible for downstream consumers to work around /bin/bash (symlink was proposed in one of the linked issues above). However, I believe that my proposal/approach (=assume /bin/bash to be available everywhere except on the local host where nextflow is invoked) here is not too intrusive.

I'm happy to change the the code as requested. I have not added any comments since I expect (at least) change requests. Please review this with an unbiased mind set. Multiple people have expressed their intention to use nextflow on systems without /bin/bash.

I have tested solely on NixOS.


Further context:

Here is the corresponding nixpkgs issue that motivates this PR.

Here is the current package definition for Nextflow in NixOS which causes trouble (see nixpkgs issue above).

Here is a new package definition assuming Nextflow would be compatible with NixOS out of the box.

On systems without /bin/bash (e.g. NixOS), invoking tasks may fail because
of some hard-coded references to /bin/bash. Those references are either fine
(because the task runs in a container where /bin/bash *is* available),
don't matter (because the shebang is ignored) or cause failure (because
/bin/bash cannot be invoked). This commit adapts Nextflow for the last
scenario. Instead of invoking `/bin/bash`, the actually available bash
executable is called.

Signed-off-by: Rolf Schröder <[email protected]>
Copy link

netlify bot commented Oct 25, 2024

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 1d9839e
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/671baf7ec2f4320008c771bd

This is more portable. This commit also makes sure that `-ue[x]` is not
used during trace.

Signed-off-by: Rolf Schröder <[email protected]>
@rollf
Copy link
Author

rollf commented Oct 25, 2024

On another thought, iff /usr/bin/env -S bash is more acceptable than I think, one could be more intrusive, i.e. use it "everywhere" (where BASH is used) but make sure it is not used within container contexts. This seems to work for me and here would be another commit based on the changes in this PR (i.e. partially revert/invert them).

Either way, I feel it's always a bit ugly to use /usr/bin/env -S bash on the host and /bin/bash in the container because the way Nextflow is designed currently assumes the same interpreter for the host and the container so one has to "manually" make sure /bin/bash is used in the container if /usr/bin/env -S bash got used by the host (but not for tracing where the initial invocation happens on the host, even for container contexts!). I'm not saying the design is wrong here, I just don't like the invocation of .replace() (see changes).

@rollf
Copy link
Author

rollf commented Oct 25, 2024

(This comment was made before 1d9839e became HEAD.)

Ah, I realize that I have used (for testing) quay.io/nextflow/bash as a docker image which actually has /usr/bin/env so the test was flawed. Playing around with this image yields problems when tracing (because /usr/bin/env is not available). Good that I kept this PR as a draft then ^^. I'll update when I fixed the issue.

@@ -65,6 +65,8 @@ class BashWrapperBuilder {

static final public List<String> BASH
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to rename BASH to BIN_BASH to make the difference clear.

@@ -571,7 +574,8 @@ class BashWrapperBuilder {
final traceWrapper = isTraceRequired()
if( traceWrapper ) {
// executes the stub which in turn executes the target command
launcher = "/bin/bash ${fileStr(wrapperFile)} nxf_trace"
String traceInterpreter = runWithContainer ? "/bin/bash" : "/usr/bin/env bash"
launcher = "${traceInterpreter} ${fileStr(wrapperFile)} nxf_trace"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No -S here becuase bash does not get any further options (think -ue)

@@ -131,7 +131,7 @@ class LocalTaskHandler extends TaskHandler implements FusionAwareTask {
}

protected ProcessBuilder localProcessBuilder() {
final cmd = new ArrayList<String>(BashWrapperBuilder.BASH) << wrapperFile.getName()
final cmd = new ArrayList<String>(BashWrapperBuilder.ENV_BASH) << wrapperFile.getName()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sure that .command.run gets executed correctly on a host without /bin/bash.

@@ -133,7 +133,7 @@ class TaskBean implements Serializable, Cloneable {
this.condaEnv = task.getCondaEnv()
this.spackEnv = task.getSpackEnv()
this.moduleNames = task.config.getModule()
this.shell = task.config.getShell() ?: BashWrapperBuilder.BASH
this.shell = task.config.getShell(task.isContainerEnabled()) ?: BashWrapperBuilder.BASH
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, task will always (?) have a shell since it comes with the process defaults. I have not toched the "else"-block here since I think it is actually not necessary (and obviously did not cause issues for me).

@@ -375,10 +375,10 @@ class TaskConfig extends LazyMap implements Cloneable {
return (List<String>) get('secret')
}

List<String> getShell() {
List<String> getShell(boolean isContainerEnabled = true) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to adapt the unit tests but I don't know how 🤷 .

@rollf
Copy link
Author

rollf commented Oct 28, 2024

(This comment was made when 1d9839e was HEAD.)

After having created this PR I realized that I could do better (see commit history) and have also invested more time to understand the code base. Now, I (hopefully) have a version that fits well and is satisfying for everybody. In the end, my implementation is more intrusive that I initially thought but it feels right now. The final approach now is this: Right from the process/task (object) creation, consider whether the process/task will run inside a container. If not, use /usr/bin/env -S bash, otherwise keep /bin/bash. This, plus 2 minor tweaks for the execution of .command.run and for tracing allows to run nextflow on systems without /bin/bash.

I keep the original comments (above) /git history for documentation. I hope it is not too confusing.

Again, I'm happy to change the code as requested, manipulate the git history, etc. I have also added some review comments which include suggestions for improvements.

I have improved the nextflow-tests for NixOS. You may find some more details there, too. I made sure to test with a docker image that only has /bin/bash available. I did not find any quay.io image (I have roughly ~20 locally) that does not provide /usr/bin/env. I am not advocating for using /usr/bin/env inside the container. I mentioned this here so that it is clear why I specifically build a new image for the NixOS tests.

IMAGES=$(docker image ls | grep quay.io | awk -v OFS='' '{ print $1,":",$2 }')
for IMAGE in $IMAGES ; do echo $IMAGE ; docker run $IMAGE /usr/bin/env bash; done

@rollf rollf marked this pull request as ready for review October 28, 2024 13:21
@rollf
Copy link
Author

rollf commented Oct 28, 2024

@pditommaso you might be the right person to look into this.

Please have a look into this with an unprejudiced mind. I spent a lot of time getting this right. This merged PR proves that using /usr/bin/env bash on the host is acceptable.

I'm sorry that this PR is a little bit of a mess. If you find that overwhelming, I'm happy to clean up everything (i.e. delete all comments, squash git history). I kept it so that you may follow my thought process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant