Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 272 #282

Merged
merged 6 commits into from
Oct 31, 2024
Merged

Issue 272 #282

merged 6 commits into from
Oct 31, 2024

Conversation

nschcolnicov
Copy link
Contributor

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/demultiplex branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Oct 29, 2024

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 1df1a08

+| ✅ 203 tests passed       |+
#| ❔   3 tests were ignored |#
!| ❗   7 tests had warnings |!

❗ Test warnings:

  • nextflow_config - Config manifest.version should end in dev: 1.5.2
  • pipeline_todos - TODO string in README.md: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-10-31 09:06:15

@nschcolnicov
Copy link
Contributor Author

Addresses #272

@nschcolnicov
Copy link
Contributor Author

nschcolnicov commented Oct 29, 2024

@grst I checked the paths on the samplesheets after running bases2fastq, bcl2fastq, bclconvert, fqtk, sgdemux and mkfastq. I tested them specifying both a local and a remote outdir and they are now displaying the right paths.
One thing that I noted is that when a local path is specified, only the relative path is displayed in the samplesheet. For example, if the user sets "--outdir results/" then the path of the fastq files in the samplesheet will start with "./results/", instead of the absolute path.
Is this something we would like to address? It may be tricky to get it to display the absolute path, but if its worth it, I can look into it.

Note: We discussed about adding the samplesheets to the nf-tests, but due to the way the are implemented, I don't see how this could be done. When we generate the snaps we do it locally, and the paths for the fastq files contain whichever path we are running the tests, but when the tests are ran by github, the paths are different, so the samplesheets can't match.

@grst
Copy link
Member

grst commented Oct 30, 2024

One thing that I noted is that when a local path is specified, only the relative path is displayed in the samplesheet

I think that's fair. I could even think of some cases where you'd want to use relative paths. If someone requires an absolute path they shall specify an absolute path as --outdir.

We discussed about adding the samplesheets to the nf-tests, but due to the way the are implemented, I don't see how this could be done

I was thinking of actually running the pipelines on the samplesheets that are generated. In that case the paths should match, but maybe it's too much overhead.

What we actually want is to test if the generated path exists. Would something along the lines of

then {
    def csv_file = new File(process.out.samplesheet)
    csvFile.eachLine { line, index ->
        columns =  line.split(",")
        assert path(columns[1]).exists()
        assert path(columns[2]).exists()  
    }
}

work? This is untested, it's more like pseudo-code to convey my idea.

@nschcolnicov
Copy link
Contributor Author

One thing that I noted is that when a local path is specified, only the relative path is displayed in the samplesheet

I think that's fair. I could even think of some cases where you'd want to use relative paths. If someone requires an absolute path they shall specify an absolute path as --outdir.

We discussed about adding the samplesheets to the nf-tests, but due to the way the are implemented, I don't see how this could be done

I was thinking of actually running the pipelines on the samplesheets that are generated. In that case the paths should match, but maybe it's too much overhead.

What we actually want is to test if the generated path exists. Would something along the lines of

then {
    def csv_file = new File(process.out.samplesheet)
    csvFile.eachLine { line, index ->
        columns =  line.split(",")
        assert path(columns[1]).exists()
        assert path(columns[2]).exists()  
    }
}

work? This is untested, it's more like pseudo-code to convey my idea.

Ah yes, that would be a good way of testing them, I'll take care of it

@nschcolnicov
Copy link
Contributor Author

nschcolnicov commented Oct 30, 2024

@grst I was able to add a function in the nf-tests that validate the downstream samplesheet, and doing so I found that the samplesheet generator script would not add the right number of commas when it had empty values in some of the rows, causing the samplesheet validator to fail. So I updated the samplesheet builder module as well.

@nschcolnicov nschcolnicov marked this pull request as ready for review October 30, 2024 17:03
@nschcolnicov nschcolnicov requested a review from a team as a code owner October 30, 2024 17:03
Copy link
Member

@grst grst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat implementation! Just two minor things...

Comment on lines +75 to +77
// Clone the first item in meta for output
meta_clone = meta.first().clone()
meta_clone.remove('publish_dir') // Removing the publish_dir just in case, although output channel is not used by other process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This process is the last one that uses the samplesheet, we are outputting the samplesheet and the meta, regardless, in case we use it for anything in the future. I added the publish_dir key to the meta map for easier handling of the paths, so I'm removing it to revert it back to its original state

workflows/demultiplex.nf Outdated Show resolved Hide resolved
@nschcolnicov nschcolnicov merged commit 74b1869 into dev Oct 31, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants