Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add camera new tools #277

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open

Add camera new tools #277

wants to merge 45 commits into from

Conversation

DamienCode404
Copy link
Contributor

@DamienCode404 DamienCode404 commented Oct 11, 2024

Addition of New CAMERA Tools for the Metabolomics Suite

Description:

This PR introduces several new tools to the CAMERA tool suite, used for metabolomics analysis with LC-MS data. These tools complement existing functionalities and enhance peak detection and annotation.

New Tools Added:

  1. camera_groupFWHM:

    • Groups peaks within a defined retention time window using Full Width at Half Maximum (FWHM).
  2. camera_groupCorr:

    • Groups peaks based on retention time and intensity correlations across samples.
  3. camera_findIsotopes:

    • Detects isotope patterns in LC-MS peak lists based on mass differences and expected isotope ratios.
    • Provides isotope annotation for downstream metabolomics analysis.
  4. camera_findAdducts:

    • Identifies potential adducts in mass spectrometry data by detecting characteristic mass shifts between peaks.
    • Facilitates the identification of molecular ions and their adducts for more accurate metabolite annotation.

Why These Changes?

These new tools provide greater flexibility and modularity for analyzing LC-MS data by breaking down the all-in-one annotateDiffreport tool into four distinct tools. This separation allows users to run specific tasks such as isotope detection, adduct identification, peak grouping by FWHM, or correlation independently. By decoupling these functionalities, users can better customize their workflows based on their needs, making the analysis more efficient and tailored to specific research objectives.

FOR CONTRIBUTOR:

  • - I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • - License permits unrestricted use (educational + commercial)
  • - This PR adds a new tool or tool collection
  • - This PR updates an existing tool or tool collection
  • - This PR does something else (explain below)

Copy link
Contributor

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe reformat your tools using 4 spaces.

Please also consider using the https://github.com/galaxyproject/galaxy-language-server it can reformat all your tools automatically and will do the lining for you as well :)

tools/camera/CAMERA_findAdducts.R Outdated Show resolved Hide resolved
tools/camera/macros.xml Show resolved Hide resolved
tools/camera/groupCorr.xml Outdated Show resolved Hide resolved
tools/camera/groupFWHM.xml Outdated Show resolved Hide resolved
tools/camera/groupFWHM.xml Outdated Show resolved Hide resolved
tools/camera/groupFWHM.xml Outdated Show resolved Hide resolved
tools/camera/findAdducts.xml Outdated Show resolved Hide resolved
tools/camera/findAdducts.xml Outdated Show resolved Hide resolved
tools/camera/findAdducts.xml Outdated Show resolved Hide resolved
@hechth
Copy link
Contributor

hechth commented Oct 21, 2024

@DamienCode404 @bgruening @yguitton the way how the tool reads input data currently is wrong/broken. Through the image argument, hard-coded paths from the R session get loaded into the Galaxy tool ... this doesn't work.

Which data are you using? The output from XCMS fillpeaks or CAMERA annotate? Maybe we can arrange a call to talk about these things and figure them out, then I can help with the implementation.

@hechth
Copy link
Contributor

hechth commented Oct 21, 2024

Also, more in general, what is the purpose of these new tools? I can see that the existing CAMERA tool also seems to include all of these steps - do you want to split them or whats the plan here?

@DamienCode404
Copy link
Contributor Author

@DamienCode404 @bgruening @yguitton the way how the tool reads input data currently is wrong/broken. Through the image argument, hard-coded paths from the R session get loaded into the Galaxy tool ... this doesn't work.

Which data are you using? The output from XCMS fillpeaks or CAMERA annotate? Maybe we can arrange a call to talk about these things and figure them out, then I can help with the implementation.

Hi @hechth, I'm curious as to why this isn't working. I'm still a beginner with galaxy tools.

For the groupFWHM camera tool, we use xcms fillpeaks files as input. For the rest of the tools, we only use camera rdata output.

@DamienCode404
Copy link
Contributor Author

Also, more in general, what is the purpose of these new tools? I can see that the existing CAMERA tool also seems to include all of these steps - do you want to split them or whats the plan here?

The main aim of this tool is, as you said, to split the annotateDiffreport tool into 4 sub-tools. This is to provide users with more options when launching these tools and to skip certain steps if necessary. Also, this method seems to give more consistent results. We expect execution time to decrease as well.

@hechth
Copy link
Contributor

hechth commented Oct 22, 2024

You should use Rds files storing only a single variable if already using builin R datatypes. This is a serious security vulnerability. If you load an RData file, it might overwrite anything internal. Someone can store an environment where print downloads some malware and runs it. If you load a Rds file, at least you avoid that other variables might be corrupted.

@jsaintvanne
Copy link
Member

Hi Helge !

This choice has been done for all XCMS workflow in W4M since the beginning I think... ! With this we should rework all this workflow that actually works with RData containing multiple variables...

Maybe we can ask @lecorguille about it ? Cause here, we just continue the workflow of XCMS in CAMERA, didn't touch the variables saved in RData files.

@hechth
Copy link
Contributor

hechth commented Oct 22, 2024

@jsaintvanne Yeah I just saw - this is probably something that would make sense to address - maybe also with the update to XCMS 4?

@hechth
Copy link
Contributor

hechth commented Oct 22, 2024

Currently planemo test fails with the following error message.

Error in retrieveRawfileInTheWorkingDir(singlefile, zipfile) : 
  Cannot access the sample: ko15.CDF located: /home/laberca/galaxy/database/objects/a/0/c/dataset_a0c47c86-be0f-4097-bd3e-68b6f5e9f04b.dat . Please, contact your administrator ... if you have one!
Execution halted
.

@jsaintvanne
Copy link
Member

@jsaintvanne Yeah I just saw - this is probably something that would make sense to address - maybe also with the update to XCMS 4?

Yeah we were asking how we will go to XCMS 4, maybe that's a way... !

Currently planemo test fails with the following error message.

Error in retrieveRawfileInTheWorkingDir(singlefile, zipfile) : 
  Cannot access the sample: ko15.CDF located: /home/laberca/galaxy/database/objects/a/0/c/dataset_a0c47c86-be0f-4097-bd3e-68b6f5e9f04b.dat . Please, contact your administrator ... if you have one!
Execution halted
.

Think this come from the data test where there is the singlefile variable to keep the link between the cdf filename and their galaxy name and that has been done in local that's why we have this path hardcoded... @DamienCode404 is working on it !

We should maybe discuss about the RData and RDs files and their security cause we can't really see the problem here sorry !

…d the correct number of sample columns in tsv output.
Failed to expand inclusions [{'source': 'camera_groupfwhm.xml'}, {'source': 'camera_groupfwhm.r'}]
WARNING: Failed to expand inclusions [{'source': 'camera_groupfwhm.xml'}, {'source': 'camera_groupfwhm.r'}]
Failed Tests
RData : Binary data detected, not displaying diff
Copy link
Contributor

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please include your Rscript that you use with "required_files": https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-required-files


<expand macro="requirements"/>

<command detect_errors="exit_code"><![CDATA[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation seems to be off here and makes it hard to read.

I recommend to use the https://github.com/galaxyproject/galaxy-language-server it has an auto-format feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the indentation seems correct. I validate the lint steps. I've already indented and formatted the code. Is it just the 4 extra spaces that bother you @bgruening ? In all the other tools I've seen, they all have this style of indentation between <tool></tool> tags. Maybe I've misunderstood.

@DamienCode404
Copy link
Contributor Author

Hello everyone,
We're making progress with the development of the CAMERA tool suite, but I'm getting stuck with a display error.
In my “help” section for each tool we'd like to add a global workflow diagram. Exemple here :

------------------------------------------
General schema of the metabolomic workflow
------------------------------------------

.. image:: groupFWHM.png

This code should display my image in Galaxy, but I have this result :

image

I think it's maybe a problem with the fact that i m developing in a local environment, or because my png files are too big (~55ko).
Let me know if you already solved this problem. Ty !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants