MzIdentML Converter Modifications #77

sureshhewabi · 2024-09-16T13:53:40Z

We need to make this library supported by command-line options for each functionality including:

1. Validation of crosslinking mzIdentML (mzID) files. #78
2. Command-line support #80
3. Data generation for PDBDev reports. #79

colin-combe · 2024-09-17T10:32:15Z

Hi @sureshhewabi, @aozalevsky, @ypriverol

we already have command line support in https://github.com/PRIDE-Archive/xi-mzidentml-converter/blob/python3/parser/process_dataset.py using the standard python library argparse.

Perhaps click is better, it has a section in the documentation about why it is not based on argparse.

Anyway, there are multiple solutions for making the command line interface. We can use click if it seems the best.

I think validation will consist mainly of running the parser and seeing if it works or not. But it will need to be modified so it doesn't try to write stuff anywhere. Also, we can improve its error messages so we know why it failed.

Lets think more about 'Data Generation for PDBDev reports':

what information do you want to get back and in what format?
how do you want to call it? - is this a case of using it like a library, i.e. a dependency of the code that generates the reports, rather than calling it on command line? (Is the code that generates the reports in python?)

cheers,
C

colin-combe · 2024-09-17T10:44:53Z

Also, IMP may have a need to extract crosslinking data from mzIdentML files? This might be related?

FYI, our converter is based on the pyteomics library. It adds a way of getting crosslink info from whats returned from pyteomics, it's not a 'from scratch' implementation of mzIdentML parsing.

aozalevsky · 2024-09-18T14:17:32Z

@colin-combe Ideally, i'd like to get an output similar to the current API output. Basically, we need sequences (some ID + sequence) + residues pairs. Keeping the JSON formatted output would be nice, too.

Calling (import + call) as a library would be ideal, but making a subprocess CLI call is also acceptable.

colin-combe · 2024-09-19T06:55:25Z

Calling (import + call) as a library would be ideal

yes, i think that's better. And you were totally right with what you said in meeting about there being several benefits to it being like this (not just a way of addressing the private submission questions). It was never deliberately not a library.

Anyway, i'll take a look at this next week,
cheers,
C

ypriverol · 2024-09-19T07:57:22Z

Validation is the priority, and then the data structure and the JSON for PDBDev reports. We have to test the validation in the command line and create some documentation for users who want to start testing their dataset files. @sureshhewabi probably would be good to have an issue alone and link to this one.

sureshhewabi · 2024-09-19T08:06:15Z

Thanks everyone. As we discussed on the meeting yesterday, let's create separate issues for separate task and then delegate task among us. Also we can keep this as the main Issue that link other task so we can track the progress.

aozalevsky · 2024-09-24T17:11:12Z

Also, IMP may have a need to extract crosslinking data from mzIdentML files? This might be related?

I had a chat with Ben, the main IMP developer in our lab. He agreed it would be a neat addition to the current functionality (dealing with csv/xls lists).

colin-combe · 2024-10-04T11:40:28Z

i updated #79 and #78 to reflect status of version in PR #84

any comments on how to better organise/structure the main process_dataset.py file are welcome. (or just general python style stuff)

sureshhewabi added the CrossLinkingValidationLib Changes related with Crosslinking validations label Sep 16, 2024

sureshhewabi changed the title ~~Command-line support for the feature~~ MzIdentML Converter Modifications Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MzIdentML Converter Modifications #77

MzIdentML Converter Modifications #77

sureshhewabi commented Sep 16, 2024 •

edited

Loading

colin-combe commented Sep 17, 2024

colin-combe commented Sep 17, 2024

aozalevsky commented Sep 18, 2024

colin-combe commented Sep 19, 2024

ypriverol commented Sep 19, 2024

sureshhewabi commented Sep 19, 2024

aozalevsky commented Sep 24, 2024

colin-combe commented Oct 4, 2024

MzIdentML Converter Modifications #77

MzIdentML Converter Modifications #77

Comments

sureshhewabi commented Sep 16, 2024 • edited Loading

colin-combe commented Sep 17, 2024

colin-combe commented Sep 17, 2024

aozalevsky commented Sep 18, 2024

colin-combe commented Sep 19, 2024

ypriverol commented Sep 19, 2024

sureshhewabi commented Sep 19, 2024

aozalevsky commented Sep 24, 2024

colin-combe commented Oct 4, 2024

sureshhewabi commented Sep 16, 2024 •

edited

Loading