Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make expression files match submission format #1059

Open
ValWood opened this issue Jan 31, 2023 · 7 comments
Open

Make expression files match submission format #1059

ValWood opened this issue Jan 31, 2023 · 7 comments
Assignees

Comments

@ValWood
Copy link
Member

ValWood commented Jan 31, 2023

We have some expression files

https://www.pombase.org/data/releases/latest/misc/gene_expression_table.tsv

but the export format seems to include and internal ID (PBO) which should be exported as the correct relation
to match the format described here
https://www.pombase.org/documentation/quantitative-gene-expression-data-bulk-upload-format
and here
https://www.pombase.org/documentation/qualitative-gene-expression-data-bulk-upload-format

(note currently I think that only quantitative are exported)

@afg1

@kimrutherford
Copy link
Member

The PBO terms in the file are just "RNA level" and "protein level" so we can probably just remove the "term_id" column.

@ValWood
Copy link
Member Author

ValWood commented Nov 30, 2023

I wonder why we added it, since we have the "type" in column 3?

@kimrutherford
Copy link
Member

I wonder why we added it, since we have the "type" in column 3?

Are we looking at the same file?
I was looking at https://www.pombase.org/data/releases/latest/misc/gene_expression_table.tsv which doesn't have a "type" column.

@ValWood
Copy link
Member Author

ValWood commented Nov 30, 2023

Sorry I'm confusing with the input file. I agree, we don't need the ID column since the type is expicit in "term name". SInce this coumn woud no longer be an "mini ontology" term (ony required for chado), it might be better to also relabel the column term_name to "type" so that it matches the coumn with the same data in the submission files?
https://www.pombase.org/documentation/qualitative-gene-expression-data-bulk-upload-format
although in the input file we call it "protein" not "protein level"

https://www.pombase.org/documentation/qualitative-gene-expression-data-bulk-upload-format

Since it's expression data, that the data will be "x level" can be assumed. So maybe we ony need to export
type= protein note "term_name=protein level"

Also we call it "gene expression data", that shoud be more generically "expression data"

It's probably easier to chat about this...

kimrutherford added a commit to pombase/pombase-chado-json that referenced this issue Dec 1, 2023
@kimrutherford
Copy link
Member

I've removed the unhelpful "term_id" column from the output file.

@ValWood
Copy link
Member Author

ValWood commented Feb 9, 2024

@kimrutherford
Copy link
Member

Note to self: change code that draws the violin plots and change configuration for columns in advanced search and table download

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants