Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: MappingSetDataFrame: .to_csv() and read_csv() #517

Open
joeflack4 opened this issue Apr 7, 2024 · 5 comments
Open

Suggestion: MappingSetDataFrame: .to_csv() and read_csv() #517

joeflack4 opened this issue Apr 7, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@joeflack4
Copy link
Collaborator

joeflack4 commented Apr 7, 2024

Overview

I was initially suggesting that we could have an alternative constructor for MappingSetDataFrame that could accept paths rather than objects, however, based on some more recent discussions (1, 2), it might be better to do design very similar to pandas:

Reading

msdf: MappingSetDataFrame = read_csv()

Or read_tsv(), or just read().

It could accept various different combinations of params:
a. Path to a SSSOM TSV which includes a metadata comment.
b. Path to a SSSOM TSV with no metadata comment, and path to a metadata yaml.
c. Path to a SSSOM TSV with no metadata comment, and path to a metadata object.
d. a dataframe, and path to a metadata yaml.
e. a dataframe, and path to a metadata object.

Writing

msdf.to_csv()

Or to_tsv().

This would be an alternative or replacement to parse_sssom_table().

@matentzn
Copy link
Collaborator

matentzn commented Apr 7, 2024

I see the point, but its a bit of a lower priority. I would probably recommend a utility method

def load_mapping_set_from_df(str: mapping_set_path, str: metadata_path):
        ...

Rather than adding further IO logic into the MappingSetDataFrame class.

@joeflack4 joeflack4 added the enhancement New feature or request label Apr 7, 2024
@joeflack4
Copy link
Collaborator Author

That would be fine, but I'm not sure if I like it either. Another thing that a user needs to know to import.

The logic for MappingSetDataFrame if it were a normal class would be a few lines, but I just realized that I don't know how to handle this for data class setup(s). And I guess with LinkML, it has to be a data class.

@matentzn
Copy link
Collaborator

matentzn commented Apr 8, 2024

I would say we go with the pandas design, as it is familiar to other devs, i.e. pd.load_csv(..)

@joeflack4
Copy link
Collaborator Author

Yes, I like that solution!

@joeflack4 joeflack4 changed the title Suggestion: MappingSetDataFrame metadata: path alternative Suggestion: MappingSetDataFrame: .to_csv() and `read_csv() May 20, 2024
@joeflack4 joeflack4 changed the title Suggestion: MappingSetDataFrame: .to_csv() and `read_csv() Suggestion: MappingSetDataFrame: .to_csv() and read_csv() May 20, 2024
@joeflack4
Copy link
Collaborator Author

I just changed the title and OP of this issue from being about allowing MappingSetDataFrame to accept paths, to instead being about pandas-like functions/methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants