Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge or remove trajan_from_dictionary to from_dataframe #56

Open
gauteh opened this issue Mar 22, 2023 · 9 comments
Open

merge or remove trajan_from_dictionary to from_dataframe #56

gauteh opened this issue Mar 22, 2023 · 9 comments

Comments

@gauteh
Copy link
Member

gauteh commented Mar 22, 2023

I think most of the functionality in https://github.com/OpenDrift/trajan/blob/main/trajan/__init__.py#L56 could be done in a simpler way in https://github.com/OpenDrift/trajan/blob/main/trajan/__init__.py#L21. The latter is also missing some attributes to make it completely CF-compatible.

Maybe it is easier to only use from_dataframe, as far as I know only sfydata uses the from_dictionary method: https://github.com/gauteh/sfy/blob/main/sfy-processing/sfy/cli/collection.py#L185 and I think that this method could be re-written to use from_dataframe and it would be end up being shorter.

This would be a pretty generic method to go from arbitrary tabular data to a CF-compliant trajactory dataset.

@knutfrode
Copy link
Contributor

Yes, I agree.
I have also written locally a method for the very simplest case, when you have arrays of lon, lat, time for a single drifter. This presently creates a dict to forward to trajectory_dataset_from_dict, which is quite awkward.

@gauteh
Copy link
Member Author

gauteh commented Mar 22, 2023

That could probably be done with something like:

import trajan as ta
import pandas as pd

lon = [..]
lat = [...]
time = [...]
ds = pd.DataFrame({'lon': lon, 'lat': lat, 'time': time})
ds = ta.from_dataframe(ds)

@knutfrode
Copy link
Contributor

Oh yes, then there should be no use for another method.

Though some people are lazy enough to prefer
ds = ta.from_lonlat(lon, lat, time)
over
ds = ta.from_dataframe(pd.DataFrame({'lon': lon, 'lat': lat, 'time': time}))

@knutfrode
Copy link
Contributor

and it should also be possible to provide drifter name(s) and variable- and global attributes

@gauteh
Copy link
Member Author

gauteh commented Mar 22, 2023

name is already possible to provide. variable and global attributes are better to set using standard xarray methods:

ds.attrs = ...

Constructing a dict or passing in attrs in some other way is equally much code or complexity, and we would diverge from the way xarray datasets are used in other contexts.

@gauteh
Copy link
Member Author

gauteh commented Mar 22, 2023

I'm not totally opposed to from_lonlat, but I think we could start with adding some simple copy-pastable method documentation to from_dataframe on how to do that.

@knutfrode
Copy link
Contributor

Ok, so you suggest to first create the dataset, and then add attributes with separate commands?
Yes, that might be better.
I guess users can make their specific convenience methods/wrappers whenever they want.

@gauteh
Copy link
Member Author

gauteh commented Mar 22, 2023

Yes, I don't see how it will simplify things if we have a way to specify attrs anyway. The attributes need to be set up beforehand anyway, now the user just needs to do it after having created the dataset, so the user would have to do the same steps anyway: then it is better to do it directly through xarray and not special arguments.

Regarding from_lonlat I think the biggest problem is for users to discover which method to use for this. That's the reason I added read_csv, so that it would be easy to google - but it is in the same category as from_lonlat.

@gauteh
Copy link
Member Author

gauteh commented Mar 22, 2023

Here's an example/suggestion, also with adding attributes: https://opendrift.github.io/trajan/autoapi/trajan/index.html#constructing-a-dataset-from-arrays-of-postions-and-time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants