Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the data import process more explicit #373

Closed
dalonsoa opened this issue Oct 3, 2024 · 4 comments
Closed

Make the data import process more explicit #373

dalonsoa opened this issue Oct 3, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@dalonsoa
Copy link
Collaborator

dalonsoa commented Oct 3, 2024

At the moment, when importing data the user selects a Station and a Format. From the user perspective, this only selects some file related settings (like extension and delimiter) and how to deal with the date and time columns. Implicitly, however, the user is picking all the variables that are related to the chosen format via a Classification object, but there's no way for them to know what these variables are or what columns they are using except by walking through all of the Classifications.

It will make way more sense to have the Classification and the Format related via a ManyToMany field in the Format object, rather than a ForeignKey to Format in the Classification object. This way, when a user opens a format, they will see exactly what variables will be pulled from the data file and from where.

This requires some changes to the models, obviously, and to the code to parse the data file, but, specially, the main complication comes with the views and templates used to display the Format objects, which will need to be more complicated as they will need to show a list of all the Classifications related to a Format. Nothing that we have not done in the past - see for example the DeviceSpecification model in Liionsden - but nevertheless, a not straight forward process. I think it will be worth the effort from the point of view of the user experience.

@dalonsoa
Copy link
Collaborator Author

dalonsoa commented Oct 4, 2024

@ICHydro , @tsmbland , I've open his issue to discuss/tackle the data ingestion process, which is not really such a good user experience. Any thoughts are most welcomed.

@ICHydro
Copy link
Collaborator

ICHydro commented Oct 13, 2024

Yes, it took me a bit of time to get familiar with the role of the format object in the import process. From a user perspective, it would probably be most straightforward if the user can set options such as the delimiter symbol, and the meaning of the columns (variable, dimensions etc) directly during the import process, rather than creating a format object first, and then using this object during import.

The value of the latter approach is probably that it is quicker if the user imports frequently the same data format, and perhaps that is also the reason that it may have been implemented in the original FONAG system.

It would seem ideal if we can combine both, for example the interface allows a user to set all the settings manually when importing a data file, but is able to click a box like "save these import settings for future use". If clicked, a format object would be created and stored for future use. I guess that this would require the same under-the-hood changes to the models that @dalonsoa suggests above?

In any case, I agree that it is useful for the user to see exactly what variables will be pulled from the datafile where.

@dalonsoa
Copy link
Collaborator Author

There's a lot of parameters to set when importing data, specially to declare the columns to import (see the Classification), so presenting all of these to the user in the import form - even if it is just once and then they can save it for future use - can be daunting. Most users, I think, will be importing always the same type of data files, so having to define the right format upfront makes sense.

Let me check how we did things in Liionsden, the other project where we had to face this problem, and that was very neat for the end user (if not under the hood, I don't recall).

@dalonsoa
Copy link
Collaborator Author

I think we can move to a more lightweight approach like the one described in #407 . Not only is way simpler to implement, but it is also enough for what is meant to achieve - make it clear to the user what they are really importing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants