Reading .xlsx files #562

genisplaja · 2022-10-27T09:22:53Z

Hello! As you might have seen, PR #560 is blocked because of an optional dependency that is missing:

E               ImportError: Missing optional dependency 'openpyxl'.  Use pip or conda to install openpyxl.

That is because I am trying to load a dataset metadata .xlsx file using pandas.read_excel(). I tried to use different loading strategies but didn't work, that seems to be the standard way to load .xlsx files. Would openpyxl be a problematic dependency to have in mirdata, taking into account that we may have a dataset in the future that includes .xlsx files as well (I know is not common but who knows...). Otherwise we could include openpyxl as an optional dependency for the dataloader in #560. What do you think?

The text was updated successfully, but these errors were encountered:

magdalenafuentes · 2022-10-27T14:27:33Z

Hey @genisplaja, we were discussing the issue of optional dependencies with @harshpalan the other day. We thought that a potential solution could be to add a check in loaders that need those optional dependencies, so when you initialize the dataset it will warn you that the dependency is missing and you should install it for that dataset to work. Would that fix your issue?

genisplaja · 2022-10-27T17:06:07Z

Yes! Now we were having a conversation with @nkundiushuti that maybe we could also try to change from .xslx to .csv, since we have access to the Zenodo entry, however, IMHO creating a newer version just for that is a little bit overkill.

What you actually propose I think would be nice. We could even use pipdeptree to check if the user has the particular optional dependencies installed, and if not, throw the warning.

genisplaja added the help wanted Extra attention is needed label Oct 27, 2022

genisplaja mentioned this issue Dec 5, 2022

Tests are failing when openpyxl is not installed. #568

Open

magdalenafuentes mentioned this issue Jan 27, 2023

[WIP] Changes for Gtzan dataset #569

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading .xlsx files #562

Reading .xlsx files #562

genisplaja commented Oct 27, 2022

magdalenafuentes commented Oct 27, 2022

genisplaja commented Oct 27, 2022 •

edited

Loading

Reading .xlsx files #562

Reading .xlsx files #562

Comments

genisplaja commented Oct 27, 2022

magdalenafuentes commented Oct 27, 2022

genisplaja commented Oct 27, 2022 • edited Loading

genisplaja commented Oct 27, 2022 •

edited

Loading