-
-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate alternatives to xarray
to handle ProcessedVariable
computations
#3913
Comments
What is pyodide being used for if it is an issue? I have used pyarrow and pandas in a lot of web based apps without issue. Both pandas and pyarrow are pretty common in data science, so I know these get used in web/notebook applications on a regular basis |
It's not being used by us currently, but as a part of my work assignment I am extending support for it across a lot of PyData projects and across the Scientific Python ecosystem (please see Quansight-Labs/czi-scientific-python-mgmt#18 and Quansight-Labs/czi-scientific-python-mgmt#19). PyBaMM isn't quite there yet, because we have CasADi as a dependency—it is tricky to compile it to WASM—if it becomes optional, we could move things forward on that (see #3826). The best and most stable example of where you can see Pyodide currently is on any of the usage examples in the
There's no issue as such if you do so locally for any data science workflows because the |
Yeah if we are going to drop xarray then using scipy or numpy native features would be good. However, it looks like we use pandas directly in a bunch of files, so it is not just due to xarray. I think if you want to make pandas optional, then you would need to pandas from a bunch of places (notebooks, tests, etc) and not just remove xarray. Pandas can be useful for analysis and plotting, so we should probably think about if it is useful on the whole to include it and make sure it is a concern for our users. Realistically optional dependencies just make things more complicated. Unless we have fully optional modules then we should try to just remove problematic libraries all together. |
We did have pandas as an optional dependency before #3892, didn't we? I imagine it should not be a lot of work to make it fully optional back again with the A lot of the plotting features (for example |
xarray
to handle ProcessedVariable
computationsxarray
to handle ProcessedVariable
computations
Recently, #3892 highlighted that
pandas
was being installed as an implicit required dependency for PyBaMM, because it was a required dependency for one of our required dependencies (xarray
).pandas
was otherwise listed as an optional dependency with the[pandas]
extra and is currently used only for handling CSV files.This dependence on
xarray
is particularly concerning, because:pandas
decides to act upon PDEP-10 with v3, it would drastically increase the download size for PyBaMM (pyarrow
wheels across platforms are 120+ megabytes in size at a minimum).Prior to the use of
xarray
(see #2366) as a backend for theProcessedVariable
and theProcessedVariableComputed
classes, thescipy.interpolate
module was being used – which could be an option to return to.There is time until
pandas
decides on this and also until we release v24.5, so we can take into account some of the developments around this area as they arise (as discussed in the technical roadmap meeting on 18/03/2024).The text was updated successfully, but these errors were encountered: