-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pcat.search(...).to_dataset_dict() sometimes slower than it should #253
Comments
For very big catalogs, I could thus see a substantial difference in speed compared to simply opening the files. That being said, we could see if there are speedups to be accomplished. |
@coxipi Is your catalog supposed to have aggregation, or is it indeed just a list of independent datasets ? The aggregation can often be sped up with passing there to
assuming all the elements to be aggregated are well behaved (no overlap between files, all variables of the same name have the same dimensions and the exact same coordinates on the non-appended dims, etc). |
Not sure what you mean by "independent datasets". Each key in the dataset dict represents a different simulation (each with its own single path to a zarr) as created in previous steps of the xscen workflow. |
I meant that they are not meant to be unified into a single dataset in the same way a In that case, I'm not sure why |
@RondeauG, in Note: if the |
Setup Information
Context
I store my files on "jarre", which is considered a slow disk AFAIK.
Sometimes,
pcat.search(...).to_dataset_dict()
will take forever to access my files (bad behaviour), while this homemade function:has a speed which is similar to the good expected behaviour of
pcat.search(...).to_dataset_dict()
.I can't tell what conditions on the server could be related to this problem. The problem sometimes comes, stays for a bit, and then stops.
Is this issue known?
The text was updated successfully, but these errors were encountered: