Timeout when using xarray to write data to zarr.storage.DBMStore multiple times which makes the whole zarr store unreadable #4832
Unanswered
QINQINKONG
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi. I'm trying to use Xarray to output files to zarr.storage.DBMStore. Since, the dataset is quite large, I calculate and output the results to one zarr.storage.DBMStore multiple times (append dim='time'). However, when the program failed because of timeout (I request 4hours from the queue every time), I seems to unable to access the data that have already been written to this zarr store during previous times. I wonder how to dealing with issues like this?
More generally, what we should do if we encounter some error when using xarray to write to a zarr store multiple times along a dimension?
Below is my code to write and read files to and from zarr store, and also the error message after a timeout attempt:
code of writing data to zarr store
After several times writing to this zarr store, the program fails because of timeout (the cluster may become very slow for some reason that I don't know). Then I found that I cannot read the data that have been written into this zarr store previously.
Problem description
It seems that the data written previously would be still accessible if I use the default directory storage. However, my cluster storage quota has a limit on the total number of files and using the default directory storage would create many small files and easily exceed limit.
Appreciate any help.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions