adding datastore size to iohub info #248

edyoshikun · 2024-09-26T00:52:47Z

This addresses issue #247 by adding the store size and array size in GB. This is useful and simple metadata.

I wanted to know how much memory to request for caching datasets.

ziw-liu · 2024-09-26T02:37:48Z

Is this meant to represent the size on disk (compressed) or size in RAM (decompressed)?

edyoshikun · 2024-09-26T18:33:54Z

I find it more use when it's decompressed rather than compressed. We can report both if needed. I think zarr.array does nbytes_stored. What do you guys think?

talonchandler

I think the uncompressed size is the most valuable.

The reported size is the expected size, not the true size (e.g. it hasn't been filled yet or there was an error). Naming is tricky---maybe "Expected uncompressed size (GB)", "Est. size in RAM (GB)", or "Est. size (GB)"?

iohub/reader.py

edyoshikun · 2024-09-28T01:29:59Z

ended up adding uncompressed size [GB]

ziw-liu · 2024-09-29T18:11:52Z

iohub/reader.py

@@ -262,11 +262,21 @@ def print_info(path: StrOrBytesPath, verbose=False):
                print("Zarr hierarchy:")
                reader.print_tree()
                positions = list(reader.positions())
+                total_GB_uncompressed = (
+                    len(positions) * (positions[0][1][0].nbytes) / 1e9


THis would be confusing if showing 0.00 GB for <10 MB. Maybe try to mimic the behavior of du -h?

e.g. https://web.archive.org/web/20111010015624/http://blogmag.net/blog/read/38/Print_human_readable_file_size

Zarr also does this: https://zarr.readthedocs.io/en/v2.18.3/_autoapi/zarr.core.Array.html#zarr.core.Array.info

import zarr z = zarr.zeros(1000000, chunks=100000, dtype='i4') z.info Type : zarr.core.Array Data type : int32 Shape : (1000000,) Chunk shape : (100000,) Order : C Read-only : False Compressor : Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0) Store type : zarr.storage.KVStore No. bytes : 4000000 (3.8M) No. bytes stored : 320 Storage ratio : 12500.0 Chunks initialized : 0/10

adding datastore size to info

f374116

edyoshikun requested review from ziw-liu, talonchandler and ieivanov September 26, 2024 00:52

ziw-liu added enhancement New feature or request NGFF OME-NGFF (OME-Zarr format) labels Sep 26, 2024

talonchandler reviewed Sep 26, 2024

View reviewed changes

iohub/reader.py Outdated Show resolved Hide resolved

adding uncompressed string

e70d168

ziw-liu reviewed Sep 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding datastore size to iohub info #248

adding datastore size to iohub info #248

edyoshikun commented Sep 26, 2024 •

edited

Loading

ziw-liu commented Sep 26, 2024 •

edited

Loading

edyoshikun commented Sep 26, 2024

talonchandler left a comment •

edited

Loading

edyoshikun commented Sep 28, 2024

ziw-liu Sep 29, 2024 •

edited

Loading

ziw-liu Sep 29, 2024

ziw-liu Sep 29, 2024

adding datastore size to iohub info #248

Are you sure you want to change the base?

adding datastore size to iohub info #248

Conversation

edyoshikun commented Sep 26, 2024 • edited Loading

ziw-liu commented Sep 26, 2024 • edited Loading

edyoshikun commented Sep 26, 2024

talonchandler left a comment • edited Loading

Choose a reason for hiding this comment

edyoshikun commented Sep 28, 2024

ziw-liu Sep 29, 2024 • edited Loading

Choose a reason for hiding this comment

ziw-liu Sep 29, 2024

Choose a reason for hiding this comment

ziw-liu Sep 29, 2024

Choose a reason for hiding this comment

edyoshikun commented Sep 26, 2024 •

edited

Loading

ziw-liu commented Sep 26, 2024 •

edited

Loading

talonchandler left a comment •

edited

Loading

ziw-liu Sep 29, 2024 •

edited

Loading