-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about per-frame metadata #30
Comments
Hi Talley
The metadata in addImage() and appendImage() is optional. It might be nice
to have an option to add tags that contain real-time, unpredictable data.
Where and how this data might be stored is implementation-dependent.
Acquire-zarr can't overwrite or append custom metadata, so, at least for
now, adding metadata would generate a run-time error. BigTiff writer, on
the other hand, is fine with inserting arbitrary metadata into multi-image
TIF blocks. There is a processing penalty for accessing this metadata
without getting image pixels, but that's the tradeoff for having a simple
implementation.
One could imagine that some future versions of zarr writer might be able to
add image-based metadata to the summary one specified at the start. For
example, it might cache the image meta in a temporary file and append it to
the custom/summary meta at the end of the acquisition. Or something like
that.
That was the general idea.
Nenad
…On Sat, Dec 14, 2024 at 11:40 AM Talley Lambert ***@***.***> wrote:
I've been digging a bit deeper into the codebase, as I work through
@go2scope <https://github.com/go2scope>'s storage-device proposal for
micro-manager. Very excited about the potential.
I have a question about how we should be handling "per frame" metadata. I
see ZarrStreamSettings_s.custom_metadata, and it works fine to add
one-time additional metdata at the creation of the StreamSettings (which I
presume is almost always going to be at the beginning of an acquisition).
But I'm curious where/how any additional metadata accumulated during the
sequence can be added (this will inevitably be needed to data that can't be
known a-priori, like time-stamps, etc...)
Looks like the primary mechanism for writing additional data is
ZarrStream_append, which doesn't take external metadata, and I also don't
see a mechanism for "rewriting" external metadata after stream creation.
Any thoughts on how that might look?
—
Reply to this email directly, view it on GitHub
<#30>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMVLVO2B7FAZBDPTE5L7QC32FSCTHAVCNFSM6AAAAABTTZX4IWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG42DAMJRGUYDMOI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
yeah i understand that it's currently optional as far as the MMCore spec is concerned ... but I need it 😂 and when I went to look inside the acquire-zarr source to see how it might be implemented, i didn't find anything. (i saw that big tiff works, that's fine, but it's acquire-zarr that I'm excited about). In any case, this is less of a question about the MMCore implementation, and more a question of how acquire-zarr itself could support adding anything beyond deterministic pre-acquisition metadata. If the general answer is "we don't support that" then sure, an MMCore storage device would need to devise its own workaround like storing metadata and then overwriting what ZarrStream.custom_meta originally wrote |
I am sure Nathan and Alan will be able to provide a solution. For example,
at the very minimum, they should make it possible to re-write or append to
custom metadata created at the beginning. Since this metadata is optional
and its name is "custom," I don't see why it must be determined initially
and stay immutable afterward. If they allow modifications, the mm adapter
can cache image metadata and append it to the custom metadata at the end,
or at any appropriate time. Since acquire-zarr requires JSON encoding,
adding more fields would not be a problem unless the Zarr standard
explicitly prescribes that we should not do that.
A more difficult problem is the max-efficiency setup streaming setup, where
we attach storage directly to the circular buffer, and the camera controls
the acquisition. In that case, the application doesn't know when a
particular image is inserted, and metadata can't be added. This might be a
non-issue because, in the asynchronous scenario, you need to know the exact
conditions when a particular image is added.
The current API allows for a compromise solution. The camera controls when
images go into the buffer, but the application controls when they are
written to disk (saveNextImage()) and has the opportunity to insert the
metadata.
Nenad
…On Sat, Dec 14, 2024 at 12:00 PM Talley Lambert ***@***.***> wrote:
The metadata in addImage() and appendImage() is optional.
yeah i understand that it's currently optional as far as the MMCore spec
is concerned ... but I need it 😂 and when I went to look inside the
acquire-zarr source to see how it might be implemented, i didn't find
anything. (i saw that big tiff works, that's fine, but it's acquire-zarr
that I'm excited about).
In any case, this is less of a question about the MMCore implementation,
and more a question of how acquire-zarr itself could support adding
anything beyond deterministic pre-acquisition metadata.
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMVLVO5KCCUDNPZKCEG6ULT2FSE67AVCNFSM6AAAAABTTZX4IWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBTGMZDQMZRGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
yep, that's why I opened this to hear how they are thinking about it :) to see which solutions they expect they will prefer, and which they see as out of scope, etc...
that's MMCore specific considerations, right? again, I really wasn't trying to get into mmcore stuff here. I mentioned it in my post to provide context, but it's really a more abstract question about how acquire-zarr intends (or doesn't intend) to handle metadata that isn't defined at stream-definition time. more generically, outside of MMCore, etc... |
Nenad @go2scope and I spoke about this yesterday and a first step could be allowing the user to read in or overwrite the custom metadata in full at any point during streaming. Depending on what you're saving, per-frame metadata can get quite large for a long acquisition, so a JSON encoding may not be the best way to go. Another approach might be to modify the struct FrameMetadata {
const char* array_name;
ZarrDataType type;
void* data;
size_t nbytes_data;
}; and save per-frame metadata to 1D arrays within the group. So your Zarr group structure might look like
with metadata to match. Both of these solutions together would work pretty well, I think. @tlambert03 |
I'm not aware of a standard built on top of zarr that provides for per-frame metadata. It's a gap in the ome-zarr standard. We could create a de facto one, and we can certainly think about api design. But since it's not part of zarr we never added any support. To me, the most important thing is timestamps and other telemetry (scalars). Next might be keeping track of state changes on the instrument (event-driven structured data). If I needed to solve this with zarr today, I'd save those files on the side (possibly in the zarr root directory); I'd use different formats for the "telemetry" and "event-driven" data, but both would have to have some way of correlating measurements with indices in the n-dimensional array (that seems straightforward). So it's possible for people to start playing with solving this problem without doing anything with acquire-zarr. |
can certainly do it outside of acquire-zarr, was mostly curious if you see this as in scope or not 👍. I can definitely understand not wanting to get into it if the ome-zarr itself doesn't comment on it |
It's important. I wouldn't say it's out of scope, but without more information it looks like an open-ended problem. I'm looking for examples of use cases that might inform how we can bound the solution. Micromanager has a more defined use case around logging its metadata, for example. That could be solved at the level of the zarr storage device there (using Nenad's pr). |
yep, definitely recognize that this can be solved over there. Part of me imagines that nenad's storage device PR won't be the only way that I will ever want to interact with acquire-zarr, which is why I keep bringing this back to a slightly higher level discussion. I very much recognize that we can do whatever we want over there to solve this :)
for me, I think the most natural thing that acquire-zarr could do (without making too many assumptions) would probably be to follow the general I also agree with @aliddell that json is probably not the best format for performance (i have currently been using msgpack instead) and to the extent that that is incompatible with zarr (is it?), then I think it would be reasonable for you to punt on that and instruct users to roll their own metadata on the side |
Nathan
If you just added an API call to write (or overwrite) custom metadata at
any time, the problem is solved - at least for the micro-manager adapter.
We can use custom metadata to store whatever additional information we want.
I don't see why it must be immutable during acquisition and must be created
*before* acquisition starts. These constraints seem arbitrary to me. It is
"custom". No code inside the Zarr library depends on its contents. Also, I
think it is fine if it grows large.
We must add the per-frame metadata to make the Zarr writer fully compatible
with the micro-manager. I was thinking of creating an additional file that
the MM driver wrapping the zarr library would have to insert somewhere.
That's much less desirable than using the facility that you would provide
at the library level.
Nenad
…On Tue, Dec 17, 2024 at 8:28 AM Nathan Clack ***@***.***> wrote:
I'm not aware of a standard built on top of zarr that provides for
per-frame metadata. It's a gap in the ome-zarr standard.
We could create a de facto one, and we can certainly think about api
design. But since it's not part of zarr we never added any support.
To me, the most important thing is timestamps and other telemetry
(scalars). Next might be keeping track of state changes on the instrument
(event-driven structured data).
If I needed to solve this with zarr today, I'd save those files on the
side (possibly in the zarr root directory). I'd use different formats for
the "telemetry" and "event-driven" data, but both would have to have some
way of correlating measurements with indices in the n-dimensional array
(that seems straightforward).
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMVLVOZAZDJAMY5G2M6HRBD2GBGKVAVCNFSM6AAAAABTTZX4IWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBYHE2TCMBTGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah making that mutable definitely opens up a lot of flexibility, without adding a new spec |
ooh that's a good idea. What's the proposal here? Something like |
I suggest deleting the call ZarrStreamSettings_set_custom_metadata and
adding a new one:
ZarrStream_set_custom_metadata(bool overwrite) that one can call at any
time after the stream is open and before it is closed.
This call can overwrite whatever was there before if overwrite is true. If
overwrite is false, overwriting will generate a run-time error. Or
something in that spirit. In principle, we don't even have to be able to
overwrite, if that sounds uncomfortable.
Nenad
…On Tue, Dec 17, 2024 at 5:26 PM Nathan Clack ***@***.***> wrote:
If you just added an API call to write (or overwrite) custom metadata at
any time, the problem is solved
ooh that's a good idea. What's the proposal here? Something like
update_external_metadata?
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMVLVO7WOFQ4VF4G6MPXBHL2GDFLVAVCNFSM6AAAAABTTZX4IWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNJQGA3TONBVGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@aliddell what do you think? |
@nclack That'll work. |
I've been digging a bit deeper into the codebase, as I work through @go2scope's storage-device proposal for micro-manager. Very excited about the potential.
I have a question about how we should be handling "per frame" metadata. I see
ZarrStreamSettings_s.custom_metadata
, and it works fine to add one-time additional metdata at the creation of the StreamSettings (which I presume is almost always going to be at the beginning of an acquisition). But I'm curious where/how any additional metadata accumulated during the sequence can be added (this will inevitably be needed to data that can't be known a-priori, like time-stamps, etc...)Looks like the primary mechanism for writing additional data is
ZarrStream_append
, which doesn't take external metadata, and I also don't see a mechanism for "rewriting" external metadata after stream creation.Any thoughts on how that might look?
The text was updated successfully, but these errors were encountered: