-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Illumination correction task #62
Comments
Hey @tcompa , thanks for the clear step description! This is currently what we do at the Liberali lab indeed. Concerning the more important point of the output strategy: since ideally the illumination correction should be part of a larger "PrepProcessing Workflow" for the lab (at least this is what happens not at the Liberalis), rewriting the .zarr structure should be fine for now, so the above strategy is sound! The flexibility of writing the output onto a new .zarr file is still important though as I can imagine that it can: 1) aid us in debugging once we have more tasks added into the task list and 2) aid the user in general in learning about optimal processing parameters when using a novel task / learning things from scratch. @tcompa , how difficult do you think it would be to create a bundled task that replicates and outputs the new .zarr structure considering the processed data? Also, do you think it would make sense to have a pointer file that would refer to these different .zarr structures (i.e. in different processing states) so that a user could best refer to which files to further process / delete? Would the naming of the folder be enough? Any other thoughts? |
Thanks for these details. I agree: storing the corrected data in a new zarr file has relevant use cases. But it also opens a broader discussion that we are having
Note that this also applies to the MIP task (first a global task to replicate the zarr, then one for the actual MIP), but in that case it seems less likely that one wants to add the 2D data in the same zarr array as the 3D ones. Something which is different, but also very important, is the issue of the pointer files. We can imagine having a few zarr files for each plate (e.g. the raw one, possibly the corrected one, one for the MIP, .. what else?), and tasks should "know" which one to use. At the moment this is all in the folder names (some are Some remarks:
|
A couple of quick questions on the output (showing my little experience with image analysis..):
Well, upon writing this I realize that here I am normalizing the correction matrix range from its original one (say [138,255]) to [0,1], meaning that 138 is mapped to 0. But this is probably not what @gusqgm meant, and I guess I should rather divide by the maximum value (without subtracting 138). This seems much more reasonable.. Gustavo, could you please confirm? |
To be more explicit, I was doing:
but I guess the correct thing to do is
|
Hey @tcompa you are right, dividing by the maximum yields a better result, especially considering that at this stage you have already performed the sole subtraction of the camera background values, so forcing more values to 0 while normalizing will most likely lead to loss of information in some cases. Regarding some of the other above points:
|
Thanks. At the moment, my understanding is the followingç
I have doubts on how to rescale the corrected images back to 16bit.
|
Concerning the implementation of the core illumination correction: The I quickly tested this locally and it produces an output as we want it (it's not perfect for my data, because it was acquired on a different microscope, but you can see how it compensates at the edge). : Code for correction:
The major benefits from this solution:
|
=> Let's go with this illumination correction & background subtraction approach for now, it seems to work quite well for general use-cases. We can implement more complicated versions later, e.g. calculating statistics on images (see here: https://pelkmanslab.org/wp-content/uploads/2019/02/StoegerBattichYakimovich2015.pdf) or use |
Quick update: The MWE for the illumination-correction task is essentially complete as it is in The issue appears when overwriting a zarr file (
Debugging this tricky issue is taking us more than expected, but it is a core feature that we definitely need to sort out ASAP, as it will likely appear within several other tasks. PS |
Some more details. What we are trying to do is essentially this:
This simple example does work, but the same thing fails when the new array comes from the more complex procedure we are using (involving several stacks, and a More on this soon, hopefully. |
Here is a much simpler minimal not-working example:
After this script is run, we expect main.zarr and raw.zarr to have the same size (~10M), but we observe that main.zarr (the overwriten one) is much smaller (<1M). Note that if do |
Very relevant: dask/dask#5942 |
The dask issue I mentioned (dask/dask#5942) is still open, after two years. It can be solved by patching After a lot of time spent in debugging, at the moment me and @mfranzon opted for a quick&dirty custom solution for the
It's clearly not the ideal way to go, but possibly sufficient for our needs. We are now testing it on the FMI dataset, to verify that at least the output looks reasonable. |
The example I'd say that we are at the level where additional tests would be useful. Important caveat: at the moment we are only loading FMI correction matrices for 60x and for channels 1,2,3,4 (each channel being corrected with the corresponding matrix). Next thing to do is to add an argument that somehow answers the question "which correction matrices should I use?". In fact there is also an example on git for the UZH dataset, but it's still using the FMI matrices and then there is no reason why it should produce nice output. |
That's great! We will probably need this ability to overwrite a zarr in different places, would be useful to abstract our workaround. Then, we can use this workaround with the temporary solution while there are still issues and, if the issues are fixed, we just replace our wrapper with the direct implementation :) Also looks good regarding memory usage and the images look nice. Also, I will have a look whether we can get similarly structured illumination correction data for UZH images as well. We typically used a different approach, but maybe I can manually create some in the same format for the moment that fit the UZH data :) |
Quick answers:
|
|
Quick question (@gusqgm and @jluethi) about correction-matrix filenames. Say that we have the files in the folders
and filenames are like
We can then use QUESTION |
(the last comment partly overlaps with the simultaneous comment by Joel) |
Small overlap. Maybe the initial implementation is that there is just 1 illumination correction file per channel, but we will definitely need to support the use-case where more than 1 correction file exists. There may be a default way of handling it, but the user would need the ability to choose the file. Regarding how standardized file naming is: As an overview of where this profile may come from:
Implementing 3 is not a priority for the moment and 2 is WIP at the FMI at the moment. So if we come up with a good way of defining how the files need to be named for 1 (e.g. similar to what we have there), we can then use that for approaches 2 & 3. |
As of our call today: |
Current version (536897e) does that. At the moment there's only a quick&dirty way of passing inputs, which we can later improve on.
and the actual task call is
|
Concerning the illumination-correction task, it would be useful to have another example of matrices, to see whether the current version (simple and very explicit in what user should provide) works reasonably. @jluethi, perhaps we could have some illumination matrices related to UZH data? After this test, I think we could close this issue for now, and we can reopen it later if we want to improve something. Comments? |
I'd like to test the FMI data with the correct illumination matrices. @gusqgm, do you confirm that this is the correct mapping between channels and files? Thanks
|
@tcompa I created some mock illumination correction profiles for UZH (mock: not our real process to create them, but they are fitting the data very well)*. I updated all the UZH example scripts and created json files containing this info. Also used this opportunity to restructure the file naming a bit of the additional json files: Channels & illum_corr are not specific to a subset, but to the whole cardiac experiment. Thus, all cardiac test data now use those same json files. As an illustration to show how well this works:
Also, very nice flexibility of input handling and the overall ability to overwrite the Zarr files, which were the major goals for this issue! I'd say we have achieved both nice illumination correction, as well as building the necessary infrastructure here! 👏🏻🚀
@tcompa Do you want to run the FMI dataset as a test now? If that looks good, I'd say with close this issue :)
|
Great, thanks!
Of course. I ran it this afternoon (before this commit), but I'll do it again and then close the issue. |
@gusqgm already explained the basics of the illumination correction task in #26 (comment). Let's see if we are all on the same page.
In our understanding, the steps are as follows:
yokogawa_to_zarr
task), subtract the background (=120, as a default value) from all the array elements.Concerning the output, our first proposal is that we just replace the original zarr array (the one obtained via
yokogawa_to_zarr
) with the new corrected one. Later on we can decide how to proceed: does it have to be a on/off flag provided by the user (see also https://github.com/fractal-analytics-platform/mwe_fractal/issues/29)? Note that this will require some care, since (by now) creating a new zarr file requires a global task that first generates its structure (as we currently do withreplicate_zarr_structure_mip
for the MIP task).The text was updated successfully, but these errors were encountered: