Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiling description #227

Open
esgomezm opened this issue Aug 6, 2021 · 6 comments
Open

Tiling description #227

esgomezm opened this issue Aug 6, 2021 · 6 comments

Comments

@esgomezm
Copy link
Contributor

esgomezm commented Aug 6, 2021

As discussed today, each consumer will apply some tiling when analyzing a "whole image" with a model. However, the tiling strategy might differ among software and there's no one unique correct way of doing so.

This will definitely affect the deployment of the bioimage.io models so one of the concerns was whether to implement it in the CI, specify it somewhere so the user knows about it, or delimit it somehow according to the model architecture.
There's probably much more documentation but the one I told you is this one:

ELMI presentation: https://youtu.be/Cj_p6ZzCN6g?t=418

Paper: https://arxiv.org/pdf/2101.05846.pdf

@FynnBe @constantinpape @k-dominik @oeway @akreshuk @fjug @arrmunoz

@oeway
Copy link
Contributor

oeway commented Aug 6, 2021

Hi @esgomezm , thanks for the info, I had a look at the presentation, so far nothing surprising yet. The talk is about tiling can go wrong, and the taken home message is the overlapping should be chose wisely according to the architecture of the network.

Correct me if I am wrong, but here is what I am thinking about tiling. There are basically two parameters in the tiling: 1) minimal patch size 2) minimal overlapping size. There is no doubt about minimal patch size since we already have min field in the input shape for that. The only thing that can be tricky is the minimal overlapping size (but we do have the halo keyword already for that).

For sure it won't work if we try to use a single overlapping for all cases, but the point is that for a given network with defined architecture, we can infer exactly what is the minimal overlapping size that will generate correct result.

This is exactly what the talk is about, but it's more focused on U-Net. I will repeat what Dagmar mentioned in the last slide about suggestions for choosing tiling for U-Net here:

  • Use valid padding
  • Train your U-NET with output with window size > pooling_factor^levels
  • Predict with output window size cropped to n*pooling_factor^levels, with n>0

As you can see, you can actually compute it to make it correct. It might not that easy, but given a CNN that take an image in and produce another image, we can always compute the minimal patch size and overlapping size.

However, the tiling strategy might differ among software and there's no one unique correct way of doing so.

Well, not exactly, because one model (at least static CNNs) can have only one unique minimal patch size and minimal overlapping size, this means any software can use a tiling bigger than that, but not smaller than that. In that sense, there is only one unique correct way for a given model, we just need to set correct min and halo and respect that for tiling.

In short, I think we might need to clarify the spec, but I think with the current spec we can already define the tiling behavior.

@constantinpape
Copy link
Collaborator

ELMI presentation: https://youtu.be/Cj_p6ZzCN6g?t=418

Paper: https://arxiv.org/pdf/2101.05846.pdf

Note that this paper refers to tiling strategies for embedding based instance segmentation. It does NOT treat tiling for general image-to-image models.

In general, there are two types of tiling strategies:

  • non-overlapping tiles
  • overlapping tiles and aggregation of the predictions in the overlaps (e.g. for EM boundary prediction but there are many other examples)

None of these methods is inherently right or wrong, which one is suited depends on the model, data and computation budget.

@esgomezm
Copy link
Contributor Author

esgomezm commented Aug 6, 2021

Note that this paper refers to tiling strategies for embedding based instance segmentation. It does NOT treat tiling for general image-to-image models.

Sure, this was yet another example (the most recent one I heard about).

Well, not exactly, because one model (at least static CNNs) can have only one unique minimal patch size and minimal overlapping size, this means any software can use a tiling bigger than that, but not smaller than that. In that sense, there is only one unique correct way for a given model, we just need to set correct min and halo and respect that for tiling.

I think I'd insist on this. As @constantinpape said there are different strategies to face the problem of tiling and a software may just have one of them implemented which is not necessarily wrong. For example in ZeroCostDL4Mic, the halo is not considered, in deepImageJ (see supplementary material in the preprint) we use overlapping tiles but then we crop them sufficiently small so as to avoid artifacts in the borders (same as the UNet original paper), other's get an average prediction in the overlaping area, and so on.

My opinion is that one should guide the user into this and get some accessible documentation rather than forcing all software to do the tiling using the same method.

@oeway
Copy link
Contributor

oeway commented Aug 6, 2021

None of these methods is inherently right or wrong, which one is suited depends on the model, data and computation budget.

I disagree here, the none overlapping cases just mean they have halo=0. Given a model architecture we can tell exactly whether it require overlapping pixels. Take the standard U-net for example, if you do tiling with an overlapping size smaller than it's minimal required value or simply don't do tiling, the image will contain tiling artifacts (as shown in the presentation), and this means result is wrong. And that's what Dagmar's title as well "what can go wrong with titling and stitching".

I think I'd insist on this. As @constantinpape said there are different strategies to face the problem of tiling and a software may just have one of them implemented which is not necessarily wrong.

Well, for a given model, if the model has halo>0 but they don't respect to it, that's a wrong implementation. Could you be a bit clear on what you are insisting, we have already defined min and halo, are you suggesting we should remove halo? Or tell the user you are free to use or not use halo?

I am not saying that we should enforce a CI to reject all the models that does not set halo correct. Instead, I would add a recommendation in the description of halo with something like this: It is highly recommended to set the halo value using the minimal overlapping value of the network, all the consumers should do tiling with respect to the halo value to avoid tiling artifacts..

The other issue of not respecting the halo value is bad for reproducibility, it is generally bad to run the same model in different consumer with the same input and produce different output.

For example in ZeroCostDL4Mic, the halo is not considered, in deepImageJ (see supplementary material in the preprint) we use overlapping tiles but then we crop them sufficiently small so as to avoid artifacts in the borders (same as the UNet original paper), other's get an average prediction in the overlapping area, and so on.

For any implementation that leads to tiling artifacts, that is an incorrect or at least sub-optimal implementation. If they know what they are doing, the consumer can choose to ignore our recommendation, but as a spec, I think we should at least guide the developers/users to the right direction. For these cases you mentioned, you need to first know the halo value, otherwise you won't know how to set that. Asking the user for that value doesn't seem like a good strategy to me.

My opinion is that one should guide the user into this and get some accessible documentation rather than forcing all software to do the tiling using the same method.

To clarify, recommending the minimal tiling value doesn't mean there is only one tiling method.

@ghost
Copy link

ghost commented Aug 6, 2021

(Reply to the first comment of Wei) I do agree with Wei.

@constantinpape
Copy link
Collaborator

Just fyi, I have implemented prediction with tiling in bioimageio.core: https://github.com/bioimage-io/python-bioimage-io/blob/main/bioimageio/core/prediction.py#L247.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants