You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The input image transforms in some of the models may not be configured to offer optimal input to the model during training, validation, and inference.
For instance the endoscopic_inbody_classification example:
uses a Resized transform that shrinks images to 256×256 pixels, but it does not enable anti_aliasing. Especially when downscaling large video frames with sharp details, patterns, or lines in them, this can lead to aliased artifacts that may cause the model to learn the wrong things or recognize structures that aren't really in the original image.
uses NormalizeIntensityd with nonzero set to true. If there are zero-valued pixels in the image, they will not be scaled and offset together with the rest, which may cause discontinuities and again make it seem as if there are structures that aren't really there.
uses NormalizeIntensitydwithout specifying a fixed subtrahend and divisor. This means the intensities in each image will be normalized according to the mean and standard deviation in that specific image. If the image only contains a narrow range of intensities, for instance a dark image with sensor noise, this will be blown up to a big noisy mess in which the model might recognize random things. The usual way of working for ImageNet and such, is to calculate the mean and stddev across the entire training set and use those values everywhere.
Describe the solution you'd like
Set "anti_aliasing": true in the Resized transforms.
Consider leaving nonzero at false in the NormalizeIntensityd transform unless there really is a good reason for it.
Consider setting a fixed subtrahend and divisor in the NormalizeIntensityd transform.
Re-train models with whatever parameters were changed.
I have tested the impact on processing time of enabling anti_aliasing. On a GPU (RTX A2000), the impact is tiny: transforming a 720p video frame and adding it to a batch takes 11ms instead of 9. On CPU the impact is much larger (55ms instead of 9).
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
The input image transforms in some of the models may not be configured to offer optimal input to the model during training, validation, and inference.
For instance the endoscopic_inbody_classification example:
Resized
transform that shrinks images to 256×256 pixels, but it does not enableanti_aliasing
. Especially when downscaling large video frames with sharp details, patterns, or lines in them, this can lead to aliased artifacts that may cause the model to learn the wrong things or recognize structures that aren't really in the original image.NormalizeIntensityd
withnonzero
set totrue
. If there are zero-valued pixels in the image, they will not be scaled and offset together with the rest, which may cause discontinuities and again make it seem as if there are structures that aren't really there.NormalizeIntensityd
without specifying a fixed subtrahend and divisor. This means the intensities in each image will be normalized according to the mean and standard deviation in that specific image. If the image only contains a narrow range of intensities, for instance a dark image with sensor noise, this will be blown up to a big noisy mess in which the model might recognize random things. The usual way of working for ImageNet and such, is to calculate the mean and stddev across the entire training set and use those values everywhere.Describe the solution you'd like
"anti_aliasing": true
in theResized
transforms.nonzero
atfalse
in theNormalizeIntensityd
transform unless there really is a good reason for it.NormalizeIntensityd
transform.Additional context
A paper that discusses the impact of aliasing in convolutional networks
I have tested the impact on processing time of enabling
anti_aliasing
. On a GPU (RTX A2000), the impact is tiny: transforming a 720p video frame and adding it to a batch takes 11ms instead of 9. On CPU the impact is much larger (55ms instead of 9).The text was updated successfully, but these errors were encountered: