Skip to content

Implementation of the paper titled - U-Net: Convolutional Networks for Biomedical Image Segmentation @ https://arxiv.org/abs/1505.04597

Notifications You must be signed in to change notification settings

sauravmishra1710/U-Net---Biomedical-Image-Segmentation

Repository files navigation

U-Net---Biomedical-Image-Segmentation

Implementation of the paper titled - U-Net: Convolutional Networks for Biomedical Image Segmentation

Original Paper

The original paper can be accessed @ https://arxiv.org/abs/1505.04597

Why Unet?

UNet, a convolutional neural network dedicated for biomedical image segmentation, was first designed and applied in 2015. In general the usecases for a typical convolutional neural network focuses on image classification tasks, where the output to an image is a single class label, however in biomedical image visual tasks, it requires not only to distinguish whether there is a medical condition, but also to localize the area of infection i.e., a class label is supposed to be assigned to each pixel.

UNet Architecture

The complete architecture of the UNet netowrk is as seen in the image below - Image Reference - https://arxiv.org/pdf/1505.04597.pdf

The Unet netowrk model has 3 parts:

  • The Contracting/Downsampling Path.
  • Bottleneck Block.
  • The Expansive/Upsampling Path.

Contracting Path:

It consists of two 3x3 unpadded convolutions each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2 for downsampling. After each downsampling operation, the number of feature channels are doubled. The following images show the input block and the contracting path -

Input Block -

Contracting Path -

Bottleneck Block:

The bottleneck block connects the contracting and the expansive paths. This block performs two unpadded convolutions each with 1024 filters and prepares for the expansive path. The image below shows the bottleneck block -

Expansive Path:

Every step in the expansive path consists of an upsampling of the feature map followed by a 2x2 convolution (“up-convolution”) using transposed convolutions, a concatenation with the correspondingly feature map from the contracting path, and two 3x3 convolutions, each followed by a ReLU. Transposed convolution is an upsampling technique to expand the size of images. The image below shows the expansive path of the network -

Skip Connections:

The skip connections from the contracting path are concatenated with the corresponding feature maps in the expansive path. These skip connections provide higher resolution features to better localize and learn representations from the input image. They also help in recovering any spatial information that could have been lost during downsampling. The image below shows one skip connection between the contracting path and the expansive path -

Final Layer:

At the final layer a 1x1 convolution is used to map each (64 component) feature vector to the desired number of classes.

The entire network consists of a total of 23 convolotional layers.

Original Implementation

The original UNet model's implementation as described in the paper can be found @ UNet - Biomedical_Segmentation

Application of UNet

An application of the UNet model is implemented @ UNet In Action. The objective of the task is to segment and identify the cell nuclei. The dataset is taken from a kaggle chellenge - 2018 Data Science Bowl - Find the nuclei in divergent images to advance medical discovery.

Results and Conclusion

Comparing the microscopic image, original mask and the predicted mask, it looks like the model is correctly able to segment the cell nuclei and generate the masks. The predictions for the masks as generated by the trained UNet model on the cellular nuclei images are as seen in the images below -

Prediction 1

Prediction 2

Prediction 3

Though UNet was originally designed for bio-medical images, this model can be applied to any conputer vision segmentation task.

Further Reading

  1. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. The original paper is available at https://arxiv.org/abs/1807.10165.
  2. UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. The original paper is available at https://arxiv.org/abs/1912.05074v2.
  3. Attention U-Net: Learning Where to Look for the Pancreas. The original paper is available at https://arxiv.org/abs/1804.03999.
  4. TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation. The original paper is available at https://arxiv.org/abs/1801.05746.
  5. U-Net and its variants for medical image segmentation: theory and applications. The original paper is available at https://arxiv.org/abs/2011.01118.

References

  1. Ronneberger, O., Fischer, P. and Brox, T. (2015) ‘U-Net: Convolutional Networks for Biomedical Image Segmentation’, CoRR, abs/1505.0. Available at: http://arxiv.org/abs/1505.04597.
  2. Implementing original U-Net from scratch using PyTorch by Abhishek Thakur. Available at https://www.youtube.com/watch?v=u1loyDCoGbE