A semantic segmentation model for pixel-wise document image binarization.
- fine-tune Segformer on 1024
$\times$ 1024 images; - set
reduce_labels=True
in Segformer processor to ignore the background; - compare valid DIBCO metrics with SauvolaNet's paper.
Segformer is an efficient semantic segmentation model introduced by Xie et al. in 2021.
In this repository, we will provide a fine-tuning of Segformer for pixel-wise document image binarization.
The dataset is an ensemble of 14 datasets replicating the setting used in SauvolaNet by Li et al. in 2021.
Figure 1. An example pair from the Bickley diary dataset
For more information on the dataset, see SauvolaNet's official repository.