Paper by Federico Baldassarre, Diego González Morín, Lucas Rodés-Guirao: arXiv:1712.03400 [cs.CV] Deep Koalarization
Layer | Filters | Kernel Size | Strides | Padding | Activation |
---|---|---|---|---|---|
Conv2D_E1 | 64 | (3 × 3) | (2 × 2) | same | ReLU |
Conv2D_E2 | 128 | (3 × 3) | (1 × 1) | same | ReLU |
Conv2D_E3 | 128 | (3 × 3) | (2 × 2) | same | ReLU |
Conv2D_E4 | 256 | (3 × 3) | (1 × 1) | same | ReLU |
Conv2D_E5 | 256 | (3 × 3) | (2 × 2) | same | ReLU |
Conv2D_E6 | 512 | (3 × 3) | (1 × 1) | same | ReLU |
Conv2D_E7 | 512 | (3 × 3) | (1 × 1) | same | ReLU |
Conv2D_E8 | 256 | (3 × 3) | (1 × 1) | same | ReLU |
Layer | Filters | Kernel Size | Strides | Padding | Activation |
---|---|---|---|---|---|
Conv2D_F1 | 256 | (1 × 1) | (1 × 1) | same | ReLU |
Layer | Filters | Kernel Size | Strides | Padding | Activation |
---|---|---|---|---|---|
Conv2D_D1 | 128 | (3 × 3) | (1 × 1) | same | ReLU |
UpSamp2D_D1 | - | - | - | - | - |
Conv2D_D2 | 64 | (3 × 3) | (1 × 1) | same | ReLU |
Conv2D_D3 | 64 | (3 × 3) | (1 × 1) | same | ReLU |
UpSamp2D_D2 | - | - | - | - | - |
Conv2D_D4 | 32 | (3 × 3) | (1 × 1) | same | ReLU |
Conv2D_D5 | 2 | (3 × 3) | (1 × 1) | same | tanh |
UpSamp2D_D2 | - | - | - | - | - |
The Inception Resnet v2 Model extracts the high-level features of the input grayscale image. The last layer before the softmax activation outputs a vector of size 1000 or dimension (1000 × 1 × 1) (feature-vector). This vector is repeated 28 × 28 times and then reshaped into a volume of (28 × 28 × 1000). This volume is then concatenated depth-wise to the Conv2D_E8 layer. This whole block of size (28 * 28 * 1256) is then passed through Conv2D_F1.