Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mode.fit() error. Someone please help me fix this error. I am not able to figure it out #20444

Open
Israh-Abdul opened this issue Nov 4, 2024 · 1 comment
Assignees

Comments

@Israh-Abdul
Copy link

I'm building a capsule network in TensorFlow for binary classification using a custom CapsuleLayer. My model and associated components are as follows:

class CapsuleLayer(layers.Layer):
    def __init__(self, num_capsule, dim_capsule, routings=3, **kwargs):
        super(CapsuleLayer, self).__init__(**kwargs)
        self.num_capsule = num_capsule
        self.dim_capsule = dim_capsule
        self.routings = routings

    def build(self, input_shape):
        self.kernel = self.add_weight(name='capsule_kernel',
                                      shape=(input_shape[-1], self.num_capsule * self.dim_capsule),
                                      initializer='glorot_uniform',
                                      trainable=True)

    def call(self, inputs):
        inputs_hat = K.dot(inputs, self.kernel)
        inputs_hat = K.reshape(inputs_hat, (-1, self.num_capsule, self.dim_capsule))
        b = K.zeros_like(inputs_hat[:, :, 0])

        for i in range(self.routings):
            c = tf.nn.softmax(b, axis=1)
            o = squash(tf.reduce_sum(c[..., None] * inputs_hat, 1))
            if i < self.routings - 1:
                b += tf.reduce_sum(inputs_hat * o[:, None, :], -1)
        return o

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

# Network architecture and margin loss
def CapsNet(input_shape):
    inputs = Input(shape=input_shape)
    x = Conv2D(64, (9, 9), strides=1, activation='relu', padding='valid')(inputs)
    x = Conv2D(128, (9, 9), strides=2, activation='relu', padding='valid')(x)
    x = Reshape((-1, 8))(x)
    primary_caps = CapsuleLayer(num_capsule=10, dim_capsule=8, routings=3)(x)
    digit_caps = CapsuleLayer(num_capsule=2, dim_capsule=16, routings=3)(primary_caps)
    out_caps = Lambda(lambda z: K.sqrt(K.sum(K.square(z), -1)))(digit_caps)
    return models.Model(inputs, out_caps)

def margin_loss(y_true, y_pred):
    m_plus, m_minus, lambda_val = 0.9, 0.1, 0.5
    left = tf.square(tf.maximum(0., m_plus - y_pred))
    right = tf.square(tf.maximum(0., y_pred - m_minus))
    return tf.reduce_mean(tf.reduce_sum(y_true * left + lambda_val * (1 - y_true) * right, axis=-1))

When training, I receive this error:
ValueError: Cannot squeeze axis=-1, because the dimension is not 1.

I've set class_mode='categorical' in the ImageDataGenerator flow:
train_generator = train_datagen.flow_from_directory(train_dir, target_size=(224, 224),
color_mode='grayscale', batch_size=64, class_mode='categorical')
I am using this model to classify an image dataset into 2 classes. Please help!

@VadisettyRahul
Copy link

Hi @Israh-Abdul @mehtamansi29

Some possibilities are:

This can happen if the network output does not have the expected shape for binary classification.

The model output needs to have two final activation units for each class. It would be possible to add a dense layer with softmax activation at the end of the network to ensure an output compatible with class_mode='categorical' of the ImageDataGenerator.

The error can also arise from the margin_loss loss function. It would have the loss for two classes and that y_true and y_pred have the same shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants