Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

ValueError: Expected input batch_size (864) to match target batch_size (32). #232

Open
abhibha1807 opened this issue Sep 30, 2020 · 0 comments

Comments

@abhibha1807
Copy link

Hello,
I am a newbie to tensorflow and hugging face. I had some BERT for text classification code with me and I am trying to figure out whether the same code can be used for ALBERT. I am facing this error while training: ValueError: Expected input batch_size (864) to match target batch_size (32). My data is the in the form of a CSV file with two columns, 'text' and 'label'. The label column has two labels [0.0,1.0] The dataset contains 13752 instances and out of them approx 13569 are in training and 1883 are in validation.
This is the code:

input_ids_train = []
attention_masks_train = []
for sent in sentences_train:
    encoded_dict_train = tokenizer.encode_plus(
                        sent,                  
                        add_special_tokens = True, 
                        max_length = 27,        
                        padding = 'longest',
                        return_attention_mask = True,  
                        return_tensors = 'pt', 
                        truncation=True   
                   )
   input_ids_train.append(encoded_dict_train['input_ids'])
   attention_masks_train.append(encoded_dict_train['attention_mask'])

input_ids_train = torch.cat(input_ids_train, dim=0)
attention_masks_train = torch.cat(attention_masks_train, dim=0)
labels_train = torch.tensor(labels_train)
tf.shape(input_ids_train) # OUTPUT: <tf.Tensor: shape=(2,), dtype=int32, numpy=array([13569,    27], dtype=int32)>
tf.shape(labels_train)# OUTPUT:  <tf.Tensor: shape=(1,), dtype=int32, numpy=array([13569], dtype=int32)>
train_dataset = TensorDataset(input_ids_train, attention_masks_train, labels_train)
train_dataloader = DataLoader(train_dataset, sampler = RandomSampler(train_dataset),  batch_size = 32)

#defining the model 
model = AutoModelWithLMHead.from_pretrained("albert-base-v1",num_labels = 2,output_attentions = False,output_hidden_states = False)
model.cuda()
epochs=2
for epoch_i in range(0, epochs):
    total_train_loss = 0
    model.train()
    for step, batch in enumerate(train_dataloader):
        b_input_ids = batch[0].to(device)
        b_input_mask = batch[1].to(device)
        b_labels = batch[2].to(device)
        loss, logits = model(b_input_ids,  token_type_ids=None,  attention_mask=b_input_mask,labels=b_labels.long()) #error from this line
        total_train_loss += loss.item()
        loss.backward()

    # Calculate the average loss over all of the batches.
    avg_train_loss = total_train_loss / len(train_dataloader)            
    

The code worked well when I was using BERT, but I got this error when I switched to ALBERT, hence I have posted the question here. Any links to answers at other forums would be also highly appreciated. Please let me know if am missing out anything. Thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant