are attention_masks zero tensors? #37

GinnyXiao · 2024-11-06T19:09:20Z

Dear authors,

I was wondering during draining what attention masks were you using? In your inference code I saw you generated these masks by setting them to zero tensors text_padding_position=torch.zeros_like(input_ids). Was it the same for training? Thank you so much!

def forward(...):
    ...
    output = self.mm_extractor.beit3(
            visual_tokens=images_evf, 
            textual_tokens=input_ids, 
            text_padding_position=~attention_masks
            )
    ...

The text was updated successfully, but these errors were encountered:

CoderZhangYx · 2024-11-07T09:00:25Z

We place no attention at padding tokens. Our dataset code may produce the attention_mask. Simply using that is okay.

GinnyXiao · 2024-11-07T14:13:44Z

Got it, thank you soooo much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

are attention_masks zero tensors? #37

are attention_masks zero tensors? #37

GinnyXiao commented Nov 6, 2024

CoderZhangYx commented Nov 7, 2024

GinnyXiao commented Nov 7, 2024

are attention_masks zero tensors? #37

are attention_masks zero tensors? #37

Comments

GinnyXiao commented Nov 6, 2024

CoderZhangYx commented Nov 7, 2024

GinnyXiao commented Nov 7, 2024