You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I got the MSCOCO captions_train2014.json and captions_val2014.json, as described in the paper, there are 82,783 train samples and 40,504 val samples, every sample contains 5 captions. If I omit one caption and combine the other four into two paraphrase pairs, there will be about 2*(82,783 + 40,504)=246,574 pairs. How can i get the 320k paraphrase pairs?
The text was updated successfully, but these errors were encountered:
The author replies me how to create the dataset as follows:
Each data has multiple captions. Say a,b and c are paraphrases of each other then to make it into a pair you can do the following pairing:
a -> b
b -> a
a -> c
c -> a
b -> c
c -> b.
This will mean a lot more data-points than the total number of image-caption pair. However, make sure that all the phrases that are part of a single image remain either in train or in val.
Hi,
I got the MSCOCO captions_train2014.json and captions_val2014.json, as described in the paper, there are 82,783 train samples and 40,504 val samples, every sample contains 5 captions. If I omit one caption and combine the other four into two paraphrase pairs, there will be about 2*(82,783 + 40,504)=246,574 pairs. How can i get the 320k paraphrase pairs?
The text was updated successfully, but these errors were encountered: