- download questions:
python cli/prep/ download_vqa2_questions --out_dir data/datasets/vqa2
- download annotations (answers):
python cli/prep/ download_vqa2_annotations --out_dir data/datasets/vqa2
- download train/val/test images:
python cli/prep/ download_vqa2_images --out_dir data/datasets/vqa2
, or only download test images:python cli/prep/ download_vqa2_images --nodownload_train --nodownload_val --download_test --out_dir data/datasets/vqa2
convert to simplified jsonl format:
python cli/prep/ process_vqa --dataset_dir data/datasets/vqa2 --version 2
train set (82783 images, ~ 10 minutes):
python cli/prep/ write_db --image_dir data/datasets/vqa2/train2014
val set (40504 images, ~ 5 minutes):
python cli/prep/ write_db --image_dir data/datasets/vqa2/val2014
test set (81434 images, ~ 5 minutes):
python cli/prep/ write_db --image_dir data/datasets/vqa2/test2015
download nvlr2 images:
python cli/prep/ download_nlvr2_images --base_url https://xxx/NLVR2/ --out_dir data/datasets/nlvr2
, see to get image url, and replace thexxx
with the base url -
download nvlr2 dataset:
python cli/prep/ download_nlvr2_data --out_dir data/datasets/nlvr2 --download_balance true --download_unbalance true
Convert to lmdb format:
- train set (103170 images, ~ 4 minutes):
python cli/prep/ write_db --image_dir data/datasets/nlvr2/images/train --output_file data/datasets/nlvr2/train.lmdb --path_pattern "*/*"
- dev set (8102 images, ~ 15 seconds):
python cli/prep/ write_db --image_dir data/datasets/nlvr2/dev
- test set (8082 images, ~ 15 seconds):
python cli/prep/ write_db --image_dir data/datasets/nlvr2/test1
use the following snippet to download flickr30 images dataset: (you can follow instructions here to get your kaggle api key)
# pip install opendatasets
import opendatasets as od"", "./download_dir")
and move flickr30k_images
to data/datasets/flickr30k
python cli/prep/ write_flickr30k_db data/datasets/flickr30k/flickr30k_images
Then follow instructions from snli-ve repo to get train, dev, test splits, and put them in data/datasets/snli-ve
download karpathy split
- flickr30:
├── flickr30k_images
│ ├── 1000092795.jpg
| └── ...
└── dataset_flickr30k.json
python cli/prep/ convert_dataset data/datasets/flickr30k/dataset_flickr30k.json data/datasets/flickr30k/flickr30k.jsonl
- mscoco:
for images, just use vqa coco-trainval2014.lmdb
python cli/prep/ convert_dataset data/datasets/mscoco/dataset_coco.json data/datasets/mscoco/mscoco.jsonl