This repository contains the code for converting the imagenet attributes dataset to a Visual Referring Expression (VRE) dataset.
Original Dataset
The imagenet attribute dataset is a new addition to the already famous dataset. They used Amazon mechanical turk to tag the following atrtributes in the images
- Color: black, blue, brown, gray, green, orange, pink, red, violet, white, yellow
- Pattern: spotted, striped
- Shape: long, round, rectangular, square
- Texture: furry, smooth, rough, shiny, metallic, vegetation, wooden, wet
I have used a simple yet effective algorithm to convert this into a VRE dataset:
Algorithm
- Create a 3x3 collage of images from the original dataset by scaling each image to a fixed size (say 300 by 200 pixels).
- For each sub-image:
-
Save its bounding box (or boundaries in the collage).
-
Create the referring expression with the following rule
the <comma separated attributes> <synset name>
-
Examples are shown below.
This creates 9 referring expressions per collage.
- Download dataset from here.
-
Clone this repository
-
Download dataset from here.
-
Run
dataset_browser.py
to view the images and bounding boxes.
-
Clone this repository
-
Download the original dataset from here. [optional]
-
Unzip the original dataset in the top level project directory.
unzip data.zip
-
Run the
create_dataset.py
(make sure to check input directory is pointed to ./data) -
This will create the output folder with vre_dataset.json
├── README.md -> you are here
├── init.py -> tell python its a module
├── create_dataset.py -> used to create the dataset
├── dataset_browser.py -> used to view the dataset
├── imagenet_attributes -> this folder contains the original imagement attributes dataset
│ └── attrann.mat -> original dataset mat file
├── json -> contains all the jsons required to create this dataset
│ ├── attribMap.json -> maps an image to its attributes
│ ├── syntoid.json -> maps a synset to the images contained
│ └── syntoword.json -> maps a synset to the associated words
├── notebooks -> Jupyter Notebooks
│ ├── Dataset Browser.ipynb -> parent notebooks of the dataset browser
│ ├── DatasetCreation.ipynb -> parent notebooks of the create dataset
│ ├── DownloadImages.ipynb -> useful for downloading the images using attran.mat
│ ├── GetSynnetWords.ipynb -> creates the synttoword.json
│ ├── ImageAttrib.ipynb -> creates the attribMap.json
│ └── SynToID.ipynb -> creates the stntoid.json
└── vre_globals.py -> stores the global dictionaries