Skip to content

This project will create a VRE dataset from the imagenet

Notifications You must be signed in to change notification settings

sahilbadyal/imagenet_to_vre

Repository files navigation

ImageNet to VRE

Basic Introduction

This repository contains the code for converting the imagenet attributes dataset to a Visual Referring Expression (VRE) dataset.

Original Dataset

Imagenet Attributes

The imagenet attribute dataset is a new addition to the already famous dataset. They used Amazon mechanical turk to tag the following atrtributes in the images

  • Color: black, blue, brown, gray, green, orange, pink, red, violet, white, yellow
  • Pattern: spotted, striped
  • Shape: long, round, rectangular, square
  • Texture: furry, smooth, rough, shiny, metallic, vegetation, wooden, wet

I have used a simple yet effective algorithm to convert this into a VRE dataset:

Algorithm

  1. Create a 3x3 collage of images from the original dataset by scaling each image to a fixed size (say 300 by 200 pixels).
  2. For each sub-image:
    1. Save its bounding box (or boundaries in the collage).

    2. Create the referring expression with the following rule

      the <comma separated attributes> <synset name>

Examples are shown below.

This creates 9 referring expressions per collage.

How to use/view and Replicate this work?

Use the dataset (Dropbox account is required (create one if does not exist, its free!))

  1. Download dataset from here.

View

  1. Clone this repository

  2. Download dataset from here.

  3. Run dataset_browser.py to view the images and bounding boxes.

Replicate

  1. Clone this repository

  2. Download the original dataset from here. [optional]

  3. Unzip the original dataset in the top level project directory.

unzip data.zip
  1. Run the create_dataset.py (make sure to check input directory is pointed to ./data)

  2. This will create the output folder with vre_dataset.json

Directory structure and files

├── README.md -> you are here
├── init.py -> tell python its a module
├── create_dataset.py -> used to create the dataset
├── dataset_browser.py -> used to view the dataset
├── imagenet_attributes -> this folder contains the original imagement attributes dataset
│   └── attrann.mat -> original dataset mat file
├── json -> contains all the jsons required to create this dataset
│   ├── attribMap.json -> maps an image to its attributes
│   ├── syntoid.json -> maps a synset to the images contained
│   └── syntoword.json -> maps a synset to the associated words
├── notebooks -> Jupyter Notebooks
│   ├── Dataset Browser.ipynb -> parent notebooks of the dataset browser
│   ├── DatasetCreation.ipynb -> parent notebooks of the create dataset
│   ├── DownloadImages.ipynb -> useful for downloading the images using attran.mat
│   ├── GetSynnetWords.ipynb -> creates the synttoword.json
│   ├── ImageAttrib.ipynb -> creates the attribMap.json
│   └── SynToID.ipynb -> creates the stntoid.json
└── vre_globals.py -> stores the global dictionaries

About

This project will create a VRE dataset from the imagenet

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published