Skip to content
This repository has been archived by the owner on May 7, 2018. It is now read-only.

zhiyong1997/Semantic-Alignment-for-Hierarchical-Image-Captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Abstract

Inspired by recent progress of hierarchical reinforcement learning and adversarial text generation, we introduce a hierarchical adversarial attention based model to generate natural language description of images. The model automatically learns to align the attention over images and subgoal vectors in the process of caption generation. We describe how we can train, use and understand the model by showing its performance on Flickr8k. We also visualize the subgoal vectors and attention over images during generation procedures.

Authors

 
Sidi Lu Zhiyong Fang Peiyao Sheng

Demo

IMAGE ALT TEXT HERE

Code

We provide source code on [Github](https://github.com/zhiyong1997/Semantic-Alignment-for-Hierarchical-Image-Captioning), including:
1. Train/Test code.
2. Visualization tool for attention mechanism.

Sample Usage

Our model can handle COCO, Flickr8k and Flickr30k dataset. For simplicity, we only present Flickr8k here.

1. Create folder ./code/dataset

2. Download processed Flickr8k Image Captioning Dataset from here with key: sh4u

3. Unzip the downloaded file in ./code/dataset/

4. Download resnet50 model file in ./code/saved_model/ from here with key: h712

4. Run ./code/main.py with python3

Paper

Our paper is available here

Bibtex

@article{Lu2018SemanticAlignment,
          title={Semantic Alignment for Hierarchical Image Captioning},
          author={Lu, Sidi and Fang, Zhiyong and Sheng, Peiyao},
          year={2018},
          howpublished={\url{https://github.com/zhiyong1997/Semantic-Alignment-for-Hierarchical-Image-Captioning}}
        }

Example Result

Releases

No releases published

Packages

No packages published

Languages