Skip to content

AAAI 2024 Accepted Paper Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training

Notifications You must be signed in to change notification settings

Artanic30/MacCap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MacCap

AAAI 2024 Accepted Paper Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training

Overview

Setup

First, download and set up the repo:

git clone https://github.com/Artanic30/MacCap
cd MacCap
conda env create -f environment.yml
conda activate MacCap

Data preparation

Download coco_train to data. Download cc3m_train to data.

Training

./train_coco.sh

or

./train_cc3m.sh

Evaluation

Follow the instruction here to evaluate generated captions.

Citation

@article{qiu2024mining,
  title={Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training},
  author={Qiu, Longtian and Ning, Shan and He, Xuming},
  journal={arXiv preprint arXiv:2401.02347},
  year={2024}
}

Acknowledgments

This repository is heavily based on ClipCap, DeCap. For training we used the data of COCO dataset and Conceptual Captions.

Release Schedule

  • Initial Code release
  • Detail Document
  • Data Preparation
  • Training and Evaluation Scripts
  • Checkpoints

About

AAAI 2024 Accepted Paper Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published