This repository provides a PyTorch implementation of Text2Colors. Text2Colors is capable of producing plausible colors (or color palette) given variable length of text input, and colorize a grayscale image based on the colors.
Text2Colors: Guiding Image Colorization through Text-Driven Palette Generation
Wonwoong Cho*1, Hyojin Bahng*1, David K. Park*1, Seungjoo Yoo*1, Ziming Wu2, Xiaojuan Ma2, and Jaegul Choo1
*These authors contributed equally and are presented in random order.
1Korea University 2Hong Kong University of Science and Technology
Overview of our Text2Colors architecture. During training, generator G0 learns to produce a color palette (y hat) given a set of conditional variables (c hat) processed from input text x = {x1, ···, xT}. Generator G1 learns to predict a colorized output of a grayscale image (L) given a palette (p) extracted from the ground truth image. At test time, the trained generators G0 and G1 are used to produce a color palette from given text, and then colorize a grayscale image reflecting the generated palette.
The model architecture of a generator G0 that produces the t-th color in the palette given an input text x = {x1, ···, xT}. Note that randomness is added to each hidden state vector h in the sequence before it is passed to the generator
We open our manually curated dataset named Palette-and-Text(PAT). PAT contains 10,183 text and five-color palette pairs, where the set of five colors in a palette is associated with its corresponding text description as shown in Figs. 2(b)-(d). The text description is made up of 4,312 unique words. The words vary with respect to their relationships with colors; some words are direct color words (e.g. pink, blue, etc.) while others evoke a particular set of colors (e.g. autumn or vibrant).
Statistics and samples of PAT dataset: (a) the number of data items with respect to their text lengths. On the right are examples that show diverse textpalette pairs in PAT. Those text descriptions matching with their palettes include (b) direct color names, (c) texts with a relatively low level of semantic relations to colors, (d) those with a high-level semantic context.For the use of PAT dataset for your research, please cite our paper.
(bibtex)
$ git clone https://github.com/awesome-davian/Text2Colors.git
$ cd Text2Colors/
$ bash install_pre.sh
$ python train_text2pal.py
$ python train_pal2color.py
*Wah, Catherine, et al. "The caltech-ucsd birds-200-2011 dataset." (2011).
If this work is useful for your research, please cite our paper.
(bibtex)