Skip to content
/ AesNet Public

[TPAMI] Multi-modality Multi-attribute Contrastive Pre-training for Image Aesthetics Computing

Notifications You must be signed in to change notification settings

yipoh/AesNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 

Repository files navigation

AesNet

Make the visual basic model have stronger aesthetic computing capabilities. Try it now!

If you like this work, please give us a star ⭐ on GitHub.

Introduction


In the Image Aesthetics Computing (IAC) field, most prior methods leveraged the off-the-shelf backbones pre-trained on the large-scale ImageNet database. While these pre-trained backbones have achieved notable success, they often overemphasize object-level semantics and fail to capture the high-level concepts of image aesthetics, which may only achieve suboptimal performances. To tackle this long-neglected problem, we propose a multi-modality multi-attribute contrastive pre-training framework, targeting at constructing an alternative to ImageNet-based pre-training for IAC. Specifically, the proposed framework consists of two main aspects. (1) We build a multi-attribute image description database with human feedback, leveraging the competent image understanding capability of the multi-modality large language model to generate rich aesthetic descriptions. (2) To better adapt models to aesthetic computing tasks, we integrate the image-based visual features with the attribute-based text features, and map the integrated features into different embedding spaces, based on which the multi-attribute contrastive learning is proposed for obtaining more comprehensive aesthetic representation. To alleviate the distribution shift encountered when transitioning from the general visual domain to the aesthetic domain, we further propose a semantic affinity loss to restrain the content information and enhance model generalization. Extensive experiments demonstrate that the proposed framework sets new state-of-the-arts for IAC tasks.

🎯 Dataset & Code

Coming soon.

Citation

If you find our work interesting, please feel free to cite our paper:

@article{AesNet,
  author={Huang, Yipo and Li, Leida and Chen, Pengfei and Wu, Haoning and Lin, Weisi and Shi, Guangming},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Multi-Modality Multi-Attribute Contrastive Pre-Training for Image Aesthetics Computing}, 
  year={2024},
  pages={1-14},
  doi={10.1109/TPAMI.2024.3492259}}

About

[TPAMI] Multi-modality Multi-attribute Contrastive Pre-training for Image Aesthetics Computing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published