AesNet

Make the visual basic model have stronger aesthetic computing capabilities. Try it now!

If you like this work, please give us a star ⭐ on GitHub.

Introduction

In the Image Aesthetics Computing (IAC) field, most prior methods leveraged the off-the-shelf backbones pre-trained on the large-scale ImageNet database. While these pre-trained backbones have achieved notable success, they often overemphasize object-level semantics and fail to capture the high-level concepts of image aesthetics, which may only achieve suboptimal performances. To tackle this long-neglected problem, we propose a multi-modality multi-attribute contrastive pre-training framework, targeting at constructing an alternative to ImageNet-based pre-training for IAC. Specifically, the proposed framework consists of two main aspects. (1) We build a multi-attribute image description database with human feedback, leveraging the competent image understanding capability of the multi-modality large language model to generate rich aesthetic descriptions. (2) To better adapt models to aesthetic computing tasks, we integrate the image-based visual features with the attribute-based text features, and map the integrated features into different embedding spaces, based on which the multi-attribute contrastive learning is proposed for obtaining more comprehensive aesthetic representation. To alleviate the distribution shift encountered when transitioning from the general visual domain to the aesthetic domain, we further propose a semantic affinity loss to restrain the content information and enhance model generalization. Extensive experiments demonstrate that the proposed framework sets new state-of-the-arts for IAC tasks.

🎯 Dataset & Code

Coming soon.

Citation

If you find our work interesting, please feel free to cite our paper:

@article{AesNet,
  author={Huang, Yipo and Li, Leida and Chen, Pengfei and Wu, Haoning and Lin, Weisi and Shi, Guangming},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Multi-Modality Multi-Attribute Contrastive Pre-Training for Image Aesthetics Computing}, 
  year={2024},
  pages={1-14},
  doi={10.1109/TPAMI.2024.3492259}}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
figs		figs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AesNet

If you like this work, please give us a star ⭐ on GitHub.

Introduction

🎯 Dataset & Code

Citation

About

Releases

Packages

yipoh/AesNet

Folders and files

Latest commit

History

Repository files navigation

AesNet

If you like this work, please give us a star ⭐ on GitHub.

Introduction

🎯 Dataset & Code

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages