GitHub

As part of the project, we are planning to attack a multimodal classification model on the Hateful Memes dataset. You can download the dataset by running code download_dataset.py . The objective of the project is to minimize the classification accuracy by adversarially modifying the image and/or text of memes to maximise the misclassification of memes from hateful to non-hateful and vice-versa. Using the adversarially modified images and/or texts.

To run the code on a conda environment, you need to create an environment with packages installed from requirements.txt

Create a conda environment conda env create --file requirement.txt
To run image perturbations, run the code in the img_perturbation folder.
To run text perturbations, run the code in the text_perturbation folder.
To run text and image perturbations, run the code in the text_perturbation folder.
Adversarial retraining can be done through the code in adv_retrain folder.

If you want to load the adversarially retrained or baseline models you can download the checkpoint files from here. To run a attack in any of the folders above, you can use the command python attack.py

Make sure you download the perturbed images from here and put in the folder called annotations inside the original dataset or create a folder using perturb_image.py in the img_perturbation folder to create it

Make sure you have the correct path in line 63 and 67 in all the attack.py files.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Base Model		Base Model
Custom Model		Custom Model
adv_retrain		adv_retrain
img_perturbation		img_perturbation
img_text_pertubation		img_text_pertubation
text_perturbation		text_perturbation
.gitignore		.gitignore
README.md		README.md
create_jsonl.py		create_jsonl.py
download_dataset.py		download_dataset.py
hateful_memes.py		hateful_memes.py
model.py		model.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 3

Languages

kartikaykaushik14/AttackHatefulMemes

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages