You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Image: We here use minigpt4 model as example to show, for other two models, please replace the name.
1: Use diffution generator to generate the images and save into directory.
2: Use minigpt_adversarial_generation.py to create adversarial images for any images in the directory.
3. Use minigpt_real_our.py to generate the responses for each prompt and images. Here we refers to use https://allenai.org/data/real-toxicity-prompts as our prompts datasets,hence save the results to jsonl file
4. Use get_metric.py with perspective api to generate evaluation
5. Using cal_score_acc.py to get the retention score and ASR.
6: For API evaluation, use gemini_evaluation.py and gpt4v_evaluation.py
4: use minigpt_noise_attack.py to generate adv responses.
5. Use adv_bench_evaluation_llama_70B.py to evaluate the score for each responses.
6. Use adv_bench_keyword_evaluation.py to get ASR
7. use extract_score to calculate the Retention Score.
8. For pure LLM, please refers to test_transformer.py and replace the model and tokennizer to vicuna-v0, v1.1, llama-13B, and use similar step for the ASR and score.