This is the code repository for our paper: An Engorgio Prompt Makes Large Language Model Babble on
This repository provides tools and code for:
- Crafting and testing Engorgio prompts
- Experimenting with different optimization/design configurations
- Evaluating average length and rate metrics for specific prompts
Citation: If you find our project useful, please consider citing: 🤗🤗🤗
@misc{dong2024engorgio,
title={An Engorgio Prompt Makes Large Language Model Babble on},
author={Jianshuo Dong and Ziyuan Zhang and Qingjie Zhang and Han Qiu and Tianwei Zhang and Hao Wang and Hewu Li and Qi Li and Chao Zhang and Ke Xu},
year={2024},
url={https://arxiv.org/abs/2412.19394},
}
Auto-regressive large language models (LLMs) have yielded impressive performance in many real-world tasks. However, the new paradigm of these LLMs also exposes novel threats. In this paper, we explore their vulnerability to inference cost attacks, where a malicious user crafts Engorgio prompts to intentionally increase the computation cost and latency of the inference process. We design Engorgio, a novel methodology, to efficiently generate adversarial Engorgio prompts to affect the target LLM's service availability. Engorgio has the following two technical contributions. (1) We employ a parameterized distribution to track LLMs' prediction trajectory. (2) Targeting the auto-regressive nature of LLMs' inference process, we propose novel loss functions to stably suppress the appearance of the <EOS> token, whose occurrence will interrupt the LLM's generation process. We conduct extensive experiments on 13 open-sourced LLMs with parameters ranging from 125M to 30B. The results show that Engorgio prompts can successfully induce LLMs to generate abnormally long outputs (i.e., roughly 2-13X longer to reach 90%+ of the output length limit) in a white-box scenario and our real-world experiment demonstrates Engergio's threat to LLM service with limited computing resources.
The project requires anaconda/miniconda. Set up your environment using:
conda env create -f environment.yml
You can obtain model checkpoints in two ways:
-
Automatic Download: The Python script supports automatic downloading of selected models from the Hugging Face Hub.
-
Manual Download: Download checkpoints manually and configure the paths in
ica_utils/model.py
:AutoTokenizer.from_pretrained('/PATH/TO/THE/FILES') AutoModel.from_pretrained('/PATH/TO/THE/FILES')
Run the script on your local server:
CUDA_VISIBLE_DEVICES=0 python ./ica_code.py \
--model gpt2 \
--log_interval 200 \
--num_iters 3000 \
--opt_alpha 1 \
--loss_opt 1 2 \
--esc_loss_version 0 \
--trigger_token_length 32 \
--max_length 2048 \
--initial_coeff 0 \
--bs 8 \
--sample_time 80 \
--load_in_8bit \
--seed 123
Parameter | Description |
---|---|
model |
Target model in white-box scenarios |
lr |
Learning rate for optimization |
num_iters |
Number of optimization iterations without checker |
opt_alpha |
λ coefficient for optimization problem |
loss_opt |
Loss terms selection (1: self-mentor, 2: EOS escape, 1 2: both) |
esc_loss_version |
EOS escape loss configuration (0: with trigger, 1: without) |
trigger_token_length |
Length of trigger sequence |
max_length |
Maximum length for generation process |
initial_coeff |
Random initialization of distribution matrix coefficient |
bs |
Batch size for inference (recommended: 8 for 7B models on single A100) |
sample_time |
Number of generation iterations for metrics calculation |
load_in_8bit |
Enable int8 quantization for large models |