Source code of the paper "When Neural Code Completion Models Size up the Situation: Attaining Cheaper and Faster Completion through Dynamic Model Inference"
transformers == 4.22.2
Download CSN and COFIC into the datasets
folder. The datasets are available at CSN(python) and COFIC
Download the pre-trained GPT-2 model from here and put it into the cached
folder.
The model needs to be finetuned on the dataset before using it for code completion.
The command for finetuning GPT-2 on CSN(Python) is as follows:
python finetune.py --batch_size 8 --run_name python_gpt2 --epoch 10 --text_length 256 --cache_path ./cached/gpt2 --data_path datasets/python/final/jsonl --language python --output_path ./cached
The command for finetuning GPT-2 on COFIC(Java) is as follows:
python train_gpt2.py --batch_size 8 --run_name java_gpt2 --epoch 10 --text_length 256 --cache_path ./cached/gpt2 --data_path ./datasets/java/COFIC.jsonl --language java --output_path ./cached
Download the pre-trained CodeGen model from here and put it into the cached
folder.
The model doesn't need further finetuning.
We have prepared the scripts for running the experiments in the paper. The scripts are in the cmds
folder.
Please follow the steps and the instructions in the scripts to run the experiments.