Skip to content

Latest commit

 

History

History
30 lines (23 loc) · 1.58 KB

README.md

File metadata and controls

30 lines (23 loc) · 1.58 KB

Source code of the paper "When Neural Code Completion Models Size up the Situation: Attaining Cheaper and Faster Completion through Dynamic Model Inference"

Requirements

transformers == 4.22.2

Datasets

Download CSN and COFIC into the datasets folder. The datasets are available at CSN(python) and COFIC

LCMs

GPT-2

Download the pre-trained GPT-2 model from here and put it into the cached folder. The model needs to be finetuned on the dataset before using it for code completion. The command for finetuning GPT-2 on CSN(Python) is as follows:

python finetune.py --batch_size 8 --run_name python_gpt2 --epoch 10 --text_length 256 --cache_path ./cached/gpt2 --data_path datasets/python/final/jsonl --language python --output_path ./cached

The command for finetuning GPT-2 on COFIC(Java) is as follows:

python train_gpt2.py --batch_size 8 --run_name java_gpt2 --epoch 10  --text_length 256 --cache_path ./cached/gpt2  --data_path ./datasets/java/COFIC.jsonl --language java --output_path ./cached

CodeGen

Download the pre-trained CodeGen model from here and put it into the cached folder. The model doesn't need further finetuning.

Run

We have prepared the scripts for running the experiments in the paper. The scripts are in the cmds folder. Please follow the steps and the instructions in the scripts to run the experiments.