GitHub - AISE-TUDelft/ContextualDataCodeCompletion: Repository containing code for analyzing the impact of type annotations and comments on code completion performance

This repository contains replication code for the paper "Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study".

Installation

Ensure you have NodeJS and Python 3 installed. Then install dependencies:

pip install -r requirements.txt
npm install

Dataset

The dataset can be retrieved from Zenodo. The dataset should be extracted to the ./data folder of this repository.

Replication

The two main files to run for replication are create-datasets.sh and evaluate.sh.

These files should be run in the root directory of the project (i.e. directly in the Replication-Code folder).

`create-datasets.sh`

This file does the following:

Copy the dataset and add marker comments (/*marker:number*/)
Copy the marked dataset, install third-party dependencies, and add type annotations
Determine which projects had all dependencies installed succesfully
Copy the marked dataset, and remove all type annotations
Analyze the dataset (#LOC, #Files, Type Explicitness)
Create train/test/validation files for consumption by UniXcoder, CodeGPT, and InCoder 6.1 Note that UniXcoder and CodeGPT use the same input files. In practice, it will only show files for UniXcoder, but these are intended to be used for both UniXcoder and CodeGPT.

`evaluate.sh`

This file does the following:

Post process predictions
Evaluate post processed predictions/computes all metrics (both for complete lines and single tokens)
Performs the statistical analysis

This is done for every model.

Note that this script expects a predictions folder to be present inside the data folder. The predictions folder should have subfolders of the format ./data/predictions/<unixcoder|codegpt|incoder>/<normal|untyped|explicit>-<all|none|docblock|single_line|multi_line>/ and should contain the respective test.json file for the model & dataset, and predictions.txt file generated based on this test.json file.

Configuration

Some parameters can be configured through config.json.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
py		py
src		src
README.md		README.md
config.json		config.json
create-datasets.sh		create-datasets.sh
evaluate.sh		evaluate.sh
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Dataset

Replication

`create-datasets.sh`

`evaluate.sh`

Configuration

About

Releases

Packages

Languages

AISE-TUDelft/ContextualDataCodeCompletion

Folders and files

Latest commit

History

Repository files navigation

Installation

Dataset

Replication

create-datasets.sh

evaluate.sh

Configuration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`create-datasets.sh`

`evaluate.sh`

Packages