Code for our paper Hybrid Ranking Network for Text-to-SQL
Python 3.8
Pytorch 1.7.1
or higherpip install -r requirements.txt
We can also run experiments with docker image:
docker build -t hydranet -f Dockerfile .
The built image above contains processed data and is ready for training and evaluation.
- Create data folder and output folder first:
mkdir data && mkdir output
- Clone WikiSQL repo:
git clone https://github.com/salesforce/WikiSQL && tar xvjf WikiSQL/data.tar.bz2 -C WikiSQL
- Preprocess data:
python wikisql_gendata.py
- Run
python main.py train --conf conf/wikisql.conf --gpu 0,1,2,3 --note "some note"
. - Model will be saved to
output
folder, named by training start datetime.
- Modify model, input and output settings in
wikisql_prediction.py
and run it. - Run WikiSQL evaluation script to get official numbers:
cd WikiSQL && python evaluate.py data/test.jsonl data/test.db ../output/test_out.jsonl
Note: the WikiSQL evaluation script will encounter error when running in Windows system. Hence we included the fixed version for Windows User (run in root folder): python wikisql_evaluate.py WikiSQL/data/test.jsonl WikiSQL/data/test.db output/test_out.jsonl
Trained model that can reproduce reported number on WikiSQL leaderboard is attached in the releases (see under "Releases" in the right column). Model prediction outputs are also attached.