Low-Resource Reimplementation of PACT

Since the original paper does not have code for the model, I’ve replicated the process (without pretraining on WebMath) with changes for training & evaluation on a single consumer GPU NVIDIA RTX 2080 TI) and made the finetuned model publicly available.

Setup instructions

The setups are similar to the publicly available code

Make sure to install elan: https://www.github.com/kha/elan

Set up `lean-tpe` repository

git clone [email protected]:sontungtran/pact-lean-low-resource.git $PACT_DIR

cd $PACT_DIR

leanpkg configure

leanproject get-mathlib-cache

bash ./_target/deps/mathlib/scripts/mk_all.sh # make a file that imports all of mathlib

leanpkg build

Download dataset

Download dataset from https://drive.google.com/file/d/1VJ4kl1VaC3DfjqdP3KLR4zIoPRfv-K1B/view?usp=sharing

Move file to directory $PACT_DIR/notebooks/data

Download finetuned model

Download finetuned model from https://drive.google.com/file/d/1OVvHv-WPY7Oz_uum7ofV1yeto0sY-DrY/view?usp=sharing

Move file to directory $PACT_DIR/notebooks/models

Finetune instructions

Simply run the GPT2_ProofStep_only.ipynb notebook, the code will try to find the finetuned model and train it further

Evaluation instructions

The Lean scripts will dump JSON messages to a specified file, with a new message on each line. Each message encodes the history of a single proof search.

Run some toy examples

WARNING: Make sure arguments are passed in correct order.

Check that single-threaded baseline evaluation works

# bfs strategy only for now

lean --run ./src/evaluation/bfs/baseline.lean ./test/dummy_decls2.names ./test/dummy_out_bfs_baseline.log 50 25 50 5 300 > ./test/dummy_out_bfs_baseline.out 2>&1

# this should be 31
cat ./test/dummy_out_bfs_baseline.log | grep | wc -l

# this should be 0 (TODO(jesse): allow optional environment-setting)
cat ./test/dummy_out_bfs_baseline.log | grep '"success":true' | wc -l

Check that single-threaded GPT-f evaluation works

# bfs strategy only for now

# args: decl_names logfile max_tokens temp top_p n best_of fuel max_width max_depth engine_id api_key
lean --run ./src/evaluation/bfs/gptf.lean ./test/dummy_decls2.names ./test/dummy_out_bfs_gptf.log 256 0.7 1.0 10 10 50 10 25 formal-lean-wm-to-tt-m1-m2-v4-c4 $OPENAI_API_KEY 5 300 > ./test/dummy_out_bfs_gptf.out 2>&1

# should be 31
cat ./test/dummy_out_bfs_gptf.log | grep | wc -l

# should be ~24
cat ./test/dummy_out_bfs_gptf.log | grep '"success":true' | wc -l

Check that the Python script wrapper works

The Python script wraps the Lean scripts and runs them in parallel, sharding a list of theorems.

usage: parallel_eval.py [-h] [--max_tokens MAX_TOKENS]
                        [--temperature TEMPERATURE] [--top_p TOP_P] [--n N]
                        [--best_of BEST_OF] [--fuel FUEL]
                        [--engine_id ENGINE_ID] [--api_key API_KEY]
                        [--nbest NBEST] [--beam BEAM]
                        [--model_path MODEL_PATH] [--data_path DATA_PATH]
                        [--entry_pt ENTRY_PT] [--max_width MAX_WIDTH]
                        [--max_depth MAX_DEPTH] [--lean_threads LEAN_THREADS]
                        [--lean_timeout LEAN_TIMEOUT] [--api API]
                        [--search_strategy SEARCH_STRATEGY]
                        [--tac_timeout TAC_TIMEOUT]
                        [--global_timeout GLOBAL_TIMEOUT]
                        n_workers decls_per_shard decls_file dest_dir

positional arguments:
  n_workers
  decls_per_shard
  decls_file
  dest_dir

optional arguments:
  -h, --help            show this help message and exit
  --max_tokens MAX_TOKENS
  --temperature TEMPERATURE
  --top_p TOP_P
  --n N
  --best_of BEST_OF
  --fuel FUEL
  --engine_id ENGINE_ID
  --api_key API_KEY
  --nbest NBEST
  --beam BEAM
  --model_path MODEL_PATH
  --data_path DATA_PATH
  --entry_pt ENTRY_PT
  --max_width MAX_WIDTH
                        maximum size of search queue for BFS
  --max_depth MAX_DEPTH
                        maximum distance of search node from root before the
                        search queue rejects it
  --lean_threads LEAN_THREADS
                        number of threads per Lean process
  --lean_timeout LEAN_TIMEOUT
                        deterministic timeout for Lean process in millions of
                        allocations. Interactive default is one. Default is
                        unbounded (none).
  --api API             gptf|baseline|fairseq
  --search_strategy SEARCH_STRATEGY
                        greedy|bfs
  --tac_timeout TAC_TIMEOUT
                        tactic execution timeout (s)
  --global_timeout GLOBAL_TIMEOUT
                        proof search timeout (s)

Check that the baseline version works. Inspect some of the output files.

python ./scripts/parallel_eval.py 4 8 ./test/dummy_decls2.names ./test_parallel/baseline/ --fuel 50 --api baseline --search_strategy bfs --tac_timeout 5 --global_timeout 300

Check that the BFS GPT-f version works. Inspect some of the output files.

python ./scripts/parallel_eval_resumeable_json.py 1 1 ./test/dummy_decls_single.names ./test_parallel/gptf_neo_dummy/ --max_tokens 256 --temperature 0.7 --top_p 1.0 --n 40 --best_of 40 --fuel 200 --max_width 32 --max_depth 128 --engine_id formal-lean-wm-to-tt-m1-m2-v4-c4 --api_key asdfasd --api gptf_neo_8epoch_modified --search_strategy bfs --tac_timeout 5 --global_timeout 300000

Data processing

Removing non-theorems

lean --run ./src/tools/filter_defs.lean $ORIGINAL_NAMES_FILE $NEW_NAMES_FILE

Shuffling names files

python ./scripts/shuffle_lines.py $NAMES_FILE $SHUFFLED_NAMES_FILE # optional seed -- seed 12387

Reference

This code is based on Lean Theorem Proving Environment repo, LeanStep dataset repo and adopted GPT-2 from scratch

@article{han2021proof,
  title={Proof artifact co-training for theorem proving with language models},
  author={Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas},
  journal={arXiv preprint arXiv:2102.06203},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
notebooks		notebooks
scripts		scripts
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.org		README.org
leanpkg.toml		leanpkg.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Low-Resource Reimplementation of PACT

Setup instructions

Set up `lean-tpe` repository

Download dataset

Download finetuned model

Finetune instructions

Evaluation instructions

Run some toy examples

Check that single-threaded baseline evaluation works

Check that single-threaded GPT-f evaluation works

Check that the Python script wrapper works

Data processing

Removing non-theorems

Shuffling names files

Reference

About

Releases

Packages

Contributors 2

Languages

License

toontran/pact-lean-low-resource

Folders and files

Latest commit

History

Repository files navigation

Low-Resource Reimplementation of PACT

Setup instructions

Set up lean-tpe repository

Download dataset

Download finetuned model

Finetune instructions

Evaluation instructions

Run some toy examples

Check that single-threaded baseline evaluation works

Check that single-threaded GPT-f evaluation works

Check that the Python script wrapper works

Data processing

Removing non-theorems

Shuffling names files

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Set up `lean-tpe` repository

Packages