This repo contains tools to collect datasets from F* builds, used in the ICSE 2025 paper "Towards Neural Synthesis for SMT-assisted Proof-Oriented Programming".
To run:
-
Run
./projects/build.sh
. This will build 14 F* projects with the right flags for the dataset collection.Alternatively, you can build F* with the
--record_options
flag. This tells F* to record the options it used to check each definition in the checked file. This allowsfstar_insights
to preserve this in the data set. Typically, if you're building F* itself, this would beOTHERFLAGS='--record_options' make -jN
-
Make sure you have done
eval $(opam env)
and set theFSTAR_HOME
environment variable. -
make -C fstar_insights
-
./ingest.py ./projects
(Or./ingest.py .../path/to/FStar
if you want to extract a dataset from some other code.)
After all that, you can run make harness-checked.json
as a sanity check to see if the harness can verify extracted proofs.
This repo also provides fstar_harness.py
, which is a harness to run F*
against sample proofs collected from the dataset.