Benchmark for evaluating the reasoning abilities of deep learning models. This repository contains the following folders:
- datasets: it contains the ChemAlgebra dataset variants.
- notebooks: it contains the notebooks used to compute the results of the ChemAlgebra paper.
- code: it contains the code used to train and evaluate the Transformer model for the baseline results.
For additional details, please refer to the ChemAlgebra official paper: https://arxiv.org/abs/2210.02095