With the continuous development of artificial intelligence technology, more and more deep-generation models are used for molecule generation. However, most new molecules generated by the generation models often face great challenges in terms of synthetic accessibility.
DeepSA is proposed to predict synthesis accessibility of compounds, and has a much higher early enrichment rate in discriminating molecules that are difficult to synthesize. This helps users to select less expensive molecules for synthesis, thus reducing the time for drug discovery and development. You can use DeepSA on a webserver at https://bailab.siais.shanghaitech.edu.cn/deepsa
Dependencies can be installed using the following command:
conda create -n DeepSA python=3.12
conda activate DeepSA
# for gpu version
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip3 install autogluon
pip3 install rdkit
-
2024-12, because AutoGluon stopped supporting python version 3.8 starting in October 2024. Therefore, we have updated DeepSA to use Python version 3.12 and updated the training and inference scripts to adapt to the latest version of AutoGluon, thanks for your interest in DeepSA!
-
2023-7, DeepSA_v1.0 has been released, welcome to provide feedback on the issue!
The expand training and tes datasets could be easily downloaded at https://drive.google.com/drive/folders/1iup6T3Bqyy-uvpdFyP0Of_WQqn-9l62h?usp=sharing
If you want to train your own model, you can run it from the command line,
running:
python DeepSA_training.py <dataset.csv/training.csv:test.csv> DeepSA ./data/test_set.list
If you want to use the model we proposed,
running:
python DeepSA.py <input_data.csv> DeepSA
We deployed a pre-trained model on a dedicated server, which is publicly available at https://bailab.siais.shanghaitech.edu.cn/deepsa, to make it easy for biomedical researcher users to utilize DeepSA in their research activity.
Users can upload their SMILES or csv files to the server, and then they can quickly obtain the predicted results.
If you find this repository useful in your research, please consider citing our paper:
Wang, S., Wang, L., Li, F. et al. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 15, 103 (2023). https://doi.org/10.1186/s13321-023-00771-3
If you have any questions, please feel free to contact Shihang Wang (Email: [email protected]) or Lin Wang (Email: [email protected]).
Pull requests are highly welcomed!
We are grateful for the support from HPC Platform of ShanghaiTech University.
Thank you all for your attention to this work.