Welcome to the repo for classifying crystal structures & space groups from 1D X-ray diffraction (XRD) patterns.
Can machine learning identify crystals in light diffraction patterns?
Check out our paper for more details and information, and be sure to cite us.
Paper:
@article{Crystals,
title = {XRDs with deep learning (pending actual name)},
author = {Jerardo Salgado; Sam Lerman; Zhaotong Du; Chenliang Xu; and Niaz Abdolrahim},
journal = {pre-print:Nature Communications},
year = {2023}
}
Use git to download the XRDs repo:
git clone [email protected]:slerman12/XRDs.git
Change directory into the XRDs repo:
cd XRDs
This project is built with the UnifiedML deep learning library/framework.
Download UnifiedML
git clone [email protected]:agi-init/UnifiedML.git
Install Dependencies
All dependencies can be installed via Conda:
conda env create --name ML --file=UnifiedML/Conda.yml
Activate Conda Environment
conda activate ML
ⓘ If your GPU doesn't support the latest CUDA version, you may need to redundantly install Pytorch with an older version of CUDA from pytorch.org/get-started after activating your Conda environment. For example, for CUDA 11.6:
pip uninstall torch torchvision torchaudio pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116ⓘ CUDA is needed to run the deep learning code on GPUs rather than CPUs. UnifiedML will automatically select GPUs when a working CUDA is available.
To run, we have 3 model variants for predicting 7-way crystal types:
Model 1: No-pool CNN
python XRD.py task=NPCNN
Model 2: Standard CNN
python XRD.py task=SCNN
Model 3: MLP
python XRD.py task=MLP
💡 To predict 230-way space groups instead, add the num_classes=230
flag.
python XRD.py task=NPCNN num_classes=230
Plots automatically save to ./Benchmarking/<experiment>/
.
The above scripts will launch training on the "souped" synthetic + random 50% RRUFF experimental data, & evaluation on the remaining 50% RRUFF data. The trained model is saved in a ./Checkpoints
directory and can be loaded with the load=true
flag.
All model and dataset code can be found in XRD.py
Custom datasets can be evaluated with the Dataset=
flag and train_steps=0 load=true
from a saved model.
Synthetic data
This repo automatically downloads the public CIF database as opposed to ICSD as in the paper. If you’d rather use ICSD and have access, you can download it to the Data/Generated/CIFs_ICSD/
directory, and this code will automatically use that instead as in the paper. If you’d like to use both, add the open_access=true
flag.
Souping and evaluation data
This GitHub provides the experimental real-world data RRUFF. It will be detected and used for souping as described in the paper. That is, reserving a random 50% subset of the real-world data for training and the remaining 50% for evaluation. If you’d like to disable souping, use the soup=false
flag. If you’d like to train only on a 0.9/0.1 split of the synthetic data, you can use rruff=false
.
If you find this work useful, be sure to cite us:
@article{Crystals,
title = {XRDs with deep learning (pending actual name)},
author = {Jerardo Salgado; Sam Lerman; Zhaotong Du; Chenliang Xu; and Niaz Abdolrahim},
journal = {pre-print:Nature Communications},
year = {2023}
}
All UnifiedML features and syntax are supported.