Skip to content

Code for the paper "Configurational Polymer Fingerprints for Machine Learning"

Notifications You must be signed in to change notification settings

Ishan-Kumar2/configurational-polymer-fingerprint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Configurational Polymer Fingerprints for Machine Learning

Code for the paper Configurational Polymer Fingerprints for Machine Learning by Ishan Kumar, Prateek K. Jha

Image

Highlights of the work

  • Machine learning (ML) model is developed for a coarse-grained, bead-spring model of polymers.
  • ML model is trained using Monte Carlo (MC) simulations performed on the bead-spring model.
  • Use of both calculated (geometric) and learnt descriptors add value to the ML model.
  • Probability of occurrence of configurations at equilibrium are predicted well by the ML model.

There are two parts to the code. The first involves C++ code to run Monte Carlo simulation in order to create the Dataset of fingerprints and descriptors. The second part is the python ML Code which uses the dataset from the previous step to train the Autoencoder and the prediction model.

Usage

Monte Carlo Simulation

First step is to compile the C++ code This requires the Spectra and Eigen libraries to be present. They can be installed using

git clone https://gitlab.com/libeigen/eigen.git
git clone https://github.com/yixuan/spectra

Also the boost and GSL libraries need to be installed.

g++ -I /path/to/Eigen -I /path/to/spectra/include -c montecarlo.cpp objutils.cpp utils.cpp vars.h main.cpp

Since the compiled files are also provided, you can directly run

g++ -o run main.o montecarlo.o objutils.o utils.o -lgsl -lboost_program_options

Then the compiled file can be used to create the dataset using the Dataset.sh code as

./Dataset.sh > output.txt

The Dataset creates a folder containing a seperate folders for each run. In order to convert it into a single folder (so that it is easier to read in the ML Dataloader) the ML/single_folder.py script can be used (by changing the path variables in the code) as follows.

python ./ML/single_folder.py

Machine Learning Model

For training the ML model (Autoencoder) to generate the learnt descriptors, use the ML/data_processing_training.py script after changing the path variables and other hyperparameters as:

python ML/data_processing_training.py

For training the Prediction model which predicts the Probability of Occurence at Equilibrium use the ML/property_predicition.py script. This requires the trained Encoder model weights, change the corresponding path of the encoder to the best performing encoder weights from the previous script.

python ./ML/property_predicition.py

For getting the metrics like RMSE, Residual on the whole dataset (Requires trained Encoder and Prediction model weights paths) use the ML/value_check.py by changing the path to the weights of the best performing Prediction model and corresponding encoder from the previous script.

python ./ML/value_check.py

If you have any suggestions/doubts feel free to raise an issue/PR on the repo or reach out to Ishan Kumar ([email protected]).

About

Code for the paper "Configurational Polymer Fingerprints for Machine Learning"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published