Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Custom Dataset #20

Open
ditto7284 opened this issue Dec 2, 2024 · 1 comment
Open

Support for Custom Dataset #20

ditto7284 opened this issue Dec 2, 2024 · 1 comment

Comments

@ditto7284
Copy link

I have a dataset obtained from the Materials Project, which includes information such as atomic positions, lattice parameters, and energy data. I would like to train and make predictions on this dataset using Equiformer. Could you please guide me on how to train a custom dataset? I would greatly appreciate detailed instructions.

Thank you in advance for your help.

@yilunliao
Copy link
Member

Hi @ditto7284

Thanks for your interest.

Here are some of my suggestions:

  1. If your dataset is in a format similar to MD17 (e.g., numpy array) in this repo, you can check this file and update how to index entries in the dataset.
  2. After 1., you can check the example of QM9 (here) and see how we use Equiformer to predict scalars.
  3. For modeling lattice parameters, I think you can first expand them to vectors of degree L_{max}, use an SO(3) linear layer and finally add them to node embeddings at the beginning. You can check here to see how we encode node-wise forces. (One slight difference is that you need to expand lattice parameters to node-level features.)
  4. Depending on how your dataset is stored, it might be helpful to check this repo. They first convert any datasets to LMDB and have dataset and dataloader classes to handle LMDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants