Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reaching out to other databases #124

Open
rartino opened this issue Jun 14, 2019 · 32 comments
Open

Reaching out to other databases #124

rartino opened this issue Jun 14, 2019 · 32 comments

Comments

@rartino
Copy link
Contributor

rartino commented Jun 14, 2019

from @ml-evs: Please add a comment with any new suggestions, but also feel free to edit this table

Database Contacted Positive response?
The Electronic Structure Project
NREL MatDb ✔️
SUNCAT Catalysis Hub
Computational Materials Repository (CMR) ✔️ ✔️
High-throughput Experimental Materials Database (HTEM) ✔️ ✔️
MatNavi
Clean Energy Project (CEPDB) (now offline)
Toplogical Quantum Chemistry Database ✔️
Organic Materials Database (OMDB) ✔️
JARVIS ✔️
Hybrid3 ✔️ ✔️
2DMatpedia ✔️
QMOF ✔️ ✔️
Open Catalyst Project ✔️
Carolina MatDB & MaterialsAtlas.org ✔️
AMCSD ✔️
CMU Alloy Database ✔️
f-electron structure database
Matgen ✔️
MIP-3d
OCELOT ✔️
ACCDB
Hypothetical Zeolite Database ✔️
Matterverse ✔️ ✔️
Google Brain's top secret DFT database Deepmind's Gnome ✔️
(currently hosted at https://optimade-gnome.odbx.science)
ditto Microsoft Research ✔️
Toyota's CAMD dataset ✔️
(hosted at optimade-misc.odbx.science)
Alexandria from Miguel Marques' group ✔️
(hosted at alexandria.odbx.science)
GW-BSE database ✔️
FFMDFPA

From discussions with @ctoher; we thought that it would be good to collect somewhere a list of other databases to reach out to for checking if they are interested in implementing OPTiMaDe.

Here are a list (some of these were pointed out to me by Lauri Himanen at Aalto University.)

  • The Electronic Structure Project (http://gurka.physics.uu.se/esp/; 2002),
  • High Performance Computing Center Materials Database - NREL MatDb
    (materials.nrel.gov; 2015)
  • SUNCAT: suncat.stanford.edu, data at catalysis-hub.org, 2012
  • Computational Materials Repository: cmr.fysik.dtu.dk, 2008

More data-set oriented

  • Materials Data Facility (MDF): materialsdatafacility.org, 2016

More experimentally oriented

  • High-throughput Experimental Materials Database (HTEM), 2017

Unknown content or status

  • MatNavi: mits.nims.go.jp, 2003
  • Clean Energy Project CEPDB: cepdb.molecularspace.org (currently broken link), 2011
@dwinston
Copy link
Contributor

Topological Materials Database, 2017-2019.

@blokhin
Copy link
Member

blokhin commented Jun 15, 2019

@ml-evs
Copy link
Member

ml-evs commented Jul 8, 2019

@ml-evs
Copy link
Member

ml-evs commented Jul 29, 2019

JARVIS out of NIST

@ml-evs
Copy link
Member

ml-evs commented May 29, 2020

I've just come across Hydrid3 on twitter, a database for organic/inorganic perovskites out of Duke, will mention OPTIMADE to them.

@ml-evs
Copy link
Member

ml-evs commented Feb 4, 2021

Carolina Materials Database, a new (2020?) database of ternary & quaternary crystal structures predicted with a generative neural net.

Zhao et al. "High-throughput discovery of novel cubic crystal materials using deep generative neural networks" arXiv:2102.01880

blokhin added a commit to tilde-lab/awesome-materials-informatics that referenced this issue Feb 4, 2021
@ml-evs
Copy link
Member

ml-evs commented Feb 5, 2021

Open Catalyst Project from CMU & Facebook AI. Big datasets of molecule+surface guided relaxations & MD. Could be a useful case study when considering adding trajectories...

Chanussot et al., "The Open Catalyst 2020 (OC20) Dataset and Community Challenges" arXiv:2010.09990

@ml-evs
Copy link
Member

ml-evs commented Apr 6, 2021

Quantum MOF database containing ~15,000 electronic structure calculations on MOFs, currently provided as an archive on figshare. This might be an example of the kind of dataset we discussed in the last meeting, where hosting a public API is prohibited by technical or resource constraints. Dare we go down the OPTIMADE-as-a-service route? 😁

Rosen et al., "Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery", Matter (2021) 10.1016/j.matt.2021.02.015

@blokhin
Copy link
Member

blokhin commented Apr 6, 2021

DFT+U for transition metal rutile dioxides - some solid amount of the raw calculations simply hosted at Github.

(I'm putting it here to raise a discussion, what are the "other databases", also in connection to the Optimade-as-a-service mentioned by @ml-evs)

@ml-evs
Copy link
Member

ml-evs commented Apr 6, 2021

(I'm putting it here to raise a discussion, what are the "other databases", also in connection to the Optimade-as-a-service mentioned by @ml-evs)

An interesting point that @shyamd made in the past was "what happens if all these databases suddenly get the same load as the Materials Project"? This could be a useful way of framing the discussion at the next workshop. I think we really have to consider the science case we imagine for querying all OPTIMADE instances at once.

@ml-evs
Copy link
Member

ml-evs commented Apr 6, 2021

As I mentioned in the last meeting, I like the idea of a datasette-like tool for OPTIMADE. that allows for local exploration and filtering of materials datasets without needing the data owner to provide an API (they would of course need to provide the data in a supported format). Would be fun to hack on something like this at the workshop. You can then bring your favourite local OPTIMADE client to the show for rich filtering. This could then naturally become the tool for data providers to do one-click OPTIMADE provider registration (or even deployment?), though not sure that would be worthwhile yet...

@ml-evs
Copy link
Member

ml-evs commented Jun 3, 2021

I just came across Matgen, a set of databases from a large Chinese academic consortium, via this paper:

He, B., Chi, S., Ye, A. et al. High-throughput screening platform for solid electrolytes combining hierarchical ion-transport prediction algorithms. Sci Data 7, 151 (2020). 10.1038/s41597-020-0474-y

@ml-evs
Copy link
Member

ml-evs commented Jun 6, 2021

CMU Alloy Database out of the group of Prof Michael Widom, last update to the website/data was 2011, but it is still alive. This could be a great candidate for quickly spinning up an OPTIMADE API.

@ml-evs
Copy link
Member

ml-evs commented Aug 18, 2021

American Mineralogist Crystal Structure Database (down at the moment, but available via Wayback Machine). Crystal structures of all minerals published across various mineralogy journals, grouped by mineral name.

@ml-evs
Copy link
Member

ml-evs commented Aug 18, 2021

f-electron structure database (FESD). Contains LAPW DFT calcs on known lanthanide/actinide-containing crystal structures, possibly also novel/predicted structures (preprint from 2017 says available soon), and possibly also DFT+DMFT calculations.

@blokhin
Copy link
Member

blokhin commented Aug 19, 2021

American Mineralogist Crystal Structure Database

Somehow related to RRUFF project presumably

@ml-evs
Copy link
Member

ml-evs commented Sep 20, 2021

MIP-3d: database focused on thermoelectric properties of known structures

Yao, M., Wang, Y., Li, X. et al. Materials informatics platform with three dimensional structures, workflow and thermoelectric applications. Sci Data 8, 236 (2021). https://doi.org/10.1038/s41597-021-01022-6

@merkys
Copy link
Member

merkys commented Sep 21, 2021

American Mineralogist Crystal Structure Database (down at the moment, but available via Wayback Machine). Crystal structures of all minerals published across various mineralogy journals, grouped by mineral name.

COD ingests AMCSD from time to time.

@ml-evs
Copy link
Member

ml-evs commented Oct 5, 2021

@ml-evs
Copy link
Member

ml-evs commented Oct 20, 2021

MaterialsAtlas.org (website currently down but there is a preprint: "MaterialsAtlas.org: A Materials Informatics Web App Platform for Materials Discovery and Survey of State-of-the-Art" arXiv 2109.04007) (overlapping devs with Carolina MatDB above)

@ml-evs
Copy link
Member

ml-evs commented Oct 20, 2021

As this list grows, I think it makes sense to collect a table of who we have actually contacted in the top comment... here is a draft below, feel free to suggest changes if you know that this database knows about OPTIMADE/is interested. I have ticked the ones that I have attended workshops or that I have contacted personally.

(moved to top comment)

@ml-evs
Copy link
Member

ml-evs commented Nov 30, 2021

ACCDB - think this has been mentioned in the past but I couldn't find it. A large number of static databases of crystal/molecular geometries used for benchmarking new methods. Could be a nice test bed for both necroptimade and the new properties format.

@ml-evs
Copy link
Member

ml-evs commented Aug 31, 2022

Hypothetical Zeolite Database: http://www.hypotheticalzeolites.net 4,450,542 zeolite structures with energies and topologies, going back 30+ years - not sure how much longer it will last so we should try to help out!

@ml-evs
Copy link
Member

ml-evs commented Oct 3, 2022

Have we been in touch with the Open ForceField consortium as part of the trajectories work? Looking in particular at project 2, which could be a good collaboration (though maybe only the COD data is relevant to them!)

@ml-evs
Copy link
Member

ml-evs commented Feb 22, 2023

@merkys
Copy link
Member

merkys commented Jun 8, 2023

PubChem, database of chemical molecules.

@ml-evs
Copy link
Member

ml-evs commented Jun 12, 2023

I've just updated the table with a few more db's I've contacted this year, if anyone has contacts at any of the remaining ❓ marks, please feel free to reach out to them...

@JPBergsma
Copy link
Contributor

Perhaps this is still an interesting database http://quantum-machine.org/

@ml-evs
Copy link
Member

ml-evs commented Aug 11, 2023

FFMDFPA: A FAIRification Framework for Materials Data with No-Code Flexible Semi-Structured Parser and Application Programming Interfaces: 10.1021/acs.jcim.3c00836 -- has developed a similar grammar and overlapping functionality with optimade-python-tools, but focused on VASP/Gromacs outputs specifically (I think) -- discusses future integration with OPTIMADE so we should make sure they get invited!

@JPBergsma
Copy link
Contributor

Perhaps the Database of Zeolite Structures could also be added to this list.
This database provides structural information on all the Zeolite Framework Types that have been approved by the Structure Commission of the International Zeolite Association (IZA-SC).

It is searchable and includes:
descriptions and drawings of each framework type
crystallographic data and simulated powder diffraction patterns for representative materials
relevant references
detailed instructions for building models
measured powder patterns from "Verified Syntheses" (2nd and 3rd edition)
29Si MAS NMR spectra for pure silica and aluminosilicate zeolites
31P MAS NMR spectra for pure aluminophosphate zeolites
framework chemical composition for all materials in the database
http://www.iza-structure.org/databases/

@ml-evs
Copy link
Member

ml-evs commented Mar 7, 2024

https://materials.phasecraft.io/

Small database of materials properties computed (at DFT level?) with quantum computing algorithms (i guess the most interesting "properties" here are the QC architectures rather than materials properties)

@ml-evs ml-evs pinned this issue Mar 25, 2024
@ml-evs
Copy link
Member

ml-evs commented Oct 25, 2024

http://www.mathub3d.net/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants