-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reaching out to other databases #124
Comments
Topological Materials Database, 2017-2019. |
JARVIS out of NIST |
I've just come across Hydrid3 on twitter, a database for organic/inorganic perovskites out of Duke, will mention OPTIMADE to them. |
Carolina Materials Database, a new (2020?) database of ternary & quaternary crystal structures predicted with a generative neural net. Zhao et al. "High-throughput discovery of novel cubic crystal materials using deep generative neural networks" arXiv:2102.01880 |
Originally posted by @ml-evs at Materials-Consortia/OPTIMADE#124
Open Catalyst Project from CMU & Facebook AI. Big datasets of molecule+surface guided relaxations & MD. Could be a useful case study when considering adding trajectories... Chanussot et al., "The Open Catalyst 2020 (OC20) Dataset and Community Challenges" arXiv:2010.09990 |
Quantum MOF database containing ~15,000 electronic structure calculations on MOFs, currently provided as an archive on figshare. This might be an example of the kind of dataset we discussed in the last meeting, where hosting a public API is prohibited by technical or resource constraints. Dare we go down the OPTIMADE-as-a-service route? 😁 Rosen et al., "Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery", Matter (2021) 10.1016/j.matt.2021.02.015 |
DFT+U for transition metal rutile dioxides - some solid amount of the raw calculations simply hosted at Github. (I'm putting it here to raise a discussion, what are the "other databases", also in connection to the Optimade-as-a-service mentioned by @ml-evs) |
An interesting point that @shyamd made in the past was "what happens if all these databases suddenly get the same load as the Materials Project"? This could be a useful way of framing the discussion at the next workshop. I think we really have to consider the science case we imagine for querying all OPTIMADE instances at once. |
As I mentioned in the last meeting, I like the idea of a datasette-like tool for OPTIMADE. that allows for local exploration and filtering of materials datasets without needing the data owner to provide an API (they would of course need to provide the data in a supported format). Would be fun to hack on something like this at the workshop. You can then bring your favourite local OPTIMADE client to the show for rich filtering. This could then naturally become the tool for data providers to do one-click OPTIMADE provider registration (or even deployment?), though not sure that would be worthwhile yet... |
I just came across Matgen, a set of databases from a large Chinese academic consortium, via this paper: He, B., Chi, S., Ye, A. et al. High-throughput screening platform for solid electrolytes combining hierarchical ion-transport prediction algorithms. Sci Data 7, 151 (2020). 10.1038/s41597-020-0474-y |
CMU Alloy Database out of the group of Prof Michael Widom, last update to the website/data was 2011, but it is still alive. This could be a great candidate for quickly spinning up an OPTIMADE API. |
American Mineralogist Crystal Structure Database (down at the moment, but available via Wayback Machine). Crystal structures of all minerals published across various mineralogy journals, grouped by mineral name. |
f-electron structure database (FESD). Contains LAPW DFT calcs on known lanthanide/actinide-containing crystal structures, possibly also novel/predicted structures (preprint from 2017 says available soon), and possibly also DFT+DMFT calculations. |
Somehow related to RRUFF project presumably |
MIP-3d: database focused on thermoelectric properties of known structures Yao, M., Wang, Y., Li, X. et al. Materials informatics platform with three dimensional structures, workflow and thermoelectric applications. Sci Data 8, 236 (2021). https://doi.org/10.1038/s41597-021-01022-6 |
COD ingests AMCSD from time to time. |
MaterialsAtlas.org (website currently down but there is a preprint: "MaterialsAtlas.org: A Materials Informatics Web App Platform for Materials Discovery and Survey of State-of-the-Art" arXiv 2109.04007) (overlapping devs with Carolina MatDB above) |
As this list grows, I think it makes sense to collect a table of who we have actually contacted in the top comment... here is a draft below, feel free to suggest changes if you know that this database knows about OPTIMADE/is interested. I have ticked the ones that I have attended workshops or that I have contacted personally. (moved to top comment) |
ACCDB - think this has been mentioned in the past but I couldn't find it. A large number of static databases of crystal/molecular geometries used for benchmarking new methods. Could be a nice test bed for both necroptimade and the new properties format. |
Hypothetical Zeolite Database: http://www.hypotheticalzeolites.net 4,450,542 zeolite structures with energies and topologies, going back 30+ years - not sure how much longer it will last so we should try to help out! |
Have we been in touch with the Open ForceField consortium as part of the trajectories work? Looking in particular at project 2, which could be a good collaboration (though maybe only the COD data is relevant to them!) |
PubChem, database of chemical molecules. |
I've just updated the table with a few more db's I've contacted this year, if anyone has contacts at any of the remaining ❓ marks, please feel free to reach out to them... |
Perhaps this is still an interesting database http://quantum-machine.org/ |
FFMDFPA: A FAIRification Framework for Materials Data with No-Code Flexible Semi-Structured Parser and Application Programming Interfaces: 10.1021/acs.jcim.3c00836 -- has developed a similar grammar and overlapping functionality with optimade-python-tools, but focused on VASP/Gromacs outputs specifically (I think) -- discusses future integration with OPTIMADE so we should make sure they get invited! |
Perhaps the Database of Zeolite Structures could also be added to this list. It is searchable and includes: |
https://materials.phasecraft.io/ Small database of materials properties computed (at DFT level?) with quantum computing algorithms (i guess the most interesting "properties" here are the QC architectures rather than materials properties) |
from @ml-evs: Please add a comment with any new suggestions, but also feel free to edit this table
Google Brain's top secret DFT databaseDeepmind's Gnome(currently hosted at https://optimade-gnome.odbx.science)
(hosted at optimade-misc.odbx.science)
(hosted at alexandria.odbx.science)
From discussions with @ctoher; we thought that it would be good to collect somewhere a list of other databases to reach out to for checking if they are interested in implementing OPTiMaDe.
Here are a list (some of these were pointed out to me by Lauri Himanen at Aalto University.)
(materials.nrel.gov; 2015)
More data-set oriented
More experimentally oriented
Unknown content or status
The text was updated successfully, but these errors were encountered: