From 6d56dffed55bbb0abafebd648a32f2451bd948a6 Mon Sep 17 00:00:00 2001 From: Apoorva Srinivasan <43023448+apoorvasrinivasan26@users.noreply.github.com> Date: Fri, 28 Apr 2023 21:02:44 -0700 Subject: [PATCH] Update README.md (#55) Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com> --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 90a579c..d1e390a 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,9 @@ Contributions are very welcome - please follow the [guidelines](CONTRIBUTING.md) - [PubMed central](https://www.ncbi.nlm.nih.gov/pmc/): free full-text archive - [PubMed](https://pubmed.ncbi.nlm.nih.gov/): abstracts and outlinks - [S2ORC](https://github.com/allenai/s2orc): The Semantic Scholar Open Research Corpus. 81.1M English-language academic papers spanning many academic disciplines largest publicly-available collection of machine-readable academic text). Released under CC BY-NC 4.0. +- [BioCreative V](https://biocreative.bioinformatics.udel.edu/tasks/biocreative-v/track-3-cdr/): BC5CDR corpus consists of 1500 PubMed articles with 4409 annotated chemicals, 5818 diseases and 3116 chemical-disease interactions. - [Elsevier Corpus](https://elsevier.digitalcommonsdata.com/datasets/zm33cdndxs/3): This is a corpus of 40k (40,001) open access (OA) CC-BY articles from across Elsevier’s journals represent the first cross-discipline research of data at this scale to support NLP and ML research. + ## structures - [Crystallography Open Database](http://www.crystallography.net/cod/): open-access collection of crystal structures of organic, inorganic, metal-organic compounds and minerals, excluding biopolymers. [They also derived SMILES for some compounds.](https://doi.org/10.1186/s13321-018-0279-6)