Skip to content

Commit

Permalink
Update README.md (#55)
Browse files Browse the repository at this point in the history
Co-authored-by: Kevin M Jablonka <[email protected]>
  • Loading branch information
apoorvasrinivasan26 and kjappelbaum authored Apr 29, 2023
1 parent 1b37167 commit 6d56dff
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ Contributions are very welcome - please follow the [guidelines](CONTRIBUTING.md)
- [PubMed central](https://www.ncbi.nlm.nih.gov/pmc/): free full-text archive
- [PubMed](https://pubmed.ncbi.nlm.nih.gov/): abstracts and outlinks
- [S2ORC](https://github.com/allenai/s2orc): The Semantic Scholar Open Research Corpus. 81.1M English-language academic papers spanning many academic disciplines largest publicly-available collection of machine-readable academic text). Released under CC BY-NC 4.0.
- [BioCreative V](https://biocreative.bioinformatics.udel.edu/tasks/biocreative-v/track-3-cdr/): BC5CDR corpus consists of 1500 PubMed articles with 4409 annotated chemicals, 5818 diseases and 3116 chemical-disease interactions.
- [Elsevier Corpus](https://elsevier.digitalcommonsdata.com/datasets/zm33cdndxs/3): This is a corpus of 40k (40,001) open access (OA) CC-BY articles from across Elsevier’s journals represent the first cross-discipline research of data at this scale to support NLP and ML research.

## structures

- [Crystallography Open Database](http://www.crystallography.net/cod/): open-access collection of crystal structures of organic, inorganic, metal-organic compounds and minerals, excluding biopolymers. [They also derived SMILES for some compounds.](https://doi.org/10.1186/s13321-018-0279-6)
Expand Down

0 comments on commit 6d56dff

Please sign in to comment.