Skip to content

Commit

Permalink
Updated datasets 2024-08-06 UTC
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Aug 6, 2024
1 parent c7e03a5 commit cbef3f8
Show file tree
Hide file tree
Showing 5 changed files with 8 additions and 8 deletions.
2 changes: 1 addition & 1 deletion aws_open_datasets.json
Original file line number Diff line number Diff line change
Expand Up @@ -28214,7 +28214,7 @@
"ARN": "arn:aws:s3:::tiger-training",
"Region": "us-west-2",
"Type": "S3 Bucket",
"Documentation": "https://tiger.grand-challenge.org/data/",
"Documentation": "https://tiger.grand-challenge.org/Data/",
"Contact": "https://tiger.grand-challenge.org/contact/",
"ManagedBy": "Radboud University Medical Center",
"UpdateFrequency": "As required",
Expand Down
2 changes: 1 addition & 1 deletion aws_open_datasets.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -1033,7 +1033,7 @@ Sup3rCC Sup3rCC Generative Models arn:aws:s3:::nrel-pds-sup3rcc/models/ us-west-
Swiss Public Transport Stops data files ESRI FGDB, CSV , MapInfo, Interlis arn:aws:s3:::data.geo.admin.ch/ch.bav.haltestellen-oev/data.zip eu-west-1 S3 Bucket https://www.bav.admin.ch/bav/de/home/allgemeine-themen/fachthemen/geoinformation [email protected] Swiss Geoportal annually You may use this dataset for non-commercial purposes. You may use this dataset f aws-pds, cities, geospatial, infrastructure, mapping, traffic, transportation ['[Browse Bucket](https://data.geo.admin.ch/index.html)']
Synthea Coherent Data Set Synthetic data set that includes FHIR resources, DICOM images, genomic data, phy arn:aws:s3:::synthea-open-data/coherent/ us-east-1 S3 Bucket https://doi.org/10.3390/electronics11081199 [email protected] [The MITRE Corporation](https://www.mitre.org) Rarely [Creative Commons Attribution 4.0 International License](https://creativecommons aws-pds, health, bioinformatics, life sciences, medicine, csv, dicom, genomic, imaging
Synthea synthetic patient generator data in OMOP Common Data Model Project data files arn:aws:s3:::synthea-omop us-east-1 S3 Bucket https://github.com/synthetichealth/synthea/wiki Post any questions to [re:Post](https://repost.aws/tags/questions/TApd0Wl5P8S9O6 [Amazon Web Sevices](https://aws.amazon.com/) Not updated https://github.com/synthetichealth/synthea/blob/master/LICENSE aws-pds, bioinformatics, health, life sciences, natural language processing, us
TIGER Training Whole slide images with corresponding annotations including tumor, stroma and tu arn:aws:s3:::tiger-training us-west-2 S3 Bucket https://tiger.grand-challenge.org/data/ https://tiger.grand-challenge.org/contact/ Radboud University Medical Center As required CC BY-NC 4.0 aws-pds, life sciences, cancer, computational pathology, grand-challenge.org, histopathology, deep learning, computer vision
TIGER Training Whole slide images with corresponding annotations including tumor, stroma and tu arn:aws:s3:::tiger-training us-west-2 S3 Bucket https://tiger.grand-challenge.org/Data/ https://tiger.grand-challenge.org/contact/ Radboud University Medical Center As required CC BY-NC 4.0 aws-pds, life sciences, cancer, computational pathology, grand-challenge.org, histopathology, deep learning, computer vision
TSBench TSBench Evaluation Metrics and Probabilistic Forecasts arn:aws:s3:::odp-tsbench us-east-1 S3 Bucket https://github.com/awslabs/gluon-ts/tree/master/src/gluonts/nursery/tsbench/READ [email protected], [email protected] [Amazon Web Services](https://aws.amazon.com) Not expected to be updated [CC BY](https://creativecommons.org/licenses/by/4.0/) machine learning, deep learning, meta learning, benchmark, time series forecasting
Tabula Muris https://githubcom/czbiohub/tabula-muris arn:aws:s3:::czb-tabula-muris us-east-1 S3 Bucket https://github.com/czbiohub/tabula-muris/blob/master/tabula-muris-on-aws.md If you have questions about the data, you can create an Issue at https://github. [Chan Zuckerberg Biohub](https://www.czbiohub.org/) This is the final version of the dataset, it will not be updated. https://github.com/czbiohub/tabula-muris/blob/master/LICENSE aws-pds, biology, encyclopedic, genomic, health, life sciences, medicine
Tabula Muris Senis https://githubcom/czbiohub/tabula-muris-senis arn:aws:s3:::czb-tabula-muris-senis us-west-2 S3 Bucket https://github.com/czbiohub/tabula-muris-senis/blob/master/tabula-muris-senis-on If you have questions about the data, you can create an Issue at https://github. [Chan Zuckerberg Biohub](https://www.czbiohub.org/) This is the first version of the dataset and it will be updated after the manusc https://github.com/czbiohub/tabula-muris-senis/blob/master/LICENSE aws-pds, biology, encyclopedic, genomic, health, life sciences, medicine, single-cell transcriptomics
Expand Down
6 changes: 3 additions & 3 deletions datasets/broad-gnomad.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Name: Genome Aggregation Database (gnomAD)
Description: |
The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects. The summary data provided here are released for the benefit of the wider scientific community without restriction on use.
The v4 data set (GRCh38) spans 730,947 exome sequences and 76,215 whole-genome sequences from unrelated individuals, of [diverse ancestries](https://gnomad.broadinstitute.org/stats#diversity), sequenced sequenced as part of various disease-specific and population genetic studies.
The v4.1 data set (GRCh38) spans 730,947 exome sequences and 76,215 whole-genome sequences from unrelated individuals, of [diverse ancestries](https://gnomad.broadinstitute.org/stats#diversity), sequenced sequenced as part of various disease-specific and population genetic studies.
The gnomAD Principal Investigators and team can be found [here](https://gnomad.broadinstitute.org/team), and the groups that have contributed data to the current release are listed [here](https://gnomad.broadinstitute.org/about).
Sign up for the gnomAD mailing list [here](http://broad.io/gnomad_list).
Documentation: https://gnomad.broadinstitute.org/about
Expand Down Expand Up @@ -39,8 +39,8 @@ DataAtWork:
AuthorName: Hail Team
AuthorURL: https://hail.is/
Publications:
- Title: A genome-wide mutational constraint map quantified from variation in 76,156 human genomes. bioRxiv 2022.03.20.485034 (2022)
URL: https://doi.org/10.1101/2022.03.20.485034
- Title: A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625, 92–100 (2024)
URL: https://doi.org/10.1038/s41586-023-06045-0
AuthorName: Chen, S., Francioli, L. C., Goodrich, J. K., Collins, R. L., Wang, Q., Alföldi, J., Watts, N. A., Vittal, C., Gauthier, L. D., Poterba, T., Wilson, M. W., Tarasova, Y., Phu, W., Yohannes, M. T., Koenig, Z., Farjoun, Y., Banks, E., Donnelly, S., Gabriel, S., Gupta, N., Ferriera, S., Tolonen, C., Novod, S., Bergelson, L., Roazen, D., Ruano-Rubio, V., Covarrubias, M., Llanwarne, C., Petrillo, N., Wade, G., Jeandet, T., Munshi, R., Tibbetts, K., gnomAD Project Consortium, O’Donnell-Luria, A., Solomonson, M., Seed, C., Martin, A. R., Talkowski, M. E., Rehm, H. L., Daly, M. J., Tiao, G., Neale, B. M., MacArthur, D. G. & Karczewski, K. J.
- Title: The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020)
URL: https://doi.org/10.1038/s41586-020-2308-7
Expand Down
4 changes: 2 additions & 2 deletions datasets/nrel-pds-dsgrid.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ DataAtWork:
- Title: dsgrid Documentation
URL: https://dsgrid.github.io/dsgrid/
AuthorName: Elaine Hale
- Title: Python API for Accessing dsgrid Data for the [Electrification Futures Study (EFS)](https://data.openei.org/submissions/4130)
- Title: Python API for Accessing dsgrid Data for the Electrification Futures Study (EFS)
URL: https://github.com/dsgrid/dsgrid-legacy-efs-api
AuthorName: Elaine Hale
- Title: dsgrid Project Standard Scenarios for the [TEMPO Project](https://data.openei.org/submissions/5958)
- Title: dsgrid Project Standard Scenarios for the TEMPO Project
URL: https://github.com/dsgrid/dsgrid-project-StandardScenarios/tree/main/tempo_project
AuthorName: Elaine Hale
Tools & Applications:
Expand Down
2 changes: 1 addition & 1 deletion datasets/tiger.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Name: TIGER Training
Description: |
"This dataset contains the training data for the [Tumor InfiltratinG lymphocytes in breast cancER or TIGER](https://tiger.grand-challenge.org) challenge. TIGER is the first challenge on fully automated assessment of tumor-infiltrating lymphocytes (TILs) in breast cancer histopathology slides. TILs are proving to be an important biomarker in cancer patients as they can play a part in killing tumor cells, particularly in some types of breast cancer. Identifying and measuring TILs can help to better target treatments, particularly immunotherapy, and may result in lower levels of other more aggressive treatments, including chemotherapy."
Documentation: https://tiger.grand-challenge.org/data/
Documentation: https://tiger.grand-challenge.org/Data/
Contact: https://tiger.grand-challenge.org/contact/
ManagedBy: Radboud University Medical Center
UpdateFrequency: As required
Expand Down

0 comments on commit cbef3f8

Please sign in to comment.