Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets not found #39

Open
icyjayd opened this issue May 23, 2024 · 0 comments
Open

Datasets not found #39

icyjayd opened this issue May 23, 2024 · 0 comments

Comments

@icyjayd
Copy link

icyjayd commented May 23, 2024

I have installed this package and but I can't load the datasets.

My code is as follows:

from genomic_benchmarks.data_check import list_datasets
from genomic_benchmarks.dataset_getters.pytorch_datasets import get_dataset
from genomic_benchmarks.data_check import info
from genomic_benchmarks.loc2seq import download_dataset

When trying to download, for example, 'demo_coding_vs_intergenomic_seqs' I get FileNotFoundError: Dataset demo_coding_vs_intergenomic_seqs not found.

For completion's sake, I wrote code to attempt to download each of the datasets.

for dset in list_datasets():
    try:
        get_dataset(dset, split='train')
        print("success!")
    except:
        print(dset, "not found")

The output is as follows:

demo_coding_vs_intergenomic_seqs not found
human_enhancers_cohn not found
human_ocr_ensembl not found
demo_human_or_worm not found
human_ensembl_regulatory not found
drosophila_enhancers_stark not found
dummy_mouse_enhancers_ensembl not found
human_enhancers_ensembl not found
human_nontata_promoters not found

The same occurs with the info and download_dataset functions as well. Any help on what I'm doing wrong would be appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant