Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error (related to rule concat_deepSea) when running https://deeprvat.readthedocs.io/en/latest/annotations.html#running-the-pipeline #117

Closed
pichuan opened this issue Aug 3, 2024 · 16 comments

Comments

@pichuan
Copy link

pichuan commented Aug 3, 2024

Hi,

When I'm running this step in https://deeprvat.readthedocs.io/en/latest/annotations.html#running-the-pipeline:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

I got this error:

[Sat Aug  3 21:38:28 2024]
Error in rule concat_deepSea:
    jobid: 15
    output: output_dir/annotations/all_variants.deepSea.parquet
    shell:
        deeprvat_annotations concatenate-deepsea  output_dir/annotations/all_variants.deepSea.parquet 8
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

It seems like the usage here is missing one flag. Specifically, when I ran that command, I got:

Usage: deeprvat_annotations concatenate-deepsea [OPTIONS] DEEPSEA_FILES
                                                OUT_FILE NJOBS

I searched a bit, and found that in the ../../pipelines/annotations/deepSEA.snakefile file, there is this rule:

rule concat_deepSea:
    input:
        deepSEAscoreFiles=expand(rules.deepSea.output, file_stem=file_stems),
    params:
        joined=lambda w, input: ",".join(input.deepSEAscoreFiles),
    threads: 8
    resources:
        mem_mb=lambda wildcards, attempt: 50_000 * (attempt + 1),
    output:
        anno_dir / "all_variants.deepSea.parquet",
    shell:
        " ".join(
            [
                "deeprvat_annotations",
                "concatenate-deepsea",
                "{params.joined}",
                "{output}",
                "{threads}",
            ]
        )

For whatever reason, it doesn't seem like I have the deepSEAscoreFiles , which I assume should be generated by a previous step(?).

So far I'm still only running through the documentation steps, so I'm not using my own VCF input files yet. (And I did already go through the preprocessing pipeline.)

If you know what I might be missing here, please let me know. Thank you!

@Marcel-Mueck
Copy link
Collaborator

hi @pichuan, it looks like your file_stems variable is an empty list for some reason. Are you running it on example data or your own? how does your example/config/deeprvat_annotation_config.yaml look like, especially the source_variant_file_pattern? Does it match with the names of the input data you're using?
Regards, Marcel

@pichuan
Copy link
Author

pichuan commented Aug 5, 2024

Hi @Marcel-Mueck ,

(1) I'm still running through the example input data. At this point, I've finished running https://deeprvat.readthedocs.io/en/latest/preprocessing.html. And according to the documentation there, I believe these were the output from the preprocessing pipeline:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ ls ../../example/preprocess/workdir/preprocesed/
genotypes.h5  genotypes_chr21.h5  genotypes_chr22.h5

(2) Here is what's in my deeprvat_annotation_config.yaml:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ grep source_variant_file_pattern ../config/deeprvat_annotation_config.yaml
source_variant_file_pattern : test_vcf_data_c{chr}_b{block}

Actually, at this point, I haven't modified deeprvat_annotation_config.yaml, so it's exactly the same as the original version. I know I'll need to comment out lines like spliceAI and primateAI etc because I won't be using them. I'll do that soon!

(3) I don't know exactly what the source_variant_file_pattern should be point to, so I searched for every files with that pattern under my directory now:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ find ../../ -type f -name *test_vcf_data_c*
../../tests/preprocessing/test_data/process_individual_missingness/one_missing/input/indmiss/sites/test_vcf_data_c22_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/one_missing/input/indmiss/sites/test_vcf_data_c21_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/one_missing/input/indmiss/samples/test_vcf_data_c22_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/one_missing/input/indmiss/samples/test_vcf_data_c21_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/one_missing/input/indmiss/stats/test_vcf_data_c22_b1.stats
../../tests/preprocessing/test_data/process_individual_missingness/one_missing/input/indmiss/stats/test_vcf_data_c21_b1.stats
../../tests/preprocessing/test_data/process_individual_missingness/two_missing/input/indmiss/sites/test_vcf_data_c22_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/two_missing/input/indmiss/sites/test_vcf_data_c21_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/two_missing/input/indmiss/samples/test_vcf_data_c22_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/two_missing/input/indmiss/samples/test_vcf_data_c21_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/two_missing/input/indmiss/stats/test_vcf_data_c22_b1.stats
../../tests/preprocessing/test_data/process_individual_missingness/two_missing/input/indmiss/stats/test_vcf_data_c21_b1.stats
../../tests/preprocessing/test_data/process_individual_missingness/no_missing/input/indmiss/sites/test_vcf_data_c22_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/no_missing/input/indmiss/sites/test_vcf_data_c21_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/no_missing/input/indmiss/samples/test_vcf_data_c22_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/no_missing/input/indmiss/samples/test_vcf_data_c21_b1.tsv
../../tests/preprocessing/test_data/process_individual_missingness/no_missing/input/indmiss/stats/test_vcf_data_c22_b1.stats
../../tests/preprocessing/test_data/process_individual_missingness/no_missing/input/indmiss/stats/test_vcf_data_c21_b1.stats
../../example/annotations/input_dir/vcf/test_vcf_data_c21_b1.vcf.gz
../../example/annotations/input_dir/vcf/test_vcf_data_c22_b1.vcf.gz
../../example/preprocess/workdir/norm/variants/test_vcf_data_c22_b1.tsv.gz
../../example/preprocess/workdir/norm/variants/test_vcf_data_c21_b1.tsv.gz
../../example/preprocess/workdir/norm/bcf/test_vcf_data_c21_b1.bcf
../../example/preprocess/workdir/norm/bcf/test_vcf_data_c22_b1.bcf
../../example/preprocess/workdir/norm/sparse/test_vcf_data_c22_b1.tsv.gz
../../example/preprocess/workdir/norm/sparse/test_vcf_data_c21_b1.tsv.gz
../../example/preprocess/workdir/qc/read_depth/test_vcf_data_c22_b1.tsv.gz
../../example/preprocess/workdir/qc/read_depth/test_vcf_data_c21_b1.tsv.gz
../../example/preprocess/workdir/qc/indmiss/sites/test_vcf_data_c22_b1.tsv
../../example/preprocess/workdir/qc/indmiss/sites/test_vcf_data_c21_b1.tsv
../../example/preprocess/workdir/qc/indmiss/samples/test_vcf_data_c22_b1.tsv
../../example/preprocess/workdir/qc/indmiss/samples/test_vcf_data_c21_b1.tsv
../../example/preprocess/workdir/qc/indmiss/stats/test_vcf_data_c22_b1.stats
../../example/preprocess/workdir/qc/indmiss/stats/test_vcf_data_c21_b1.stats
../../example/preprocess/workdir/qc/allelic_imbalance/test_vcf_data_c22_b1.tsv.gz
../../example/preprocess/workdir/qc/allelic_imbalance/test_vcf_data_c21_b1.tsv.gz
../../example/preprocess/workdir/qc/hwe/test_vcf_data_c22_b1.tsv.gz
../../example/preprocess/workdir/qc/hwe/test_vcf_data_c21_b1.tsv.gz
../../example/preprocess/workdir/qc/varmiss/test_vcf_data_c22_b1.tsv.gz
../../example/preprocess/workdir/qc/varmiss/test_vcf_data_c21_b1.tsv.gz
../../example/preprocess/data/vcf/test_vcf_data_c21_b1.vcf.gz
../../example/preprocess/data/vcf/test_vcf_data_c22_b1.vcf.gz

Looking a bit closely, I wonder if this line in the deeprvat_annotation_config.yaml file doesn't match the file structure above?

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ grep source_variant_dir ../config/deeprvat_annotation_config.yaml
source_variant_dir : preprocessing_workdir/norm/bcf

Did I do something wrong in my steps, or is this something that you should update in the example documentation as well? Thanks!

@Marcel-Mueck
Copy link
Collaborator

linking #118

@Marcel-Mueck
Copy link
Collaborator

Hey @pichuan I updated the docs and performed some small changes to the config in #118 . We will soon merge this with the master branch

@Marcel-Mueck
Copy link
Collaborator

Hello @pichuan, the PR #118 is merged now. I corrected some path of the annotation config file:
image

I also updated the docs, so when you want to run the example data, I can recommand checking this part:
https://deeprvat.readthedocs.io/en/latest/annotations.html#running-the-annotation-pipeline-on-example-data
and when you want to run it on your own data, this part is relevent:
https://deeprvat.readthedocs.io/en/latest/annotations.html#running-the-pipeline-on-your-own-data
Specifically, when you want to run it without the spliceAI and primateAI data, you can modify the additional_vep_plugin_cmds inside the config and add include_absplice: False to the config.
Regards, Marcel Mück

@pichuan
Copy link
Author

pichuan commented Aug 7, 2024

Thank you!

On the same machine, I get the latest version of the code. I reran the preprocessing step first, and then:

I have this at the end of my yaml file:

include_absplice: False
additional_vep_plugin_cmds:
  cadd : CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz
  #spliceAI : SpliceAI,snv=annotation_data/spliceAI/spliceai_scores.raw.snv.hg38.vcf.gz,indel=annotation_data/spliceAI/spliceai_scores.raw.indel.hg38.vcf.gz
  #primateAI : PrimateAI,annotation_data/primateAI/PrimateAI_scores_v0.2_GRCh38_sorted.tsv.bgz
  condel: Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2
  alphamissense : AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz

Then I ran this command, and got this error:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda
Building DAG of jobs...
MissingInputException in rule vep  in line 246 of /home/pichuan/deeprvat/pipelines/annotations.snakefile:
Missing input files for rule vep:
    output: output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv
    wildcards: file_stem=test_vcf_data_c21_b1
    affected files:
        reference/GRCh38.primary_assembly.genome.fa

real	0m0.670s
user	0m0.561s
sys	0m0.109s

Which seems to say that my vep step is anticipating like output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv.

I did a search in my directory, and didn't find a filename like that:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ find ../../ -name test_vcf_data_c21_b1_vep_anno.tsv

I can try to get a new machine and completely from scratch. Do you think that would solve the problem?

Thanks!

@Marcel-Mueck
Copy link
Collaborator

Hi @pichuan, looks like the GRCh38.primary_assembly.genome.fa file is missing in your example/annotations/reference directory. You can download it here(don't forget that you have to unzip it (gzip -d)). Alternatively, you can change the fasta_dir : reference line in the config to fasta_dir : ../preprocess/workdir/reference(to use the same fasta dir as in the preprocessing pipeline). Btw, don't forget that you need a GTF file as well in the same directory as the fasta file, which you can download here.

@pichuan
Copy link
Author

pichuan commented Aug 8, 2024

Thanks @Marcel-Mueck ! I downloaded that in the preprocess step, but didn't quite realize it wasn't pointing to the same one. Thanks for pointing that out. I found it here:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ find ../../ -name GRCh38.primary_assembly.genome.fa
../../example/preprocess/workdir/reference/GRCh38.primary_assembly.genome.fa

It seems like I already have the GTF file:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ ls ../../example/annotations/reference
gencode.v44.annotation.gtf.gz  hg38.fa

For convenience, I just did:

ln -sf $HOME/deeprvat/example/preprocess/workdir/reference/GRCh38.primary_assembly.genome.fa $HOME/deeprvat/example/annotations/reference/GRCh38.primary_assembly.genome.fa 

Then I ran this again:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

This time it made a bit more progress. This one failed:

[Thu Aug  8 17:16:07 2024]
Error in rule vep:
    jobid: 13
    output: output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv
    shell:
            vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/cache --dir_plugins repo_dir/ensembl-vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

From the command above, I wonder if it requires repo_dir to be under the same directory as my current directory, so I tried:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ ln -sf $HOME/repo_dir .
(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

This time it failed with:

[Thu Aug  8 17:19:44 2024]
Error in rule deepSea:
    jobid: 31
    output: output_dir/annotations/test_vcf_data_c22_b1.CLI.deepseapredict.diff.tsv
    conda-env: kipoi-veff2
    shell:
        kipoi_veff2_predict output_dir/annotations/tmp/test_vcf_data_c22_b1_variants_header.vcf.gz reference/GRCh38.primary_assembly.genome.fa output_dir/annotations/test_vcf_data_c22_b1.CLI.deepseapredict.diff.tsv -l 1000 -m 'DeepSEA/predict' -s 'diff'
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Then I tried to directly run that command:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ kipoi_veff2_predict output_dir/annotations/tmp/test_vcf_data_c22_b1_variants_header.vcf.gz reference/GRCh38.primary_assembly.genome.fa output_dir/annotations/test_vcf_data_c22_b1.CLI.deepseapredict.diff.tsv -l 1000 -m 'DeepSEA/predict' -s 'diff'

But it told me: kipoi_veff2_predict: command not found

I'm pretty sure previously I installed that on this machine. I'll go back and check. I'll let you know how it goes. But if it's clear what I'm missing here, please let me know.

@pichuan
Copy link
Author

pichuan commented Aug 8, 2024

I went back and checked:

  1. If I activate the other one: mamba activate kipoi-veff2, then I have that binary in the environment:
$ kipoi_veff2_predict
Usage: kipoi_veff2_predict [OPTIONS] INPUT_VCF INPUT_FASTA OUTPUT_TSV
Try 'kipoi_veff2_predict --help' for help.

Error: Missing argument 'INPUT_VCF'.
  1. I checked the full log. It seems to have Activating conda environment: kipoi-veff2, so I'd expect it to work. However, I'm still getting:
[Thu Aug  8 17:32:48 2024]
Error in rule deepSea:
    jobid: 29
    output: output_dir/annotations/test_vcf_data_c21_b1.CLI.deepseapredict.diff.tsv
    conda-env: kipoi-veff2
    shell:
        kipoi_veff2_predict output_dir/annotations/tmp/test_vcf_data_c21_b1_variants_header.vcf.gz reference/GRCh38.primary_assembly.genome.fa output_dir/annotations/test_vcf_data_c21_b1.CLI.deepseapredict.diff.tsv -l 1000 -m 'DeepSEA/predict' -s 'diff'
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
  1. Then, I decided to try switching the environment myself and see it would run:
(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ mamba activate kipoi-veff2
(kipoi-veff2) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ kipoi_veff2_predict output_dir/annotations/tmp/test_vcf_data_c21_b1_variants_header.vcf.gz reference/GRCh38.primary_assembly.genome.fa output_dir/annotations/test_vcf_data_c21_b1.CLI.deepseapredict.diff.tsv -l 1000 -m 'DeepSEA/predict' -s 'diff'

Here is the full log from my attempt in step 3:

Adding diff from kipoi_veff2.scores
Already up to date.
Using downloaded and verified file: /home/pichuan/.kipoi/models/DeepSEA/predict/downloaded/model_files/weights/89e640bf6bdbe1ff165f484d9796efc7
Traceback (most recent call last):
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/bin/kipoi_veff2_predict", line 33, in <module>
    sys.exit(load_entry_point('kipoi-veff2', 'console_scripts', 'kipoi_veff2_predict')())
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/pichuan/repo_dir/kipoi-veff2/kipoi_veff2/cli.py", line 154, in score_variants
    model_config, input_vcf, input_fasta, output_tsv, scoring_function
  File "/home/pichuan/repo_dir/kipoi-veff2/kipoi_veff2/variant_centered.py", line 236, in score_variants
    kipoi_model = kipoi.get_model(model_config.model)
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/kipoi/model.py", line 184, in get_model
    mod = infer_pyt_class(md.args)(**md.args)
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/kipoi/model.py", line 904, in __init__
    import torch
  File "/home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/torch/__init__.py", line 218, in <module>
    from torch._C import *  # noqa: F403
ImportError: /home/pichuan/miniforge3/envs/kipoi-veff2/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent

Any advice for me to run that step correctly? Thank you!

@Marcel-Mueck
Copy link
Collaborator

Looks like it could have something to do with this pytorch bug. Could you run mamba list|grep mkl inside thekipoi-veff2 env to check which version of mkl is installed in this env? The bug is in version mkl==2024.1.0 you may have to change the version of mklin the kipoi-veff2 env

@pichuan
Copy link
Author

pichuan commented Aug 9, 2024

Here is what I have:

(kipoi-veff2) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ mamba list|grep mkl
blas                      1.0                         mkl    conda-forge
libblas                   3.9.0            23_linux64_mkl    conda-forge
libcblas                  3.9.0            23_linux64_mkl    conda-forge
liblapack                 3.9.0            23_linux64_mkl    conda-forge
mkl                       2024.1.0           ha957f24_693    conda-forge

Any suggestions on how I change the mld version in that environment? I searched for the files under the directory, but it seems like it was not explicitly set in any config files there:

(kipoi-veff2) pichuan@pichuan-gpu:~/repo_dir/kipoi-veff2$ find . -type f -exec grep -H mkl {} \;
grep: ./.git/objects/pack/pack-a6cbadf6ef55db9b5ff5c1622e55771b320ad96e.pack: binary file matches

@pichuan
Copy link
Author

pichuan commented Aug 9, 2024

I tried this manually:

(kipoi-veff2) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ mamba install mkl==2024.0

which now seems to work:

(kipoi-veff2) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ kipoi_veff2_predict output_dir/annotations/tmp/test_vcf_data_c21_b1_variants_header.vcf.gz reference/GRCh38.primary_assembly.genome.fa output_dir/annotations/test_vcf_data_c21_b1.CLI.deepseapredict.diff.tsv -l 1000 -m 'DeepSEA/predict' -s 'diff'
Adding diff from kipoi_veff2.scores
Already up to date.
Using downloaded and verified file: /home/pichuan/.kipoi/models/DeepSEA/predict/downloaded/model_files/weights/89e640bf6bdbe1ff165f484d9796efc7
(kipoi-veff2) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ ls -lh output_dir/annotations/test_vcf_data_c21_b1.CLI.deepseapredict.diff.tsv
-rw-rw-r-- 1 pichuan pichuan 124K Aug  9 23:01 output_dir/annotations/test_vcf_data_c21_b1.CLI.deepseapredict.diff.tsv

I'll keep going from here.

It'll be nice to figure out how to have this fix in a config file, or as a step that I write into a script. I guess - worst case I could just write down what I did in a script.

(For context: My hope here is that I can write all the steps in a script, so that eventually I can use this to automatically set up a machine, and eventually so that I can scale this to run on more files automatically (bring up instances, point to an input, set up the environment, etc all through just one script))

@pichuan
Copy link
Author

pichuan commented Aug 9, 2024

Worklog:

OK, now I'm got around the pytorch error, I ran this command again:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

Here is a next blocker:

[Fri Aug  9 23:05:20 2024]
Error in rule vep:
    jobid: 13
    output: output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv
    shell:
            vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/cache --dir_plugins repo_dir/ensembl-vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

When I ran that command directly, I got this error:

MSG: ERROR: Cache directory repo_dir/ensembl-vep/cache/homo_sapiens not found

I searched for it:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ find -L $HOME/deeprvat  -name cache -type d
/home/pichuan/deeprvat/example/annotations/repo_dir/ensembl-vep/t/testdata/cache

Not exactly sure why, but the path is different. So I did:

sed -i -e 's|vep_cache_dir : repo_dir/ensembl-vep/cache/|vep_cache_dir: repo_dir/ensembl-vep/t/testdata/cache|g' ../config/deeprvat_annotation_config.yaml

and try this again:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

Still seems to have a few errors:

[Fri Aug  9 23:14:14 2024]
Error in rule vep:
    jobid: 13
    output: output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv
    shell:
            vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/t/testdata/cache --dir_plugins repo_dir/ensembl-vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[Fri Aug  9 23:14:14 2024]
Error in rule vep:
    jobid: 20
    output: output_dir/annotations/test_vcf_data_c22_b1_vep_anno.tsv
    shell:
            vep --input_file output_dir/annotations/tmp/test_vcf_data_c22_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c22_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/t/testdata/cache --dir_plugins repo_dir/ensembl-vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[Fri Aug  9 23:14:18 2024]
Error in rule deepSea_PCA:
    jobid: 27
    output: output_dir/annotations/tmp/deepSEA_PCA/deepsea_pca.parquet
    shell:
        mkdir -p output_dir/annotations/tmp/deepSEA_PCA && deeprvat_annotations deepsea-pca output_dir/annotations/all_variants.deepSea.parquet output_dir/annotations/tmp/deepSEA_PCA/pca.npy output_dir/annotations/tmp/deepSEA_PCA/deepSEA_means_SDs.parquet output_dir/annotations/tmp/deepSEA_PCA --n-components 100
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

I ran this directly, and got a few more messages that's useful to follow up on :

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/t/testdata/cache --dir_plugins repo_dir/ensembl-vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz
Smartmatch is experimental at /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
WARNING: Failed to instantiate plugin CADD: ERROR: Data file annotation_data/cadd/whole_genome_SNVs.tsv.gz not found

WARNING: Failed to compile plugin AlphaMissense: Can't locate AlphaMissense.pm in @INC (you may need to install the AlphaMissense module) (@INC contains: repo_dir/ensembl-vep/Plugins /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0 /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/5.32/site_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/site_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/5.32/vendor_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/vendor_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/5.32/core_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/core_perl .) at (eval 46) line 2.
BEGIN failed--compilation aborted at (eval 46) line 2.


-------------------- EXCEPTION --------------------
MSG: ERROR: No cache found for homo_sapiens, version 110

STACK Bio::EnsEMBL::VEP::CacheDir::dir /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:322
STACK Bio::EnsEMBL::VEP::CacheDir::init /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:219
STACK Bio::EnsEMBL::VEP::CacheDir::new /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:111
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all_from_cache /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:116
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:92
STACK Bio::EnsEMBL::VEP::BaseRunner::get_all_AnnotationSources /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/BaseRunner.pm:170
STACK Bio::EnsEMBL::VEP::Runner::init /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/Runner.pm:128
STACK Bio::EnsEMBL::VEP::Runner::run /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/Runner.pm:200
STACK toplevel /home/pichuan/miniforge3/envs/deeprvat_annotations/bin/vep:46
Date (localtime)    = Fri Aug  9 23:15:30 2024
Ensembl API version = 110
---------------------------------------------------

It seems like I don't have whole_genome_SNVs.tsv.gz, so I looked for it. It's because I didn't have annotation_data in the right place, so I just linked it and reran:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ find $HOME -type f -name whole_genome_SNVs.tsv.gz*
/home/pichuan/annotation_data/cadd/whole_genome_SNVs.tsv.gz
/home/pichuan/annotation_data/cadd/whole_genome_SNVs.tsv.gz.tbi
(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ ln -sf $HOME/annotation_data .

Try again:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/t/testdata/cache --dir_plugins repo_dir/ensembl-vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz
Smartmatch is experimental at /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
WARNING: Failed to compile plugin AlphaMissense: Can't locate AlphaMissense.pm in @INC (you may need to install the AlphaMissense module) (@INC contains: repo_dir/ensembl-vep/Plugins /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0 /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/5.32/site_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/site_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/5.32/vendor_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/vendor_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/5.32/core_perl /home/pichuan/miniforge3/envs/deeprvat_annotations/lib/perl5/core_perl .) at (eval 46) line 2.
BEGIN failed--compilation aborted at (eval 46) line 2.

So, I think next I'll need to install AlphaMissense. I'll try that and see if I can proceed.

@pichuan
Copy link
Author

pichuan commented Aug 9, 2024

Worklog : install AlphaMissense

I went back and did this:

(deeprvat_annotations) pichuan@pichuan-gpu:~/repo_dir/ensembl-vep$ perl INSTALL.pl

I went through the steps, and made sure to install this Plugins this time:

  34: AlphaMissense            - Annotates missense variants with the pre-computed AlphaMissense pathogenicity scores. AlphaMissense is a deep learning model developed by Google DeepMind that predicts the pathogenicity of single nucleotide missense variants.

Log:

 - installing "AlphaMissense"
AlphaMissense already installed; do you want to overwrite (probably OK if updating) (y/n)? y
 - This plugin requires data
 - See /home/pichuan/.vep/Plugins/AlphaMissense.pm for details
 - OK

NB: One or more plugins that you have installed will not work without installation or downloading data; see logs above

All done

Trying again:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

This is when I realize the problem is where the Plugins is located. Instead of vep_plugin_dir : repo_dir/ensembl-vep/Plugins, it seems like I should have vep_plugin_dir : /home/pichuan/.vep/Plugins. Let me try that:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ sed -i -e 's|vep_plugin_dir : repo_dir/ensembl-vep/Plugins|vep_plugin_dir : /home/pichuan/.vep/Plugins|g' ../config/deeprvat_annotation_config.yaml

while doing that, I think this path also looks wrong:

  condel: Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2

So I changed that too:

sed -i -e 's|condel: Condel,repo_dir/ensembl-vep/Plugin/config/Condel/config,s,2|condel: Condel,/home/pichuan/miniforge3/pkgs/ensembl-vep-110.1-pl5321h2a3209d_0/share/ensembl-vep-110.1-0/config/Condel/config,s,2|g' ../config/deeprvat_annotation_config.yaml

Try the vep command again, it seems like I'm missing this file:

WARNING: Failed to instantiate plugin AlphaMissense: ERROR: Data file annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz not found

Hmm, when I install AlphaMissense earlier, it did say that One or more plugins that you have installed will not work without installation or downloading data. So I read /home/pichuan/.vep/Plugins/AlphaMissense.pm -> it mentioned AlphaMissense_hg38.tsv.gz but didn't really mention where to download it... so I just searched, and found https://www.biostars.org/p/9580275/

so I did:

wget https://storage.googleapis.com/dm_alphamissense/AlphaMissense_hg38.tsv.gz
tabix -s 1 -b 2 -e 2 -f -S 1 AlphaMissense_hg38.tsv.gz
mkdir -p annotation_data/AlphaMissense # I ln -sf annotation_data here earlier.
mv AlphaMissense_hg38.tsv.gz* annotation_data/AlphaMissense/

OK, now I tried again:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ time snakemake -j $(nproc) -s ../../pipelines/annotations.snakefile --configfile ../config/deeprvat_annotation_config.yaml --use-conda

Still having some error at the vep step, so I tried directly running that step:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/t/testdata/cache --dir_plugins /home/pichuan/.vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,/home/pichuan/miniforge3/pkgs/ensembl-vep-110.1-pl5321h2a3209d_0/share/ensembl-vep-110.1-0/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz

Here is the error:

Smartmatch is experimental at /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
readline() on closed filehandle SIFT at /home/pichuan/.vep/Plugins/Condel.pm line 160.
readline() on closed filehandle POLYPHEN at /home/pichuan/.vep/Plugins/Condel.pm line 171.

-------------------- EXCEPTION --------------------
MSG: ERROR: No cache found for homo_sapiens, version 110

STACK Bio::EnsEMBL::VEP::CacheDir::dir /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:322
STACK Bio::EnsEMBL::VEP::CacheDir::init /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:219
STACK Bio::EnsEMBL::VEP::CacheDir::new /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:111
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all_from_cache /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:116
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:92
STACK Bio::EnsEMBL::VEP::BaseRunner::get_all_AnnotationSources /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/BaseRunner.pm:170
STACK Bio::EnsEMBL::VEP::Runner::init /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/Runner.pm:128
STACK Bio::EnsEMBL::VEP::Runner::run /home/pichuan/miniforge3/envs/deeprvat_annotations/share/ensembl-vep-110.1-0/modules/Bio/EnsEMBL/VEP/Runner.pm:200
STACK toplevel /home/pichuan/miniforge3/envs/deeprvat_annotations/bin/vep:46
Date (localtime)    = Fri Aug  9 23:46:14 2024
Ensembl API version = 110
---------------------------------------------------

@Marcel-Mueck : question for you: Is it clear to you what this latest error indicates that I'm missing?

MSG: ERROR: No cache found for homo_sapiens, version 110 seems suspicious. I do have a homo_sapiens directory under cache, but maybe the directory was wrong? Here is what I have:

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ ls repo_dir/ensembl-vep/t/testdata/cache/homo_sapiens
84_GRCh38

Thank you!

One follow up on this: After reading Ensembl/ensembl-vep#1716 , I tried adding --cache_version 84 to my vep command. That seems to have worked. However, I'm confused why my cache has version 84 while the default cache was a different version.

@Marcel-Mueck : I'd love to know your thoughts on whether using --cache_version 84 is the right way to go, or if I should somehow update the cache version (not sure how). Thanks!

@Marcel-Mueck
Copy link
Collaborator

Marcel-Mueck commented Aug 12, 2024

Dear @pichuan regarding the alpha missense plugin data, the link you got the data from looks correct, it looks like the same link as in the annotation docs: https://deeprvat.readthedocs.io/en/latest/annotations.html#requirements (under AlphaMissense).
Regarding the VEP cache, VEP strongly suggests using the same cache version as the VEP software installed (as mentioned here):
image
So if you use version 110 of vep (as was used for the paper version of deeprvat) you would have to download (and unpack ) this cache inside repo_dir/ensembl-vep/cache: https://ftp.ensembl.org/pub/release-110/variation/indexed_vep_cache/homo_sapiens_vep_110_GRCh38.tar.gz

@pichuan
Copy link
Author

pichuan commented Aug 14, 2024

Thanks. In my case, somehow my cache directory is repo_dir/ensembl-vep/t/testdata/cache, so I did:

(deeprvat_annotations) pichuan@pichuan-gpu:~/repo_dir/ensembl-vep/t/testdata/cache$ aria2c -c -x10 -s10 -d . https://ftp.ensembl.org/pub/release-110/variation/indexed_vep_cache/homo_sapiens_vep_110_GRCh38.tar.gz
(deeprvat_annotations) pichuan@pichuan-gpu:~/repo_dir/ensembl-vep/t/testdata/cache$ tar xvfz homo_sapiens_vep_110_GRCh38.tar.gz

Now I have:

(deeprvat_annotations) pichuan@pichuan-gpu:~/repo_dir/ensembl-vep/t/testdata/cache$ ls homo_sapiens
110_GRCh38  84_GRCh38

After that, I was able to run the following command without --cache_version 84!

(deeprvat_annotations) pichuan@pichuan-gpu:~/deeprvat/example/annotations$ vep --input_file output_dir/annotations/tmp/test_vcf_data_c21_b1_stripped.vcf.gz --output_file output_dir/annotations/test_vcf_data_c21_b1_vep_anno.tsv --species homo_sapiens --assembly GRCh38 --format vcf  --offline --cache --dir_cache repo_dir/ensembl-vep/t/testdata/cache --dir_plugins /home/pichuan/.vep/Plugins --fork 5 --fasta reference/GRCh38.primary_assembly.genome.fa --tab --total_length --no_escape --polyphen s --sift s --canonical --protein --biotype --force_overwrite --no_stats --per_gene --pick_order biotype,mane_select,mane_plus_clinical,canonical,appris,tsl,ccds,rank,length,ensembl,refseq --plugin CADD,annotation_data/cadd/whole_genome_SNVs.tsv.gz,annotation_data/cadd/gnomad.genomes.r3.0.indel.tsv.gz --plugin Condel,/home/pichuan/miniforge3/pkgs/ensembl-vep-110.1-pl5321h2a3209d_0/share/ensembl-vep-110.1-0/config/Condel/config,s,2 --plugin AlphaMissense,file=annotation_data/AlphaMissense/AlphaMissense_hg38.tsv.gz

Thanks. Now I'm having a different errors in the deeprvat_annotations step. I'll continue that error report in #121, and I'll close this one for now.

@pichuan pichuan closed this as completed Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants