Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to open reference... Protocol not supported #128

Open
Deleetdk opened this issue Aug 18, 2022 · 3 comments
Open

Failed to open reference... Protocol not supported #128

Deleetdk opened this issue Aug 18, 2022 · 3 comments

Comments

@Deleetdk
Copy link

Working with my own genome in CRAM format, I installed using pip as local user. Installation and downloading of files proceeded without issues. However, run time produced errors:

user@computer:/disk/genomes/nebula/emil$ pip install cnvpytor
Defaulting to user installation because normal site-packages is not writeable
Collecting cnvpytor
  Downloading CNVpytor-1.2.1.tar.gz (1.1 MB)
     |████████████████████████████████| 1.1 MB 2.8 MB/s            
  Preparing metadata (setup.py) ... done
Collecting gnureadline
  Downloading gnureadline-8.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (628 kB)
     |████████████████████████████████| 628 kB 53.5 MB/s            
Requirement already satisfied: requests>=2.0 in /home/emil/.local/lib/python3.6/site-packages (from cnvpytor) (2.21.0)
Requirement already satisfied: pysam>=0.15 in /home/emil/.local/lib/python3.6/site-packages (from cnvpytor) (0.16.0.1)
Requirement already satisfied: numpy>=1.16 in /home/emil/.local/lib/python3.6/site-packages (from cnvpytor) (1.19.5)
Requirement already satisfied: scipy>=1.1 in /home/emil/.local/lib/python3.6/site-packages (from cnvpytor) (1.4.1)
Requirement already satisfied: matplotlib>=2.2 in /home/emil/.local/lib/python3.6/site-packages (from cnvpytor) (3.2.1)
Requirement already satisfied: h5py>=2.9 in /home/emil/.local/lib/python3.6/site-packages (from cnvpytor) (3.1.0)
Collecting xlsxwriter>=1.3
  Downloading XlsxWriter-3.0.3-py3-none-any.whl (149 kB)
     |████████████████████████████████| 149 kB 43.8 MB/s            
Requirement already satisfied: cached-property in /home/emil/.local/lib/python3.6/site-packages (from h5py>=2.9->cnvpytor) (1.5.2)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/emil/.local/lib/python3.6/site-packages (from matplotlib>=2.2->cnvpytor) (2.4.7)
Requirement already satisfied: cycler>=0.10 in /home/emil/.local/lib/python3.6/site-packages (from matplotlib>=2.2->cnvpytor) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/emil/.local/lib/python3.6/site-packages (from matplotlib>=2.2->cnvpytor) (1.2.0)
Requirement already satisfied: python-dateutil>=2.1 in /home/emil/.local/lib/python3.6/site-packages (from matplotlib>=2.2->cnvpytor) (2.8.1)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/emil/.local/lib/python3.6/site-packages (from requests>=2.0->cnvpytor) (3.0.4)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/emil/.local/lib/python3.6/site-packages (from requests>=2.0->cnvpytor) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /home/emil/.local/lib/python3.6/site-packages (from requests>=2.0->cnvpytor) (2019.9.11)
Requirement already satisfied: idna<2.9,>=2.5 in /home/emil/.local/lib/python3.6/site-packages (from requests>=2.0->cnvpytor) (2.8)
Requirement already satisfied: six in /home/emil/.local/lib/python3.6/site-packages (from cycler>=0.10->matplotlib>=2.2->cnvpytor) (1.15.0)
Building wheels for collected packages: cnvpytor
  Building wheel for cnvpytor (setup.py) ... done
  Created wheel for cnvpytor: filename=CNVpytor-1.2.1-py3-none-any.whl size=1117058 sha256=92e8ff5a0594e8d927b280e1c0b43c09c7572ba6dad62d3c326b1c6936b16147
  Stored in directory: /home/emil/.cache/pip/wheels/80/1b/90/1ecbda76f9a2d71cb22216c141cb4160b2e0c464a0a6871c0c
Successfully built cnvpytor
Installing collected packages: xlsxwriter, gnureadline, cnvpytor
Successfully installed cnvpytor-1.2.1 gnureadline-8.1.2 xlsxwriter-3.0.3
user@computer:/disk/genomes/nebula/emil$ cnvpytor -download
2022-08-18 07:11:35,662 - cnvpytor.genome - INFO - Updating reference genome resource files...
2022-08-18 07:11:35,662 - cnvpytor.genome - INFO - Detecting missing GC resource file for reference genome 'hg19'
2022-08-18 07:11:37,021 - cnvpytor.genome - INFO - Downloading GC resource file: gc_hg19.pytor
2022-08-18 07:11:37,885 - cnvpytor.genome - INFO - File downlaoded.
2022-08-18 07:11:37,885 - cnvpytor.genome - INFO - Detecting missing MASK resource file for reference genome 'hg19'
2022-08-18 07:11:39,059 - cnvpytor.genome - INFO - Downloading MASK resource file: mask_hg19.pytor
2022-08-18 07:11:39,704 - cnvpytor.genome - INFO - File downlaoded.
2022-08-18 07:11:39,704 - cnvpytor.genome - INFO - Detecting missing GC resource file for reference genome 'hg38'
2022-08-18 07:11:41,053 - cnvpytor.genome - INFO - Downloading GC resource file: gc_hg38.pytor
2022-08-18 07:11:41,984 - cnvpytor.genome - INFO - File downlaoded.
2022-08-18 07:11:41,985 - cnvpytor.genome - INFO - Detecting missing MASK resource file for reference genome 'hg38'
2022-08-18 07:11:43,466 - cnvpytor.genome - INFO - Downloading MASK resource file: mask_hg38.pytor
2022-08-18 07:11:44,135 - cnvpytor.genome - INFO - File downlaoded.
2022-08-18 07:11:44,135 - cnvpytor.genome - INFO - Detecting missing GC resource file for reference genome 'chm13'
2022-08-18 07:11:45,523 - cnvpytor.genome - INFO - Downloading GC resource file: gc_chm13.pytor
2022-08-18 07:11:46,478 - cnvpytor.genome - INFO - File downlaoded.
2022-08-18 07:11:46,478 - cnvpytor.genome - INFO - Done.
user@computer:/disk/genomes/nebula/emil$ cnvpytor -root file.pytor -rd 
genome_Emil_Kirkegaard_Full_20140303114842.txt  NG1IL0F60J.vcf.gz.tbi
NG1IL0F60J.cram                                 notebook.Rmd
NG1IL0F60J.cram.crai                            rsids
NG1IL0F60J.vcf.gz                               wgs_23andme_overlap.vcf.gz
user@computer:/disk/genomes/nebula/emil$ cnvpytor -root file.pytor -rd NG1IL0F60J.cram
2022-08-18 07:13:25,481 - cnvpytor.bam - INFO - File: NG1IL0F60J.cram successfully open
2022-08-18 07:13:25,481 - cnvpytor.bam - INFO - Detected reference genome: hg38
2022-08-18 07:13:25,483 - cnvpytor.pool - INFO - Parallel processing using 8 cores
2022-08-18 07:13:25,494 - cnvpytor.root - INFO - Reading data for chromosome chr1 with length 248956422
2022-08-18 07:13:25,494 - cnvpytor.root - INFO - Reading data for chromosome chr2 with length 242193529
2022-08-18 07:13:25,494 - cnvpytor.root - INFO - Reading data for chromosome chr3 with length 198295559
2022-08-18 07:13:25,494 - cnvpytor.root - INFO - Reading data for chromosome chr4 with length 190214555
2022-08-18 07:13:25,494 - cnvpytor.root - INFO - Reading data for chromosome chr5 with length 181538259
2022-08-18 07:13:25,494 - cnvpytor.root - INFO - Reading data for chromosome chr6 with length 170805979
2022-08-18 07:13:25,495 - cnvpytor.root - INFO - Reading data for chromosome chr7 with length 159345973
2022-08-18 07:13:25,495 - cnvpytor.root - INFO - Reading data for chromosome chr8 with length 145138636
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 0
[E::cram_decode_slice] Unable to fetch reference #0 10001..39269

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,527 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/76635a41ea913a405ded820447d067b0": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 2
[E::cram_decode_slice] Unable to fetch reference #2 10179..49575

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,529 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/f98db672eb0993dcfdabafe2a882905c": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 1
[E::cram_decode_slice] Unable to fetch reference #1 10417..44091

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,540 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/c67955b5f7815a9a1edfaa15893d3616": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 7
[E::cram_decode_slice] Unable to fetch reference #7 60001..136325

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,543 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/cc044cc2256a1141212660fb07b6171e": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 6
[E::cram_decode_slice] Unable to fetch reference #6 10248..49629

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,545 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/3210fecf1eb92d5489da4346b3fddc6e": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 3
[E::cram_decode_slice] Unable to fetch reference #3 10002..24650

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,546 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/a811b3dc9fe66af729dc0dddf7fa4f13": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 4
[E::cram_decode_slice] Unable to fetch reference #4 10005..45128

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,547 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/5691468a67c7e7a7b5f2a3a683792c29": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 5
[E::cram_decode_slice] Unable to fetch reference #5 60029..138449

[E::cram_next_slice] Failure to decode slice
2022-08-18 07:13:25,548 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
^CTraceback (most recent call last):
  File "/home/emil/.local/bin/cnvpytor", line 8, in <module>
    sys.exit(main())
  File "/home/emil/.local/lib/python3.6/site-packages/cnvpytor/__main__.py", line 270, in main
    app.rd(args.rd, chroms=args.chrom, reference_filename=args.reference_filename)
  File "/home/emil/.local/lib/python3.6/site-packages/cnvpytor/root.py", line 308, in rd
    self._read_bam(bf, chroms, reference_filename=reference_filename, overwrite=overwrite)
  File "/home/emil/.local/lib/python3.6/site-packages/cnvpytor/root.py", line 90, in _read_bam
    res = parmap(read_chromosome, chr_len, cores=self.max_cores)
  File "/home/emil/.local/lib/python3.6/site-packages/cnvpytor/pool.py", line 50, in parmap
    sent = [q_in.put((i, x)) for i, x in enumerate(x_arg)]
  File "/home/emil/.local/lib/python3.6/site-packages/cnvpytor/pool.py", line 50, in <listcomp>
    sent = [q_in.put((i, x)) for i, x in enumerate(x_arg)]
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 82, in put
    if not self._sem.acquire(block, timeout):
KeyboardInterrupt

I see nothing reported here or on Google. The URLs work fine, e.g. https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd opens OK in the browser though of course is not readable by humans (binary).

Any ideas? I will try the Github version.

My CRAM file is here: https://filedn.eu/lCyoUMpONNB7afAi4dJTUyX/data/genomics/personal%20genomes/emil/

@arpanda
Copy link
Member

arpanda commented Aug 19, 2022

Hi,
As you are using cram formatted file, could you please try with reference genome and let us know the problem persist or not.
cnvpytor -root file.pytor -rd NG1IL0F60J.cram -T <path for reference fasta file>

Thanks
Arijit

@Deleetdk
Copy link
Author

I've let it run for 6+ hours so far, seems it is stuck:

cnvpytor -root file.pytor -rd NG1IL0F60J.cram -T /data/genomics/reference_files/hg38.fa
2022-08-29 01:34:53,231 - cnvpytor.bam - INFO - File: NG1IL0F60J.cram successfully open
2022-08-29 01:34:53,232 - cnvpytor.bam - INFO - Detected reference genome: hg38
2022-08-29 01:34:53,236 - cnvpytor.pool - INFO - Parallel processing using 8 cores
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr2 with length 242193529
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr1 with length 248956422
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr3 with length 198295559
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr4 with length 190214555
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr5 with length 181538259
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr6 with length 170805979
2022-08-29 01:34:53,249 - cnvpytor.root - INFO - Reading data for chromosome chr7 with length 159345973
2022-08-29 01:34:53,249 - cnvpytor.root - INFO - Reading data for chromosome chr8 with length 145138636
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 2 pos 16776232..16814735
[E::cram_decode_slice] CRAM: 94982bbafd95a1a748fa20098fa90785
[E::cram_decode_slice] Ref : f16024284d657779afbaff7aeafdee31
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:35:13,239 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 1 pos 20944307..20980664
[E::cram_decode_slice] CRAM: d6684283cb67e862e1f3c1a612609f28
[E::cram_decode_slice] Ref : 14de7a8fa2a63952b296c2f2457ac77c
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:35:18,101 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 4 pos 47308302..49599821
[E::cram_decode_slice] CRAM: c4a1a9cc77b233653ed493122ac7d8f6
[E::cram_decode_slice] Ref : a0079071eceaa2b978aec2dbe12e9744
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:35:47,822 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 5 pos 61323012..61378416
[E::cram_decode_slice] CRAM: f6e90e6d1513b16c3f6b27b12470e858
[E::cram_decode_slice] Ref : d31242f5e8ad326d470866fa93641452
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:36:06,459 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
2022-08-29 01:37:41,411 - cnvpytor.root - INFO - Reading data for chromosome chr9 with length 138394717
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 6 pos 154558071..154597614
[E::cram_decode_slice] CRAM: 828f186956ba7548c8cb2a5cf3f58d95
[E::cram_decode_slice] Ref : 4e7e0d006cfebccf4945eba9a783830e
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:37:51,781 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
2022-08-29 01:38:35,198 - cnvpytor.root - INFO - Reading data for chromosome chr10 with length 133797422
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 8 pos 89873013..89913176
[E::cram_decode_slice] CRAM: 64c0251ef044d4a1f07ddb3c6091ff65
[E::cram_decode_slice] Ref : de90d4be921385c7764f1156bca9e3fb
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:39:05,648 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 9 pos 39224987..39261161
[E::cram_decode_slice] CRAM: 1db319bc13d26df6c359d41e6304a51a
[E::cram_decode_slice] Ref : be226a28220ba1ae4633caf36a971ffb
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:39:19,863 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 0 pos 248745489..248787439
[E::cram_decode_slice] CRAM: a2844d84c77fd8b4f48ae633c2d0f962
[E::cram_decode_slice] Ref : 20c6c56d70d5fd59b58aa111d0dcccd1
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:39:40,503 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'

You were right, the protocol errors disappeared.

@suvakov
Copy link
Member

suvakov commented Sep 19, 2022

It seems that reference used to create CRAM file is not the same as hg38.fa reference you provided in command line. Please, check assembly version in CRAM header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants