Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

freec can't extract reads from bam file #137

Open
ZYongQi opened this issue Dec 13, 2023 · 1 comment
Open

freec can't extract reads from bam file #137

ZYongQi opened this issue Dec 13, 2023 · 1 comment

Comments

@ZYongQi
Copy link

ZYongQi commented Dec 13, 2023

Hi,here is ZY.
I want to call CNVs from bam files.My expected output is ''_CNVs'',but something is wrong.

sambamba 1.0.0
by Artem Tarasov and Pjotr Prins (C) 2012-2022
LDC 1.28.1 / DMD v2.098.1 / LLVM12.0.0 / bootstrap LDC - the LLVM D compiler (1.28.1)

..finished reading /big/zyq/out/test_freec/wugang_chr.bam
486663374 lines read..
0 reads used to compute copy number profile

Error: FREEC was not able to extract reads from /big/zyq/out/test_freec/wugang_chr.bam

Check your parameters: inputFormat and mateOrientation
Use "matesOrientation=0" if you have single end reads

First, I got the ''GC_profile.cnp'' file through ''gccount -conf config.txt''.
this is my config.txt:

[general]

chrFiles = /home/zyq/ref/
chrLenFile = /big/zyq/out/test_freec/panda_chr.len
ploidy = 2
breakPointThreshold = 0.8
maxThreads = 16
window = 5000
outputDir = /big/zyq/out/test_freec
sambamba = /home/zyq/miniconda3/envs/freec/bin/sambamba
SambambaThreads = 16
#intercept = 1
#coefficientOfVariation = 0.062
#degree = 3
#GCcontentProfile = /big/zyq/out/test_freec/GC_profile.cnp

[sample]

mateFile = /big/zyq/out/test_freec/wugang_chr.bam
inputFormat = BAM
mateOrientation = 0

BTW,my bam file was sorted,and contains "chr" prefixes for chromosomes.
@sq SN:chrNC_048218.1 LN:212770937
@sq SN:chrNC_048219.1 LN:199809881
@sq SN:chrNC_048220.1 LN:147627920
@sq SN:chrNC_048221.1 LN:144794249

my chrLenFile:
1 chr1 212770937
2 chr2 199809881
3 chr3 147627920

my chrFiles:
chr1.fa chr2.fa chr3.fa....... and I used samtools to split it.

NC_048218.1
tccagctcaactcaggggaccggctgaaaaaggggtcccctactcctccgccatcttaac
ctctcgaCCTccattaaacttttaattttaattctagcatagtcaacatatagtgttata

my GC_profile.cnp":
1 90000 0.4214 1
1 95000 0.5054 1
1 100000 0.5264 1
1 105000 0.4242 1
1 110000 0.527 1

thank you for your suggestions.

@ZYongQi
Copy link
Author

ZYongQi commented Dec 14, 2023

Hi,this is ZY. I 've solved the problem before. When running freec,you must keep the name of each chromosome in chrFiles,chrLenFile and .BAM file consistent. I didn't use a .VCF file so my chromosomes in .BAM file start with "NC_.....". Then you must keep the same prefixes of each chromosome in chrFiles and chrLenFile to ensure freec to identify and extract reads from bam files.
But still I have two questions to discuss:

  1. In your replies brfore ,if a .VCF file is used, a .BAM file with reads should contain "chr" prefixes for chromosomes. My purpose is to call CNVs in bam file .I wonder if it is necessary for me to use a .VCF file when calling CNVs. And if it is,what is the role of a .VCF file?

  2. As I mentioned before,I have many .BAM files to call CNVs. When I run freec with the parameter"coefficientOfVariation = 0.062",I found that the window sizes of each .BAM file are not the same , basicallly ranging from 800 to 1800. Under this circumstance, will I need to set a fixed window size value,such as 1000 or 2000, or just let freec to set windows automatically,though the windows of each .BAM file are not the same? Will this affect my subsequent analysis?

Good wishes to you and looking forward to your suggestions ,sincerely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant