Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Somatic cnv issue #134

Open
awast opened this issue Aug 9, 2023 · 0 comments
Open

Somatic cnv issue #134

awast opened this issue Aug 9, 2023 · 0 comments

Comments

@awast
Copy link

awast commented Aug 9, 2023

Hi ,
I want to identify the somatic cnv from tumor-normal paired WES data. The config file i have used is :

#First way
[general]
BedGraphOutput=TRUE
chrLenFile = path/data/hg38.fa.fai
window = 0
ploidy = 2
outputDir =path to outputfile/freec-results
bedtools = /usr/bin/bedtools
samtools = /usr/bin/samtools
forceGCcontentNormalization = 0
contaminationAdjustment=TRUE

#sex=XY
breakPointType=4
chrFiles = path/genome-file-chr
maxThreads=10
gemMappabilityFile = path/FREEC-11.6/out100m2_hg38.gem

noisyData=TRUE
printNA=FALSE
sambamba = path/bin/./sambamba
readCountThreshold=50

[sample]
mateFile =path/sample_1T_dedup_reads.bam
miniPileup = path/pileup-files/sample_1T.pileup.gz
inputFormat = BAM
mateOrientation = FR

[control]

mateFile = path/sample_1N_dedup_reads.bam
miniPileup =path/sample_1N.pileup.gz
inputFormat = BAM
mateOrientation = FR

[BAF]
makePileup = path/dbSNP151.hg38-commonSNP_minFreq5Perc_with_CHR.bed
fastaFile = path/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
SNPfile =path/dbSNP151.hg38-commonSNP_minFreq5Perc_with_CHR.vcf.gz
minimalCoveragePerPosition = 5
shiftInQuality = 33

[target]

#captureRegions = /EXOME-SEQ/TruSeq_exome_targeted_regions.hg19.bed
captureRegions = path/final-overlapping-removed-sorted.bed

After running this script i got the output file for somatic cnv like this:
1 18921041 19198941 3 gain AAB 81.846
1 71007971 75160955 3 gain AAB 6.00045
1 78641024 84202811 3 gain AAB 10.0454
1 93278631 99740911 3 gain AAB 32.4437
1 100292566 108136556 3 gain AAB 56.9157
1 185296350 187641778 3 gain AAB 3.25507
1 190097632 198859138 3 gain - -1
1 220025096 220134605 1 loss A -1
1 227016028 227081001 3 gain AAB 68.6832
2 38783 45827 3 gain - -1
2 31963042 32433806 1 loss A 100
2 49017438 51029206 3 gain AAB 28.0357

#Second way
#using config file-only including pileup files:
[general]
BedGraphOutput=TRUE
chrLenFile = /path/hg38.fa.fai
window = 0
ploidy = 2
outputDir = /path/pileup-basedresults
bedtools = /usr/bin/bedtools
samtools = /usr/bin/samtools
forceGCcontentNormalization = 1
contaminationAdjustment=TRUE

#sex=XY
breakPointType=4
chrFiles =path/genome-file-chr
maxThreads=10
gemMappabilityFile = path/out100m2_hg38.gem

noisyData=TRUE
printNA=FALSE
sambamba = path/./sambamba
readCountThreshold=50

[sample]

mateFile = /path/CaGB_10_T.pileup.gz
inputFormat = pileup
mateOrientation = FR

[control]

mateFile = path/CaGB_10_N.pileup.gz
inputFormat = pileup
mateOrientation = FR

[BAF]
SNPfile =path/dbSNP151.hg38-commonSNP_minFreq5Perc_with_CHR.vcf.gz
minimalCoveragePerPosition = 5
shiftInQuality = 33

[target]

#captureRegions = /EXOME-SEQ/TruSeq_exome_targeted_regions.hg19.bed
captureRegions = path/final-overlapping-removed-sorted.bed

output_file: again only 7 columns
1 27621445 29279653 1 loss A -1
1 77631847 77926905 1 loss A 47.3996
1 91642882 93079408 1 loss A 100
1 96777544 99709210 3 gain AAB 27.2263
1 173500937 174241659 1 loss A 11.0234
1 219915358 220147732 1 loss A 17.95
2 31531343 32388914 1 loss A -1
2 36378249 36555753 3 gain AAB 53.6888
2 38689775 38882527 1 loss A -1
2 46816758 48733146 1 loss A -1
2 54143223 54850294 3 gain AAB 24.374
2 60756240 62502550 1 loss A -1
2 62502552 69962188 2 neutral AA -1
2 69996591 70297189 1 loss A -1

only 7 columns are displayed i am not getting status column whether its germline or somatic. I want to acquire somatic alterations excluding germline .Additionally , i have overlapped these files with dgv database germline variants in order to remove the germline ones but i am getting good match for almost every region. So i am not sure whether these cnv is somatic or germline .how can i be sure that the cnvs in this file are somatic please suggest how can i acquire that column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant