Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about ratio.txt #145

Open
ZYongQi opened this issue Jul 16, 2024 · 3 comments
Open

questions about ratio.txt #145

ZYongQi opened this issue Jul 16, 2024 · 3 comments

Comments

@ZYongQi
Copy link

ZYongQi commented Jul 16, 2024

Hi,this is ZY.We did a summary on the quantity and distribution of CNVs and CNV regions . And I took your advice to visualize the ratio.txt file.But still doubted.

R script:FREEC_ratio2Absolute.R. One of the outputs shows:

Chromosome Start End Num_Probes Segment_Mean
NC_048218.1 1 1264440 1285 -0.0513244
NC_048218.1 1264441 1302816 39 -3.715107
NC_048218.1 1302817 3479424 2212 -0.05671026
NC_048218.1 3479425 3504024 25 -4.576851
NC_048218.1 3504025 3536496 33 0.01631089

What kind of criteria should we use to filter the results? The number of probes or a specific segment_mean?
By the way, why some of segment_means equal -Inf? How we deal it ?
Wish your reply!

@valeu
Copy link
Contributor

valeu commented Jul 20, 2024

Hi,
-Inf should be log(0). These must be segments where FREEC predicts copy number of zero.

By visualization, I meant to visualize the results as .png to visually evaluate the amount of noise after the normalization and the quality of FREEC's CNA calls.

@ZYongQi
Copy link
Author

ZYongQi commented Jul 22, 2024

Hi,
I have a rough idea of your advice.I should visualize ratio.txt to remove the noise and outlier.Later I'll take care of it.

Our research is drawing to a close.We reviewed all the steps and collated them.I am currently working on the first draft of my article.As I review the FREEC, I have a few questions.

  1. 0 means Zero copies of DNA in this region predited in _CNVs file. And 0 always corresponds to a loss according to my output.But how to explain it? 0 fragment is lossing from this region? Or too much 0 means noise,and I should filter them, as you
    adviced before?

Part of my _CNVs file:
NC_048218.1 68935104 69748872 1 loss
NC_048218.1 72301368 72327936 0 loss
NC_048218.1 89715216 89736864 0 loss
NC_048218.1 94680480 94689336 7 gain
NC_048218.1 98317344 98352768 1 loss
NC_048218.1 98685360 98714880 0 loss
NC_048218.1 100452624 100478208 0 loss
NC_048218.1 109823256 109882296 0 loss
NC_048218.1 115849272 116669928 3 gain
NC_048218.1 130496112 130513824 0 loss

  1. The CNVs FEEEC predicted show two types: loss and gain. I wonder if FREEC writes the normal region into the output -- the copy number does not change compared to the reference genome. If 0 means no change ,why does "loss" appears?

Wish your reply!

@ZYongQi
Copy link
Author

ZYongQi commented Jul 24, 2024

Hi, I have a rough idea of your advice.I should visualize ratio.txt to remove the noise and outlier.Later I'll take care of it.

Our research is drawing to a close.We reviewed all the steps and collated them.I am currently working on the first draft of my article.As I review the FREEC, I have a few questions.

  1. 0 means Zero copies of DNA in this region predited in _CNVs file. And 0 always corresponds to a loss according to my output.But how to explain it? 0 fragment is lossing from this region? Or too much 0 means noise,and I should filter them, as you
    adviced before?

Part of my _CNVs file: NC_048218.1 68935104 69748872 1 loss NC_048218.1 72301368 72327936 0 loss NC_048218.1 89715216 89736864 0 loss NC_048218.1 94680480 94689336 7 gain NC_048218.1 98317344 98352768 1 loss NC_048218.1 98685360 98714880 0 loss NC_048218.1 100452624 100478208 0 loss NC_048218.1 109823256 109882296 0 loss NC_048218.1 115849272 116669928 3 gain NC_048218.1 130496112 130513824 0 loss

  1. The CNVs FEEEC predicted show two types: loss and gain. I wonder if FREEC writes the normal region into the output -- the copy number does not change compared to the reference genome. If 0 means no change ,why does "loss" appears?

Wish your reply!

Hi,
I 'm sorry to trouble you. I found that I forgot a truth -- there are 2 chromosomes.

And I missed the meaning of CN. I misunderstood it to the number of copy number region. I always tried to understand CN as how many losses or gains there are. But in reality the two are not equal.

So the anwser is 0 and 1 both represent loss. And CN=2 represents the normal fragments ,which FREEC writes in the ratio.txt.

Thanks to this review, I have to make some adjustments to my research.Best wishes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants