Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different result from champ.filter() function with the same dataset #46

Open
thscandolara opened this issue Jan 31, 2024 · 0 comments
Open

Comments

@thscandolara
Copy link

Greetings everyone,

Was there any change in champ.filter() function? I had a dataset from TCGA that had theses results last year:

#champ.filter function: In this step, these probes can be filtered out: NoCG, SNPs start, MultiHit start, XY start..
met <- champ.filter(beta = beta_matrix,
                      pd = clinical,
                      filterXY = T,
                      filterNoCG = T,
                      filterSNPs = T,
                      filterMultiHit = T,
                      population = NULL,
                      filterDetP = TRUE)

#[ Section 2: Filtering Start >>
#Filtering NoCG Start
#Only Keep CpGs, removing 1375 probes from the analysis.
#
#Filtering SNPs Start
#Using general 450K SNP list for filtering.
#Filtering probes with SNPs as identified in Zhou's Nucleic Acids Research Paper 2016.
#    Removing 881 probes from the analysis.
#
#  Filtering MultiHit Start
#    Filtering probes that align to multiple locations as identified in Nordlund et al
#    Removing 10 probes from the analysis.
#
#  Filtering XY Start
#    Filtering probes located on X,Y chromosome, removing 6831 probes from the analysis.
#
#  Updating PD file
#    filterDetP parameter is FALSE, so no Sample Would be removed.
#
#  Fixing Outliers Start
#    Replacing all value smaller/equal to 0 with smallest positive value.
#    Replacing all value greater/equal to 1 with largest value below 1..
#[ Section 2: Filtering Done ]
#
# All filterings are Done, now you have 345347 probes and 130 samples.

Due to a problem with our saved files, we had to re-run all analyses. But now I have a different output:

#champ.filter function: In this step, these probes can be filtered out: NoCG, SNPs start, MultiHit start, XY start..
met <- champ.filter(beta = beta_matrix,
                    pd = clinical,
                    filterXY = T,
                    filterNoCG = T,
                    filterSNPs = T,
                    filterMultiHit = T,
                    population = NULL,
                    filterDetP = TRUE)
##[ Section 2: Filtering Start >>
#Filtering NoCG Start
#Only Keep CpGs, removing 1373 probes from the analysis.
#
#[ Section 2: Filtering Start >>
#Filtering NoCG Start
#Only Keep CpGs, removing 1414  probes from the analysis.
#
#Filtering SNPs Start
#Using general 450K SNP list for filtering.
#Filtering probes with SNPs as identified in Zhou's Nucleic Acids Research Paper 2016.
#Removing 904  probes from the analysis.
#
# Filtering MultiHit Start
#    Filtering probes that align to multiple locations as identified in Nordlund et al
#    Removing 10 probes from the analysis.
#
#  Filtering XY Start
#    Filtering probes located on X,Y chromosome, removing 7080  probes from the analysis.
#
#  Updating PD file
#    filterDetP parameter is FALSE, so no Sample Would be removed.
#
#  Fixing Outliers Start
#    Replacing all value smaller/equal to 0 with smallest positive value.
#    Replacing all value greater/equal to 1 with largest value below 1..
#[ Section 2: Filtering Done ]
#
# All filterings are Done, now you have 349956 probes and 130 samples.

It is the same input, but I have no idea what changed the output because I do not have the previous files to check it. Has anything changed in this filtering step?

This is kind of important because it really affected all other analyses and even some sample clustering..

Many thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant