1000 genome filter #90

tkcaccia · 2024-12-10T07:44:55Z

This is an incredible tool.

I am using to analyze our data from African population.
Unfortunately, 1000 genome is not comprehensive of all genetic variations of African population. In particular, the population of Khoisan is not considered.

How could I integrate the 1000 genome data with the data from Simons Genome Diversity Project (https://www.simonsfoundation.org/simons-genome-diversity-project/) for the SNV filter of Monopogen?

ZiyiWang7 · 2024-12-17T16:04:35Z

Hi @tkcaccia,

Thank you for your interest in our package! To integrate this, you can probably try expanding the reference file by merging the 1KG3 panel with the Simons Genome Diversity Project panel.

Then, update the naming convention of the imputation panel in Monopogen.py (line 96). The current naming format is:
imputation_vcf = args.imputation_panel + "CCDG_14151_B01_GRM_WGS_2020-08-05_" + record[0] + ".filtered.shapeit2-duohmm-phased.vcf.gz"
You can modify this line to match the naming convention of your merged reference file.

We have not done this kind of integration before, so please double-check the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1000 genome filter #90

1000 genome filter #90

tkcaccia commented Dec 10, 2024

ZiyiWang7 commented Dec 17, 2024

1000 genome filter #90

1000 genome filter #90

Comments

tkcaccia commented Dec 10, 2024

ZiyiWang7 commented Dec 17, 2024