Ancestry Analysis using Dimensionality Reduction Techniques
This is the Ancestry pipeline for the Personal Genome Project UK (PGP UK).
EasyAncestry is a simplified and earlier version, which gives the user a introduction to the dimensionality reduction techniques Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). Open the Jupyter Notebook file to use. Enter the vcf.gz of choice and follow the prompts to generate a 3-D ancestry graph with the predicted ancestral background. Credit to arkevi/tgviz for his help.
Ancestry is the full program, which allows the user to manipulate variables such as the number of neighbours. The dimensionality reduction techniques PCA, NCA (neighborhood component analysis) and UMAP are used. This can be used either with the home directory or with a Jupyter Notebook. Credit to arkevi/ezancestry for his help.
Enjoy!