Skip to content
Freya Hubert edited this page Mar 4, 2022 · 1 revision

How do I interpret the 3D plot?

Each dot in the 3D plots represents one gene. The color represents the taxonomic assignment that taXaminer identified for a gene based on the protein sequence for it. As stated in the introduction, the dot position respresents a combination of coverage, sequence composition and spatial information, i.e. the indicators whether a gene possibly is contamination or horizontal gene transfer (HGT). The PCA, which was applied on the indicators, has the effect that genes with similar values for certain variables are closer together. In each of the dimensions (or principal components (PC)), each indicator is reprensented differently strong. The representation of the variables can be viewed in 'PCA_and_clustering/PCA_results/contribution_of_variables.png|.pdf' and is listed in detail in 'PCA_and_clustering/PCA_results/pca_loadings.csv'. The representation (or loading/contribution) tells you how much the variable contributes on how genes are spread along the respective dimension according to this variable. To put it this way: if variable A has much contribution to PC 1, then genes with much different values for variable A are far apart in PC 1 and closely toghether if they have similar values.
This has the effect, that genes, which stand out with values for the indicators of contamination and HGT deviating from the mean of the gene set, can be identified by an outboard or separated position.