Skip to content

phylogenetic graphs

Mark T. Holder edited this page Feb 14, 2015 · 12 revisions

Phylogenetic graph for the finding a supertree based on ranks

See supertrees using tree ranking page for some broader discussion of the role of a phylogenetic network. This page discusses how to construct a "phylogenetic graph" or "phylogenetic network" used in step 5 "build a phylogenetic graph for each subproblem" of a pipeline for producing a supertree is here..

input

The input is a ranked set of trees with a full tree from the taxonomy as the lowest ranked input.

Comments on mapping of tips

There is a proposal to remap tips to the deepest uncontested taxon that they examplify. So tips of the tree may be associated with higher taxa.

This page assumes that a this proposal or something similar is adpoted to prune any tip that was mapped by a curator to a higher taxon which is contested.

Thus we assume that, at this point in the pipeline, the inputs that are higher taxa are uncontested higher taxa.

preprocessing step: replace higher taxa with polytomies of terminal taxa.

Because we know that each of the tips mapped to a higher taxon is a case of a taxon that is going to be in the final answer, we can unproblematically create a set of inputs with no complexity of mapping to taxonomic groups by replacing such labels with polytomies include the terminal taxa. In fact, (because we know that the last input will bring in all of the leaves), it is safe to remap each leaf of the higher ranked trees as follows.

  1. create a list of all of the tips in the phylo inputs that are non-terminal taxa.
  2. partially sort the list such that, if any of these higher taxa are nested (based on the taxonomy) the shallower taxa come first.
  3. walk through this list remapping by: * A. find the union of terminal taxa used in the phylo (non taxonomy) trees. * B. if the set from A is empty, then choose one terminal taxon (arbitrarily) to stand in for the higher taxon * C. if the set from A is not empty, replace the tip with a polytomy of the set of terminals, but make an annotation on the branch that will indicate that this tree is not providing support for this grouping. Later when support for different branches are tabulated these expansions should not be included because thes are cases in which we use the taxonomy to assume that the group is monophyletic, even though the phylo input just has one tip.

the goal

Imagine that the score for a supertree, s, is a tuple:

( |c(s, N)| , |c(s, N-1)|, |c(s, N-2)|, ... |c(s, 0)| )  

where c(s,N) is the set of groups in the highest ranked input tree N, that are displayed on the supertree. |c(s,x)| indicates the size of the sets.

We would like to take these inputs and find the set of trees that has the highest score attainable if scores are computed lexicographically. So it is better to display "4 groupings from tree 4, and none from trees 0-3" than it is to display "3 groupings from tree 4, and all of the groupings from the lower ranked inputs".

The set of trees may be very large, so we would like to have a compact structure that represents the entire set.

Clone this wiki locally