Phylogenetic Inference

PHYLOGENETIC INFERENCE

The crucial issue in systematics is that there is a history of the organisms we wish to classify, but we don't know that history. We must infer the sequence of branches or evolutionary transformations that have taken place. There is a true phylogeny which we may never know, our task is to collect and analyze data to provide the best estimate of the true phylogeny.

We will work some examples that illustrate the difficulty of this task. Phenetics: classification based on overall similarity. See fig. 14.4, pg. 378. Matrix of shared character states. Those taxa with the most number of similar character states are deemed more similar.

Distance (or similarity) matrix derived from morphological measurements, genetic distance measures, etc. Each cell in the matrix is a value indicating the degree of difference (or similarity) between the two taxa. These can be clustered by UPGMA (unweighted pair group methods with averages). The two most similar (least distant) taxa are joined to form a group (e.g., taxa 1 & 2); the length of each branch is half the distance value between the two taxa. The next most similar taxon (3) is joined to the tree and the distance is calculated as the average of the distance from taxon 1 to taxon 3 and taxon 2 to taxon 3. At each such step in building a tree, the number of taxa in the matrix is reduced by one and new distance values are calculated as the average distance from each member of the group just formed to each taxon outside that group. This process of adding the most similar new taxon to a group is continued until all taxa are joined.

The tree produced is a Phenogram and is one way to infer relationships. Why might this tree not reflect phylogeny (true ancestor descendant relationships)? 1) Variable evolutionary rates: faster evolving taxon will be more different from all others and appear as an "outgroup" 2) Homoplasy (convergence) will tend to make character states similar between unrelated taxa and the UPGMA approach will join them.

Cladistics: classification reflects sequence of branching events, not degree of difference/similarity. See figures 17.6 and 17.7, pages 471-472. Classification is on shared derived characters (synapomorphies). Note that relationships are never based on the absence of characters (e.g., "Invertebrates" makes sense to us, but refrigerators and pizzas are "invertebrates" because they don't have back bones, but they clearly are not related to animals. For that matter, plants are invertebrates!). Tree produced is a Cladogram and is a hypothesis of relationship. A taxon can evolve at a different rate, but it will tend to accumulate autapomorphies which will not be shared with any other taxa and thus will affect the branch pattern less (but variable rate can lead to incorrect cladograms). How about Homoplasies? They will affect the hypothesis since those characters showing convergences (or parallelisms) will contradict data from other characters.

This brings us to the topic of Parsimony: in constructing cladograms we seek that branching pattern which requires the fewest number of evolutionary steps. Example of marine mammals (chosen since we know that it is an example of a convergence). It is more parsimonious to evolve fins twice and all the characters that hold mammals together once, than it is to evolve fins once and all the characters that ally whales with other mammals twice. We tolerate fins as Homoplasies (=analogies) since it is much more parsimonious than calling all the mammalian characters homoplasies. See fig. 17.13, pg. 485 and work through it.

Parsimony is central to the cladistic method and can be used for both studying the Polarity (direction of evolution in a transformation series) of characters and the confidence of hypotheses of relationships. Example: Drosophila chromosome banding patterns (e.g., chromosomal inversions, figs. 17.16 and 17.17, pg. 491, 494). Each species has a distinct pattern of bands in their salivary gland chromosomes. The sequence of bands appears to have been inverted for certain sections of the chromosome during evolution. One can determine a network of likely evolutionary steps from one species to another. Big problem: can start anywhere in the network. Need to establish where the network begins, i.e. where to Root the tree?

Choose an Outgroup: A taxon (or taxa) that are known to lie outside that group in question and are thus believed to be ancestral to the ingroup. Requires independent information. Once properly selected the determination of polarity falls out logically based on parsimony. the identification of an outgroup can help identify Character reversals = reversal in a trend of character change. An example is winglessness in insects: insects evolved from a wingless myriapod ancestor, but there are groups derived (i.e., more recently evolved) insects that have no wings (fleas). Wings have been lost in fleas and represent a character reversal. The use of an outgroup is extremely important in phylogenetic inference as it allows you to determine the "polarity" or direction of evolution as illustrated with insect wings. Once a reliable phylogenetic tree has been produced based on a data set of characters properly rooted with an outgroup, one can use the polarity provided by the outgroup to analyze the patterns of character evolution in general (how many times does a character originate during evolution?). See fig. 17.9, pg. 476.

Another means of determining the direction of evolution in a transformation series is by studying the development of the related taxa. Not as easy as "Ontogeny recapitulates phylogeny" once claimed because different developmental stages can be lost either early and/or late in development making difficult in some cases. In general, however, development can provide resolving power in studies of transformation series (fig. 17.11, pg. 478).

Compatibility methods: go with the tree that is supported by the largest number of characters. Said another way: the most likely tree is that which is supported by the greatest number of independent characters (the largest "clique" of characters) in which there are no homoplasies.