MOLECULAR SYSTEMATICS
Molecular biology has revolutionized the field of systematics. DNA
evolves by mutations being incorporated in the DNA and fixed in populations.
This will lead to divergence of DNA sequences in different species.
Although diverged, we can refer to two DNA sequences as homologous
(just as we would for any morphological trait such as forelimbs). Nicely
demonstrates descent with modification as a definition of evolution.
For this reason, DNA should be an excellent tool for inferring phylogenies:
large number of homologous characters that should be (??) less subject
to convergent evolution than other characters that might lead to a confusion
of grade and clade.
To estimate phylogenies we first must estimate how much sequence
divergence there has been between the various taxa we want to study.
Several methods: direct sequencing. Elegant molecular methods available,
a lot of work but provides lots of data. Each nucleotide position is a
character and the actual nucleotide that is present at that site
is the character state.
A character can be phylogenetically informative when nucleotide
changes are shared by two or more taxa. A character can be phylogenetically
uninformative when all nucleotides are the same among taxa, or when
only a single taxon has a different nucleotide.
I | U | I | |||||||||
A | C | T | C | G | A | C | T | A | G | A | T |
A | C | T | C | G | T | C | T | A | G | A | T |
A | C | A | C | G | T | C | T | A | G | A | T |
A | C | A | C | G | T | C | T | A | C | A | T |
A | C | A | C | G | T | C | T | A | C | A | T |
A less direct method of determining sequence divergence is by restriction
enzyme mapping. These are enzymes that recognize specific sequences
in the DNA and cut the DNA strands. Depending on the location of restriction
recognition sites in the DNA, DNA fragments of various lengths will
be generated by the restriction enzyme digest resulting in a restriction
fragment pattern. One can also determine where various restriction
enzymes cut a given piece of DNA and draw up a restriction map of
the stretch of DNA. The extent to which two restriction maps (or restriction
fragment patterns) are similar serves as an estimator of sequence similarity
(or difference). Restriction enzymes will recognize only a fraction of
the entire DNA sequence, so one will not know all the differences between
two stretches of DNA. The data serve as an estimate and because it usually
involved less work can be done on many individuals.
DNA-DNA hybridization is another indirect way of obtaining estimates
of DNA sequence divergence between two taxa. DNA strands are melted apart
at high temperature and allowed to "reanneal" in the presence
of the DNA of another species. One species' DNA has been labeled with a
radioactive nucleotide. This form heteroduplex DNA. The heteroduplex
DNA is gradually heated and the amount of single stranded DNA that has
"melted" apart is determined by the amount of radioactive label
that is counted in each fraction collected from the various temperature
steps. Very similar DNA will melt at a high temperature and heteroduplexes
between diverged DNAs will melt at a lower temperature. Sequence divergence
is proportional to melting temperature. See fig. 17.19, pg. 497.
Phenetic approaches: DNA hybridization, sequence divergence from
sequencing, restriction patterns or restriction maps. In each case the
data would be in the form of a single number indication the similarity
or difference between the DNAs of each pair of species in the study.
Cladistic approaches: direct sequencing, restriction maps and
restriction fragments (fragments less desirable). In each case one would
look for shared derived character states (nucleotide, restriction recognition
site) among taxa. With restriction sites shared loss is unreliable
as a uniting character because the nucleotide change could have occurred
anywhere in the recognition sequence. Like refrigerator/pizza/invertebrate
example.
Molecular approaches to systematics for us to think about the rates
of molecular evolution. If DNA or proteins evolved at a constant rate
in all species, then one could use estimates of sequence divergence to
build very reliable phylogenies. If there was a molecular clock
we could determine the "true phylogeny". Fact is, there is no
one molecular clock.
Different proteins and DNA sequences evolve at different rates. Why?
Different functional constraints. Different proteins do different
things and some can do their structural or functional job with any of several
different amino acids at many of the positions (fibrinopeptides).
Other proteins will not function properly with "any" amino acid
changes (histones: two amino acid differences between peas and cows!).
Intron sequences less constrained than coding exon sequences, and
hence introns tend to diverge faster than exons. Synonymous sites
evolve faster than non-synonymous sites, again due to different functional
constraints (i.e., some form of selection against "incorrect"
sequences). Nuclear DNA tends to evolve slower than mitochondrial DNA (in
vertebrates). Unit evolutionary period: time required to observe
a given unit of divergence. A 1% divergence of vertebrate mitochondrial
DNA takes about 250,000 to 500,000 years.
What gene do I use??: Depends on the taxa you are studying and
the amount of divergence among them. Histones good for "macrosystematics",
fibrinopeptides, mtDNA good for "microsystematics" or population
level phylogenies. See fig. 17.15, pg. 489.
Note problems with tree building from data: unequal rates and
convergence.
As if the choice of gene/protein were not a problem. What if different
lineages evolve at different rates? Test for this with the relative
rate test. Compare the paths from two different taxa to a third taxon.
If the paths are the same: taxa are evolving at the same rate; if not:
different rates. Extreme rate fluctuations are a problem; slight ones are
not as they would not lead to regrouping taxa (depending on how "slight"
is defined)
Convergence over long stretches of DNA is unlikely, although it has
been reported for lysozyme. Another kind of "convergence" can
occur due to the limited number of character states in DNA. Back mutations:
e.g. A changes to T, T changes to C and C changes back to A. Could occur
in one step or many. Maximal random divergence: 25% similarity
Nonetheless molecular tools have allowed major leaps in our understanding of biological diversity: Bacterial evolution: three kingdoms, not five; Endosymbiont hypothesis, essentially proven; AIDS virus: rapid evolution is good for the virus, bad for us.