BACKGROUND

The main goal of this problem set is to give you some first-hand experience in phylogenetic analysis. An additional goal is to make some of the concepts we discuss in lecture more concrete by forcing you to familiarize yourself with a set of organisms and their relationships to one another. Before you begin, get it straight in your mind that there may not be one best answer to the questions we ask, although there are wrong answers. We are primarily interested in having you learn the process of how to do systematics. In short, you are presented with a set of organisms that look different, and are faced with solving the puzzle of who is related to whom. By using different types of information to solve this puzzle we hope that you will learn how evolution shapes different traits in different ways.

This problem set involves Two Steps: 1) A problem involving vertebrate evolution and 2) A problem with whale phylogeny where you have to collect your own data set from a package of material on reserve in the Science Library (Bio 48 Whale Data), and then use this data set in two pieces of Macintosh software. You will compare the results from your own data set to those from a molecular data set of mitochondrial DNA (mtDNA) sequence. You will do the vertebrate evolution problem and the first part of the whale phylogeny problem at one sitting. You will then go do research on whales and return for further analysis.

The two different pieces of Macintosh software work together (i.e., can read each others files). One program (PAUP for Phylogenetic Analysis Using Parsimony) finds the shortest evolutionary tree based on the data set provided. The other program (MacClade) allows you to move branches of the tree around with the mouse and trace the evolution of characters on different trees. This allows you to test different hypotheses about the evolution of characters and play "what if..." (e.g., "what happens to the statistics of the evolutionary tree if we force these species to be more closely related?").


Using the Macintosh Cluster Servers. 1) To use the Cluster programs you need to have a NetID and Password. Follow the directions on the Getting Connected Using a Macintosh in the Clusters handout. You will need to have the CLUSTER.CLASSES and the CLUSTER.MAC "file cabinets" (servers) active on the desktop of the Mac you are using. The MacClade and Paup software are in the CLUSTER.MAC server, listed alphabetically as MacClade 3.01 and Paup 3.0s. In the MacClade folder is a Vertebrate Examples file that you will use in Part 1. A Whale.mtDNA.Bio48 data set is in the CLUSTER.CLASSES server under the Bio 48 Course folder (open BI0048, open COURSE) that you will use in part 2. Alternatively, you can get the Whale.mtDNA.Bio48 data set from "David Rands Computer" in the Walter Hall Zone of the Mac campus network (under the Apple menu, select Chooser, click on AppleShare, select the brn_BioMed_Walter_Hall zone, then click on David Rand's Computer" and log in as Guest).



NOTE: THERE IS A LIMIT TO THE NUMBER OF PEOPLE (37) THAT CAN USE MacClade SIMULTANEOUSLY. PLAN AHEAD AND/OR WORK DURING NON-PEAK HOURS



THERE IS NO LIMIT TO THE NUMBER OF PEOPLE USING PAUP.


Part 1. Vertebrate Evolution

Open the MacClade 3.01 folder, then open the MacClade Examples folder, then double click on the Vertebrates file. A Data matrix comes up with taxa in the left column, characters along the top of each column and character states in each cell. Inspect the entire data set using the scroll bar at the bottom. If the terms don't make any sense, consult your local medical dictionary or morphology text.

Pull down the Display menu and select on Go To Tree Window; a tree will be displayed. Pull down the S menu and select Tree Changes; repeat with Consistency Index. These will be displayed in a box at the bottom of the screen; find them. (Tree changes are the number of character state changes that are imposed by the current structure of the tree. Consistency Index is a measure of how "good" the tree is at evolving these characters: it is a ratio of the number of characters in the data set to the length of the current tree displayed. A tree with no reversals (or convergence) would have a C.I. of 1.0; a tree with twice as many steps as needed would have a C.I. of 0.5).

Pull down the Trace menu and select Trace Character. This will highlight in color each branch on which the alternative character states have evolved, and put a character box in the lower right which describes the character being displayed. This box also has a scroll bar so you can examine each character in turn. You will see on the current tree that the amnion has evolved three separate times . Is this parsimonious? Use the scroll bar in the character box to examine the pattern of character evolution for each character on the current tree (if you want to look at the data matrix again, select Data Editor under the Display menu). Make note of which taxa appear to evolve character states independently on this tree; this will help in the next step. Move the scroll bar back so amnion is showing in the character box.

Now you will make the evolution of the characters more parsimonious by rearranging the branches on the tree. Use the mouse pointer to click on branches and drag them to new locations on the tree (point to a branch, click the mouse button, hold it down while you move the pointer to a new branch position and release the button). Make the amnion evolve once on the tree. Make a note of the new C.I. and Treelength. Is the new tree parsimonious with respect to other characters? (use scroll bar again).

Problem 1: Find the shortest tree (Treelength = 19; yes, 19! There is more than one tree of length 19, can you find them??). Use the patterns of character evolution as a guide to finding the shortest tree.

Question 1: i) Are reptiles a natural group (monophyletic)? ii) What term do we use to describe this kind of "group"? iii) The two fish taxa (rayfinned and lungfishes) are not sister taxa, hence are not a natural group. Does the treelength change when fish are put as a natural group (monophyletic) versus a paraphyletic group? iv) Look at the data set (under the Display menu, select 'Go to Data Editor'), and provide a clear justification for your answer in part iii. v) Which character is the least parsimonious in vertebrate evolution?

Part 2 Whale Phylogeny.

Open the Paup folder in the CLUSTER.MAC folder, and double click on the PAUP 3.0s application. After it launches, open the Whale.mtDNA.Bio48 file (pull down the File menu and select Open; you will need to go to the CLUSTER.CLASSES folder to access the Whale.mtDNA.Bio48 file ). DO NOT change any of the text or the program will not run. (If you do by mistake, just quit without saving anything and re-open the file).

Pull down the File menu and select Execute " Whale.mtDNA.Bio48 ". Click YES if prompted with questions. The screen should say Processing of file "whale.mtDNA.Bio48" completed.

Pull down the Data menu and select Define Outgroup. A dialogue box will come up; click on Camel then click on >To Outgroup> and the Camel will be moved over to the outgroup box. This tells PAUP that the camel is known to lie outside the group of taxa in the rest of the data set, and helps to root the tree. Click OK.

Pull down the Search menu and select Heuristic, then click on search. The program will read the data set that appeared in the first window (sequences of the cytochrome b gene in whale mitochondrial DNAs) and will find the shortest tree. When the computer has finished searching (beep or flash) note the number of trees saved and number of rearrangements tried, then click Close.

Pull down the Trees menu and select Describe Trees. A dialogue box comes up; click on the Rooting button. A new dialogue box comes up; select three buttons: Outgroup rooting, then Make Ingroup Monophyletic and then Make Outgroup Monophyletic Sister Group to Ingroup. What do these mean? Click OK. It goes back to the Tree Description Option box. In the boxes shown, click on Cladogram and Phylogram. These will lead to the display of your phylogeny as a simple cladogram, and as a phylogram where the branch lengths are proportional to the number of changes along each branch, akin to an evolutionary distance. If there is more than one tree listed in the box in the upper left (only one now, but might be >1 with your own data set), select the ones you want. Click OK. Trees are displayed with simple statistics.

To save your tree, pull down the Trees menu and select Save Trees at the bottom. Save the tree to your floppy. Now you know how to use PAUP. Play around with the various menus and options.

Extra Credit: One interesting variation is the Bootstrap (under the Search menu). This randomly throws out characters from your data set, and builds the tree on the smaller data set. After the tree is built, the discarded characters are restored to the data set, and a different random set of characters are thrown out. This process is repeated as many times as you specify in the "Number of Replications" box that come up when you run the Bootstrap. When the bootstrap is finished, click on the Close button, and a tree will be scroll onto the window. The numbers on the branches represent the percent of bootstrap replications where all members of the group included in that branch were clustered together. This is a sort of significance test for the reliability of the cluster. For example, a bootstrap value of 65 means that 35% of the bootstrapped data sets placed one member of the group at a different place in the tree. High numbers (70-90) give you confidence that the group in question is "supported" by the data.

Check it out. For starters, try only 25 replications; BE FOREWARNED: it can take a long time to run when many people are on the network, but if the runs go quickly, try 100 replications. Draw the bootstrap tree in your notes, with the bootstrap values; the simple Save command only saves the tree shape.

Question 2: Manipulate the phylogenetic position of the three non-cetacean outgroup taxa and answer: can you prove which group (camel, cow or hippo) is the sister taxon to the Cetacea? Briefly describe a comparison between specific tree statistics that support you answer.

You are done with the first session.

Collect Cetacean Data. On the Bio 48 Web page is a preliminary data set and some links to various sources of information about whales, Cetaceans in general, and their relationship to other mammals. There are also some informative books on reserve at the Science Library. Use these materials (and other references if you know some good ones) to find characters and character states for each of the species in question. You can use morphology, color, ecology, geographic range, etc. Characters must have two or more character states and they should be phylogenetically informative (see lectures on Systematics and Phylogenetic Inference). Avoid using more than three character states per character, if possible. Some characters are listed as ranges of values (e.g. Body Length: x - y...). For data of these types, break them up into discrete values that can be coded as discrete character states. Try to use a good combination of characters that you think are biologically meaningful (morphology, ecology, geography). Find a minimum of 10 additional characters beyond what can be derived from the preliminary data set (the more the better; you always want more characters than taxa WHY?). What you are trying to do is compare the tree produced from the mitochondrial DNA data to a tree produced from your own data so that you can study the evolution of characters.

Enter your data set into MacClade Get into MacClade (see above), pull down the File menu, close any open file (e.g., Vertebrates). Open a New file in a dialogue box (or under the File menu). You will see a data editor window as before, but with no taxa or characters. Use the mouse to expand the number of rows (to 10 taxa) and columns (to the number of characters you have found in your research) by clicking and dragging the small boxes in the data window. Now pull down the Display menu and select State names... You will be presented with a box for naming your characters at the top and the character states in the boxes listed vertically. This tells the program what to recognize when you type your data into the data editor (if confused, go back to the Vertebrate data set from Part 1 and look at the data window). The scroll bar at the top of the box allows you to move to the next character when you have named all the character states for a character. It will scroll only as far as the number of columns you defined with the mouse in the data editor; if you need more, go back to the data editor and move the right edge out.

Type your characters and character states into the data editor (remember to code continuous vales as discrete states!). The little ruler in the upper left corner of the Data Editor allows you to widen or narrow the columns to fit words you use in naming characters. As you enter the data, Save the file: pull down the File menu and select Save. Note: you cannot save files on the Cluster server, so you must save the file on the Desktop or Hard Disk (or a floppy). Click on the rectangular box just above the list of folders in the dialogue box (box with scroll bar on right edge) and choose Desktop (or Hard Disk). When you can save the file, the Save button at the lower right will be active. Name your file in the box provided (don't name it "Whales"; try Lastname.data or Ilovewhales, or WhyDoWeHaveToDoThis.data, or something original). When it is successfully saved, close the file (under the File menu), but leave MacClade running

Open your file under PAUP Launch PAUP from the CLUSTER.MAC folder (also an icon on the right side of your screen; it may be under your current window). After PAUP opens, pull down the File menu and select Open. A dialogue box will appear; click on the rectangular bar above this scroll box and locate your data file from MacClade). Your new whale data file should appear in the window; double click on it to open it. Now go through the steps described above for running this data set under PAUP and finding a tree (set Camel to the outgroup). Because your data set may have a lot of parallel or convergent (='homoplasious') changes, PAUP may find several trees that are the same length. If the program produced many trees, your data set is not very informative and you may need better character data. Examine the tree (or first few trees) using the Describe trees... Rooting procedure described above. Study the tree(s) and statistics.

Analyze character evolution. Now go back to MacClade (click on the icon in the upper right corner) and activate your data file. Pull down the Display menu and select (Go To) Tree Window... Using the mouse, build the tree that PAUP generated from your data set. Select Treelength and C.I. under the S menus and Trace Character under the Trace menu (see above, Part 1). Scroll through the characters as you did with the Vertebrates problem and examine the patterns of character evolution. Now, use the mouse to rearrange the branches of the morphology-based tree so that it matches the tree that PAUP generated from the mitochondrial DNA data set. You may need to open the mtDNA tree or re-run the data set under PAUP to get this tree. Compare patterns of character evolution (i.e. Trace characters): which characters are most and least parsimonious in character state changes in the two different tree topologies?.

WRITE-UP: You should hand in the following 1) Write short but coherent and meaningful answers to the questions below, 2) Print out your whale data set, 3) Print out the shortest tree(s) that PAUP found from your data set including C.I. and Treelength, 4) Print out one informative tree from MacClade that shows how different characters you found evolve on the tree based on the mitochondrial DNA and 5) provide a written analysis of important observations about the analysis of character evolution. All written answers should not exceed 1 1/2 pages ! Do not print out anything until you have your final versions .

Don't forget Questions 1 and 2 above

Question 3: i) How many different trees did PAUP find from your data set, and how do your trees compare to those of the mtDNA data set? ii) Which character(s) show the most disagreement between the morphological and mtDNA trees? (i.e, are parsimonious in one topology, but not in another? (support for this should be provided in the trees printed from Paup and MacClade). iii) How does the comparison between the morphological and molecular trees provide evidence for adaptive evolution?

Question 4: In the whales, is the order Odontoceti (toothed whales) a natural group? How do the traditional groups of Odontoceti and Mysticeti relate to the concepts of grade and clade?

Question 5: Most cetaceans have lost their hind limbs, but vestigial pelvic girdles are found in some of the baleen whales. What does this fact say about the selection pressures on pelvic girdles in the three lineages (porposes+dolphins), baleen whales, and sperm whales.

Question 6: What is the difference between: "whales are descended from hippos", and "whales are sister taxon to hippos"? Can DNA studies from living organisms distinguish between these statements? Explain the role fossils could play in addressing these questions.