I haven't wandered into FHF much recently, but since this is a topic of interest to me...
mtDNA does typically do a pretty good job, in part because it evolves relatively rapidly, so you can get a well resolved tree, but also because it usually isn't experiencing the types of selection that can really screw up phylogenetic analyses. For example, say two closely related species (but NOT sister species, other species are more closely related to both) both experience directional selection for increased body size. If you use a gene involved in that increased body size to try to estimate the phylogeny, they may well have similar mutations due to similar selection for larger body size. Thus, its possible that you recover an inaccurate phylogeny with those two larger species as sister taxa because they experiences similar selective pressures for larger body size. There are also some genes, such as Major Histocompatibility Complex (MHC), where there is strong 'balancing selection' - selection for variability (MHC is involved in immune response, so individuals with lots of different MHC alleles are better able to fight off infection than individuals with a bunch of copies of the same MHC alleles). Balancing selection can also really screw up analyses, and result in a big comb phylogeny with little support for relationships. The patterns caused by balancing or direction selection really screw with analyses, and it can be really difficult to detect them and account for them, hence why genes that are either selectively neutral or under weak purifying selection (i.e. mutations that would change the protein being coded for are selected against) tend to be a lot better for estimating phylogeny - we can model how neutral sequence evolves really well, and other kinds of selection can really mess things up. Anyhow, getting back to mtDNA, because it doesn't typically experience the types of selection that can screw up phylogenies. That being said, it's still a single locus, a single estimate of the relationships, so it's got the issues of incomplete lineage sorting that Squam explained.
I disagree. Most of the commonly used mitochondrial markers are coding loci and are unquestionably under selection... and it isn't just directional or diversifying selection that can cause problems, purifying selection can also screw up phylogenetic analyses. AFAIK, few people--if any--make an effort in phylogenetic studies to quantify how strong that selection is, so I don't think saying that it is typically weak and therefore unproblematic is warranted. Purifying selection's main effect is to limit the available character space--most of the possible sequences are not available because they would not yield a functioning protein. As available character space goes down, the likelihood of homoplasy--and thus poor phylogenetic inference--increases. The inverse is true for mutation rates; as mutation rates go up, the likelihood of homoplasy increases. Commonly-used mtDNA markers have both small available character space and high mutation rates. Among sufficiently closely related terminals, this is probably not a significant problem; for deeper nodes, it's a big problem.
There are also cases, not super common, but it does happen that mitochondrial genomes introgress from one species to another following a hybridization event. Again, this movement of mitochondrial genomes from one species to another is relatively uncommon, so it's definitely not something that should be latched onto without strong additional data, such as from multiple nuclear loci, but it is something that happens. So, we're moving more and more away from mtDNA only studies, and relying more and more on multiple loci, and using methods that account for the possible differences between gene trees to estimate the species-level phylogeny (i.e. the 'species tree' methods Squam mentioned).
I don't think this is nearly as rare as people think it is. It isn't reported too frequently--but then, datasets involving strong nuclear and mitochondrial topologies that would allow us to see this kind of anomalous pattern in mtDNA are still uncommon (and downright rare
in the herp world), so the safest statement would probably be that we know it occurs but do not have enough data to allow us to confidently infer that it is either rare or common.
Anyway, when you do have discordance between, say, the mtDNA and morphology (such as between mtDNA and coloration in the eastern Pantherophis), I think it depends heavily what the other data is. If it's a character that likely evolves fairly rapidly and is likely to change a lot, like coloration, I'd rely more heavily on the mtDNA, but if it's something that is unlikely to change much, such as the structure of a bone or the place where some muscle attached, discordance with mtDNA would probably raise my eyebrows and get me thinking in more detail about what's going on. So, for example, the eastern Pantherophis, their color pattern probably evolves fairly rapidly.
OTOH, mtDNA also
evolves rapidly; that isn't really a difference here. AFAICT, the main unique problem of rapidly evolving morphological characters is: generally smaller available character space (purifying selection dramatically reduces character space compared to randomly evolving loci, but even if variation is largely limited to third codons we're still talking about a much greater range of possible sequences than, e.g., the 4-5 basic kinds of coloration in Pantherophis obsoletus
). When directional selection is likely (which does seem to be the case in Pantherophis obsoletus
) this is an additional problem, as directional selection (at least when observed in the phenotype; it's less clear when we're talking about genotypes, as there may be a whole slew of different possible mutations that can lead independently to a single observed change in phenotype, making directional selection on a phenotype may be much less likely to result in homoplasy in nucleotide sequences) is more likely to actively create homoplasy rather than merely increasing the likelihood of chance homoplasy.
Sampling gaps are always a pain, and it's difficult to tell if you've got a gradient from one mtDNA lineage to another, or if you'll end up with a nice clear boundary between the two with better sampling. It's certainly a sign that more sampling is necessary to tease apart what exactly is happening, but when you have two well supported clades that are as deeply divergent as in the case of the getula complex (I didn't see a table with the divergence in the getula paper, but they estimate, using a fossil calibration, that they diverged about 4 Mya, so they're pretty divergent, probably something on the order of 6-10%), it's more likely that there with better sampling, you'll see a relatively clean boundary between the two, possibly a hybrid zone, or even a nice clean boundary with no evidence of hybridization. When they're that divergent, I find it unlikely that it would be a gradient from one mtDNA lineage to the other, and I think most evidence from squamates supports this.
There are two cases (references at the end of this post) I know of in which people have looked closely at contact zones after putative species were defined based on mtDNA topologies. In both cases there were not
nice clean divisions between the putative species, but instead mtDNA haplotypes from each species were extensively mixed within populations. I haven't been reading the recent herp literature much, so maybe there are now cases in which those mtDNA boundaries do hold up on closer examination... but the only studies I've seen indicate the opposite.
It's also not entirely clear that we should expect the degree of introgression in contact zones to scale with mtDNA divergence. MtDNA sequence variation at the commonly-used markers would not, in itself, be expected to have any influence on reproductive isolation. We have to assume that it is correlated with other changes in the genome that do
influence reproductive isolation. That seems like a reasonable guess, but I'm not sure it's much more than that or that we can justify a particular percent-divergence cutoff.
Directly addressing a couple of Cole's questions:
Cole Grover wrote:
1. Couldn't/shouldn't a discordance of mtDNA data (such as that from Cytochrome b) with other available data (geographical, morphological, ecological, and even other biochemical data) just as easily signal why it should not be relyed upon so heavily as signal the "wrong-ness" of all other data? (Yes, I know "not all data are created equal") The example given in question 4 would apply here, too, though I could dig up other examples if needed.
Discordance doesn't necessarily mean any particular source of information is "wrong"... but when dealing with very closely-related individuals, it does suggest that these individuals all belong to the same species. Independent sorting of different characters is what we should expect within a species. It should not (at least, not without various mitigating circumstances--e.g., incomplete lineage sorting) be occurring in a set of distinct species, and all else equal should be taken as evidence against the hypothesis that we are in fact looking at distinct species.
4. How about when members of the same population don't cluster togther in an mtDNA phylogeny? Couldn't/Shouldn't that signal that the "preferred tree" ought not to be preferred? (an example here is http://www.naherpetology.org/pdf_files/711.pdf
). I guess my question boils down to what is the null in in these situations? Shouldn't the default be to reject the discordant or incomplete data set?
My answer here is about the same as above--if members of the same population contain divergent mtDNA haplotypes, that is not
evidence that there is something wrong with the mtDNA tree. All else equal, our assumption should be that the mtDNA topology is correctly telling us to reject the hypothesis that we're dealing with distinct species rather than variation within a single species.
The authors of that paper basically get it right:
Bryson et al. wrote:
Our results suggest that the current recognition of L. mexicana and L. triangulum may be incongruent with the evolutionary history of these two groups.
Probably Lampropeltis alterna
and Lampropeltis ruthveni
should be included in that statement (which may have been their intent; it's not entirely clear in a quick reading if they mean just Lampropeltis mexicana
in the narrow sense, or the L. mexicana
group, including L. alterna
and L. ruthveni
). However, the authors are understandably (although, for mtDNA studies, unusually
) reluctant to make taxonomic changes from a single locus topology and give the standard "more research is needed":
Further research using nuclear markers to assess gene flow among these lineages will be necessary to determine if the currently recognized taxa do represent species and if the mtDNA data are indeed in error.
Gibbs, H. L., S. J. Corey, G. Blouin-Demers, K. A. Prior, and P. J. Weatherhead, 2006. Hybridization between mtDNA-defined phylogeographic lineages of black ratsnakes (Pantherophis spp.). Molecular Ecology 15:3755-3767
Leache, A. D. and C. J. Cole, 2007. Hybridization between multiple fence lizard lineages in an ecotone: locally discordant variation in mitochondrial DNA, chromosomes, and morphology. Molecular Ecology 16:1035-1054