Methods for studying polyploid diversification and the dead end hypothesis: a reply to Soltis et al . (2014)

Carta Revisado por pares

Methods for studying polyploid diversification and the dead end hypothesis: a reply to Soltis et al . (2014)

2014; Wiley; Volume: 206; Issue: 1 Linguagem: Inglês

10.1111/nph.13192

ISSN

1469-8137

Autores

Itay Mayrose, Shing H. Zhan, Carl J. Rothfels, Nils Arrigo, Michael S. Barker, Loren H. Rieseberg, Sarah P. Otto,

Tópico(s)

Genomics and Phylogenetic Studies

Resumo

The fate of polyploid lineages has been of long-standing interest to evolutionary biologists. In our previous work (Mayrose et al., 2011a; reviewed in Arrigo & Barker, 2012), we used likelihood methods to estimate the effects of recent polyploidization events on diversification rates (speciation minus extinction). Our goal was to ask whether, across groups, there is a preponderance of evidence for or against the hypothesis that polyploid species diversify at different rates than diploid species. Our results demonstrated that polyploid lineages, when compared with their diploid relatives, tend to form new species more slowly and go extinct more quickly, for a combined effect of lower diversification rates. Soltis et al. (2014) recently called into question our conclusions and raised concerns about our analyses. Some of these concerns reflect open questions and limitations in the available data, but others are based on misinterpretations of the methods used and their implications. In the spirit of furthering the fascinating debate about the macroevolutionary consequences of polyploidization, we offer this response with the hope that it will clarify what has been shown, demonstrate the utility of the methods applied, and inspire future work. Soltis et al. (2014) frame their concerns as a series of arguments, and we follow their framework in responding to the points raised. Before embarking, it is worth reemphasizing what we defined as a ‘polyploid lineage’. Indeed, all angiosperms have one or more polyploidization events in their history if one looks back far enough (Jiao et al., 2011), and thus polyploidy must be defined with respect to a reference time point. Because our focus was on the impacts on diversification of recent polyploidization events, we explicitly defined polyploids ‘as those lineages that underwent a polyploidization event since divergence from their generic ancestor’ (Mayrose et al., 2011a). This definition enabled us to perform a large-scale comparative study using chromosome number changes to infer polyploid transitions in 63 clades of plants. Thus, when asking whether polyploidization impacts subsequent diversification, it must be borne in mind that our analyses were conducted on rather short evolutionary timescales, comparing those taxa that had undergone fairly recent polyploidization (‘neopolyploids’) with those that had not. As we concluded in Mayrose et al. (2011a), it remains an interesting open question whether earlier polyploidy events had different impacts on speciation and extinction, as has been suggested, for example, in the context of mass extinction events (Fawcett et al., 2009; Vanneste et al., 2014). The first criticism leveled is that we overestimated the diversification rates of diploids because ‘increased diversification rates are more likely to arise in large clades than in small ones’ (Soltis et al., 2014). Specifically, Soltis et al. (2014) noted that polyploid lineages, as we defined them, would have to be nested within a larger diploid group, hence the polyploid subclades must be more recent and would thus tend to be smaller. We agree that the polyploid subclades will tend to be younger and, assuming equal diversification rates, polyploid species are expected to be fewer in number, but this should not bias the estimation of diversification rates. The likelihood method that we used (‘binary-state speciation and extinction’, BiSSE; Maddison et al., 2007) estimates instantaneous ‘birth’ (speciation) and ‘death’ (extinction) rates per unit time, not the total number of speciation and extinction events. It thus simultaneously accounts for the amount of time spent in each state and for trait transitions. By analogy, the rate at which a jogger runs may be measured at 10 km h−1, whether they have been running for a long time or a short time. The fact that the polyploid subclades have existed for a shorter amount of time does not bias their inferred diversification rates towards lower values. At an intuitive level, the main phylogenetic information that allows BiSSE to infer trait effects on diversification are the internode distances (see also Ree, 2005). If a trait increases diversification, it should be associated with shorter distances between branching events, compared with the alternative character state, and this association should be seen regardless of the depth of the tree (power issues not withstanding – see the Statistical and analytical arguments section, later). Also at an intuitive level, the main information used to infer extinction rates is the rise near the present in the number of species (Nee et al., 1994). If, compared with diploids, polyploids are found more often on the tips of the tree than expected, then this signal suggests that polyploids have a higher extinction rate: they arise but fail to persist. Indeed, the simulation study by Goldberg & Igic (2008) is highly relevant here: they simulated trees where the ancestor was in one state (diploid in the current case) and only forward transitions were allowed (say, to polyploidy). Although their focus was on the estimation of the transition rate, maximum-likelihood results kindly provided by E. E. Goldberg demonstrate that the extinction rates, speciation rates and diversification rates inferred by BiSSE had point estimates near their true values (Fig. 1). Thus, even though polyploids had less time to diversify, the rate at which they did so is well estimated from simulated data (the age of the clade did not bias the rate estimates). It is worth clarifying a related misconception: Soltis et al. (2014) claim that ‘older genera have more time to spawn polyploids than very young lineages; hence including ‘young’ genera while excluding ‘older’ genera will bias the outcome’. Because BiSSE accounts for the transition time between diploidy and polyploidy, younger clades will have fewer transitions and less time to accumulate species, but this is reflected in broader confidence intervals, not in lower estimated rates of diversification. As suggested by Soltis et al. (2014), an alternative to the probabilistic framework is to restrict comparisons to clades of equal age (sister-clade comparisons). Doing so, however, drops a great deal of information from phylogenetic trees (i.e. dropping all branches not part of a contrast), reducing power. This is particularly true for traits, like polyploidy, which tend to appear on terminal branches of a phylogeny (Mayrose et al., 2011a; Escudero et al., 2014). Moreover, it is often not straightforward to identify sister clades with confidence and without subsequent transitions (Maddison, 2006). In addition, such sister-clade comparisons have been shown recently to be inherently biased in cases where one character state is more often the derived one. In these cases, there must be a transition from the ancestral to the derived state on the branch subtending the derived-state clade. The ancestral-state clade, however, gets a head start by already being in that state. So the time available for diversification of the derived state is less, causing it to artificially appear in clades with lower species richness (Kafer & Mousset, 2014). The second philosophical criticism leveled is that reticulate evolution is common among polyploid lineages – indeed, it defines allopolyploids – and that this reticulation raises questions about the legitimacy of diversification analyses based on bifurcating trees. Notably, this criticism extends beyond our analyses; it applies to most applications of comparative methods using phylogenies that contain taxa that have evolved via reticulation at some point in their histories. The effects of reticulation on comparative analyses in general, and on BiSSE-like analyses in particular, have not been well established. These biases may be particularly pronounced when polyploidy is involved as the effect on the topology and inferred branch lengths is still unexplored. To get some sense of the potential impacts of reticulation on our results, we subdivided our data into those genera whose phylogenies were based solely on chloroplast loci and those that were not. As the chloroplast is uniparentally inherited, the true phylogeny is expected to be bifurcating in the former groups, even in the face of hybridization. The results are shown in Fig. 2: the inferred diversification differences between diploids and polyploids are not significantly different between the cpDNA-based trees and the remaining trees (P > 0.05, two-tailed t-test). While further research is needed to explore the effect of reticulation on cpDNA-based phylogenies, this comparison suggests that hybridization did not strongly bias our results. Regardless, we agree with Soltis et al. (2014) that care should be taken when interpreting results based on models that ignore reticulation. We encourage future method development that accounts for hybridization and allows for its influence on diversification rate estimates to be considered. Soltis et al. (2014) caution that extinction rates are difficult to estimate and claim that our conclusion that ‘polyploids diversify at a lower rate is based on higher inferred extinction rates in polyploids’. We agree that extinction rate estimates should be interpreted cautiously. While simulation studies demonstrate that BiSSE-like methods can estimate extinction rates accurately when the model assumptions hold (Maddison et al., 2007), extinction rates are particularly sensitive to sampling biases and departures from the assumed model (Rabosky, 2010). That said, the primary focus of our work was not to estimate extinction rates, but rather to estimate the net diversification rate (r = speciation rate minus extinction rate) of diploids vs polyploids (rD vs rP). Typically, there are ridges on the likelihood surface, where the same diversification rate can be obtained by different combinations of speciation and extinction rates, all with similar likelihoods of explaining the data. Thus, comparative analyses have more power to measure the net diversification rate than to tease apart whether differences in diversification are due to speciation or extinction (Nee, 2006). Our main result that diploids diversify at a higher rate than polyploids (Fig. 1a of Mayrose et al., 2011a) does not rely on a precise estimate of the extinction rate and accounts for correlations between speciation and extinction. These results led us to conclude that polyploidy is ‘most often an evolutionary dead end’. We realize, however, that some readers may interpret the term ‘evolutionary dead end’ to imply that polyploid lineages will always go extinct before speciating. This was not our intention. Rather, we use the term to indicate that neopolyploids, on average, have lower – frequently negative – rates of diversification relative to congeneric diploids. Consequently, many polyploid lineages are ephemeral and contribute little to the diversity of plants, relative to their diploid relatives. That does not preclude the possibility that occasionally a polyploid lineage will radiate into a successful lineage, as stated in Mayrose et al. (2011a). We believe this nuanced view to be in keeping with that of G. L. Stebbins Jr (e.g. ‘The long-continued evolution needed to differentiate genera, families, orders, and phyla appears to have taken place chiefly on the diploid level…Nevertheless, there is some evidence that many genera and even subfamilies or families of seed plants have had a polyploid origin’. Stebbins, 1950, p. 359). Another statistical argument leveled in Soltis et al. (2014) revolves around the issue of power. In particular, they noted that the clades we examined were smaller than they should be to have sufficient power to infer diversification rate differences. Our approach was not, however, based on characterizing with confidence the diversification rates in individual clades. Instead, we performed a meta-analysis to assess the preponderance of evidence across dozens of clades. If polyploids and diploids diversified at equal rates, then we would expect to see higher polyploid diversification in roughly half of the clades. To assess this, we counted the fraction of Markov chain Monte Carlo (MCMC) steps in which rD > rP, treating this fraction for a clade as a single datum. As shown in Fig. 1(a) of Mayrose et al. (2011a), the majority of clades – 55 out of 63 – exhibited higher diversification for diploids, not polyploids (P = 10–9 exact binomial test of obtaining an outcome as extreme as 55 or more out of 63 trials; P = 10−12 one-sample t-test following a probit transformation testing whether the average fraction of MCMC steps differed from 0.5). By analogy, if one flips a coin once and gets a head, it would indeed be inappropriate to claim support for the coin being biased towards heads. However, observing 55 heads out of 63 coin flips is strong evidence that the coin is biased. Furthermore, there was no significant trend in the percent of MCMC steps showing rD > rP and tree height (from Supporting Information Table S3 of Mayrose et al., 2011a). We argue that approaches like ours that consider the preponderance of evidence across multiple clades is the best way to assess whether a trait, like polyploidy, affects diversification in a repeatable way. This multiclade approach is particularly important given recent concerns raised by Maddison & FitzJohn (2015), who argue that analyzing single clades is susceptible to artifacts if there have been few transitions in the trait of interest, and by D. L. Rabosky & E. E. Goldberg (unpublished), who note that BiSSE can have a high false positive rate when other, unmodeled, processes affect diversification patterns. Thus, significant results from a single clade must be interpreted with caution. However, across many trait transitions, each false positive result, driven by chance associations with other processes, is equally likely to support increased or decreased diversification of the focal trait (here, polyploidy). Because we analyzed multiple clades in Mayrose et al. (2011a), we ensured that chance associations in any one clade did not drive our conclusions. Although our recommended approach is meta-analytic as in Mayrose et al. (2011a), our result should be echoed in many of the analyses of individual clade datasets (albeit with reduced power), which Soltis et al. (2014) concluded was not the case. Unfortunately, the clade-specific statistics were incorrectly interpreted in Soltis et al. (2014) because they misinterpreted an error in Table S2 of Mayrose et al. (2011a). In that table we accidentally misordered the last three column headings, which should read %(λD > λP), %(μD > μP), and %(rD > rP), respectively, consistent with Fig. 1 of Mayrose et al. (2011a). Soltis et al. (2014) realized that there was a mistake, but assumed that it was only in the last column, and then interpreted this column as %(μD < μP) instead of %(μD > μP). (We were not asked about the error and have now posted a correction on the Science website.) Below, we summarize the results from the individual clades, using the corrected column headings. The within-clade results are fully consistent with our meta-analysis. In short, 21 clades exhibited significantly higher rates of diversification among diploids than among polyploids (i.e. > 97.5% of the MCMC steps exhibited rD > rP; using a two-tailed test corresponding to an α-value of 0.05); none of the 63 clades showed a significantly higher rate of diversification for polyploids. While we caution against drawing conclusions about the impact of polyploidization in any one particular clade (as discussed above), the fact that the vast majority of MCMC steps (> 97.5%) supported rD > rP in one third of the clades demonstrates that the signal for higher diploid diversification is both sufficiently strong and recurrent to be detected repeatedly. This trend was echoed in the extinction rate estimates, with nine clades exhibiting significantly higher extinction rates in polyploids, compared with none that exhibited significantly higher diploid extinction rates. Individual-clade speciation rates differed less by ploidy, with two clades showing significantly higher speciation rates in polyploids and two showing significantly higher speciation rates in diploids. Similar results were obtained using the less conservative cut-offs suggested by Soltis et al. (2014) of 5 and 95% (corresponding to an α-value of 0.1), with diploids diversifying significantly more in 31 clades (vs two for polyploids), speciating more in six clades (vs four for polyploids), and never exhibiting a significantly higher extinction rate (whereas polyploids went extinct significantly more often in 12 clades). The next category of concerns raised in Soltis et al. (2014) relates to the datasets included in our analyses and our handling of them. The first concern is that recent polyploidization events may be underrepresented because many taxonomists use a ‘conservative species concept’ that ignores chromosomal variants within species. Defining species is, indeed, a thorny topic, particularly so for groups with ongoing polyploidization. In our Mayrose et al. (2011a) analyses, we ignored within-species variation in ploidy level, took the lowest reported chromosome number as the value for the species, and did not attempt to identify cryptic species. Doing otherwise, however, would have driven up the speciation rate of diploids (from which these new polyploids arise) and driven up the extinction rate of polyploids (because many more polyploid tips on the tree would have arisen very recently and would appear to be short-lived). As pointed out in Soltis et al. (2014), ‘most new polyploids go extinct early, probably at the population level before they are even detected’. By ignoring these highly ephemeral polyploids, our analysis was conservative; including them is expected to generate an even stronger signal that diploids diversify at higher rates than polyploids – contrary to what was implied in Soltis et al. (2014). Indeed, recent analyses found exactly this pattern (N. Arrigo et al., unpublished; see details in Supporting Information Notes S1). For each diploid species with a polyploid cytotype (an unsequenced and unnamed variant), the polyploid was elevated to species status by randomly ‘injecting’ a subbranch onto the terminal branch leading to that diploid for the genera Draba, Gossypium, Ranunculus and Viola. As expected, addition of these polyploid terminal taxa decreased polyploid diversification rates and increased diploid diversification rates (Table 1). Regardless of how branch lengths were assigned to the newly added terminal taxa (as exemplified in Fig. S1), net diversification rates for polyploids were much lower than diploids in each of these genera and were lowered still further by injecting the cytotypes into the phylogenies. This study confirms that excluding polyploid cytotypes is conservative with respect to the conclusions of Mayrose et al. (2011a). The second concern raised was that we only analyzed a minuscule fraction of the angiosperm genera (estimated to be 14559; The Plant List, 2013). We agree; our analyses are only a start. As more phylogenies are constructed and coupled with chromosomal data, a fuller picture will emerge. That said, subsequent analyses with different groups have reached similar conclusions: polyploid lineages tend to diversify more slowly. In particular, Escudero et al. (2014) found that polyploid transitions generally occur towards the tips of the phylogeny and additionally that polyploid changes persisted over less time than changes in single chromosome numbers (dysploidy). A related criticism raised by Soltis et al. (2014) is that Mayrose et al. (2011a) did not include many of the most well-characterized genera of flowering plants. Recent analyses (N. Arrigo et al., unpublished) have examined many of these classically studied genera and recover results consistent with Mayrose et al. (2011a). For example, analyses of Draba, Gossypium, Ranunculus and Viola – well-studied genera noted by Soltis et al. (2014) as absent from our previous analyses – again reveal lower net diversification rates for polyploids than diploids (Table 1). The inference of lower polyploid diversification rates is consistent with two studies using entirely different methods (not phylogenetic). The first was a set of simulations conducted in Mayrose et al. (2011a) to ask whether the number of paleopolyploid events inferred from recent genomic analyses can be explained even if polyploidization does not increase diversification. These simulations were initiated with a single diploid ancestor and ran until the number of lineages reached 300 000, approximating the total number of angiosperm species. At each iteration of the simulation, a random species was selected to speciate or go extinct, with extinction happening with probability μ/(λ + μ) and speciation otherwise. We assumed that extinction happened 60% as often as speciation (μ/λ = 0.6, as estimated by Bokma, 2003; across a range of angiosperm taxa). At each speciation event, the daughter species was polyploidized with probability fhet (i.e. polyploidization evolved by ‘cladogenesis’ with shifts only at speciation), with fhet set to a range of values from 0.15 (Wood et al., 2009) to 0.3 (Mayrose et al., 2011a). According to the simulation results (Fig. S2 in Mayrose et al., 2011a), if polyploids and diploids diversify at equal rates, the average number of paleopolyploidy events in the evolutionary history of a single angiosperm lineage is expected to be 4.6–8.9 for fhet ranging from 0.15 to 0.30. These estimates are higher than the observed number of 1–4 polyploidization events in the history of most angiosperms (Jiao et al., 2011). Thus, not only is there no need to invoke higher diversification rates for polyploid taxa in order to account for the number of ancient polyploid events, but such higher rates would be incompatible with our current understanding of the frequency of paleopolyploidy. C++ source code that generated these simulations is available at http://www.tau.ac.il/~itaymay/cp/pp_diversification/. The second study was based on the ratchet model of Meyers & Levin (2006). These authors pointed out that the high abundance of polyploidy can be explained even when polyploidy is disadvantageous, due to the high frequency of polyploid formation coupled with slow reversal to the diploid state. Recently, Scarpino et al. (2014) extended the ratchet model to allow for diversification and polyploidization rates to depend on the ploidy level of a lineage. They then used approximate Bayesian computation to estimate the model parameters for 60 angiosperm genera (only five are shared with the Mayrose et al., 2011a dataset), chosen based on high coverage of chromosome number data. Using the number of species at each ploidy level within a genus as data, the study reached a similar conclusion: diploids diversify at higher rates than polyploid congeners, due primarily to a higher speciation rate from the diploid state via polyploidization. The consistency of the conclusions, using different data and methods, confirms the view that polyploidization does not increase species diversification, at least over genus-level timescales. A related concern raised by Soltis et al. (2014), which we fully share, is that sampling biases may mislead inferences about diversification rates. In particular, if systematists avoid sampling polyploid species when inferring trees or avoid clades with speciose polyploid groups, then diversification rates could be biased. In addition, if higher level taxa, such as genera, are biased with respect to polyploidy (e.g. if taxonomists commonly break diverse polyploid clades into multiple genera) estimated diversification rates of polyploids could be affected. Such biases can be corrected, if known, but the extent of these biases is generally unknown. We endeavored to make the best use of the data available at the time and welcome efforts to revisit these conclusions with new data. Of great concern are the issues raised in Soltis et al. (2014) about inconsistency between the data analyzed in Mayrose et al. (2011a) and the original sources. As described later, the data that we used are consistent with the literature cited and follow the methods described in Mayrose et al. (2011a) (with two exceptions). The first concern raised about data consistency involved chromosome counts in Physalis. Soltis et al. (2014) wrote that we had failed to list three polyploid taxa based on their IPCN chromosome counts: P. angulata (2n = 48), P. hederaefolia (2n = 24, 48; correcting the typo stating ‘12, 24’ in Soltis et al., 2014; D. E. Soltis, pers. comm.), and P. peruviana (2n = 48, 72). Our compilation of chromosome numbers included additional sources besides IPCN (see citations in Mayrose et al., 2011a), and these indicated additional diploid counts of 2n = 24 for both P. angulata and P. peruviana (Fedorov, 1969), as confirmed by more recent chromosome counts (Rodríguez & Bueno, 2006; Wahua & Sam, 2013). Because all three species have diploid chromosome counts of 2n = 24, we did not treat any of these species as polyploid to avoid ephemeral within-species variation (see earlier). Soltis et al. (2014) then presented their Table 1 listing 21 groups in which they claimed that taxa were seemingly omitted from Mayrose et al. (2011a). We traced the causes of these inconsistencies (see additional details in Notes S2), as follows. In the first two rows of their Table 1, Soltis et al. (2014) report that we only used five out of 14 Sium species and 17 out of 32 of the species of Antirrhinum. However, the analyses reported in Mayrose et al. (2011a) were based on the full number of species (see data deposited in Dryad, Mayrose et al., 2011b). These two inconsistencies resulted from a bug in the software used by Soltis et al. (2014) to view the tree files (D. E. Soltis, pers. comm.) Because we analyzed diversification patterns at the species level, the inclusion of intraspecific variants would bias the results towards a higher diversification rate for those species with denser intraspecific sampling. Thus, we excluded subspecies and variants as detailed in the methods of Mayrose et al. (2011a) (see also details in Supporting Information), while Soltis et al. (2014) did not (affected: Phacelia, Achillea, Pelargonium, Houstonia, Digitalis/Isoplexis, Mimulis, Cerastium, Aichryson, Arisaema). As stated in the supporting information of Mayrose et al. (2011a), the outgroup taxa were used only to root the tree and were removed before our chromEvol and BiSSE analyses. As discussed in FitzJohn et al. (2009), outgroup taxa must be excluded in order to apply BiSSE, because they are typically not well sampled and their diversification pattern would not necessarily reflect that of the analyzed ingroup, biasing the estimated diversification rates of the ingroup. We thus removed outgroups, as stated in Mayrose et al. (2011a) (affected: Mimulis, Graptopetalum, Aristolochia, Trillium). Specifically, the GenBank entries for Lathyrus and Vaccinium did not match the accessions reported in the original studies, and accession numbers were missing for two species of Gunnera. These accessions were thus not used in Mayrose et al. (2011a). As described in Mayrose et al. (2011a), we chose to exclude species that lacked data for one or more of the genes sequenced in a study, rather than including all species with any amount of sequence data. We further excluded sequences stated to be of poor quality. In cases where requiring all loci resulted in a loss of too many ingroup species (> 20%), we narrowed our analysis to subsets of loci that maximized the number of species analyzed. These procedures were detailed in the Supporting Information of Mayrose et al. (2011a). While we agree that alternative inclusion criterion could have been made, for example, to use all loci regardless of coverage, we made this decision a priori without regarding any specific clade. Thus, these inclusion criteria should not bias the relative diversification rates of diploids and polyploids (affected: Gaura/Stenosiphon, Achillea, Penstemon, Mimulis, Cerastium, Coreopsis, Graptopetalum). In Cuphea, we mischaracterized C. sessiliflora and C. sessilifolia as duplicate entries. In Cerastium, we accidentally omitted the count for C. lithospermifolium (see later). Soltis et al. (2014) then reexamined three clades. In the Mayrose et al. (2011a) analysis of Tiquilia, several taxa reported by Moore et al. (2006) were dropped because they were not sequenced for the full set of genes (matK, ndhF, rps16, ITS and waxy). As the dropped taxa contained a disproportionate number of polyploids (six out of nine; Soltis et al., 2014), their exclusion could have biased our results. Indeed, based on the Soltis et al. (2014) reanalysis, the difference in diversification rate was substantial (% of MCMC steps with rD > rP: 82% in Mayrose et al., 2011a; using the corrected column headings, vs 24% in Soltis et al., 2014), largely due to a difference in speciation rate; the difference in extinction rates was fairly minor (% of MCMC steps with μD > μP: 8% in Mayrose et al., 2011a; vs 23% in Soltis et al., 2014). (Soltis et al., 2014 reported this as the % of MCMC steps with μD < μP, but this is inconsistent with their Fig. 2c.; verified by D. E. Soltis, pers. comm.) Thus, Tiquilia represents a potential example where a bias against genotyping polyploids led to their underrepresentation in our dataset. However, because the genes used to generate the Soltis et al. (2014) phylogeny (rps16 and ITS) were said by Moore et al. (2006) to be incongruent and, for ITS, difficult to align, reliable conclusions about Tiquilia must await a more complete and robust genetic dataset. In the second example, we had accidentally omitted the chromosome count for Cerastium lithospermifolium. As indicated by Soltis et al. (2014), the ancestral chromosome count is affected by properly including C. lithospermifolium. Doing so, Soltis et al. (2014) found that support for higher diploid diversification in this clade dropped from 100 to 76% (% of MCMC steps with rD > rP). In their last example, Soltis et al. (2014) revised the basal chromosome number for the GAMA clade of Greenovia/Aeonium/Monanthes/Aichryson by including Sedum (n = 8) as an outgroup not only to reconstruct the phylogeny (based on Mort et al., 2002), but also to infer ploidy shifts using ChromEvol (Mayrose et al., 2010). Doing so caused Soltis et al. (2014) to infer that a group of Aichryson (for which there were no chromosome counts in the dataset) were similar in chromosome number to the outgroup (n = 8) and to label these as diploids in Fig. 4 in Soltis et al. (2014). Both Fedorov (1969) and Uhl (1961), however, report several n = 15 chromosome counts for Aichryson, and the latter study concludes that this is the basal chromosome number for the genus (with subsequent polyploidization in A. pachycaulon and A. villosum). These data are thus inconsistent with Aichryson having the same chromosome count as the Sedum outgroup. Accordingly, the ancestral chromosome count for the base of the group once the outgroup is removed would be c. 15–18, as in our original analysis. Note also that in the diversification analysis of Soltis et al. (2014), the tree used included multiple accessions of two polyploid species (Aeonium pseudourbicum, Aeonium percameum), which also artificially increased the inferred diversification rates of polyploid lineages reported in Soltis et al. (2014). Nevertheless, even if we use the three datasets as reanalyzed by Soltis et al. (2014), the preponderance of evidence across the 63 datasets continues to support higher diversification rates for diploids (53 out of 63 clades; P = 10−8 exact binomial test). Soltis et al. (2014) also highlighted several differences between the topologies of our trees and those in the original papers, although they noted that most of these differences were minor. The original papers typically reported nonultrametric likelihood- or parsimony-based trees. Because a requirement of BiSSE is that branch lengths be proportional to time (otherwise rates cannot be estimated), ultrametric trees had to be generated for each dataset from the original sequence data. To do so, we developed a pipeline that applied likelihood methods to infer trees with branch lengths proportional to time for each of the groups analyzed. This pipeline further standardized our data analysis across the dozens of plant groups studied, which we believe is a necessary step when conducting such large-scale analyses. Importantly, in our analyses we also accounted for phylogenetic uncertainty (in both topology and relative branch lengths) by conducting the diversification analysis across a sample of Bayesian trees rather than relying on estimates from a single inferred tree. We make no claim of superiority for these trees, only that they can be used for analyses that require time-proportionality for branch lengths. An additional concern was that examining slightly more inclusive trees by including the next closest sister group could have dramatically changed the ploidy level assignments. Certainly, if one were to study older trees, then earlier polyploid events would be included, and these might well change ploidy assignments (e.g. incorporating all of Sedum as an ingroup alongside GAMA would lead to the inference that the entire GAMA clade is polyploid, albeit with a mixture of tetraploids and higher ploidy levels). If one were to conduct BiSSE using these older trees, one would be assessing the impact on diversification of earlier polyploid events. We welcome such analyses, but they would not change our result: that polyploidy events which have happened in the recent past (within the genus for most of our clades) tend to lower diversification rates. It is tantalizing to contemplate the possibility that earlier polyploidization events may tell a different story. For example, additional gene loci may have provided substantial ‘genetic degrees of freedom’ and have been highly beneficial during earlier rounds of polyploidization in angiosperms, but no longer. But evaluating this speculation awaits analyses of deeper trees. A final concern raised in Soltis et al. (2014) is that we assumed that all genera are equal in age. This is not, however, an assumption of our method. If one wanted to compare the diversification rate within one group (e.g. rD of Actinidia) to the diversification rate within another group (e.g. rD of Tiquilia), then it would indeed be critical that both diversification rates were measured in the same units of time. In our analysis, however, we compared the relative diversification rates of polyploids and diploids within a clade (e.g. rD to rP within Actinidia). No units of time are needed and the groups can be of any age when asking, for example, the percentage of MCMC steps where rD > rP. The R script that was used for the diversification analysis presented in Mayrose et al. (2011a) is available at http://www.tau.ac.il/~itaymay/cp/pp_diversification/. The majority of the philosophical, statistical and analytical arguments made in Soltis et al. (2014) stem from misinterpretations of our methodology, an incorrect assumption regarding our mislabeled column headings, and differences of opinion about how to define polyploids in a comparative analysis. However, we entirely agree with the broader point that our paper should not be considered the end of the story, that more data are needed from a wider variety of plant groups, that having more completely sampled clades would improve accuracy of the estimates, and that conducting analyses at deeper phylogenetic levels might reveal greater evolutionary success for polyploidization events earlier in plant evolution (just as deeper phylogenetic analyses carried out in fish did not find lower polyploid diversification; Zhan et al., 2014). We look forward to learning, as more data are gathered and analyzed, whether the patterns we inferred from the 63 genera included in Mayrose et al. (2011a) are or are not representative of the fate of polyploid plants more broadly. Nevertheless, we believe that model-based methodology applied over many character state transitions (either a meta-analysis as in Mayrose et al., 2011a, or on a single clade with many transitions) is the best currently available approach for studying the effects on diversification of traits, like polyploidy, both in terms of its power and its relative insensitivity to biases. Having reexamined the data and results, we stand by our original conclusion based on the 63 clades examined (Mayrose et al., 2011a): ‘our results indicate that polyploidy is most often an evolutionary dead end, but the possibility remains that the expanded genomic potential of those polyploids that do persist drives longer-term evolutionary success’. But we emphasize the wording. There is a critical difference between polyploids are always evolutionary dead ends and polyploids are most often evolutionary dead ends. Because polyploidization occurs so often (Meyers & Levin, 2006) and because occasional polyploids are endowed with phenotypic attributes that improve their success (Levin, 1983), we should not wonder that plant evolution has been repeatedly shaped by polyploidization. Like most mutations, our evaluation of the evidence to date is that polyploidization is typically deleterious, both at the within-species level (Ramsey & Schemske, 2002) and the between-species level (Mayrose et al., 2011a; Arrigo & Barker, 2012). That most mutations are harmful does not preclude the occasional adaptive mutation, whether caused by a change in nucleotide or a change in ploidy level; indeed, evolution depends on such changes. We thank Emma Goldberg for helpful comments on the manuscript and for supplying simulation results and the anonymous referees for useful comments. This study was supported by the Israel Science Foundation grant no. 1265/12 (I.M.), the US–Israel Binational Science Foundation grant no. 2013286 (I.M.), NSERC Discovery grant (L.H.R.), Natural Sciences and Engineering Research Council of Canada postdoctoral fellowship (C.J.R.), and by a Natural Sciences and Engineering Research Council of Canada (S.P.O.). Please note: Wiley Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office. Fig. S1 Elevating polyploid variants to species status. Notes S1 Diversification analysis of four plant clades (Table 1). Notes S2 Clarification of taxon count discrepancies in Table 1 of Soltis et al. (2014). Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Methods for studying polyploid diversification and the dead end hypothesis: a reply to Soltis et al . (2014)