Stability of higher taxa in phylogenetic nomenclature — some comments on Moore (1998)
1999; Wiley; Volume: 28; Issue: 3-4 Linguagem: Inglês
10.1046/j.1463-6409.1999.00017.x
ISSN1463-6409
Autores Tópico(s)Paleontology and Evolutionary Biology
ResumoIn a recent contribution, Moore (1998) compared aspects of the current zoological, botanical, and bacteriological codes of nomenclature with those of phylogenetic taxonomy (de Queiroz & Gauthier 1990, 1992). The suggestion that a formal code of phylogenetic nomenclature urgently needs to be drafted is well taken. Also, I am in broad agreement with the view that apomorphy-based definitions are not necessarily inferior to node-and stem-based definitions. Moore makes the observation that problems with such definitions caused by convergence and reversals of the relevant trait (leading to recognition of paraphyletic and polyphyletic groups: Bryant 1994; Schander & Thollesson 1995) can be circumvented by adopting a new format, the ‘type-modified apomorphy definition’. This definition is perhaps best phrased along the lines of ‘the most restricted clade diagnosed by character C homologous with that in taxon T’. Such definitions will always refer to a discrete clade, even if character C apparently originates multiple times elsewhere on the phylogeny or reverses within the clade of interest. This is because the boundaries of the named taxon are based on a particular origin of a trait, not on the presence or absence of the trait. Obviously, however, it is best to further minimize ambiguity by attempting to choose traits that are unique and unreversed. Moore’s phrasing of the type-modified apomorphy-based definition is as follows: ‘the clade stemming from the first ancestor of taxon T with character C’. This definition is potentially problematic as it invokes a hypothetical ancestor, a concept which some might find objectionable. Furthermore, such a definition might refer to an unintentionally large clade if character C has been gained and lost repeatedly along the stem lineage leading to taxon T. For instance, the definition of Cetacea (whales) as ‘the clade stemming from first ancestor of Delphinus delphis (dolphin) to possess fins’ will refer to the huge clade stemming from the first vertebrate to possess fins (a jawless fish), rather than, as intended, the smaller clade stemming from the first animal with fins homologous to those in dolphins (an ungulate mammal). There is another possible advantage of apomorphy-based names that Moore did not discuss. Under currently accepted usage, the boundaries of most higher taxa correspond to major morphological innovations. For instance, Vertebrata is based on the possession of a backbone, Gnathostomata on the possession of jaws, Tetrapoda on the possession of dactylous limbs (i.e. limbs with digits), Amniota on the possession of the amniote egg. Consider the simplified phylogeny in Fig. 1. Given this phylogeny, it is possible to define the name Tetrapoda using either node-or stem-based definitions such that all known organisms (i.e. organisms on the cladogram) with dactylous limbs are included within Tetrapoda while those without such limbs are excluded: for instance, ‘all organisms more closely related to the human than to the lungfish’, or ‘the least inclusive clade bounded by the frog and the human’. A simplified vertebrate phylogeny, with stem-, node- and apomorphy-based definitions of Tetrapoda indicated. Horizontal bar denotes the origin of the dactylous (i.e. ‘tetrapod’) limb. X denotes the position of the lobe-finned fish Eustenopteron and Y denotes the position of the primitive amphibian Ichthyostega. However, the boundary of Tetrapoda does not coincide precisely with the origin of dactylous limbs under either of these definitions. The stem-based definition of Tetrapoda might in future include some forms without a dactylous limb: for instance, if a fish is discovered which is more closely related to frogs, chickens and humans than is the lungfish. The extinct lobe-finned fish Eustenopteron would be an example of this (taxon X in Fig. 1). Conversely, the node-based definition can exclude some animals with dactylous limbs from Tetrapoda: for instance, if a dactylous-limbed vertebrate were discovered that falls outside the frog-chicken-human clade. The primitive amphibian Ichthyostega would be a case in point (Taxon Y Fig. 1). Thus, if one wishes to preserve current usage and ensure that the name Tetrapoda will always refer to animals with the diagnostic limbs, node-and stem-based phylogenetic definitions are inappropriate. However, the following apomorphy-based phylogenetic definition would serve the purpose: ‘the clade diagnosed by a dactylous limb homologous with that in humans’. Under this definition, the boundary of Tetrapoda is fixed at the origin of the trait ‘dactylous limb’, i.e. a limb with digits (Fig. 1). There might still be ambiguities if this feature originates through a continuous transformation series, e.g. if animals are found with varying developments of ‘‘proto-digits’’. The placement of these animals within or outside of Tetrapoda will be rather subjective and depend on an arbitrary decision as to what appendages are to be considered limbs and what are not. An apomorphy-based definition cannot place such taxa unambigously with respect to the boundaries of Tetrapoda. However, even under traditional interpretations of Tetrapoda, the placement of such taxa will be uncertain. The important point is the classification of forms which can be unambiguously placed according the apomorphy-based definition, and traditional usage, will coincide exactly: all forms with ‘proper’ limbs will be included in Tetrapoda, and all forms clearly lacking limbs will be excluded. The same cannot be said for the other types of phylogenetic definitions. Therefore, if one wishes to ensure that certain taxon names under a phylogenetic nomenclature coincide with existing, traditional usage, apomorphy-based definitions might be employed in preference to node-and stem-based definitions. There are other arguments in Moore (1998) that appear to be less well founded. Some of these have already been addressed in earlier papers. However, because they continue to be raised (e.g. Lidén & Oxelman 1996; Dominguez & Wheeler 1997), they are therefore discussed briefly here, and attention drawn to the appropriate literature. Moore notes that all higher taxon names under the current botanical and bacteriological codes, and family and genus names under the current zoological code, are defined via type species (and thus ultimately type specimens) and ranks. The type species determines the focal point of the higher taxon: the taxon must include at least the type species, and may include no, few, or many other species. The rank determines the inclusiveness of the higher taxon and thus, its boundaries. However, as Moore (p.562) and colleagues (e.g. Darwin 1859; Hennig 1966; Sundberg & Pleijel 1994; Schander & Thollesson 1995; de Queiroz 1997) have realized, ‘the assignment of rank is an arbitary process’. There are no explicit rules regarding what degree of inclusiveness is meant by each rank. In other words, compared to the type species, how related or similar does another species have to be to warrant inclusion within the same ‘family’, and conversely, how different or distantly related does it have to be to warrant exclusion from this ‘family’? Generally, the higher the Linnaean rank, the further away the boundary is drawn, so that organisms progressively more different or distantly related to the type are included. Also, as noted above, boundaries for higher taxa tend to coincide with major evolutionary innovations or morphological gaps. However, the codes contain no rules to make this approach mandatory, nor even any indication as to how such innovations might be recognized. Thus, the inclusiveness of higher taxa defined under the current codes is arbitrary: the contents of a family level taxon include the type and whatever other species a taxonomist deems appropriate to be cofamilial. Moore (p.565) concludes, validly, that names under the current codes ‘have types and no circumscriptions’. In agreement with Lidén & Oxelman (1996), it is suggested that this lack of circumscription represents ‘flexibility’ and is thus advantageous: workers are free to alter the boundaries (inclusiveness) of a higher taxon in any way they wish. The alternative viewpoint adopted here sees this as a major shortcoming, and an inevitable source of instability. Without any objective criterion, there can never be agreement on the accepted boundaries of each higher taxon; every worker can draw the boundary in a slightly (or perhaps very) different place, resulting in endless and inconclusive arguments over splitting and lumping, even if there is complete agreement over the phylogeny (e.g. Hennig 1966; Sundberg & Pleijel 1994; de Queiroz 1997; Cantino et al. 1997). To precisely define a name of a higher taxon therefore one needs both a reference taxon (‘type’) and circumscription. Current codes provide only the first criterion. Unlike the current codes, phylogenetic nomenclature provides both a reference taxon and a circumscription, either through a synapomorphy (apomorphy-based definitions), or additional reference taxa (stem-and node-based definitions). Consider the following set of definitions:Aves: the clade diagnosed by wings homologous with that in the chicken (Gallus domesticus).Dinosauria: the clade diagnosed by erect gait homologous with that in the chicken (Gallus domesticus).Amniota: the clade diagnosed by the amniote egg homologous with that in the chicken (Gallus domesticus).Tetrapoda: the clade diagnosed by the tetrapod limb homologous with that in the chicken (Gallus domesticus).Vertebrata: the clade diagnosed by a backbone homologous with that in the chicken (Gallus domesticus). Here, names refer to a series of nested taxa, and each name is defined according to a common reference taxon and a distinctive synapomorphy. Successively larger clades are named by using successively more general synapomorphies of the reference taxon. AsMoore (1998) noted, the above scheme is rather similar to traditional definitions, in that a single type is specified and some indication of the inclusiveness of each higher taxon. The notable difference is that the boundaries of all named higher taxa are delimited precisely, using diagnostic synapomorphies. Moore’s suggestion (p.570) that apomorphy-based definitions do not place any circumscriptional limits on the taxon beyond inclusion of the type is incorrect; the boundary (and thus membership) is determined by the origin of that apomorphy. Aves therefore includes all the animals that have wings homologous to that in Gallus domesticus, instead of all the animals that ‘belong to the same class’ as Gallus domesticus. The former entity can be readily discerned and diagnosed on a cladogram, the latter entity is largely arbitrary, since there are no rules governing what degree of similarity or relatedness is required for organisms to be placed in the same class. Biologists are therefore free to arbitrarily expand or contract the limits of class Aves according to personal preference: for instance, Bakker & Galton (1974) suggested that birds are not sufficiently distant from dinosaurs (either phylogenetically or phenetically) to be placed in a separate class and thus formally included dinosaurs in the same class as birds. Many ornithologists might object to this. However, under current codes there is no criterion to determine which interpretation of Aves is correct, i.e. whether dinosaurs belong in the same class-level taxon as birds. Thus, the concept of Aves might become so unstable that it becomes of little use. Other phylogenetic definitions (node-and stem-based) use a second reference taxon, rather than a synapomorphy, to help delimit the boundaries of the taxon. These methods are also superior to using ‘ranks’. Moore (1998; fideLidén & Oxelman 1996) argued that the multitype system is problematic as it causes instability. Traditional names with one type (and apomorphy-based names with one reference taxon) will be most stable, and names with two reference taxa (e.g. many node-and stem-based definitions) will be slightly less stable, and names based on three reference taxa will be even more unstable, and so on. However, as he notes, it is possible to construct phylogenetic definitions using a single reference taxon, as in the apomorphy-based examples above, so even if multiple types decrease stability, they can be avoided in phylogenetic definitions. Furthermore, as discussed below, the assumption that multiple types decrease stability is not necessarily correct. The observation that traditional definitions are ‘more flexible’ (less objective) than phylogenetic definitions (Moore 1998) is therefore correct. It is then suggested (fideLidén & Oxelman 1996) that traditional definitions are consequently more desirable because they are free to vary, including or excluding various forms according to the opinion of individual systematists. However, it is difficult to see how such imprecision and subjectivity in definition of higher taxon names, which ensures the same name can (and often will) mean different things to different people, can be construed as desirable. Biological classification should be a scientific endeavour, and there are no instances in science where subjectivity and irreproducibility is favoured over objectivity and standardization. Moore discusses three methods of increasing stability in phylogenetic definitions, but finds each of them problematic: 1) Avoiding organisms with uncertain relationships when choosing reference taxa in definitions (Schander & Thollesson 1995; Cantino et al. 1997). It is suggested that this approach is problematic because it means that some clades will have to go unnamed, and that all clades must be named. Consider the phylogeny in Fig. 2, where the relationships of A-E are well-corroborated, but those of X and Y are uncertain. Almost every grouping on the cladogram can be named using only taxa A-E, even dubious clades such as XYE (e.g. ‘the most inclusive clade consisting of E and not D’). The only grouping that cannot be named without mentioning a phylogenetically labile taxon (X or Y) is the highly dubious clade XY, which consists exclusively of such taxa. In any case, the suggestion that all clades should be named is questionable: the alternative view is that poorly corroborated clades should not be named (e.g. Gauthier et al. 1988; Lee 1996; Cantino et al. 1997), and clades where all taxa are of uncertain phylogenetic position are extreme examples of this. The phylogenetic relationships of A–E are well-resolved but the position of two taxa (X and Y) are uncertain; alternative positions for the latter are indicated by the dotted lines. The clade XY can only be named phylogenetically by mentioning either X or Y, but it is doubtful whether it is worth naming. 2) Including additional reference taxa in the definition (e.g. Schander & Thollesson 1995; Wyss & Meng 1996; Cantino et al. 1997; Lee 1998). It is suggested that multiple types will reduce stability. However, there is no clear relationship between the number of reference taxa mentioned in the definition of a name and the stability of the named clade. Under many circumstances increasing the number of taxa in the definition increases stability. The stability of names can be increased by specifically mentioning certain taxa in the definition so that they cannot be inadvertantly included/excluded from a named clade if ideas about relationships change (see Wyss & Meng 1996; Lee 1996, 1998; Cantino et al. 1997). Moore suggested that such ‘buffered’ definitions increase stability only ‘in the context of a given phylogeny’ (p.569). This misses the point, since such definitions are specifically constructed to remain stable (i.e, refer to clades of similar content) regardless of which of the multiple possible phylogenies is eventually accepted. The further criticism that problems will occur if some of the multiple reference taxa are not themselves provided with formal phylogenetic definitions (Moore 1998) is certainly correct. There is also the additional problem that, even if the subordinate taxa are all formally defined, some of them might still be highly unstable in terms of inclusiveness. Definitions employing them as reference taxa might therefore also be unstable (Schander & Thollesson 1995; Bryant 1996). However, as Moore acknowledges, both these problems can be circumvented by directly using particular species — and thus ultimately the type individuals of those species — in phylogenetic definitions, rather than using subordinate higher taxa that might themselves be defined according to multiple species of uncertain relationships (e.g. Schander & Thollesson 1995; Bryant 1996; Cantino et al. 1997). 3) Designated phylogenetic contexts (Bryant 1997) and adaptive definitions with contingency clauses (Schander & Thollesson 1995; Cantino et al. 1997). A definition phrased to refer to a particular clade on a phylogeny might refer to a very different clade in the context of a new phylogeny. A designated phylogenetic context prevents the definition and name being used in the context of radically different phylogenies (where it will refer to very different clades), while an adaptive definition ensures that if the new phylogeny is accepted the ‘working version’ of the definition will change so that it continues to refer to a similar clade. Thus, both methods help maintain the association of particular names with particular clades in the context of uncertain relationships. Moore suggests that this approach is undesirable because it ‘places limits on what the taxon can represent well beyond inclusion of the type’ (p.569); in other words, because it rigidly circumscribes the contents of higher taxa. This reprises the argument, addressed above, that traditional codes are desirable because they does not have clear boundaries for higher taxa, while phylogenetic nomenclature is undesirable because such boundaries are clearly demarcated. It must be acknowledged, however, that such contingency clauses make the definitions of names more cumbersome (Moore 1998). Also, they do not completely eliminate the nomenclatural instability but partly shift it. In the context of a new phylogeny, a ‘basic’ phylogenetic definition of a name would be unstable because it now refers to a very different clade. A definition attached to a designated phylogenetic context would have to be discarded, leading to the need for a new definition (and a new name) for the clade that corresponds closely to the previous named clade. If an adaptive definition was adopted, a new working version of the definition compatible with the new phylogeny would be used to continue attaching the name to a similar clade. Thus, it appears that under both methods, in order to ensure that names remain attached to similar clades in the context of new phylogenies, changes have to be made to the definitions themselves. In other words, such definitions are only applicable in the context of a particular phylogeny or subset of possible phylogenies (see Härlin 1998; Härlin & Sundberg 1998 for further discussion). There is a tradeoff. Either the definition remains the same, and the associated clade changes (‘basic’ phylogenetic definitions); or the definition and name (designated phylogenetic contexts) or working definition (adaptive definitions) changes but the associated clade remains constant. Systematists will have to decide which of these changes is less disruptive, perhaps on a case-by-case basis. For well-known names that are already entrenched to refer to a tightly circumscribed assemblage of organisms, a radical change in the associated clade might be more confusing than a slight modification of the definition, whereas for new or poorly known names, the opposite might be true. Thus, the suggestion that phylogenetic nomenclature has more ‘bombs of instability’ (Moore 1998 p.568) than traditional codes appears unfounded. Higher taxa with phylogenetic definitions have clear boundaries: these might shift with changes in phylogenetic hypotheses, but the impact of these changes can be minimized using the approaches above. Higher taxa under traditional codes have no clear boundaries at all. As mentioned previously, these boundaries can shift according to individual opinion, i.e. arbitrary splitting and lumping decisions, even if the accepted phylogeny remains constant. They therefore have no mechanisms to ensure stability, or, in Moore’s terms, are plagued by infinite ‘bombs of instability’. Moore (1998) also emphasizes that in a phylogenetic taxonomy, which eschews standard ranks and endings, taxon names by themselves convey nothing about nesting; a cladogram or some other scheme (e.g. indenting) is required. This is true, but the implication is that standard codes, with standard ranks and rank-signifying endings, differ by conveying such information. This is a common misconception (e.g. Lidén & Oxelman 1996; Dominguez & Wheeler 1997). Dominguez and Wheeler claim, for instance, that suffixes show that Dasycerinae (a subfamily) is nested within Staphylinidae (a family). However, all that is conveyed by the endings of the names is that Dasycerinae is nested within (or possibly redundant with) some taxon ending in -idae. Without external information such as a cladogram or an indented classification, it cannot be inferred that the more inclusive -idae taxon is Staphylinidae (e.g. de Queiroz & Gauthier 1992; Lee 1998). Nesting of taxa would only be conveyed in the rare instances where names have the same root but different endings (e.g. Iguanidae, Iguaninae, Iguana). It was also noted that homotypic synonyms will not necessarily be objective synonyms in a phylogenetic nomenclature, leading to confusion. In a traditional taxonomy, two names (Lamiaceae and Labiatae) each defined as ‘the family level taxon containing Lamium’ will be synonyms since they specify the same type and the same (though vague) level of inclusiveness. However, if Lamiaceae and Labiatae are given different phylogenetic definitions and attached to different clades, they will not be synonyms in a phylogenetic taxonomy. This might be confusing if names universally understood as referring to identical groupings are redefined to refer to different clades in a phylogenetic taxonomy. However, this is not an inherent problem with phylogenetic taxonomy per se; if two names are entrenched as synonymous (identical) under traditional schemes, one can choose to use only one of these in a phylogenetic taxonomy. Finally, Moore (1998 p.575) suggests phylogenetic taxonomy can only depict hierarchical relationships, while the current codes of nomenclature are more flexible in that they are suited to ‘artificial, natural, evolutionary, phenetic, and cladistic classifications’. However, the phylogenetic definitions can be phrased to refer to nonhierarchical relationships (e.g. paraphyletic or polyphyletic groups), although such definitions are more cumbersome (de Queiroz & Gauthier 1992). Also, it is debatable whether the current codes are indeed suited to evolutionary, phenetic, or cladistic classifications (it is uncertain what is meant by ‘artificial’ and ‘natural’ classifications). For instance, the current systems are based on a nested hierarchy, and this scheme is completely unsuited to portray the ancestor-descendant relationships inherent in classical ‘evolutionary’ taxonomy of Simpson and Mayr (e.g. see Hull 1980; Charig 1982). Similarly, a truly phenetic classification would consist of a distance matrix, or a multivariate ordination plot. I fail to see how a hierarchical system, and uniform ranks, can be construed as being suited to represent phenetic resemblences. Such distance matrix data must be translated into a branching dendrogram — with consequent distortion and loss of information — before an hierachical nomenclature can be applied, and the continuous range of differences in such dendrograms (from 0 to 100%) cannot be captured with a few discrete ranks. Also, while a nested hierarchical system is indeed well-suited to portrary cladistic relationships, having a few fixed ranks cannot capture the complex fractal nature of cladograms (e.g. Hennig 1966; Wiley 1981; de Queiroz & Gauthier 1992; de Queiroz 1997; Knox 1998). In a cladistic classification, the only strictly equivalent groups are sister groups, and thus, each pair of sister groups must be given a unique rank (Knox 1998). Thus, contrary to Moore, it seems that current (Linnaean) codes are poorly suited to portray evolutionary, phenetic, and cladistic relationships. Clearly, in order to represent these different patterns of relationships, different methods of classification will be required; no single system will suit all needs. If one wishes to represent patterns of phenetic similarity, then distance matrices and multivariate plots should be used. If one wishes to represent patterns of relationship, then a phylogenetic system should be used. The question therefore should be: what patterns of organismal diversity should form the general reference framework for biology? There seems to be a consensus that it should be patterns of relationship, rather than overall similarity, ecological niche, or some other criterion. If this is the case, then a system of nomenclature that adequately represents these patterns of relationship is required – a phylogenetic taxonomy. This research is supported by a QEII fellowship and research grants from the Australian Research Council.
Referência(s)