Artigo Acesso aberto Revisado por pares

THE COMBINED EFFECTS OF RIVERS AND REFUGIA GENERATE EXTREME CRYPTIC FRAGMENTATION WITHIN THE COMMON GROUND SKINK ( SCINCELLA LATERALIS )

2009; Oxford University Press; Volume: 64; Issue: 2 Linguagem: Inglês

10.1111/j.1558-5646.2009.00840.x

ISSN

1558-5646

Autores

Nathan D. Jackson, Christopher C. Austin,

Tópico(s)

Identification and Quantification in Food

Resumo

EvolutionVolume 64, Issue 2 p. 409-428 Free Access THE COMBINED EFFECTS OF RIVERS AND REFUGIA GENERATE EXTREME CRYPTIC FRAGMENTATION WITHIN THE COMMON GROUND SKINK (SCINCELLA LATERALIS) Nathan D. Jackson, Nathan D. Jackson Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803 E-mail: njacks4@lsu.eduSearch for more papers by this authorChristopher C. Austin, Christopher C. Austin Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803 E-mail: ccaustin@lsu.eduSearch for more papers by this author Nathan D. Jackson, Nathan D. Jackson Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803 E-mail: njacks4@lsu.eduSearch for more papers by this authorChristopher C. Austin, Christopher C. Austin Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803 E-mail: ccaustin@lsu.eduSearch for more papers by this author First published: 20 January 2010 https://doi.org/10.1111/j.1558-5646.2009.00840.xCitations: 42AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Abstract Rivers can act as both islands of mesic refugia for terrestrial organisms during times of aridification and barriers to gene flow, though evidence for long-term isolation by rivers is mixed. Understanding the extent to which riverine barrier effects can be heightened for populations trapped in mesic refugia can help explain maintenance and generation of diversity in the face of Pleistocene climate change. Herein, we implement phylogenetic and population genetic approaches to investigate the phylogeographic structure and history of the ground skink, Scincella lateralis, using mtDNA and eight nuclear loci. We then test several predictions of a river–refugia model of diversification. We recover 14 well-resolved mtDNA lineages distributed east–west along the Gulf Coast with a subset of lineages extending northward. In contrast, ncDNA exhibits limited phylogenetic structure or congruence among loci. However, multilocus population structure is broadly congruent with mtDNA patterns and suggests that deep coalescence rather than differential gene flow is responsible for mtDNA–ncDNA discordance. The observed patterns suggest that most lineages originated from population vicariance due to riverine barriers strengthened during the Plio–Pleistocene by a climate-induced coastal distribution. Diversification due to rivers is likely a special case, contingent upon other environmental or biological factors that reinforce riverine barrier effects. The evolutionary legacy imparted by Pleistocene climate change to naturally dispersing populations and species has been subject to considerable investigation (Smith 1957; Blair 1958; Lundelius et al. 1983; Hewitt 1996, 2000; Provan and Bennett 2008), in part due to what this legacy can tell us about the climatic conditions experienced during the Quaternary as well as about the processes of species formation and extinction (Waltari et al. 2007). Recent reviews of phylogeographic studies implemented across several continents (Brunsfeld et al. 2001; Hewitt 2004; Soltis et al. 2006; Gomez and Lunt 2007; Byrne 2008; Stork and Turton 2008; Zeisset and Beebee 2008) reveal remarkable heterogeneity in the population genetic responses of codistributed species to the Pleistocene, reflecting that climate change is not a deterministic driver of evolution but just one among many factors (e.g., ecological, genetic, geographical) that interact to shape population genetic patterns (Byrne 2008). Climatic fluctuations throughout the Pleistocene resulted in a variety of both diversifying (e.g., Lister 2004; Provan and Bennett 2008) and homogenizing (e.g., Zink et al. 2004; Loehle 2007) consequences depending on the taxa assayed, and understanding the interaction of climate change with other forces can help explain the distinctive evolutionary histories observed for species that have experienced the same climatic cycles. Rivers are one force that may interact with Pleistocene-induced range-shifting to maintain and even generate diversity in the face of reduced, consolidated, or extirpated populations (Haffer 1997; Schneider et al. 1999; Anthony et al. 2007). For many terrestrial organisms, rivers (and their surrounding regions) can act both as refugial mesic habitat in the face of xerification (Delcourt and Delcourt 1981; Haffer 1997) and as barriers to gene flow between populations on either side (Sick 1967). The refugial effect of rivers can both sequester pre-Pleistocene diversity that would otherwise not survive in the wider more arid environments (Evans et al. 2004) and generate new diversity by isolating previously interbreeding populations to refugial islands (Holder et al. 1999). This effect is however contingent upon the existence of such refugia and the ability of a species to reach and survive in potentially small, ephemeral, and isolated pockets of suboptimal habitat (Davis 1983). The barrier effect of rivers, likely strongest far from the headwaters where they are widest and most impenetrable to would-be migrants (Haffer 1992), can potentially generate or retain population genetic structure within a single refugium. Climate change can also strengthen the barrier effect of a river if wide-ranging populations normally able to circum-navigate a riverine barrier at the headwaters are forced to contract their ranges toward warmer, moister habitat near the mouth of a river where it is most isolating (Capparella 1991; Haffer 1997). This river–refugia effect is contingent upon the inability of organisms to migrate across wide rivers and the persistence of populations in multiple refugia. The southeastern United States has been the focus of substantial phylogeographic inquiry over the past few decades (Auffenberg and Milstead 1965; Walker and Avise 1998; Avise 2000; Soltis et al. 2006), with much attention being paid to the identification of both geographic barriers to gene flow and hospitable mesic refugia that might have retained or produced its substantial diversity during glacial maxima (Delcourt 2002; Waltari et al. 2007). Much of this attention has highlighted the evolutionary importance of two major rivers in the region: the Mississippi and Apalachicola Rivers. Data from both phylogeographic inquiry (reviewed in Soltis et al. 2006) as well as pollen and macro-fossils (Delcourt and Delcourt 1981; Watts 1983; Jackson et al. 2000) provide strong evidence for the existence of many warm mesic-adapted species within the Mississippi River Valley and Apalachicola River-region of northern Florida throughout the Pleistocene. Additionally, well-established genetic discontinuities within a variety of taxa are concordant with the Mississippi (e.g., Brant and Orti 2002; Leaché and Reeder 2002), Apalachicola (e.g., Burbrink et al. 2000; Pauly et al. 2007), and to a lesser extent, Tombigbee Rivers (e.g., Gill et al. 1993; Gamble et al. 2008), indicating an important role for these rivers in driving diversification in the region. Although additional Pleistocene refugia in high-elevation regions such as the Ouachita (Shepard and Burbrink 2008), Ozark (Austin et al. 2004), and Appalachian mountains (Church et al. 2003; Kozak et al. 2006; Walker et al. 2009) must be invoked to explain extant phylogeographical patterns further north, identification of other refugia along the Gulf Coast (which largely lacks the topography allowing for mesic refugia in highlands) has been more difficult. Pockets of mesic refugia likely existed throughout the Gulf Coast (Davis 1983), particularly along the bluffs of major rivers (Delcourt 2002), but it is unclear how extensive or reliable such refugia were during the Pleistocene (Jackson et al. 2000). Additionally, beyond the Apalachicola, Mississippi, and Tombigbee Rivers, a potential long-term barrier effect of major rivers in southeastern North America has received little support for terrestrial organisms (but see Pounds and Jackson 1981; Kozak et al. 2006; Lemmon et al. 2007) and empirical evidence is mixed on whether such smaller, although still formidable, rivers are sufficiently impenetrable or enduring to permit evolutionarily significant genetic or phenotypic divergence to accumulate between populations on opposite banks (Haffer 1969; Capparella 1991). Given the identification of such pocket refugia along the Gulf Coast, the potential for the isolating power of major rivers to be strengthened by Pleistocene-induced range contractions of populations into riverine-associated refugia can be investigated. Here, we report on the finding of an unusual pattern of iterative mtDNA population fragmentation observed in the common ground skink (Scincella lateralis) that is consistent with diversification by a combination of Pleistocene refugia and rivers. These findings suggest a more consistent supply of mesic refugia along the Gulf Coast than has previously been shown for this region (Jackson et al. 2000; Loehle 2007; Gonzales et al. 2008). In this study, we describe and analyze the mtDNA pattern, assess the consistency of this pattern across the genome at eight nuclear loci, and test three predictions of a river–refuge origin of diversity (Capparella 1991; Haffer 1997): (1) Distinct lineage boundaries or genetic discontinuities should be delineated by rivers, with most rivers associated with divergence near the coast, and only the largest rivers associated with divergence further north, (2) Recent population growth of S. lateralis populations from hypothesized refugia should be evident, and (3) Estimates of divergence should date to or near the Plio-Pleistocene. Results from this study highlight the potential power of rivers as an evolutionary force, but also the contingency of rivers-as-barriers, their effectiveness being dependent upon other, often labile, environmental or biological factors. Scincella lateralis, one of the most abundant reptiles in the southeastern United States (Dundee and Rossman 1989; Akin 1998; Conant and Collins 1998) is well-suited for a study on the isolating-potential of rivers and refugia due to its low vagility (Brooks 1967; Fitch and Achen 1977) and wide-ranging distribution across most major rivers along the southeastern coastal plains. Populations exhibit a similar morphology across the species range (Lewis 1951; Johnson 1953; Brooks 1967) and no subspecies or geographical variants have been proposed. Although exhibiting a wide tolerance to a variety of habitats (Milstead et al. 1950; Dundee and Rossman 1989), S. lateralis is reliant on warm-temperate and mesic conditions (Milstead 1960; Ashton and Ashton 1985) best provided by the present-day deciduous and mixed forests of the southeastern United States. Materials and Methods SAMPLING We collected tissue from 367 ingroup specimens (Table S1) representing 100 sampling localities (Table S2) from throughout the range of the species, with some attention to sampling near major rivers (Fig. 1). Two additional samples, S. gemmingeri and Sphenomorphus cherriei, were included as outgroup specimens, in part based on previous phylogenetic analysis (Honda et al. 2003). The complete cytochrome b mitochondrial gene (cytb), including 33 base pairs of the downstream flanking tRNA-Thr gene (1177 total base pairs), was sequenced for all individuals. A geographically representative subsample of 63 individuals (including the outgroup S. gemmingeri) was selected for further sequencing at eight nuclear loci (4673 total base pairs; Table S3). These loci include one intron: selenoprotein T (SELT; 852 bp); one protein-coding gene: the prolactin receptor (PRLR; 558 bp); and six noncoding genomic loci (ranging from 443 to 641 bp), discussed below (Table S4). Figure 1Open in figure viewerPowerPoint Geographical distribution of localities sampled for this study, with numbers corresponding to locality numbers in Table S2. Circle colors correspond to 14 reconstructed mtDNA clades, where localities harboring multiple clades are indicated by circles with mixed colors filled in proportion to clade representation. Gray boxes outline areas depicted in more detail in Figure 3A–C. Latitudinal coordinates are indicated along the border of the map. COLLECTION OF GENETIC MATERIAL Either liver or tail tissue was sampled from each lizard and preserved in ≥95% ethanol and/or stored at −80°C. Genomic DNA was extracted using either salt-extraction (Fetzner 1999) or a Qiagen DNeasy extraction kit (Qiagen, Valencia, CA). Six noncoding nuclear loci were developed from select cloned sequences screened from a genomic microsatellite library created from a single S. lateralis individual (sample NDJ764; Table S1). Initially, of 192 sequenced clones, 20 candidate loci were selected from clones either lacking microsatellites or containing substantial microsatellite flanking regions. Primer pairs were developed for these loci using Primer 3 (Rozen and Skaletsky 2000) and then optimized for the sequencing of an initial screening set of eight individuals (representative of the major mtDNA variation as then understood for the species). Ascertainment bias was minimized by (1) using a large screening panel composed of divergent populations and (2) basing the inclusion of loci into the final dataset on our ability to obtain quality sequences rather than on the variability of observed polymorphisms (Rogers and Jorde 1996; Brumfield et al. 2003). All loci were compared against sequences in an online genome database using BLASTN (Altschul et al. 1997) to assess homology with known genomic regions. Polymerase chain reaction (PCR) of genomic DNA was carried out in accordance with standard protocols (Austin et al. 2009) and amplicons were purified by combining 5 μl PCR product with 0.25 μl Exonuclease I (20 units/μl), 0.25 μl of Antarctic phosphatase (5 units/μl), 0.25 μl 10× buffer (50 mM Bis-Tris-Propane-HCl, 1 mM MgCl2, 0.1 mM ZnCl2), and 4.25 μl purified water, followed by incubation for 20 min at 37°C and 15 min at 80°C. Double-stranded cycle-sequencing was carried out for each amplicon using a BigDye Terminator cycle-sequencing kit version 3.1 (Applied Biosystems, Foster City, CA). After sequences were cleaned using Sephadex, they were electrophoresed on a 3100 Genetic Analyzer (Applied Biosystems). DNA ALIGNMENT, PHYLOGENETIC, AND NETWORK ANALYSIS Sequences were edited and assembled into contigs using Sequencher version 4.6 (GeneCodes, Ann Arbor, MI) and cytb and PRLR datasets were translated into amino acids to check alignment. The alignment of noncoding regions was carried out using Clustal X version 2.0 (Larkin et al. 2007) and results were adjusted as necessary by eye. For nuclear datasets, haplotype phase was reconstructed using a Markov chain Monte Carlo (MCMC) algorithm as implemented in PHASE version 2.1 (Stephens et al. 2001; Stephens and Scheet 2005). The program was run multiple times for each dataset and haplotype frequency and goodness of fit were compared across runs to assess the consistency of results. Six nuclear sequences were unreadable due to heterozygous indels and removed from the analysis. We collapsed each dataset into unique haplotypes using Collapse version 1.2 (Posada 1999) and reconstructed gene phylogenies using both Bayesian inference (BI) and maximum likelihood (ML) optimality criteria. The most likely model of nucleotide substitution was selected (using Akaike information criterion) for each locus (and for codon-based partitions of datasets for protein-coding loci) after performing likelihood-ratio tests in Modeltest version 3.7 (Posada and Crandall 1998). For the cytb dataset, we performed a full likelihood heuristic search (200 replicates) in RAxML-HPC version 7.0.3 (Stamatakis 2006) assuming a general time-reversible (GTR) model of evolution with Γ-distributed rate heterogeneity, four rate categories, and an estimation of the proportion of invariable sites. To assess nodal support, we also performed 1000 replicates of nonparametric bootstrapping implemented in RAxML-HPC version 7.0.3. BI phylogenies for cytb and each nuclear locus were reconstructed using MrBayes version 3.1.2 (Ronquist and Huelsenbeck 2003). For each dataset, searches were carried out for two independent runs each consisting of four Markov chains that were permitted to run for 5–20 million generations, with sampling every 1000 generations. Convergence was confirmed by inspecting split frequency plots constructed using AWTY (Nylander et al. 2008) and 50% consensus trees were constructed using post-burn-in genealogies. All eight nuclear datasets were concatenated and a partition homogeneity test (100 replicates) was implemented in PAUP* version 4.0b10 (Swofford 2002) to assess the level of well-supported discordance among individual topologies. Simultaneous analysis was not carried out via concatenation given obvious topological differences among gene histories (Kubatko and Degnan 2007; Edwards 2009) and the significant degree of geographical contact and migration observed among some ncDNA populations (see Results) renders these datasets poorly suited for novel methods of multilocus phylogenetic reconstruction (Brumfield et al. 2008; Eckert and Carstens 2008). Finally, given the potential for multifurcations and reticulations within intraspecific gene trees (Rosenberg and Nordborg 2002), we also constructed haplotype networks for single-gene nuclear datasets in order to better visualize such nonbifurcating relationships (Posada and Crandall 2001). Networks were constructed using statistical parsimony (Templeton et al. 1992) as implemented in TCS version 1.21 (Clement et al. 2000) at the 95% confidence level. POPULATION DIVERSITY AND STRUCTURE Standard haplotype (Hd) and nucleotide (π) diversity indices were calculated for each single gene dataset and mtDNA lineage using Arlequin version 3.1.1 (Excoffier 2005). Average corrected (Kimura 2-parameter; K2P) and uncorrected pairwise genetic distances within and between major mtDNA lineages were calculated using MEGA version 3.1 (Kumar et al. 2004). Rough estimates of divergence times among mtDNA clades were calculated assuming a molecular clock that represents the minimum and maximium of the 1–2% range of cytb divergence rates estimated for various small-bodied lizards and lygosomine skinks (Austin 1995; Brown and Pestano 1998; Gübitz et al. 2000; Malhotra and Thorpe 2000; Poulakakis et al. 2005). We also investigated substitution rate heterogeneity among mtDNA lineages by performing a phylogenetically weighted relative rates test in RRTREE (Robinson-Rechavi and Huchon 2000). Population structure of the eight nuclear loci was investigated using a Bayesian clustering method implemented in Structure version 2.2.3 (Pritchard et al. 2000). We first estimated the most likely number of populations (K), as well as the most probable individual assignments to populations, given multilocus genotypic data. Applying the linkage model (Falush et al. 2003) to the raw phased sequence data (where linkage was allowed within, but not among loci), we ran Structure for all values of K between K= 1 and K= 8 ten times each for at least 150,000 generations (with an additional burn-in of 200,000). After it was determined that K= 3 had the highest likelihood (see below), we also ran Structure for each of these three populations separately three times each for K= 1 through K= 6. The mean likelihood estimate [ln Pr(X | K)] from replicate runs for each value of K was used to estimate the posterior probability for each K[Pr(K | X)] by assuming a uniform prior on K (Pritchard et al. 2000). Estimates of θ (= 4Neμ) and migration rates among the three Structure-inferred populations were obtained jointly for the ncDNA dataset by carrying out parameter searches using a Metropolis MCMC Bayesian-style sampling algorithm implemented in LAMARC version 2.1.3 (Kuhner et al. 1998; Kuhner 2006). Because the coalescent process as modeled here assumes no genealogical reticulation due to recombination, we used a dataset from which recombining regions were removed. Using the program IMgc (Woerner et al. 2007) we generated recombination-free blocks of data by filtering out all regions in the multilocus dataset that violate the four-gamete rule for recombination (Hudson and Kaplan 1985). Several independent chains were run for 5–10 million generations, with sampling every 100–140 steps, using default priors, and convergence was assessed using Tracer version 1.4 (Drummond and Rambaut 2007). To convert scaled parameter estimates to demographic estimates, we assumed a generation time of 1.7 years estimated from S. lateralis population age structure and survivorship data using the equation G=α+[s/(1 −s)] (Sæther et al. 2005), where α= the mean first year breeding age (1) and s= the adult survivor rate (0.41; Brooks 1967). We also assumed a mean nuclear clock calibrated to both 1% and 2% cytb rates by using divergence from the outgroup S. gemmingeri as a calibration point. ASSOCIATION OF DIVERSITY WITH RIVERINE BARRIERS Analysis of molecular variance (AMOVA; Excoffier et al. 1992) was used to compare ncDNA population structure with mtDNA phylogeny as well as assess the association of multilocus diversity with major rivers. To do this, we partitioned the concatenated ncDNA dataset in three different ways. (1) We first assessed the hierarchical allocation of genetic variation for "minor" Structure-inferred populations (depicted in Fig. 5B) nested within "major" Structure-inferred populations (depicted in Fig. 5A). (2) We compared these results with those from an ncDNA dataset similarly partitioned hierarchically into "minor" mtDNA clades (A–N) within "major" mtDNA clades ([A–D], [E–F], [G–H], [J–K], [N]). (3) Finally, to test population structure based on a model of divergence due to major rivers, we partitioned ncDNA into seven hypothesized populations isolated by six major rivers (Colorado, Red, Mississippi, Tombigbee, Choctawhatchee, and Apalachicola). We then grouped these populations in all possible ways, performing hierarchical AMOVAs on each partition scheme to determine which river(s) explain(s) the most variation. All AMOVAs were performed using Arlequin version 3.1.1 where significance of FST analogues was tested against 10,000 nonparametric permutations. Figure 5Open in figure viewerPowerPoint Geographic distribution of population assignments inferred for eight nuclear loci using Bayesian clustering as implemented in the program Structure. (A) Results from analysis of the entire dataset and (B) results from analysis of datasets partitioned in accordance with populations I–III depicted in (A). Circles correspond to sampling localities, circle fill corresponds to sample assignment to one of three populations (I–III), and line design within circles corresponds to sample assignment to one of seven subpopulations (Ia–Ib, IIa–IIb, or IIIa–IIIc). Hybrid samples (any sample assigned to two populations with ≥20% probability) in (A) are depicted with mixed fill roughly proportioned to the proportion of assignment probabilities. Hybrid samples in (B) are depicted with combined population line designs. Asterisks indicate localities from which two or more individuals were sequenced that had the sample population assignment probability profile. Small squared letters indicate localities from which two or more individuals were sequenced that had different assignment profiles, one assigned as indicated on the map (hybrids in all cases) and the others assigned as follows: A = Ia, B = IIb, C = IIa, IIb, IIb, IIb, D = IIb, and E = IIIc. Partial Mantel tests were carried out using the ncDNA dataset for these same six rivers to test whether genetic distance is greater between samples on opposite sides of rivers than between equidistant samples from the same side (Smouse et al. 1986), a pattern expected if rivers restrict gene flow. For each river, great circle geographic distances and pairwise patristic genetic distances were calculated between genotypes originating from either side of the river. The partial correlation between genetic distance and a binary matrix (describing whether any two samples are separated by the river) was determined once autocorrelation due to geographic distance was accounted for. Tests were performed using the vegan package in R (Oksanen et al. 2009; R Development Core Team 2009) where significance was assessed using 1000 randomizations. We also performed Fisher's exact tests for mtDNA clade pairs putatively delineated by 10 major rivers (Colorado, Red, Atchafalaya, Mississippi, Amite, Pascagoula, Tombigbee, Choctawhatchee, Chipola, and Apalachicola/Chattahoochee) to test the significance of the association between rivers and clade membership. DEMOGRAPHIC EXPANSION FROM REFUGIA The presence of a signature of recent demographic expansion within the dataset was tested using a variety of methods. First, values of Tajima's D (TD; Tajima 1989), Fu's FS (Fu 1997), and Ramos-Onsins and Roza's R2 (Ramos-Onsins and Rozas 2002) were calculated in Arlequin version 3.1.1 and DnaSP version 4.5 (Rozas et al. 2003) for all lineages and loci because significantly negative values can indicate recent growth. A statistical significance of these values was assessed with 10,000 coalescent simulations, where the estimated recombination rate (calculated in DnaSP) was assumed for diploid loci. The frequency distribution of pairwise genetic differences (or mismatches) within mtDNA clades was investigated using Arlequin version 3.1.1. We compared observed mismatch distributions against simulated distributions (expected to be unimodal under a model of sudden population growth; Rogers and Harpending 1992). Fit of the data to each model and confidence intervals around parameters were determined using the sum of squares deviations (SSD) between the observed and expected mismatch distributions as well as the raggedness index (Harpending 1994) calculated from 2000 bootstrap replicates. We also reconstructed demographic histories for most mtDNA clades using the Bayesian skyline plot (BSP) as implemented in BEAST version 1.4.8 (Drummond and Rambaut 2007). The MCMC was run at least three times for at least 20 million generations (taking 10,000 samples per analysis) for each clade under a GTR +Γ+ I model of evolution assuming a relaxed uncorrelated lognormal molecular clock (Drummond et al. 2006). Effective sample size of parameters was assessed and plots were visualized using Tracer version 1.4. We used LAMARC version 2.1.3 to produce posterior probability distributions of parameters within the exponential growth model θt=θ0e−gtμ to assess population growth rate using the ncDNA dataset. Searches were run for each of the three major populations estimated by the program Structure. Several independent chains were run for 5–10 million generations, sampling every 100–140 steps, and convergence was assessed using Tracer version 1.4. Several initial runs were carried out to explore the parameter space, with final runs implementing a flat growth rate prior (linearly scaled) of g= 1000–5000 for population I and g= 0–4000 for populations II and III. Finally, recent growth out of putative southern refugia would predict that southern populations exhibit higher levels of diversity than northern populations. To compare levels of diversity in these two regions while accounting for bias in sampling effort, we resampled 100 individuals (without replacement) from samples collected in the north and south (with 31° north latitude used as an arbitrary cut-off ) separately for 500 iterations using R. We calculated mtDNA diversity indexes from each bootstrap using Arlequin version 3.1.1 and constructed distributions of values from which we obtained point estimates and 95% confidence intervals. We repeated this procedure for the eight nuclear loci (resampling from the reduced ncDNA dataset) and for individual mtDNA lineages (for which we had ≥15 samples north or south of 31°), except resampling only 20 and 12 individuals per iteration, respectively. Results DESCRIPTION OF GENETIC DATA Excluding the outgroup, nuclear datasets contained between 20 and 70 variable sites of which 11 to 68 were parsimony-informative (Table S5). All six anonymous loci are assumed to be noncoding given they do not appear translatable into proteins or match any annotated gene regions in GenBank. Most haplotypes were reconstructed with 100% posterior probability, and no effect on results was observed when analyses were performed either including or excluding loci phased with <95% probability. GenBank accession numbers for all sequences are provided in Table S1. PHYLOGENETIC AND NETWORK ANALYSIS For the cytb dataset, 314 unique haplotypes were recovered from 366 S. lateralis samples plus two outgroups. An absence of indels, nucleotide ambiguities, and stop codons indicates a mitochondrial origin for these sequences (Zhang and Hewitt 1996). Excluding outgroups, 389 sites are variable, of which 286 are parsimony-informative and likelihood ratio tests determined that GTR +Γ+ I is the most likely model of evolution for the overall dataset. After burn-in trees were removed, codon-partitioned BI analysis produced a 50% consensus genealogy with a mean ln-likelihood of −10035.4 (standard deviation = 22.81).

Referência(s)