Artigo Acesso aberto Revisado por pares

Genome‐wide association to fine‐scale ecological heterogeneity within a continuous population of Biscutella laevigata (Brassicaceae)

2008; Wiley; Volume: 178; Issue: 2 Linguagem: Inglês

10.1111/j.1469-8137.2007.02361.x

ISSN

1469-8137

Autores

Christian Parisod, Pascal‐Antoine Christin,

Tópico(s)

Plant Ecology and Taxonomy Studies

Resumo

New PhytologistVolume 178, Issue 2 p. 436-447 Free Access Genome-wide association to fine-scale ecological heterogeneity within a continuous population of Biscutella laevigata (Brassicaceae) Christian Parisod, Christian Parisod Department of Ecology and Evolution, Biophore, quartier Sorges, University of Lausanne, CH–1015 Lausanne, SwitzerlandSearch for more papers by this authorPascal-Antoine Christin, Pascal-Antoine Christin Department of Ecology and Evolution, Biophore, quartier Sorges, University of Lausanne, CH–1015 Lausanne, SwitzerlandSearch for more papers by this author Christian Parisod, Christian Parisod Department of Ecology and Evolution, Biophore, quartier Sorges, University of Lausanne, CH–1015 Lausanne, SwitzerlandSearch for more papers by this authorPascal-Antoine Christin, Pascal-Antoine Christin Department of Ecology and Evolution, Biophore, quartier Sorges, University of Lausanne, CH–1015 Lausanne, SwitzerlandSearch for more papers by this author First published: 17 January 2008 https://doi.org/10.1111/j.1469-8137.2007.02361.xCitations: 53 Author for correspondence:Christian ParisodTel: +33 1 30 83 30 24Fax: +33 1 30 83 30 99Email: cparisod@versailles.inra.fr AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Summary • Gene flow, drift and selection can be detected through different signatures across the genome and the landscape. Genetic discontinuities along with their correlation to environmental features can be used to tease out isolation-by-distance and isolation-by-time from processes related to selection. • Using spatial statistics (spatial autocorrelation methods, canonical correspondence analysis and partial Mantel tests) dealing with genome-wide amplified fragment length polymorphism (AFLP) under unlikely Hardy–Weinberg assumptions, this study investigates 124 individuals within a continuous population of the autopolyploid Biscutella laevigata (Brassicaceae). • Fine-scale spatial genetic structure was strong and the mosaic-like distribution of AFLP genotypes was consistently associated with habitat factors, even when controlled for geographical distances. The use of multivariate analyses enabled separation of the factors responsible for the repartition of the genetic variance and revealed a composite effect of isolation by distance, phenological divergence and local adaptation to habitats characterised by different solar radiation regimes. • These results suggest that the immigrant inviability barrier facilitated the maintenance of adapted subpopulations to distinct environmental conditions at the local scale. Introduction The increased availability of highly variable genetic markers led to the development of population genomics, a discipline investigating the evolution of different regions across the genome in natural population (Luikart et al., 2003). Meanwhile, improved accessibility of spatially accurate ecological data resulted in the emergence of landscape genetics, which maps genetic discontinuities among population with environmental features with aim of a better understanding of the factors influencing the distribution and evolution of natural populations (Manel et al., 2003). These closely related fields are currently merging into landscape genomics, which simultaneously investigates numerous genetic markers across the genome together with numerous sites across the landscape (Joost et al., 2007). However, despite guidelines and sophisticated tools, the links between genetic and environmental variables remain difficult to unambiguously quantify, because of the numerous disadvantages of conventional analytical tools (Sork et al., 1999; Escudero et al., 2003; Luikart et al., 2003; Manel et al., 2003; Storfer et al., 2007). Although differentiation of gene pools is easily envisioned in spatial isolation when a barrier to gene flow leads to the independent evolution of lineages and thereby genetic discontinuities, divergence is not as straightforward within populations (Levin, 2000). In continuous populations, recurrent processes, such as gene flow, genetic drift and selection, act in concert to shape the genetic structure (Lenormand, 2002; Latta, 2003). Theoretically, gene flow is expected to homogenize the distribution of genetic variation, unless selection or drift is strong. Fine-scale genetic differentiation has often been reported in plant populations (Vekemans & Hardy, 2004), even under substantial gene flow, suggesting that strong selective pressure promotes local adaptation at small scale in heterogeneous landscapes (Linhart & Grant, 1996). Nevertheless, restricted gene dispersal also increases genetic differentiation among populations as a function of geographical distance (Slatkin, 1993) and several evolutionary processes can create genetic heterogeneity, leading to a within-population structure similar to the expectations under selection (Latta, 2003). Purely demographic processes can be distinguished from selection, because they are expected to have different consequences across the genome and across the landscape. Demographic processes such as migration or genetic drift are expected to similarly affect all loci across the genome, while selection is supposed to act upon simple loci or small portions of the genome. Drift and selection are also expected to leave different genetic signatures across the landscape. On one hand, drift generates a strong, but stochastic spatial structure, which is therefore not expected to match environmental conditions. On the other hand, a consistent association of particular genotypes with environmental conditions suggests that populations are locally adapted (Latta, 2003). It has also been recently stressed that differential gene exchange influenced by plastic variation in the flowering phenology among local habitats may contribute to a fine-scale genetic variation associated with environmental heterogeneity (Hendry & Day, 2005). Unlike local adaptation, it is anticipated that differences in phenology lead to genome-wide structure associated with environmental factors and it is crucial to explore procedures that may distinguish local adaptation from phenological divergence. Population genomics tools without Hardy–Weinberg assumptions From a methodological point of view, understanding populations as a community of loci varying across heterogeneous environments, it becomes attractive to adopt the statistical tools of community ecology in population genetics (Gram & Sork, 2001). Indeed, despite controversies over multivariate analyses in evolutionary genetics (Blows, 2007), genomic studies make great use of this statistical framework (Quackenbush, 2001) and the few population studies adopting it provided valuable insights about adaptation under natural conditions (Hamrick & Allard, 1972; Nevo et al., 1988; Angers et al., 1999; Gram & Sork, 2001; Manel et al., 2003; Volis et al., 2004). Amplified fragment length polymorphism markers (AFLPs) particularly suit the aims of population genomics and several statistical tools have been developed to deduce allele frequencies from dominant molecular markers in diploid organisms (Bonin et al., 2007). However, Hardy–Weinberg assumptions on which these estimations rely are hardly verifiable and probably not met in several situations, in which no procedure seems yet available to deal with AFLPs. For example, autopolyploid lineages, which occur in many plant taxa (Soltis & Soltis, 2000; Soltis et al., 2007), certainly violate Hardy–Weinberg expectations because they often present multisomic inheritance at certain loci, resulting in segregation complexities (Bever & Felber, 1992; Ronfort et al., 1998). In order to investigate the evolutionary potential of all these insightful taxa, alternative statistical tools are needed. Principal component analysis (PCA) aims at extracting major gradients of variation in data and the related between-group eigenanalysis (BPCA) ordinates specified groups as to maximize their separation in some space, allowing testing for the significance of the between-group variance (Doledec & Chessel, 1987). When used in population genetics, PCA and BPCA are band-based approaches that are independent of any theoretical genetic models and conveniently allow investigation of genetic structure (Patterson et al., 2006), especially when Hardy-Weinberg equilibrium assumptions are a priori unexpected. Furthermore, when quantifying the effect of environmental heterogeneity on genetic structure and/or detecting candidate loci, spatial coincidence analyses are particularly suitable because they allow circumventing assumptions of neutrality. In that context, canonical correspondence analysis (CCA) suits the aims of population geneticists by attributing dependent variables (e.g. genetic data) to explanatory variables (e.g. ecological factors) and testing for their association. Canonical correspondence analysis integrates ordination and multiple regressions, taking all the AFLP loci and the environmental factors independently into account in a single analysis. It may represent an efficient method to explore the association between genetic differentiation and environmental heterogeneity (i.e. patterns of local adaptation; Manel et al., 2003; Storfer et al., 2007). Nevertheless, investigating the association between spatial patterns of genetic diversity and environmental heterogeneity is complicated by the presence of similar spatial autocorrelation in both independent datasets, which may artificially inflate their correlation (Volis et al., 2004). Such a situation is particularly likely in continuous population at the local scale and inferences from spatial coincidence analyses have to be completed by a quantification of purely geographic effects, which may be achieved by partial Mantel tests comparing distances among multiple datasets (Riginos & Nachman, 2001; Bekkevold et al., 2005). Biscutella laevigata as a model species that does not follow Hardy-Weinberg expectations Biscutella laevigata L. (Brassicaceae) is a spring blooming, long-lived perennial species with a sporophytic self-incompatibility system (Olowokudejo & Heywood, 1984). Pollen dispersal is achieved by generalist Diptera and Lepidoptera, while seeds are passively dispersed by gravity and wind. Since the early studies of Manton (1937), B. laevigata is thought to have survived the Pleistocene vicissitudes as a diploid (2n = 2x = 18) in ice-free parts of the European continent and recolonized the Alps as an autotetraploid (2n = 4x = 36) after the ice ages. Using allozyme markers, Tremetsberger et al. (2002) showed multisomic segregation of alleles in the autopolyploids and recent plastid DNA phylogeography indicated that the species evolved through polytopic autopolyploidy before recolonizing previously glaciated areas with independent lineages (Parisod & Besnard, 2007). Whereas many studies focus on allopolyploidy (i.e. the merging and doubling of differentiated genomes), the evolutionary advantages of autopolyploid lineages, which do not combine the attributes of divergent parents, are neither straightforward nor well understood (Soltis & Soltis, 2000; Comai, 2005; Soltis et al., 2007). Studies investigating multisomic taxa, such as B. laevigata, may thus provide valuable insights about the role of genome doubling under natural conditions. Using a systematic sampling strategy in a continuous population of Biscutella laevigata (Brassicaceae) and AFLP on 124 individuals, this study aims at separating isolation-by-distance and isolation-by-time processes from those influenced by selection. Using appropriate statistical procedures, an association between the genome-wide AFLPs and the environmental variance was detected, suggesting restricted gene flow among habitats in relation to local adaptation and phenology. Material and Methods Sampling site and sampling strategy The present study population (6°58′/46°26′; JAM in Parisod & Besnard, 2007) is located in an area with high level of plastid DNA haplotype diversity (Parisod & Besnard, 2007) and lies in putative refugia of the external area of the western Swiss Alps (Stehlik, 2000). It is therefore considered as a rear-edge population of one autopolypoid lineage. The altitude ranges between 1850 m and 2000 m, which represents the lower limit of the alpine belt. The bottom of the zone corresponds to subalpine habitats, while the plots located in higher elevation are in alpine conditions. Along this natural ecotone, contrasted habitats have a mosaic distribution according to the spatial repartition of ecological factors related to microtopography (C. Parisod et al., unpublished). Biscutella laevigata appeared in a linear, continuous population system over the drier areas of the study site. Following a systematic strategy, the species was censused in 2 × 2 m plots every 12.5 m and, when at least six individuals were present, leaves of each individual located nearest to each corner (total of four individuals) were collected and dried in silica gel (Fig. 1). Using this strategy, the resulting sampling is representative of the species natural distribution. The plot location was measured 100 times using a Trimble Geo-explorer 3 GPS (Sunnyvale, CA, USA) providing a precision of 1-m. Biscutella laevigata was present in 31 sampling plots, giving a dataset of 124 individuals for genetic analyses. The phenological stage of individuals was reported and coded as 1 for buds, 2 for buds and opened flowers, 3 for opened flowers, 4 for opened flowers and fruits, 5 for mature fruits and 6 for mature and dispersed fruits. Environmental values corresponding to the location of every plots were extracted from published GIS eco-climatic layers (Zimmermann & Kienast, 1999) and used as explanatory variables to explore their contribution to the phenology, but also their association with genetic markers (see later). Based on ecological modelling at the local and the regional scale (S. Joost and C. Parisod, unpublished), a few relevant environmental factors, which showed wide amplitude within the species habitat, were investigated. The following factors (presented with the range of values across the sampled plots and units between brackets) were taken into account (see the Supplementary material, Table S1): elevation (1851–1990 m), degree-days during the growing season (1124–1260 degree × days), slope (2–69°) and total solar radiation (18 057–75 418 kJ d−1). Figure 1Open in figure viewerPowerPoint Distribution of the sampled plots located within the continuous population of Biscutella laevigata and spatial representation of the genetic structure. Basemap presents the topography of the site with conventional northwest lighting and isoelevation lines every 50 m. (a) Circles represent the plots sampled, and colour indicates their quantiled score along the between-group eigenanalysis (BPCA) first axis and whose diameter is correlated with the genetic diversity, according to the panel on the right. (b) Plots are represented by star, square or triangle indicating the mean phenological stage and coloured according to their quantiled score along the second BPCA axis, according to the panel on the right. Genetic analysis Total DNA was extracted from dried leaf material using the FastDNA kit (Q-Biogen) in a room devoted to DNA extraction. The manufacturer instructions were followed except that extracted DNA was washed twice. DNA quality and concentration were checked on agarose gels. Dominant AFLP markers were generated for the whole population (124 profiles) following Vos et al. (1995) with minor modifications. Recommendations of Bonin et al. (2004) were followed in order to reduce genotyping errors. Restrictions and ligations were carried in a Biometra thermocycler (Gottingen, Germany) at 37°C for 2 h each. Restrictions were performed on 150 ng of genomic DNA in 20-µl reactions of 10 × buffer for EcoRI with 2 U MseI, 5 U EcoRI and 0.3 µg bovine serum albumin (BSA). Then, 0.7 µm double-stranded EcoRI and MseI adapters were ligated in further 20 µl of 10 × T4 buffer with 1 Weiss U of T4 DNA ligase giving a 40 µl ligation mix. Two microlitres of ligation solution were used for a nonselective amplification in 20 µl by 28 preselective cycles (120 s at 94°C, 28 cycles for 45 s at 94°C, 45 s at 56°C and 60 s at 72°C and a final elongation of 600 s at 72°C). Three microlitres of 20-times diluted preamplification solution were then used for a selective amplification by 36 cycles of a touch-down polymerase chain reaction (PCR) (60 s at 94°C, 13 cycles of 30 s at 94°C, 30 s at 65°C to 56°C and 60 s at 72°C followed by 23 cycles of 30 s at 94°C, 30 s at 56°C, 60 s at 72°C with a final elongation of 300 s at 72°C). Based on a preliminary assessment, where eight selective primer combinations were tested three times independently to check for polymorphisms and reproducibility, the complete sample set was processed using two selective primer combinations: M-CAG/E-ACA and M-CTC/E-AGG with the fluorescent E primer labelled by FAM and JOE, respectively. PCR-products were visualized using an ABI-PRISM 377 (Applied Biosystems, Foster City, CA, USA) sequencer on a 6% Long Ranger denaturing gel for 5 h (2000 V, 50 A). Fragment sizes were estimated using the 500-ROX standard and scored manually as dominant markers using Genescan 3.1.2 (Applied Biosystems). Only clearly identifiable bands with high peaks were scored as present or absent and weak profiles were discarded. According to Vekemans et al. (2002), AFLP fragments inferior to 50 bp, which are reputed to be highly homoplasic, were not taken into account. Since no significant correlation between band size and frequency was detected (P = 0.237), homoplasy was considered negligible here. The whole AFLP procedure was replicated twice for 20 individuals and an error rate was calculated as the number of band differences between the two profiles of the same individual divided by the total number of fragments scored. The estimated error rate was 1.8%, which is lower than commonly reported with similar techniques (Bonin et al., 2004). Genetic structure with AFLPs At the individual level, the pattern of spatial genetic structure was explored by spatial autocorrelation methods (Vekemans & Hardy, 2004) on the 124 individuals over the continuous population. The association between pairwise relationship coefficients calculated on AFLP markers (Hardy, 2003) in spagedi (Hardy & Vekemans, 2002) and the logarithm of the geographic distance was tested by Mantel tests with 9999 permutations, using fstat (Goudet, 1995). An autocorrelogram was then constructed to assess correlations among genotypes at increasing geographical distance intervals with balanced number of individuals. Significance of distance-class mean relationship coefficients was assessed with 999 permutations and Bonferroni corrected (Vekemans & Hardy, 2004). Given the lack of knowledge concerning the segregation of dominant AFLP markers in the autopolyploid B. laevigata (i.e. disomic vs multisomic inheritance, rate of double reduction), the genetic structure of the continuous population was investigated by a multivariate analysis (Fig. 2). Since it is a band-based rather than allele frequency-based approach, this procedure does not assume Hardy–Weinberg equilibrium and is therefore convenient to explore the genetic structure of polyploid populations with multisomic inheritance. Using ADE-4 (Thioulouse et al., 1997), PCA on the covariance matrix was computed on AFLP profiles at the individual level and then at the plot level with BPCA in order to subsequently correlate genetic data with environmental factors. Principal component analysis conserves Euclidian distances and decomposes the covariance of all descriptors (here, loci) into components for each object (here, individuals or plots) along each of the full-ranked eigenvectors derived from general singular value decomposition (Doledec & Chessel, 1987; Patterson et al., 2006). Principal component analysis thus summarizes a maximum of variance into fewer, interpretable dimensions. Thereafter, BPCA (i.e. PCA between plots based on PCA among individuals) was performed. This analysis groups individual PCA profiles into sampling plots in order to maximize the between-group genetic variance. The plot-centroids are then projected in a new reduced space, along the full-ranked BPCA eigenvectors. Significance of the between-group variance was estimated by 9999 permutations using ADE-4 (Thioulouse et al., 1997). The mathematical details of the BPCA can be found elsewhere (Culhane et al., 2002; Pavoine et al., 2004), but variance partitioning by BPCA is an Euclidian discriminant approach and can be safely used with any combination of plots and loci. One interesting use of BPCA is to produce a set of univariate genotypic variables for each plot (BPCA scores) that can be further analysed by univariate and/or spatial statistics in order to help the interpretation of CCA (see later). Since BPCA is based on Euclidian distances, it can be considered as analogous to F-statistics (Parisod et al., 2005). However, those estimators, named βST, are not equivalent to F-statistics and βST values may be overestimated because BPCA maximizes the between-group variance. Nevertheless, relative βST represented the genetic structure of the population well. Indeed, pairwise βST, which were computed here as multidimensional Euclidian distance between the multidimensional BPCA scores of plot centroids, were slightly inflated but highly correlated with other traditional estimators of genetic differentiation (Supplementary material, Fig. S1), such as the GST calculated on Shannon diversity (Mantel test, r = 0.81, P-value < 0.001; Bussell, 1999) and the ΦST obtained from the amova implemented in the arlequin software (Mantel test, r = 0.70, P-value < 0.001; Excoffier et al., 1992). Figure 2Open in figure viewerPowerPoint Conceptual diagram of the multivariate statistical procedure used to investigate the association of individual genetic amplified fragment length polymorphism (AFLP) profiles and environmental factors. Variance partitioning by between-group eigenanalyses (BPCA) is analogous to FST and can be used to summarize most of the individual genetic information in a subspace at the plot level. Similarly, most of the environmental information is combined into an environmental subspace at the plot level. i and n represent the number of individuals and plots, respectively. j and p are the number of AFLP loci and environmental factors, respectively. After multivariate procedures (presented along the arrows), the column of the matrix contains independent information (scores, j′ or j″ and p′) that can be further investigated by univariate and/or spatial analyses. The canonical correspondence analysis (CCA, at the bottom) provides the framework to regress the multivariate environmental subspace on the multivariate genetic subspace and test for their association by permutations. Euclidian distances between plots multidimensional scores (j″ and p′) can be computed in each subspace (dashed arrows) and used to investigate the influence of the geographical distance between plots with partial Mantel tests (at the top). Grouping individuals into sampling plots, the Shannon diversity index was calculated following Bussell (1999) for each AFLP locus and averaged in each plot, as The relationships between the Shannon diversity index, plot size and density, as well as ecological factors were explored by stepwise multiple linear regressions and robust regressions for each factor. Association between the genetic and environmental heterogeneity The ecogenetic structure of the population was explored by CCA and partial Mantel tests (Fig. 2). Both procedures were computed in parallel because CCA cannot properly deal with geographical distance (but see Borcard et al., 1992) and has been claimed to possibly raise misleading positive correlations because of shared spatial autocorrelation between genetic and ecological traits (Volis et al., 2004). Conversely, partial Mantel tests account for geography but are limited to distance type of relationship among descriptors. Canonical correspondence analysis explored the association between the AFLP data summarized by the BPCA at the plot level and the environmental factors summarized by PCA on a correlation matrix (Angers et al., 1999; Gram & Sork, 2001; Volis et al., 2004; Blows, 2007). Canonical correspondence analysis integrates ordination and multiple regression, resulting here in ordination axes ordered according to the genetic variance explained by linear combination of the independent ecological variables (Angers et al., 1999). The CCA was performed with ADE-4 (Thioulouse et al., 1997). Global significance of the association was tested by 9999 Monte Carlo permutations, which is permissive concerning the data distribution. In a single analysis, CCA produced a reduced space that maximized the joint structure between the genetic data (i.e. BPCA matrix derived from the individual AFLP bands occurrences and maximizing the genetic structure between plots) and the environmental data (PCA on correlation matrix of ecological factors). Therefore, CCA summarized the pattern of genetic variation that was best explained by the environment within the population (Gram & Sork, 2001) and highlighted the contribution of single ecological factor as well as AFLP loci to the CCA axes by producing an ordination diagram showing sampling plots as well as vectors presenting their AFLP composition and ecological factors acting on plots. Patterns of genetic isolation-by-distance and spatial autocorrelation with ecological factors were investigated by Mantel tests with 9999 permutations, using fstat (Goudet, 1995). The correlation between pairwise βST was first tested separately with pairwise Euclidian distances among plots for each ecological factor and for the mean phenological stage and the logarithm of geographic distances among plots. All the environmental factors and the logarithmic geographic distance were then added as explanatory variables in a partial Mantel test. The best model was determined through a stepwise procedure, with the less significant explanatory variable removed until all the remaining factors were significant. This was repeated several times with a randomisation of the explanatory factors order to verify that the results were consistent. The same procedure was applied to determine the best models for the pairwise distances on the first and second axis of the BPCA. Results Population AFLP structure One-hundred and two reliable AFLP bands were generated. A Mantel test between pairwise relationship coefficients and logarithm of the geographic distances among individuals revealed significant isolation by distance (r = 0.097, P < 0.001) and spatial autocorrelogram further showed fine-scale genetic structure (Fig. 3). It showed that geographically nearby individuals were genetically related as only the first three distance classes presented a mean relationship coefficient significantly greater than zero (0.234 between 1 and 21.3 m; 0.047 between 21.3 and 41.1 m and 0.036 between 41.1 and 60.5, respectively). The first significant distance-class with a negative relationship coefficient (i.e. trough) was composed of individuals located c. 200 m apart and the relationship coefficient reached zero below 100 m. Congruently, individuals located further than 400 m were all negatively related. Although predominantly negative relationship coefficients were significant after conservative Bonferroni correction, consecutive peaks and troughs of similar amplitude succeeded on the correlogram with a period of c. 100 m (Fig. 3). Figure 3Open in figure viewerPowerPoint Fine-scale spatial genetic structure in the continuous population of Biscutella laevigata located at the rear edge of the species range. Autocorrelogram presenting the mean coefficient of relationship within 20 balanced log-distance classes. The distance scale is linear and the intervals of distance classes are presented between brackets. Closed circles represent significant relationship coefficient for the distance class after Bonferroni correction; open circles represent nonsignificant values. Similarly, intervals of significant distance classes are presented in black and non significant in grey. The first axis of the PCA on individual AFLP profiles summarized only 6.71% of the total inertia and the 40 first PCA axes, which together represented 89.79% of the information contained in the genetic dataset, were kept for further analyses. The BPCA showed that most of the genetic variation was between plots (βST = 0.485, P < 0.001). The two first ordination axes showed roughly even contribution of most AFLP loci and summarized 10.4% and 9.1%, respectively (Fig. 4). Amplified fragment length polymorphism loci showed a broadly even contribution to the first BPCA axes, without strict covariation (i.e. most arrows had similar length and were distributed around 360°), and no particular loci appeared as mainly responsible for the observed structure. The ordination of plot centroids did not show obvious clustering and, although plots from field-patches A and B were genetically differentiated from one another, those belonging to the patch C were related to both (Fig. 4). Figure 4Open in figure viewerPowerPoint Representatio

Referência(s)
Altmetric
PlumX