Artigo Acesso aberto Revisado por pares

Multiplexing with three-primer PCR for rapid and economical microsatellite validation

2014; BioMed Central; Volume: 151; Issue: 2-3 Linguagem: Inglês

10.1111/hrd2.00044

ISSN

1601-5223

Autores

Salla Vartia, Patrick C. Collins, T. Cross, R. Fitzgerald, David T. Gauthier, Philip McGinnity, Luca Mirimin, Jens Carlsson,

Tópico(s)

Environmental DNA in Biodiversity Studies

Resumo

HereditasVolume 151, Issue 2-3 p. 43-54 ResearchOpen Access Multiplexing with three-primer PCR for rapid and economical microsatellite validation Salla Vartia, Salla Vartia [email protected] Carna Research Station, Ryan Institute, National University of Ireland, Galway, Carna, Connemara, Co. Galway, Ireland Area 52 Research Group, School of Biology & Environment Science, University College Dublin, Belfield, Dublin, IrelandSearch for more papers by this authorPatrick C. Collins, Patrick C. Collins Area 52 Research Group, School of Biology & Environment Science, University College Dublin, Belfield, Dublin, IrelandSearch for more papers by this authorThomas F. Cross, Thomas F. Cross Beaufort Fish Genetics Programme, School of Biological, Earth and Environmental Sciences/Aquaculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, IrelandSearch for more papers by this authorRichard D. Fitzgerald, Richard D. Fitzgerald Carna Research Station, Ryan Institute, National University of Ireland, Galway, Carna, Connemara, Co. Galway, IrelandSearch for more papers by this authorDavid T. Gauthier, David T. Gauthier Department of Biological Sciences, Old Dominion University, Norfolk, VA, USASearch for more papers by this authorPhilip McGinnity, Philip McGinnity Beaufort Fish Genetics Programme, School of Biological, Earth and Environmental Sciences/Aquaculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, IrelandSearch for more papers by this authorLuca Mirimin, Luca Mirimin Carna Research Station, Ryan Institute, National University of Ireland, Galway, Carna, Connemara, Co. Galway, Ireland Molecular Breeding and Biodiversity Lab, Department of Genetics, Stellenbosch University, Stellenbosch, Western Cape, South AfricaSearch for more papers by this authorJens Carlsson, Jens Carlsson Area 52 Research Group, School of Biology & Environment Science, University College Dublin, Belfield, Dublin, Ireland Beaufort Fish Genetics Programme, School of Biological, Earth and Environmental Sciences/Aquaculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, IrelandSearch for more papers by this author Salla Vartia, Salla Vartia [email protected] Carna Research Station, Ryan Institute, National University of Ireland, Galway, Carna, Connemara, Co. Galway, Ireland Area 52 Research Group, School of Biology & Environment Science, University College Dublin, Belfield, Dublin, IrelandSearch for more papers by this authorPatrick C. Collins, Patrick C. Collins Area 52 Research Group, School of Biology & Environment Science, University College Dublin, Belfield, Dublin, IrelandSearch for more papers by this authorThomas F. Cross, Thomas F. Cross Beaufort Fish Genetics Programme, School of Biological, Earth and Environmental Sciences/Aquaculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, IrelandSearch for more papers by this authorRichard D. Fitzgerald, Richard D. Fitzgerald Carna Research Station, Ryan Institute, National University of Ireland, Galway, Carna, Connemara, Co. Galway, IrelandSearch for more papers by this authorDavid T. Gauthier, David T. Gauthier Department of Biological Sciences, Old Dominion University, Norfolk, VA, USASearch for more papers by this authorPhilip McGinnity, Philip McGinnity Beaufort Fish Genetics Programme, School of Biological, Earth and Environmental Sciences/Aquaculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, IrelandSearch for more papers by this authorLuca Mirimin, Luca Mirimin Carna Research Station, Ryan Institute, National University of Ireland, Galway, Carna, Connemara, Co. Galway, Ireland Molecular Breeding and Biodiversity Lab, Department of Genetics, Stellenbosch University, Stellenbosch, Western Cape, South AfricaSearch for more papers by this authorJens Carlsson, Jens Carlsson Area 52 Research Group, School of Biology & Environment Science, University College Dublin, Belfield, Dublin, Ireland Beaufort Fish Genetics Programme, School of Biological, Earth and Environmental Sciences/Aquaculture and Fisheries Development Centre, University College Cork, Distillery Fields, North Mall, Cork, IrelandSearch for more papers by this author First published: 07 July 2014 https://doi.org/10.1111/hrd2.00044Citations: 1AboutSectionsPDF ToolsExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Abstract The next generation sequencing revolution has enabled rapid discovery of genetic markers, however, development of fully functioning new markers still requires a long and costly process of marker validation. This study reports a rapid and economical approach for the validation and deployment of polymorphic microsatellite markers obtained from a 454 pyrosequencing library of Atlantic cod, Gadus morhua, Linnaeus 1758. Primers were designed from raw reads to amplify specific amplicon size ranges, allowing effective PCR multiplexing. Multiplexing was combined with a three-primer PCR approach using four universal tails to label amplicons with separate fluorochromes. A total of 192 primer pairs were tested, resulting in 73 polymorphic markers. Of these, 55 loci were combined in six multiplex panels each containing between six and eleven markers. Variability of the loci was assessed on G. morhua from the Celtic Sea (n = 46) and the Scotian Shelf (n = 46), two locations that have shown genetic differentiation in previous studies. Multilocus FST between the two samples was estimated at 0.067 (P = 0.001). After three loci potentially under selection were excluded, the global FST was estimated at 0.043 (P = 0.001). Our technique combines three-primer and multiplex PCR techniques, allowing simultaneous screening and validation of relatively large numbers of microsatellite loci. Thirty years after their discovery in the 1980s, microsatellite-based genetic markers are still extensively used in studies of population structure, parentage analysis, genetic mapping, evolutionary processes and forensics (Bruford and Wayne 1993; Brockmann et al. 1994; Knapik et al. 1998; Goldstein et al. 1999; Primmer et al. 2000). These markers have a wide application due to high allelic diversity and co-dominance of alleles (Chambers and MacAvoy 2000). Many methodologies have been established in order to discover such markers, but it is only following the recent advent of next generation sequencing (NGS) technology that large amounts of markers can be increasingly rapidly and economically developed from non-model organisms. NGS approaches allow the fast discovery of large amounts of microsatellite-containing sequences, however mining such data for suitable DNA fragments and validation of candidate markers are still posing challenges prior to the utilisation of fully operating new markers. The most common approach to date for de novo microsatellite marker development includes creation of repeat-enriched DNA libraries, fragment replication by cloning, and Sanger sequencing of clones containing potential microsatellites (Zane et al. 2002). These processes are laborious and time consuming, and typically have low marker yield, with the percentage of positive clones averaging 2–3% (Ashworth et al. 2004). The final marker yield is even lower with a large portion of markers discarded during the isolation–characterisation process (Squirrell et al. 2003). Alternatively, microsatellite-containing sequences can be mined from existing molecular data such as genomic DNA or expressed sequence tag (EST) sequences (Li et al. 2004). These approaches are limited by the paucity of data on non-model organisms. EST-linked microsatellites can be relatively easy to identify but have a higher probability of being affected by selective processes, and hence may not be suitable for population analyses that assume that loci are selectively neutral (Ellis and Burke 2007). Microsatellite markers developed for one species may also be applied to closely related species (Schlötterer et al. 1991). However, this approach is limited by varying levels of successful cross-species amplification between species (Moore et al. 1991). Even when cross-species amplification is successful, levels of variability tend to be lower compared with the species for which the markers were developed (Primmer et al. 1996). Because of these limitations, it may be preferable to develop markers de novo for a species or population of interest to ensure optimal power of newly discovered markers (Carlsson et al. 2013). Recently, several approaches have been presented for discovery of microsatellites using next generation sequencing (NGS)-generated data (Abdelkrim et al. 2009; Allentoft et al. 2009). Large amounts of sequence data can be generated from either genomic DNA or microsatellite enriched libraries (Guichoux et al. 2011; Malausa et al. 2011) and then mined for microsatellite loci. With this approach, thousands of potential loci can be rapidly identified (Guichoux et al. 2011). Large-scale microsatellite identification has several advantages, including the ability to rigorously screen individual loci for presence of optimal primer-binding sites in flanking regions (Guichoux et al. 2011; Zalapa et al. 2012; Fernandez-Silva et al. 2013). Additionally, deployment criteria (e.g. higher levels of variability, neutrality and low linkage) are study-specific and cannot be assessed until the markers have been validated (Selkoe and Toonen 2006). Validation of a large number of markers enables selection of most suitable loci (Selkoe and Toonen 2006). Even with the NGS approach, validation is labour intensive and a limiting bottleneck in microsatellite marker development (Squirrell et al. 2003; Malausa et al. 2011; Fernandez-Silva et al. 2013). Typically, methods for validation and genotyping of microsatellites involve capillary gel electrophoresis with fluorescence-based amplicon detection (Edwards et al. 1991; Guichoux et al. 2011). The three-primer PCR method can be used to reduce the expense associated with fluorescently labelled primers (sensu Schuelke 2000; Diniz et al. 2007; Rubin et al. 2009). In three-primer PCR, the primers comprise an unlabelled forward primer with a universal tail attached to its 5′ end, a labelled universal primer matching the tail sequence and an unlabelled reverse primer (Steffens et al. 1993; Oetting et al. 1995; Neilan et al. 1997; Schuelke 2000). The labelled universal primer can be used in combination with any appropriately tailed forward primer, thereby eliminating the need to synthesize a fluorescently labelled forward primer for every unique locus during the validation phase, in which a large proportion of loci may be excluded because of problems with amplification. Following initial identification and PCR optimisation of successful markers, sets of primers are usually labelled with a fluorescent label either on the forward or the reverse primer (Guichoux et al. 2011). Markers are then amplified in single PCR reactions or combined into a multiplex PCR containing multiple markers (Guichoux et al. 2011). Improvement of the traditional multiplex PCR technique (Missiaggia and Grattapaglia 2006) employed human microsatellite primer sequences as universal tails and combined three universal tails with three dyes in a true multiplex PCR (sensu Guichoux et al. 2011). However, despite the obvious cost benefits of the three primer PCR approach in combination with multiplexing, few studies have employed the method (Langen et al. 2011; Blacket et al. 2012). This may be attributable to poor amplification or poor quality chromatograms, resulting in difficulty in accurate genotyping of individuals (Hagell et al. 2013). Here we present the development and application of a method for rapid validation and genotyping of novel microsatellites in Atlantic cod, Gadus morhua, Linnaeus 1758, using the three primer approach with multiplex PCR. The main aim of this study is the fast and economic development and deployment of microsatellite multiplexes from raw NGS data applicable for studies on a wide range of organisms. MATERIAL AND METHODS Sampling Gadus morhua were obtained by trawling in 2009 and 2011 from the Celtic Sea, south of Ireland (n = 7, n = 46, respectively) and in 1996 from the Scotian Shelf, off Nova Scotia in eastern Canada (n = 46). Previous research has shown that these two populations are genetically differentiated (Hutchinson et al. 2001; O'Leary et al. 2007). Fin clip samples were preserved in 100% ethanol. DNA extraction DNA was extracted from fin clips using a Chelex protocol as described in Mirimin et al. (2011). DNA from the Scotian Shelf samples was extracted using a standard phenol–chloroform method (O'Leary et al. 2007). DNA was quantified using a NanoDrop 1000 Spectrophotometer (Thermo Scientific) and normalised to a concentration of 50 ng μl−1. Microsatellite selection The unpublished sequence data used here were generated for a previous study (Carlsson et al. 2013), in which microsatellite containing sequences were obtained from five of the 2009 Celtic Sea individuals also used in the present study, using 454 pyrosequencing of a reduced representation library. Carlsson et al. (2013) identified a total of 11 341 microsatellite containing sequences as suitable for primer design using the Primer3 plug-in (Rozen and Skaletsky 2000) for MISA ver. 1.0 ( ). Of these, 6424 were estimated to be unique. These microsatellite-containing sequences were used in the present study. To avoid excessive homoplasy (alleles identical in state but not in descent, cf. Estoup et al. 2002) and to ensure ease of genotyping, complex repeat motifs (i.e. compound and imperfect motifs) were excluded. To ensure sufficient space for primer design, reads that had less than 50 bp of sequence before and after the repeat-containing region were removed. In addition, to avoid excessively large allele size ranges, repeat sequences of more than 100 bp, and penta- and hexanucleotide repeats were excluded. A subsample of the remaining microsatellite sequences (n = 1309) were visually inspected for primer design. Primer design Primers were designed using Primer3Plus (Rozen and Skaletsky 2000; Untergasser et al. 2007) with optimal primer length as 20bp and optimal Tm at 60°C. Two sets of three size classes were used: the first set of size classes was separated by 30 bp (100–150, 180–250, 280–450 bp), and the second set separated by 50 bp (100–150, 200–250, 300–450 bp). Equal numbers of markers were designed for each size class. Only primer pairs with a Tm difference of less than 1°C were accepted in order to facilitate PCR multiplexing. Designed primers were cross-referenced with the original sequence data set to identify primers that annealed to multiple regions (not unique) or originated from redundant sequences (different reads of the same sequence). Redundant sequences not detected in the previous steps (due to sequencing error in the primer regions) were identified by performing a de novo assembly with the remaining candidate loci sequences using Geneious ver. 6.1.5 (created by Biomatters; available from ), CAP3 plug-in (default settings; i.e. min overlap length = 40 bp, min overlap identity = 90%; Huang and Madan 1999). If two or more reads assembled together they were considered redundant and only one of them was kept for future analysis. To minimise the risk that primer sequences were derived from contaminants, expressed G. morhua gene regions or previously published G. morhua microsatellites, microsatellite-containing sequences and primers were subjected to a BLAST search in the GenBank nucleotide database (Altschul et al. 1990). For possible contaminants, we considered a match with ≥ 95% coverage and 100% identity as a threshold for excluding reads. No threshold was employed for matches on G. morhua sequences. When such matches were encountered, primers were excluded from further analyses. In addition, validated primer sequences were subjected to BLAST searches against the G. morhua genome (Star et al. 2011) in the whole-genome shotgun contigs database in GenBank. Universal primers The universal dye-labelled primers used were T3: PET-5′ AATTAACCCTCACTAAAGGG 3′, M13 Reverse: NED-5′ GGATAACAATTTCACACAGG 3′ (Diniz et al. 2007), Hill: 6FAM-5′ TGACCGGCAGCAAAATTG 3′ (Tozaki et al. 2001) and Neomycin rev: VIC-5′ AGGTGAGATGACAGGAGATC 3′. Each forward primer had one of the above universal primer sequences added to its 5′ end. PIG-tails were added to the 5′ end of all the reverse primers. PIG-tailing leads to an addition of a non-templated adenosine nucleotide to the 3′ end on nearly 100% of PCR products which reduces stutter caused by random addition of dATP (Brownstein et al. 1996). The tails were matched with the primers using OligoAnalyzer ver. 3.1 ( ) to ensure the least amount of different secondary structures. Equal numbers of primers were paired with each of the four different universal primers. Microsatellite validation Primers were combined into twelve multiplex PCR reactions containing 12 markers each (12-plex) and validated using all seven 2009 Celtic Sea individuals. Loci that amplified successfully and showed polymorphism were combined into further multiplexes. The construction of multiplexes was done by means of successive attempts of adding and removing loci from sets of markers that had amplified together in the initial test panels until at least six loci were successfully combined in a panel. When amplified loci were monomorphic, the procedure was repeated on seven Scotian Shelf samples to assess whether they were monomorphic in these individuals. Multiplex PCRs were performed in 5 μl reactions with 50 ng template DNA, 1 × Multiplex PCR Master Mix (Qiagen), 0.2 μM of each reverse primer, 0.05 μM of each unlabelled forward primer (modified with the appropriate universal tail) and 0.2 μM of labelled universal primer for each forward primer labelled with matching universal tail. Further adjustments made to optimize concentrations of primers in the PCR reactions are given in Table 1. PCR thermal cycling conditions were as follows: 1 × 95°C (15 min); 30 × 94°C (30 s), 60°C (90 s), 72°C (60 s); 8 × 94°C (30 s), 53°C (90 s), 72°C (60 s); 1 × 60°C (30 min). No-template controls were included to monitor for potential contamination. Table 1. Six multiplex panels for amplifying 55 loci. Underlined sequence in the forward primer signifies the universal tail sequence, whereas underlined sequence in the reverse primer signifies the PIG-tail sequence. Adjustment to primer concentration applies for all three primers used to amplify a locus (forward, reverse and universal primer). SRA accession number for the sequences is SRP041380 Multiplex Name Motif Size range Adjustment to primer concentration Forward primer Reverse primer 1 A08_T3 tetra 163–187 AATTAACCCTCACTAAAGGGATCTCGAGTGGCGCAGTAG GTTTCTTGCACGCTGACAAGTGAAGAG A16_M13 tetra 156–194 GGATAACAATTTCACACAGGCGTTCCCTCAGCTTGTCTCT GTTTCTTCGTCCAACAGATCTATGCAATC A43_T3 tetra 309–369 AATTAACCCTCACTAAAGGGCCACTTTAACCTGCGGTTTC GTTTCTTGCCTGCTTGTATACGCTGA B19_T3 tri 242–320 AATTAACCCTCACTAAAGGGAAAAGGATCTGCTTGCCTCA GTTTCTTCGTGAGCTCAGTTTTGGCTA B38_Neo tri 388–415 AGGTGAGATGACAGGAGATCGAATTGAGGAGGCATGGGTA GTTTCTTGGTTAATTCCAGCCGTAGAGG C15_Hill tetra 169–205 TGACCGGCAGCAAAATTGCCTTTCGTTCTCTCCGTCAG GTTTCTTGAGGATTTGGTGGGATGAT C28_Neo tri 255–282 AGGTGAGATGACAGGAGATCCAGCACAAGTGGTAGGGTCA GTTTCTTGCGATCAGAAGTTGTGCTT C36_Hill tri 319–352 TGACCGGCAGCAAAATTGGTTGGCTCACACAATCATCG GTTTCTTATCCTTCAAACAGCCCTCAA C40_M13 tri 276–277 GGATAACAATTTCACACAGGGGTCTTTGGGAGGTCTTCCT GTTTCTTCATCCTGCTTGCGGACTTAT D14_Hill tetra 228–276 TGACCGGCAGCAAAATTGGGCGATAATCTGCCATTTTG GTTTCTTCGTGACAAGTGTCGATTTGC D30_M13 tetra 317–333 GGATAACAATTTCACACAGGTTCACAAACGGGAACTACGA GTTTCTTGACAAGTCCAAGATGTGTCA TCA 2 A18_M13 tetra 203–243 GGATAACAATTTCACACAGGGACTGTCCGTTGAGGGTGTT GTTTCTTGGGTCGAACTGGTCTGGTTA A19_T3 tetra 242–274 AATTAACCCTCACTAAAGGGTCCTGGTTCCAACACATGAC GTTTCTTCTAGCCAATGGTGCAAGT A34_M13 tetra 297–321 GGATAACAATTTCACACAGGTCCTTAACGACAGGCACCTT GTTTCTTCCTGACTTGTGTCGTTCCAG A37_Neo tetra 265–489 AGGTGAGATGACAGGAGATCCCGGCAGTACAGCTAATGAA GTTTCTTAAATGCTCAACCCATTGGAC B03_T3 di 147–161 AATTAACCCTCACTAAAGGGCGATAATAGCGTTCCCATCC GTTTCTTGGGGTACCTTGTGACCTGT B12_T3 di 346–398 2x AATTAACCCTCACTAAAGGGGCTTTGGCAACACTGTTTGA GTTTCTTGTCGAGCAGACCAGAAGACC B30_Neo tri 151–169 AGGTGAGATGACAGGAGATCTTGTACGGACAGGAAGTCCA GTTTCTTGCGAACAGTGTGTAAATTGAA C01_M13 tetra 115–195 GGATAACAATTTCACACAGGACCAGGAGGTTGGATCAGTG GTTTCTTCCATTATTCATTCGTCATCCA C14_Hill tetra 146–166 TGACCGGCAGCAAAATTGAGGTTCAGCCAGAAGCTGAT GTTTCTTCCATTGGTTGTCGGTGATTA C20_Neo tri 227–239 AGGTGAGATGACAGGAGATCCCGCCTATCACCCTAAATCTG GTTTCTTCGTCTACATGTCGTGGTAGGG D12_Hill tetra 251–291 TGACCGGCAGCAAAATTGAACGGCTCCTCAAGACAAAC GTTTCTTAGGCATCTGCGTCCATACTC 3 A33_M13 tetra 233–297 GGATAACAATTTCACACAGGAGACACTGAGCTCGACAGCA GTTTCTTATCAATGATCCCAGGCAAAC A39_Hill tetra 356–416 TGACCGGCAGCAAAATTGCCTGTCCAAATGCACACAAG GTTTCTTAGTGCTTGGATGGTGTGATG B01_Hill di 129–213 TGACCGGCAGCAAAATTGTAGACTCTGGGGCTGGGTAA GTTTCTTGCATCCGAGACTCTTGTTCC B15_T3 di 170–178 AATTAACCCTCACTAAAGGGCGATGCGATTCTTGGTAAATG GTTTCTTGGTGTCTCATCCCTCTTCA B29_Neo tri 262–281 AGGTGAGATGACAGGAGATCGGGAAAGAGCCGGAAAAGTA GTTTCTTGCTAATGTTGGCAGAACCA C13_Neo tri 158–176 AGGTGAGATGACAGGAGATCGGGTGATTGAGGTTGCGATA GTTTCTTGCTCACACATCCTACGAGCA C17_M13 tetra 154–190 GGATAACAATTTCACACAGGCTTCTCGATGGCATGTTTCC GTTTCTTCCTGCACAATGATCTGCAT C22_Hill tetra 225–273 TGACCGGCAGCAAAATTGGGCTTGTCGTTGGTTCCTT GTTTCTTGTTGAATGCAACCCCTCAGT C30_Neo tri 337–343 AGGTGAGATGACAGGAGATCAGGTGGTCGCAGTGAAGAAG GTTTCTTGGGTGAATGCCTCTTAATCG C42_M13 tri 345–358 GGATAACAATTTCACACAGGGCTGAGGGGATGCGATAATA GTTTCTTAGCCAAGGGTGAAGTGTTGT D37_T3 tetra 313–333 AATTAACCCTCACTAAAGGGATGTGACACCGAATCACAGC GTTTCTTACCCGTCCTGTACGTGAACT 4 A11_Hill tetra 163–175 TGACCGGCAGCAAAATTGCGACAGGGAGGCATAAAGAC GTTTCTTGTTCACCTCCCTGGCTCTT A22_Neo tetra 214–243 AGGTGAGATGACAGGAGATCGGTGAGGTTCTTGAGGGTCA GTTTCTTGATTATTTCCCCCTGCTG A31_T3 tetra 314–386 AATTAACCCTCACTAAAGGGGGATATGTGGGGATGAGCAC GTTTCTTATGGGTCCTTCTCCTTTGGT B33_M13 tri 120–135 0.5x GGATAACAATTTCACACAGGCTACAGCAGGGGTTCCTCAG GTTTCTTGTTTGTTGCTCCGATGGACT C08_T3 tetra 145–185 AATTAACCCTCACTAAAGGGCTCGGACCCAGAGATCAAAA GTTTCTTGCAGCATCTGAACTGAAACG D15_Hill tetra 244–256 TGACCGGCAGCAAAATTGTGACTCAACGGAGGTACGTG GTTTCTTCCATCAGGATCAGGACCACT D21_Neo tetra 337–389 AGGTGAGATGACAGGAGATCAACACGCTTGCTGGGACTAC GTTTCTTCACTGGAGTGTACGGTCTCTGA D46_Neo tetra 139–155 AGGTGAGATGACAGGAGATCCCTCCCTAATACCATGTCACCA GTTTCTTCGTCTGTTCACGGATGCAC 5 A04_Neo tetra 143–211 AGGTGAGATGACAGGAGATCACAATCAACCCTCCAACTCG GTTTCTTCAGGTCCCGAATATCAAGG B07_M13 di 163–187 GGATAACAATTTCACACAGGTGGACAATTACATTGAAAATCACAG GTTTCTTCCTGAACTGCCTGTCAATG B28_Neo tri 226–238 AGGTGAGATGACAGGAGATCCCCACCCCTTAATGTTTCAA GTTTCTTGGCGTCATTCTCTTTGATG D10_Neo tetra 308–312 2x AGGTGAGATGACAGGAGATCCGCCAATGCAAATCTCTTTT GTTTCTTATCTGAGTGCGTCGAGTGC D35_M13 tetra 316–360 GGATAACAATTTCACACAGGTCCACACTTGGTCGATGAAA GTTTCTTGACGAGTGTCAGCAGGTGTG D43_T3 tetra 260–292 2x AATTAACCCTCACTAAAGGGGTGCCGCTCACGCTACTAAT GTTTCTTCGTGATCGCTCTCGATTC 6 A03_Hill tetra 158–170 TGACCGGCAGCAAAATTGGAGCGTGTTGAACGACTTGA GTTTCTTCCTGAGCAGTGGAGTGACAA A20_Neo tetra 164–182 AGGTGAGATGACAGGAGATCCGAGGCTACACAGCCTGTAA GTTTCTTACTGTGGGCATGTAACAGCA A30_T3 tetra 245–269 AATTAACCCTCACTAAAGGGAGTTGAACTGCGGGTTCTGT GTTTCTTGCACGATGTCACAGCTGATT B36_Hill tri 384–396 TGACCGGCAGCAAAATTGCCCGCCCAGACATAATAAGA GTTTCTTCAGTCTCAGCCTCCACATCA C31_Neo tri 332–341 AGGTGAGATGACAGGAGATCGCCAAGACAAGCATTTCCAT GTTTCTTCGAGCCAGCGTTTACTTCTC C35_M13 tri 233–348 GGATAACAATTTCACACAGGGGCAATGTCGTACACCTCAA GTTTCTTCTGAACGGCAACACTTCGTA D05_Hill tetra 230–270 TGACCGGCAGCAAAATTGACTGCCCCTGATAACAATGC GTTTCTTAGGCATCGACCATTTGTAGC D16_T3 tetra 424–436 AATTAACCCTCACTAAAGGGCCAGCAGCTTTCTGGGTAGT GTTTCGAAGCGTTACTGCAGACAG A total of 1 μl of the multiplex PCR product was added to 9 μl of Super-DI Formamide (MCLAB) with 0.01 μl of Orange DNA Size Standard (MCLAB) and run on an ABI 3130xl Genetic Analyzer according to manufacturer's recommendations. GeneMarker ver. 1.97 ( ) was used for fragment length analysis. Microsatellite genotyping All 46 Celtic Sea and 46 Scotian Shelf samples were genotyped with multiplex panels. Genotype data were inspected with Micro-Checker ver. 2.2.3 for genotyping errors and presence of null-alleles (van Oosterhout et al. 2004) using default settings. The 99% confidence interval was used when checking for null alleles to avoid false positives resulting from multiple tests. MSAnalyser ver. 4.05 (Dieringer and Schlötterer 2003), using default settings, was used to assess the number of alleles, allelic richness, allele size ranges, FST estimates and expected and observed heterozygosity. Data were analysed for possible departure from Hardy–Weinberg equilibrium, linkage disequilibrium, and excess and deficit of heterozygotes using Genepop ver. 4.2 with default settings (Raymond and Rousset 1995; Rousset 2008). False discovery rate (FDR) was used to correct for multiple comparisons (Benjamini and Yekutieli 2001) with initial α = 0.05. Lositan (Antao et al. 2008) was used to detect loci that could be under positive or balancing selection (settings ''Neutral' mean FST' and 'Force mean FST' with 10 000 simulations were used under both the infinite allele model and stepwise mutation model). FST replicate sampling The current study purposefully aimed to validate more markers than required for accurate evaluation of population differentiation (i.e. multilocus FST). To estimate the number of microsatellite loci future studies on cod population structure may require we investigated how many markers were needed to accurately estimate multilocus FST. Data sets were generated by randomly drawing 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 loci from the 55-locus dataset with each condition (number of loci) replicated ten times. Average FST and 95% confidence interval of the ten replicates were calculated and plotted to visualise the variability of average FST estimates as a function of numbers of markers (Fig. 1a–b). Figure 1a–bOpen in figure viewerPowerPoint FST replicate sampling with all 55 loci (a) and with outliers excluded (b). 95% confidence interval is displayed. RESULTS Primer design Of 1309 candidate microsatellite loci, 559 were determined to be suitable for primer design upon visual inspection. A total of 349 primer pairs were rejected based on Tm difference, sequence redundancy or secondary structure with the 5′ tail. Another 18 were excluded due to a BLAST match (one match to a Gadus morhua microsatellite, nine to a G. morhua gene and eight to a possible contaminant). The remaining 192 primer pairs were chosen for validation. Of those, 51 failed to amplify and 45 were excluded due to low scorability. Unambiguous amplification of PCR products in the expected size range was successful in 96 of the 192 markers tested (50%), of which 73 showed polymorphism (38%). The 73 polymorphic loci were used to build multiplex panels. Of these 13 were not included in the final multiplexes because of incompatible size, associated fluorochrome or failure to amplify with the other markers in a panel. As a result 60 markers were combined into six multiplex panels ranging between eight and twelve loci. Five markers were not used in the final analysis due to ambiguous genotyping leading to high chance of scoring errors, resulting in a final panel of 55 polymorphic markers combined into six multiplexes (Table 1). The results of the BLAST search on the validated primers against the G. morhua genome are presented in the Supplementary material Appendix 1 Table A1. Application of markers to test populations The mean allelic richness (RS) was 7.1 (SD = 4.11) in the Celtic Sea sample and 7.2 (SD = 4.24) in the Scotian Shelf sample. The minimum number of alleles was two for both the Celtic Sea and Scotian Shelf samples; the maximum number of alleles was 21 and 25, respectively (Supplementary material Appendix 1 Table A2). Micro-Checker analyses indicated no genotyping errors. However, ten loci had a different repeat pattern than the motif originally identified from the raw sequence (Supplementary material Appendix 1 Table A3). Null alleles were observed in 13 loci in the Celtic Sea sample and 11 in the Scotian Shelf sample (Supplementary material Appendix 1 Table A4). Twelve and eleven loci deviated significantly from Hardy–Weinberg equilibrium (after FDR correction) in Celtic Sea and Scotian Shelf samples, respectively. Linkage disequilibrium was observed (after FDR correction) in locus pair A43_T3 x C01_M13 in the Celtic Sea sample; and locus pairs A43_T3 x B19_T3 and C15_Hill x C17_M13 in the Scotian Shelf sample. Lositan identified loci A11_Hill, C40_M13, C42_M13 and D14_Hill as being potentially affected by positive selection, (both under IA and SMM). Only C40_M13 (global FST = 0.581), C42_M13 (global FST = 0.301) and D14_Hill (global FST = 0.246) remained significant after correction for multiple comparisons. Global multilocus FST was estimated at 0.067 (P = 0.001). After the three outlier loci identified by Lositan were excluded, the global FST was estimated at 0.043

Referência(s)