Molecular Analysis of the Breast Cancer Genes BRCA1 and BRCA2 Using Amplicon-Based Massive Parallel Pyrosequencing
2012; Elsevier BV; Volume: 14; Issue: 6 Linguagem: Inglês
10.1016/j.jmoldx.2012.05.006
ISSN1943-7811
AutoresG Michils, Silke Hollants, Luc Dehaspe, Jeroen Van Houdt, Yannick Bidet, Nancy Uhrhammer, Yves‐Jean Bignon, Joris Vermeesch, Harry Cuppens, Gert Matthijs,
Tópico(s)Gene expression and cancer classification
ResumoThe aim of this study was to implement the massively parallel sequencing technology for diagnostic applications. We evaluated an amplicon-based method for the analysis of the BRCA1 and BRCA2 genes on the Roche 454 GS-FLX sequencer, to identify disease-causing mutations in breast and/or ovarian cancer patients. A first evaluation relied on the analysis of DNA fragments containing known mutations. Secondly, the entire coding regions of the BRCA1 and BRCA2 genes were interrogated in more than 400 patient samples, using a multiplex PCR-based assay. Variants were filtered on the basis of their frequency (20%) and sequencing depth (>25×). Special attention was given to sequencing accuracy in homopolymers. In the initial evaluation, all known heterozygous mutations were detected. The percentage of mutant reads ranged from 22% to 62%. For the multiplex assay, 95% sensitivity and 91% specificity were obtained. In addition, we were able to reliably distinguish mutations from noise through the analysis of the raw signal intensities in homopolymers. This work presents an evaluation of the next-generation sequencing for use in diagnostics, based on a relatively high number of samples and experiments. We anticipate that the technique would further improve, and would allow reducing the costs per analysis and the turn-around time, to benefit patients who undergo BRCA molecular testing. The aim of this study was to implement the massively parallel sequencing technology for diagnostic applications. We evaluated an amplicon-based method for the analysis of the BRCA1 and BRCA2 genes on the Roche 454 GS-FLX sequencer, to identify disease-causing mutations in breast and/or ovarian cancer patients. A first evaluation relied on the analysis of DNA fragments containing known mutations. Secondly, the entire coding regions of the BRCA1 and BRCA2 genes were interrogated in more than 400 patient samples, using a multiplex PCR-based assay. Variants were filtered on the basis of their frequency (20%) and sequencing depth (>25×). Special attention was given to sequencing accuracy in homopolymers. In the initial evaluation, all known heterozygous mutations were detected. The percentage of mutant reads ranged from 22% to 62%. For the multiplex assay, 95% sensitivity and 91% specificity were obtained. In addition, we were able to reliably distinguish mutations from noise through the analysis of the raw signal intensities in homopolymers. This work presents an evaluation of the next-generation sequencing for use in diagnostics, based on a relatively high number of samples and experiments. We anticipate that the technique would further improve, and would allow reducing the costs per analysis and the turn-around time, to benefit patients who undergo BRCA molecular testing. Our aim has been to evaluate and possibly to introduce next-generation sequencing (NGS) technology for the screening of the BRCA1 and BRCA2 genes in a diagnostic context. We present an evaluation of amplicon-based sequencing on the 454 GS-FLX Titanium Genome Sequencer (Roche 454 Life Sciences, Branford, CT)1Margulies M. Egholm M. Altman W.E. Attiya S. Bader J.S. Bemben L.A. Berka J. Braverman M.S. Chen Y.J. Chen Z. Dewell S.B. Du L. Fierro J.M. Gomes X.V. Godwin B.C. He W. Helgesen S. Ho C.H. Irzyk G.P. Jando S.C. Alenquer M.L. Jarvie T.P. Jirage K.B. Kim J.B. Knight J.R. Lanza J.R. Leamon J.H. Lefkowitz S.M. Lei M. Li J. Lohman K.L. Lu H. Makhijani V.B. McDade K.E. McKenna M.P. Myers E.W. Nickerson E. Nobile J.R. Plant R. Puc B.P. Ronan M.T. Roth G.T. Sarkis G.J. Simons J.F. Simpson J.W. Srinivasan M. Tartaro K.R. Tomasz A. Vogt K.A. Volkmer G.A. Wang S.H. Wang Y. Weiner M.P. Yu P. Begley R.F. Rothberg J.M. Genome sequencing in microfabricated high-density picolitre reactors.Nature. 2005; 437: 376-380Crossref PubMed Scopus (5964) Google Scholaras a tool for the molecular diagnosis of hereditary breast and/or ovarian cancer. This autosomal dominantly inherited cancer syndrome is caused by heterozygous germline mutations in the highly penetrant breast cancer genes, BRCA1 (MIM 113705) and BRCA2 (MIM 600185). The great majority of the mutations are nonsense mutations and frameshifts [Breast Cancer International Core (BIC)]. Exonic deletions or duplications have also been reported. The BRCA genes are commonly analyzed by direct sequencing.2De Leeneer K. Coene I. Poppe B. De Paepe A. Claes K. Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms.Clin Chem. 2008; 54: 982-989Crossref PubMed Scopus (66) Google Scholar, 3Ford D. Easton D.F. Stratton M. Narod S. Goldgar D. Devilee P. Bishop D.T. Weber B. Lenoir G. Chang-Claude J. Sobol H. Teare M.D. Struewing J. Arason A. Scherneck S. Peto J. Rebbeck T.R. Tonin P. Neuhausen S. Barkardottir R. Eyfjord J. Lynch H. Ponder B.A. Gayther S.A. Zelada-Hedman M. Breast Cancer Linkage ConsortiumGenetic heterogeneity and penetrance analysis of the BRCA1 and BRCA2 genes in breast cancer families.Am J Hum Genet. 1998; 62: 676-689Abstract Full Text Full Text PDF PubMed Scopus (2489) Google Scholar Mutation-scanning methods, such as denaturing high-performance liquid chromatography (dHPLC), denaturing gradient gel electrophoresis (DGGE), or high-resolution melting curve analysis (HRMCA) are also frequently used.4Eng C. Brody L.C. Wagner T.M. Devilee P. Vijg J. Szabo C. Tavtigian S. Nathanson K. Ostrander E. Frank T. Steering Committee of the Breast Cancer Information Core (BIC) ConsortiumInterpreting epidemiological research: blinded comparison of methods used to estimate the prevalence of inherited mutations in BRCA1.J Med Genet. 2001; 38: 824-833Crossref PubMed Scopus (128) Google Scholar, 5Wagner T. Stoppa-Lyonnet D. Fleischmann E. Muhr D. Pagès S. Sandberg T. Caux V. Moeslinger R. Langbauer G. Borg A. Oefner P. Denaturing high-performance liquid chromatography detects reliably BRCA1 and BRCA2 mutations.Genomics. 1999; 62: 369-376Crossref PubMed Scopus (195) Google Scholar, 6van der Hout A.H. van den Ouweland A.M. van der Luijt R.B. Gille H.J. Bodmer D. Brüggenwirth H. Mulder I.M. van der Vlies P. Elfferich P. Huisman M.T. ten Berge A.M. Kromosoeto J. Jansen R.P.M. van Zon P.H.A. Vriesman T. Arts N. Boutmy-de Lange M. Oosterwijk J.C. Meijers-Heijboer H. Ausems H.G.E.M. Hoogerbrugge N. Verhoef S. Halley D.J.J. Vos Y.J. Hogervorst F. Ligtenberg M. Hofstra R.M.W. A DGGE system for comprehensive mutation screening of BRCA1 and BRCA2: application in a Dutch cancer clinic setting.Hum Mutat. 2006; 27: 654-666Crossref PubMed Scopus (70) Google Scholar Exonic deletions or duplications can be detected by multiplex ligation-dependent probe amplification (MLPA),7Schouten J.P. McElgunn C.J. Waaijer R. Zwijnenburg D. Diepvens F. Pals G. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification.Nucleic Acids Res. 2002; 30: e57Crossref PubMed Scopus (2092) Google Scholar, 8Hogervorst F.B.L. Nederlof P.M. Gille J.J.P. McElgunn C.J. Grippeling M. Pruntel R. Regnerus R. van Welsem T. van Spaendonk R. Menko F.H. Kluijt I. Dommering C. Verhoef S. Schouten J.P. van'T Veer L.J. Pals G. Large genomic deletions and duplications in the BRCA1 gene identified by a novel quantitative method.Cancer Res. 2003; 63: 1449-1453PubMed Google Scholar, 9Agata S. Dalla Palma M. Callegaro M. Scaini M.C. Menin C. Ghiotto C. icoletto O. Zavagno G. Chieco-Bianchi L. D'Andrea E. Montagna M. Large genomic deletions inactivate the BRCA2 gene in breast cancer families.J Med Genet. 2005; 42: e64Crossref PubMed Scopus (51) Google Scholar Southern blot or (semi-) quantitative (multiplex) PCR.10Puget N. Stoppa-Lyonnet D. Sinilnikova O.M. Pages S. Lynch H.T. Lenoir G.M. Mazoyer S. Screening for germ-line rearrangements and regulatory mutations in BRCA1 led to the identification of four new deletions.Cancer Res. 1999; 59: 455-461PubMed Google Scholar As only 10% to 20% of the familial cases are attributable to mutations in the BRCA1 and BRCA2 genes, no mutation is found in at least 80% of the families.11Stratton M.R. Rahman N. The emerging landscape of breast cancer susceptibility.Nat Genet. 2008; 40: 17-22Crossref PubMed Scopus (382) Google Scholar Also, the cost per test is high, especially if Sanger sequencing is used.12Sevilla C. Moatti J.P. Julian-Reynier C. Eisinger F. Stoppa-Lyonnet D. Bressac-de Paillerets B. Sobol H. Testing for BRCA1 mutations: a cost-effectiveness analysis.Eur J Hum Genet. 2002; 10: 599-606Crossref PubMed Scopus (54) Google Scholar This explains why in most laboratories, a careful selection of patients and families is made before analysis. The practical approach often consists of a combination of the mutation-scanning strategies, to reduce the economical burden imposed by Sanger sequencing. The NGS technology could change the approach, if it effectively allows reducing the cost. Multiplexing is a way of improving efficiency, especially when there is a great number of amplicons and patients to be screened. Indeed, the development of specific tools for primer design and pooling, that was described by Goossens et al,13Goossens D. Moens L.N. Nelis E. Lenaerts A.S. Glassee W. Kalbe A. Frey B. Kopal G. De Jonghe P. De Rijk P. Del-Favero J. Simultaneous mutation and copy number variation (CNV) detection by multiplex PCR-based GS-FLX sequencing.Human Mutat. 2008; 30: 472-476Crossref Scopus (57) Google Scholar were recently commercialized. To date, at least five NGS platforms are commercially available. Their properties differ in many ways, ie, their sequencing capacity, chemistry, and applications.14Zhang J. Chiodini R. Badr A. Zhang G. The impact of next-generation sequencing on genomics.J Genet Genomics. 2011; 38: 95-109Crossref PubMed Scopus (352) Google Scholar Specifically for the BRCA1 and BRCA2 genes, Walsh et al15Walsh T. Lee M.K. Casadei S. Thornton A.M. Stray S.M. Pennil C. Nord A.S. Mandell J.B. Swisher E.M. King M.C. Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing.Proc Natl Acad Sci USA. 2010; 107: 12629-12633Crossref PubMed Scopus (383) Google Scholar proposed a capture-based method on the Genome Analyzer (Illumina, San Diego, CA) and showed that mutations could be identified in 10 high-risk breast cancer genes. De Leeneer et al16De Leeneer K. Hellemans J. De Schrijver J. Baetens M. Poppe B. Van Criekinge W. De Paepe A. Coucke P. Claes K. Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations.Human Mut. 2011; 32: 1-10Crossref Google Scholar used a multiplex PCR-based approach combined with an in-house data analysis pipeline on the 454 GS-FLX. We tested the 454 GS-FLX platform with a multiplex PCR design that is different from the one presented by Leeneer et al. We extensively analyzed the NGS data with different software packages and compared the results to Sanger data. In addition, the balance between the coverage and the number of reads with a heterozygous variation was thoroughly examined. Finally, we tested a workflow aiming to improve the base calling and the interpretation of homopolymeric stretches. The DNA samples that were used in this study were obtained from referrals for molecular testing of the breast cancer genes. All patients have been genetically counseled at our Center and provided informed consent for testing. Genomic DNA had been extracted from lymphocytes by standard semiautomated methods (Chemagic Magnetic Separator Module I, Chemagen, Baesweiler, Germany; Autopure LS, Qiagen, Hilden, Germany). The amplicons generated for the GS-FLX System were all purified using MultiScreen PCR Filter Plates (Millipore, Billerica, MA) on a pipetting robot (Janus Automated Workstation, Perkin Elmer, Waltham, MA) using standard procedures. The PCR products were eluted in 30 μL of water. After DNA measurement using the Quant-iT PicoGreen ds DNA Assay kit (Invitrogen, Life Technologies, Grand Island, NY), the amplicons were pooled to generate an equimolar mix. Finally, emulsion-PCR with the GS-FLX Titanium Lib-A emPCR kit and the sequencing reaction with the GS Titanium Sequencing XLR70 kit were performed according to the instructions. For the first validation phase, we evaluated the NGS technology on the basis of a set of amplicons that covered the whole BRCA1 coding region. For this experiment, we choose to maintain the amplicon design that was originally defined for HRMCA.6van der Hout A.H. van den Ouweland A.M. van der Luijt R.B. Gille H.J. Bodmer D. Brüggenwirth H. Mulder I.M. van der Vlies P. Elfferich P. Huisman M.T. ten Berge A.M. Kromosoeto J. Jansen R.P.M. van Zon P.H.A. Vriesman T. Arts N. Boutmy-de Lange M. Oosterwijk J.C. Meijers-Heijboer H. Ausems H.G.E.M. Hoogerbrugge N. Verhoef S. Halley D.J.J. Vos Y.J. Hogervorst F. Ligtenberg M. Hofstra R.M.W. A DGGE system for comprehensive mutation screening of BRCA1 and BRCA2: application in a Dutch cancer clinic setting.Hum Mutat. 2006; 27: 654-666Crossref PubMed Scopus (70) Google Scholar For the BRCA1 gene, this represented a set of 45 amplicons, ranging from 172 bp to 370 bp (adaptor and key sequences excluded). For each amplicon, a specific DNA sample with a previously identified variation or a pathogenic mutation was used as template. In total, 37 heterozygous variations (21 substitutions and 16 frameshifts) were selected. These variations were distributed over 34 amplicons; wild-type DNA was used as template in the remaining amplicons. The amplicons were all separately amplified and pooled together to generate an (artificial) control sample (Table 1). The amplicons were amplified using a two-step PCR protocol. In brief, after a first PCR of the individual amplicons with the (exon) specific primers that contained universal tags, a second amplification was performed with primers containing the key and adaptor sequences (Roche, Technical Bulletin 005–2009). The PCR products were then prepared as described above.Table 1Heterozygous BRCA1 Variations Tested for the Evaluation of the AVA and SeqNext Software PackagesVariation cDNA (protein) levelVariant frequency⁎Ratio, in percent (%), of the reads containing the variation versus the total number of reads.AVA 2.5SeqNext 3.5.0c.1A>G (p.?)3428c.68_69delAG4728c.133A>C (p.Lys45Gln)5662c.212 + 3A>G (IVS5 + 3A>G)5658c.441G>C (p.Leu147Phe)5555IVS7-34T>C5657c.591C>T (p.Cys197Cys)4749c.692C>T (p.Thr231Met)47 (2.3)47 (3.3.2)c.744C>G (p.Thr248Thr)4950c.1072delCNot detected46 (3.3.2)c.1115G>A (p.Trp372X)4342c.1292delTNot detected49c.1621C>T (p.Gln541X)5555c.2082C>T (p.Ser694Ser)53 (2.3)46 (3.3.2)c.2197del52423c.2359dupGNot detected50c.2612C>T (p.Pro871Leu)5050c.2728delCNot detected44 (3.3.2)c.2898delTNot detected36c.3113A>G (p.Glu1038GLy)5354c.3119G>A (p.Ser1040Asn)4747c.3331del42622c.3481del115856c.3548A>G (p.Lys1183Arg)5959c.3756del44433c.3891del34827c.4088C>G (p.Ser1363X)4646c.4165delAG4850c.4308T>C (p.Ser1436Ser)4848c.4327C>T (p.Arg1443X)5051c.4391delCTAinsTT5049c.4535G>T (p.Ser1512Ile)5253c.4956G>A (p.Met1652Ile)4343c.5137delGNot detected54c.5177del44947c.5266dupCNot detected50c.5278–14C>G (IVS20–14C>G)4543Mean variant frequency (±SD)47 ± 1046 ± 10Variant frequencies are given for each variant. Ratio, in percent (%), of the reads containing the variation versus the total number of reads. Open table in a new tab Variant frequencies are given for each variant. In the second part of the study, the BRCA MASTR assay v1.2 (Multiplicom, Niel, Belgium) was evaluated in two phases. First, the sensitivity and specificity were calculated by analyzing seven samples for which Sanger data were available for the entire BRCA1 and BRCA2 coding regions. Secondly, 17 additional samples that had been previously screened with a combination of HRMCA and dHPLC, were used to generate more NGS data to evaluate the robustness of the method in terms of coverage. The DNA sample of each patient was amplified in 12 multiplexes through a two-step PCR method according to the manufacturer's instructions. Briefly, 20 ng of genomic DNA was used as a template in a first PCR that allowed the amplification of the targeted regions with tagged primers. A 1-μL quantity of a 1/1000 dilution of each of these PCR products was subsequently amplified using primers containing the universal tags flanked by the respective multiplex identifiers, ie, “barcode” sequences, the key and adaptor sequences. The 12 final PCR products were pooled per patient using the mixing scheme provided by the manufacturer, purified as described above and prepared for GS-FLX sequencing. Finally, the procedure was extended to 400 new patient samples. The same strategy was applied to BRCA MASTR assay v2.0 (Multiplicom). The amplicon design was adapted for titanium chemistry, allowing the amplification of longer fragments. Overall, 94 amplicons spanning the BRCA1/2 genes were amplified in five multiplexes. To evaluate this multiplex assay, 32 patient samples with 31 known pathogenic BRCA1/2 mutations (19 frameshifts and 12 substitutions) were selected and amplified according to the manufacturer's instructions. A summary of all experiments and validation steps is given in Table 2.Table 2Testing Strategy and Run InformationType of experimentEvaluation softwareEvaluation multiplex (1)Patient screeningEvaluation multiplex (2)No. samples/run1243216No. runs11132No. analyzed samples12440032GS-FLX capacity/run6% (1/16)50% (1/2)50% (1/2)25% (1/4)Average reads/run18,710414,246486,762235,905Amplification designSimplexMultiplexMultiplexMultiplexMASTR v. 1.2MASTR v. 1.2MASTR v. 2.0No. amplicons (gene)45 (BRCA1)169 (BRCA1/2)169 (BRCA1/2)94 (BRCA1/2)Coverage/amplicon4159784153% Amplicons >25×1008995100% Amplicons >40×1007987100Software for mutation analysis⁎See Table 3 for additional details.AVA;AVA;SeqNextSeqNextSeqNextSeqNextAll of the read data were determined by the AVA software. See Table 3 for additional details. Open table in a new tab All of the read data were determined by the AVA software. The signals were processed and the reads were filtered and assembled according to the standard parameters (the reader is referred to 454 sequencing system software manual, v 2.5.3, Part B, Roche). The quality of the reads was analyzed with a set of stringent filters. The reads were discarded from further analysis (keypass filter, dot filter, mixed filter) and trimmed back from the 3′ end (primer filter, signal intensity filter, valley filter). Finally, PHRED-equivalent scores were also used to trim reads based on an internal calibrated signal (quality score trimming filter; Roche Diagnostics Belgium, personal communication, 2012). All filter-passed reads were aligned to the Ensembl files BRCA1_ENSG00000012048 and BRCA2_ENSG00000139618; the mutations were annotated according to the HGVS guidelines (transcript reference files BRCA1 NM_007294.3 and BRCA2 NM_000059.3). The coverage per amplicon was defined by the amplicon variant analyzer (AVA) software (Roche). A threshold of 25-fold sequencing depth per amplicon was used in the experiments; a 40× cut-off is shown on the figures as an additional indication. We used AVA, versions 2.3 to 2.5.3, as well as SeqNext software (JSI Medical Systems GmbH, Kippenheim, Germany), versions 3.3.2 and 3.5.0. An extensive evaluation was performed on the ability of the AVA and SeqNext to identify known mutations. Because breast and/or ovarian cancer is dominantly inherited, pathogenic mutations are expected to be heterozygous. A mutation will theoretically be present in 50% of the corresponding reads. However, in reality, the variant frequency (VF) is binomially distributed around this value. In this study, a threshold of 20% was chosen. This cut-off was considered useful to distinguish the signal (ie, true mutations) from the noise (ie, false-positive results). Because this threshold is lower than the one proposed by De Leeneer et al,16De Leeneer K. Hellemans J. De Schrijver J. Baetens M. Poppe B. Van Criekinge W. De Paepe A. Coucke P. Claes K. Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations.Human Mut. 2011; 32: 1-10Crossref Google Scholar we decreased the minimum sequencing depth to 25×. Furthermore, the SeqNext software works with different thresholds simultaneously. The variants are sorted separately into a “distinct” table and an “other” table. To evaluate the sensitivity and specificity of the method, the initial cut-off of 20% has been applied for both tables. This allowed sorting all variants that were sequenced at a minimum of 25x coverage and present in at least 20% of the reads (VF ≥ 20%). After the evaluation phase, new DNA samples from BRCA patient samples were submitted to NGS analysis. To increase the power of detection and to avoid false-negative results due to stochastic events, a 20% VF was used for the “distinct” table and a second threshold of 10% was applied to the variants listed in the “other” table. Variants present in homopolymeric stretches (≥7 bp) were listed separately. Light signal intensities (alias: flow values) in homopolymers were extracted from the dataset using the AVA software and sorted by intervals of 0.1. This was done for homopolymers for which a mutant sample was available (ie, a sample with a true mutation, typically a deletion or a duplication). Theoretically, the signal intensities should correspond to the expected nominal value for the respective stretch (ie, number of bases in the stretch). In practice, however, the signal intensities cluster around this value. By analyzing the number of reads mapping to each 0.1 interval, it is possible to derive relative ratios and to study the distribution of the reads in each stretch. For example, if 60 reads in total were mapped to a seven-nucleotide stretch, and six of these reads had a raw signal intensity of 7.2, a relative ratio of 10% could be calculated. Those ratios were made for each 0.1 interval, the sum of the ratios for each patient being 100%. To visually present those data, the obtained values were represented in a black-and-white scale (black indicating high, and white indicating low) in a “heatmap.” Under perfect circumstances—100% of the reads with a signal intensity of exactly 7—only one black box corresponding to this interval would be present in the plot. The first experiment was conducted on an artificial set of amplicons containing 37 known heterozygous variations. AVA did not succeed in reporting all variations: the 1 bp deletions or insertions were present in the global alignment but were not reported by the software (results not shown). All variations could be successfully detected by the SeqNext software. The results showed that the variant frequency (mutant reads/total number of reads) of heterozygous variations is 46% ± 10% (range 22 to 62; Table 1). Multiplexing has been evaluated in 24 samples containing a total of 63 known heterozygous variants including 10 BRCA1 or BRCA2 deleterious mutations, using the MASTR 1.2 kit. Seven of the 63 variants tested were missed by AVA because of their nature (seven frameshifts of one and two bp); this gives an experimental sensitivity of 89%. With the SeqNext software, all variations could be detected. Hence, based on these results, an analytical sensitivity of 100% is put forward for the latter (maximum sensitivity: 95% (95% CI)).17Mattocks C.J. Morris M.A. Matthijs G. Swinnen E. Corveleyn A. Dequeker E. Müller C.R. Pratt V. Wallace A. EuroGentest Validation Group: A standardized framework for the validation and verification of clinical molecular genetic tests.Eur J Hum Genet. 2010; 18: 1276-1288Crossref PubMed Scopus (146) Google Scholar To calculate the specificity of this assay, data from seven of the 24 samples, which had been completely analyzed by Sanger sequencing, were used. This represented 1183 amplicons (7 × 169), but one amplicon was known to contain a mutation. At a cut-off of 20% and a minimum sequencing depth of 25×, 104 variants were called that were not detected by Sanger sequencing. Hence, the specificity is calculated at 104 false-positive results in 1182 amplicons, or 91%. All 104 false-positive results were located in homopolymer stretches. An overview of the software tools and their specificity and sensitivity is presented in Table 3.Table 3Results of the Software EvaluationSoftwareAVASeqNextVersion2.3 to 2.5.33.3.2, 3.5.0VF threshold2020Type of experimentArtificial controlBRCA1 37 known mutations 45 amplicons Max sensitivity: 92% (95%Cl)Observed VF range24–5922–62Average VF and SDEV47 ± 1046 ± 10Analytical sensitivity81% (30/37)⁎Indels between 1 and 3 bp were likely not called by the AVA software.100% (37/37)Specificity82% (100–8/45)†False-positive results in homopolymers were not reported by the AVA software.84% (100–7/45)Type of experimentEvaluation multiplex assay v. 1.2BRCA1 + BRCA2 63 Known mutations 1182 Amplicons Max sensitivity: 95% (95% CI)Observed VF range27–6225–63Average VF and SDEV49 ± 746 ± 5Analytical sensitivity89% (56/63)⁎Indels between 1 and 3 bp were likely not called by the AVA software.100% (63/63)Specificity98% (100–25/1182)†False-positive results in homopolymers were not reported by the AVA software.91% (100–104/1182) Indels between 1 and 3 bp were likely not called by the AVA software.† False-positive results in homopolymers were not reported by the AVA software. Open table in a new tab To evaluate the effect of the coverage on the variant frequency, samples were also analyzed in duplicate: once at a very low coverage, and once following a repeated analysis in which the expected average coverage was reached. In Figure 1, the variant frequency of 91 substitutions (27 distinct polymorphisms) was plotted according to the coverage (range 5× to 469×). As expected, the variant frequency came closer to 50% with a higher coverage. The outliers obtained were a VF of 88 with a coverage of 8×, and surprisingly, a VF of 11 associated with a coverage of 27×. In the latter example, detailed analysis showed that the polymorphism was present in only 6% (1/17) of the reads in the forward direction and in 20% (2/10) in the reverse direction. Afterward, 400 new samples were analyzed in 13 experiments. These runs were performed in the same conditions (semiautomated workflow, pooling of multiplexes and samples). The coverage distribution showed that a mean coverage of 84× was obtained and that 87% of the amplicons reached a 40× coverage (Figure 2). The analysis of the sequencing depth also showed that the variation, in terms of coverage, was important among and between amplicons. The coverage data were also plotted per experiment to give an indication of the reproducibility (Figure 3). Overall, the total number of passed reads in the AVA software was reproducible and the results were in accordance with the manufacturer's specifications (360 103−520 103 reads per half region). Outliers (amplicons with low ( 200×), were identified in each experiment. A novel multiplex assay (MASTR 2.0, Multiplicom) containing 94 amplicons instead of 169, was also tested. The major changes were the decrease of the number of amplicons by increasing their length that better fits with the Titanium chemistry and the distribution of the amplicons within the multiplexes. All amplicons were successfully amplified. Furthermore, >99.9% of the amplicons reached the 40× coverage (only two in 3008 amplicons were below). Thus, the MASTR 2.0 assay easily allowed to pool 16 patients samples in one-quarter region of a GS-FLX run (Table 2). We also focused on mutations in stretches by analyzing the pyrosequencing data in more detail, according to the procedure developed by Y. Bidet and colleagues (personal communication, 2012). A real heterozygous deletion or insertion should generate two populations of read data, whereas for homozygous (normal) reads the signal intensity should cluster around the nominal value. Figure 4 shows representative examples of the distribution of homopolymer reads in three stretches of adenosines for which positive controls were available: c.1016delA (7A >6A) and c.1016dupA (7A >8A) in BRCA1; c.2175dupA (6A >7A) and c.9253dupA (7A >8A) in BRCA2. For each homopolymer, “heatmaps” were generated (see Materials and Methods) where the proportion of the reads (%) according to their read length was plotted. The results showed that it is possible to distinguish the deletion and the duplications, by a left or right shift, respectively, from the read distribution for the samples containing the mutations. A list of all mutations located near or in homopolymers is presented in Table 4.Table 4Variations Located Near or in Homopolymer Stretches, and Results of Mutation Detection with SeqNext SoftwareGeneVariationSequence contextObserved variant frequency⁎Ratio, in percent (%), of the reads containing the variation versus the total number of reads.BRCA1c.1016delA7A >6A52BRCA1c.1016dupA7A >8A35BRCA1c.3331_3334del42 nt after 6A41BRCA1c.3891del32 nt after 5T60BRCA2c.927A>G (p.Ser309Ser)2 nt after 5T47BRCA2c.1281C>T (p.Asp427Asp)3 nt after 5A47BRCA2c.1804G>A (p.Gly602Arg)2 nt before 8A44BRCA2c.2175dupA6A >7A50BRCA2c.4301A>T (p.Lys1434Ile)in stretch of 5A49BRCA2c.4585G>A (p.Gly1529Arg)3 nt before 6A40BRCA2c.4935delA5A >4A50BRCA2c.5353delA2 nt after 7A43BRCA2c.5645C>A (p.Ser1882X)C>A, 5A >6A40BRCA2c.7544delCafter 5A45BRCA2c.9253dupA7A >8A33 Ratio, in percent (%), of the reads containing the variation versus the total number of reads. Open table in a new tab Sequencing of DNA samples using NGS seems to be easy and straightforward. Nonetheless, numerous parameters affect the quality of the final result. Hence, in view of the acc
Referência(s)