Artigo Acesso aberto Revisado por pares

The Development of Next-Generation Sequencing Assays for the Mitochondrial Genome and 108 Nuclear Genes Associated with Mitochondrial Disorders

2013; Elsevier BV; Volume: 15; Issue: 4 Linguagem: Inglês

10.1016/j.jmoldx.2013.03.005

ISSN

1943-7811

Autores

Shale Dames, Lan-Szu Chou, Xiao Ye, Tyler Wayman, Jennifer Stocks, Marc Singleton, Karen Eilbeck, Rong Mao,

Tópico(s)

RNA modifications and cancer

Resumo

Sanger sequencing of multigenic disorders can be technically challenging, time consuming, and prohibitively expensive. High-throughput next-generation sequencing (NGS) can provide a cost-effective method for sequencing targeted genes associated with multigenic disorders. We have developed a NGS clinical targeted gene assay for the mitochondrial genome and for 108 selected nuclear genes associated with mitochondrial disorders. Mitochondrial disorders have a reported incidence of 1 in 5000 live births, encompass a broad range of phenotypes, and are attributed to mutations in the mitochondrial and nuclear genomes. Approximately 20% of mitochondrial disorders result from mutations in mtDNA, with the remaining 80% found in nuclear genes that affect mtDNA levels or mitochondrion protein assembly. In our NGS approach, the 16,569-bp mtDNA is enriched by long-range PCR and the 108 nuclear genes (which represent 1301 amplicons and 680 kb) are enriched by RainDance emulsion PCR. Sequencing is performed on Illumina HiSeq 2000 or MiSeq platforms, and bioinformatics analysis is performed using commercial and in-house developed bioinformatics pipelines. A total of 16 validation and 13 clinical samples were examined. All previously reported variants associated with mitochondrial disorders were found in validation samples, and 5 of the 13 clinical samples were found to have mutations associated with mitochondrial disorders in either the mitochondrial genome or the 108 nuclear genes. All variants were confirmed by Sanger sequencing. Sanger sequencing of multigenic disorders can be technically challenging, time consuming, and prohibitively expensive. High-throughput next-generation sequencing (NGS) can provide a cost-effective method for sequencing targeted genes associated with multigenic disorders. We have developed a NGS clinical targeted gene assay for the mitochondrial genome and for 108 selected nuclear genes associated with mitochondrial disorders. Mitochondrial disorders have a reported incidence of 1 in 5000 live births, encompass a broad range of phenotypes, and are attributed to mutations in the mitochondrial and nuclear genomes. Approximately 20% of mitochondrial disorders result from mutations in mtDNA, with the remaining 80% found in nuclear genes that affect mtDNA levels or mitochondrion protein assembly. In our NGS approach, the 16,569-bp mtDNA is enriched by long-range PCR and the 108 nuclear genes (which represent 1301 amplicons and 680 kb) are enriched by RainDance emulsion PCR. Sequencing is performed on Illumina HiSeq 2000 or MiSeq platforms, and bioinformatics analysis is performed using commercial and in-house developed bioinformatics pipelines. A total of 16 validation and 13 clinical samples were examined. All previously reported variants associated with mitochondrial disorders were found in validation samples, and 5 of the 13 clinical samples were found to have mutations associated with mitochondrial disorders in either the mitochondrial genome or the 108 nuclear genes. All variants were confirmed by Sanger sequencing. CME Accreditation Statement: This activity ("JMD 2013 CME Program in Molecular Diagnostics") has been planned and implemented in accordance with the Essential Areas and policies of the Accreditation Council for Continuing Medical Education (ACCME) through the joint sponsorship of the American Society for Clinical Pathology (ASCP) and the American Society for Investigative Pathology (ASIP). ASCP is accredited by the ACCME to provide continuing medical education for physicians.The ASCP designates this journal-based CME activity ("JMD 2013 CME Program in Molecular Diagnostics") for a maximum of 48 AMA PRA Category 1 Credit(s)TM. Physicians should only claim credit commensurate with the extent of their participation in the activity.CME Disclosures: The authors of this article and the planning committee members and staff have no relevant financial relationships with commercial interests to disclose. CME Accreditation Statement: This activity ("JMD 2013 CME Program in Molecular Diagnostics") has been planned and implemented in accordance with the Essential Areas and policies of the Accreditation Council for Continuing Medical Education (ACCME) through the joint sponsorship of the American Society for Clinical Pathology (ASCP) and the American Society for Investigative Pathology (ASIP). ASCP is accredited by the ACCME to provide continuing medical education for physicians. The ASCP designates this journal-based CME activity ("JMD 2013 CME Program in Molecular Diagnostics") for a maximum of 48 AMA PRA Category 1 Credit(s)TM. Physicians should only claim credit commensurate with the extent of their participation in the activity. CME Disclosures: The authors of this article and the planning committee members and staff have no relevant financial relationships with commercial interests to disclose. Mitochondrial disorders genetically fall into two classes: mutations in the mitochondrial genome (mtDNA) and genes in the human nuclear genome. mtDNA is a maternally inherited, circular, 16,569-bp haploid organelle composed of 37 genes. It is estimated that up to 1500 nuclear genes may be associated with nuclear encoded mitochondrial proteins, and inheritance may be autosomal recessive, dominant, or sex-linked.1Koene S. Smeitink J. Mitochondrial medicine: entering the era of treatment.J Intern Med. 2009; 265: 193-209Crossref PubMed Scopus (65) Google Scholar, 2Wong L.J. Mitochondrial syndromes with leukoencephalopathies.Semin Neurol. 2012; 32: 55-61Crossref PubMed Scopus (42) Google Scholar, 3Li M. Schroeder R. Ko A. Stoneking M. Fidelity of capture-enrichment for mtDNA genome sequencing: influence of NUMTs.Nucleic Acids Res. 2012; 40: e137Crossref PubMed Scopus (80) Google Scholar Mitochondrial disorders have an overall incidence of 1:5000, with approximately 20% of disease-causing variants found in the mtDNA and the remainder in the nuclear genome.4Distelmaier F. Koopman W.J. van den Heuvel L.P. Rodenburg R.J. Mayatepek E. Willems P.H. Smeitink J.A. Mitochondrial complex I deficiency: from organelle dysfunction to clinical disease.Brain. 2009; 132: 833-842Crossref PubMed Scopus (246) Google Scholar, 5Di Donato S. Disorders related to mitochondrial membranes: pathology of the respiratory chain and neurodegeneration.J Inherit Metab Dis. 2000; 23: 247-263Crossref PubMed Scopus (41) Google Scholar Common characteristics of mitochondrial disorders include ptosis, external ophthalmoplegia, proximal myopathy, exercise intolerance, cardiomyopathy, sensorineural deafness, optic atrophy, pigmentary retinopathy, diabetes mellitus, and mid- to late-term pregnancy loss.6Duncan J.G. Mitochondrial dysfunction in diabetic cardiomyopathy.Biochim Biophys Acta. 2011; 1813: 1351-1359Crossref PubMed Scopus (152) Google Scholar, 7Wang Q. Moley K.H. Maternal diabetes and oocyte quality.Mitochondrion. 2010; 10: 403-410Crossref PubMed Scopus (62) Google Scholar, 8Schapira A.H. Mitochondrial disease.Lancet. 2006; 368: 70-82Abstract Full Text Full Text PDF PubMed Scopus (520) Google Scholar, 9Spinazzola A. Mitochondrial DNA mutations and depletion in pediatric medicine.Semin Fetal Neonatal Med. 2011; 16: 190-196Abstract Full Text Full Text PDF PubMed Scopus (35) Google Scholar Central nervous system findings may include fluctuating encephalopathy, seizures, dementia, migraine, stroke-like episodes, ataxia, and spasticity.10Hudson G. Chinnery P.F. Mitochondrial DNA polymerase-gamma and human disease.Hum Mol Genet. 2006; 15 Spec No 2: R244-R252Crossref PubMed Scopus (188) Google Scholar, 11Palau F. Estela A. Pla-Martin D. Sánchez-Piris M. The role of mitochondrial network dynamics in the pathogenesis of Charcot-Marie-Tooth disease.Adv Exp Med Biol. 2009; 652: 129-137Crossref PubMed Scopus (26) Google Scholar, 12Reinecke F. Smeitink J.A. van der Westhuizen F.H. OXPHOS gene expression and control in mitochondrial disorders.Biochim Biophys Acta. 2009; 1792: 1113-1121Crossref PubMed Scopus (112) Google Scholar, 13Rötig A. Human diseases with impaired mitochondrial protein synthesis.Biochim Biophys Acta. 2011; 1807: 1198-1205Crossref PubMed Scopus (117) Google Scholar, 14Wallace D.C. Mitochondrial defects in neurodegenerative disease.Ment Retard Dev Disabil Res Rev. 2001; 7: 158-166Crossref PubMed Scopus (61) Google Scholar Disease onset ranges from childhood to adulthood. Because mitochondrial disorders encompass a wide range of phenotypes and a large number of genes, high-throughout next-generation sequencing (NGS) is an ideal method for variant detection. NGS allows for a low-cost, comprehensive mitochondrial disorder panel, one that would be financially and technically difficult to perform by Sanger sequencing. Furthermore, the ability to sequence at high coverage allows for detection of low-level heteroplasmy for mtDNA mutations, which are not easily detected by Sanger sequencing. For the present study, 108 nuclear genes were chosen (Supplemental Table S1), based on known genes implicated in mitochondrial disorders at the time of assay design. The 108 sequenced nuclear genes include genes required for mtDNA integrity, fatty acid metabolism, transcription, mitochondria maintenance, and other oxidative phosphorylation-associated genes. NGS has been used for detection of mitochondrial disorders.15Haack T.B. Haberberger B. Frisch E.M. Wieland T. Iuso A. Gorza M. Strecker V. Graf E. Mayr J.A. Herberg U. Hennermann J.B. Klopstock T. Kuhn K.A. Ahting U. Sperl W. Wilichowski E. Hoffmann G.F. Tesarova M. Hansikova H. Zeman J. Plecko B. Zeviani M. Wittig I. Strom T.M. Schuelke M. Freisinger P. Meitinger T. Prokisch H. Molecular diagnosis in mitochondrial complex I deficiency using exome sequencing.J Med Genet. 2012; 49: 277-283Crossref PubMed Scopus (139) Google Scholar, 16Graham B.H. Diagnostic challenges of mitochondrial disorders: complexities of two genomes.Methods Mol Biol. 2012; 837: 35-46Crossref PubMed Scopus (21) Google Scholar, 17Elo J.M. Yadavalli S.S. Euro L. Isohanni P. Gotz A. Carroll C.J. Valanne L. Alkuraya F.S. Uusimaa J. Paetau A. Caruso E.M. Pihko H. Ibba M. Tyynismaa H. Suomalainen A. Mitochondrial phenylalanyl-tRNA synthetase mutations underlie fatal infantile Alpers encephalopathy.Hum Mol Genet. 2012; 21: 4521-4529Crossref PubMed Scopus (129) Google Scholar, 18Calvo S.E. Compton A.G. Hershman S.G. Lim S.C. Lieber D.S. Tucker E.J. Laskowski A. Garone C. Liu S. Jaffe D.B. Christodoulou J. Fletcher J.M. Bruno D.L. Goldblatt J. Dimauro S. Thorburn D.R. Mootha V.K. Molecular diagnosis of infantile mitochondrial disease with targeted next-generation sequencing.Sci Transl Med. 2012; 4: 118ra10Crossref PubMed Scopus (355) Google Scholar, 19Dhillon S. Hellings J.A. Butler M.G. Genetics and mitochondrial abnormalities in autism spectrum disorders: a review.Curr Genomics. 2011; 12: 322-332Crossref PubMed Scopus (47) Google Scholar, 20Danhauser K. Iuso A. Haack T.B. Freisinger P. Brockmann K. Mayr J.A. Meitinger T. Prokisch H. Cellular rescue-assay aids verification of causative DNA-variants in mitochondrial complex I deficiency.Mol Genet Metab. 2011; 103: 161-166Crossref PubMed Scopus (24) Google Scholar Enrichment techniques include long-range PCR (LR-PCR) for the mitochondrial genome and various in-solution and chip-based capture methods for nuclear genes. Enrichment using RainDance emulsion PCR libraries (RainDance Technologies, Lexington, MA) has not previously been reported for mitochondrial disorders. RainDance emulsion PCR has been described for the enrichment of genes associated with aortopathies, glycosylation disorders, and selected genes on the X chromosome, among others.21Wooderchak-Donahue W.L. O'Fallon B.D. Furtado L.V. Durtschi J.D. Plant P. Ridge P.G. Rope A.F. Yetman A.T. Bayrak-Toydemir P. A direct comparison of next generation sequencing enrichment methods using an aortopathy gene panel-clinical diagnostics perspective.BMC Med Genomics. 2012; 5: 50Crossref PubMed Scopus (29) Google Scholar, 22Mondal K. Shetty A.C. Patel V. Cutler D.J. Zwick M.E. Targeted sequencing of the human X chromosome exome.Genomics. 2011; 98: 260-265Crossref PubMed Scopus (19) Google Scholar, 23Jones M.A. Bhide S. Chin E. Ng B.G. Rhodenizer D. Zhang V.W. Sun J.J. Tanner A. Freeze H.H. Hegde M.R. Targeted polymerase chain reaction-based enrichment and next generation sequencing for diagnostic testing of congenital disorders of glycosylation.Genet Med. 2011; 13: 921-932Abstract Full Text Full Text PDF PubMed Scopus (80) Google Scholar RainDance libraries are synthesized with single primer pairs encapsulated into an emulsion droplet. Approximately 1 million droplets containing target-specific, individual PCR primer sets are used for enrichment. A PCR amplification mix with target DNA is subsequently emulsified and fused (or merged) with an individual RainDance emulsion droplet. The merged RainDance library/PCR mix is deposited into a PCR tube and amplified.24Tewhey R. Warner J.B. Nakano M. Libby B. Medkova M. David P.H. Kotsopoulos S.K. Samuels M.L. Hutchison J.B. Larson J.W. Topol E.J. Weiner M.P. Harismendy O. Olson J. Link D.R. Frazer K.A. Microdroplet-based PCR enrichment for large-scale targeted sequencing.Nat Biotechnol. 2009; 27 ([Erratum appeared in Nat Biotechnol 2010, 28:17]): 1025-1031Crossref PubMed Scopus (368) Google Scholar After amplification, the PCR fragments are concatenated and sonicated to obtain the desired length inserts for NGS libraries. Without this concatenation step, PCR fragments longer than the read lengths associated with a given NGS platform would not have internal sequence information. The main advantage of PCR enrichment methods (such as LR-PCR and RainDance) revolves around their greater specificity, compared with capture-based methods.3Li M. Schroeder R. Ko A. Stoneking M. Fidelity of capture-enrichment for mtDNA genome sequencing: influence of NUMTs.Nucleic Acids Res. 2012; 40: e137Crossref PubMed Scopus (80) Google Scholar, 25Dames S. Durtschi J. Geiersbach K. Stephens J. Voelkerding K.V. Comparison of the Illumina Genome Analyzer and Roche 454 GS FLX for resequencing of hypertrophic cardiomyopathy-associated genes.J Biomol Tech. 2010; 21: 73-80PubMed Google Scholar As with previously reported mtDNA assays, LR-PCR was chosen for enrichment based on ease, reproducibility, and cost. In the present study, we assessed the performance of LR-PCR, RainDance enrichment, and NGS for detection of mutations associated with mitochondrial disorders. We also addressed differences in bioinformatics analysis for mtDNA and RainDance-enriched genes. All 37 genes of the mtDNA were sequenced. The 108 nuclear genes were included as all nuclear genes known (at the time of assay design) to be involved in mitochondrial disorders.26Wang J. Zhan H. Li F.Y. Pursley A.N. Schmitt E.S. Wong L.J. Targeted array CGH as a valuable molecular diagnostic approach: experience in the diagnosis of mitochondrial and metabolic disorders.Mol Genet Metab. 2012; 106: 221-230Crossref PubMed Scopus (36) Google Scholar, 27Vasta V. Ng S.B. Turner E.H. Shendure J. Hahn S.H. Next generation sequence analysis for mitochondrial disorders.Genome Med. 2009; 1: 100Crossref PubMed Scopus (92) Google Scholar Seven samples with known mtDNA mutations and two normal samples were obtained from the Coriell Cell Repositories at the Coriell Institute for Medical Research (Camden, NJ) and were used for validation of the mtDNA NGS assay. The seven mutation samples were NA10742 MT-ND4, m.11778G>A p.Arg340His; NA10744 MT-ND4, m.11778G>A p.Arg340His; NA11605, MT-ND1, m.3460G>A p.Ala52Thr; NA11906, m.8344A>G tRNA-Lys/wt; NA13411, MT-ATP6, m.8993T>G p.Leu156Arg; NA13740, MT-ATP6, m.8993T>G p.Leu156Arg; and NA13741, MT-ATP6, m.8993T>G p.Leu156Arg. The two other samples, NA12878 and NA19240, had no reported clinically associated mtDNA variants and were used as a normal control. Three Coriell samples with no reported deleterious variants in the 108 nuclear genes examined were used as normal controls [NA11605 (which harbors a mtDNA mutation in MT-ND1), NA12878, and NA19240]. Four research samples (R459, R460, R461, and R463) were used for nuclear gene panel validation. The four research samples were shown to have no mtDNA mutations by Sanger sequencing. Two of these samples (R461 and R463) had reported mutations in genes targeted in the nuclear panel; the other two samples (R459 and R460) had no known mutations. All research samples were deidentified before enrichment and NGS sequencing. Clinical samples were deidentified according to the ARUP Laboratories institutional review board protocol (no. 7725). A total of 13 clinical samples were processed for mitochondrial disorder assay (Table 1). Six mtDNA samples, five 108-gene panel samples, and two samples with both mtDNA and the 108-gene panel were sequenced.Table 1Clinical Samples Interrogated for Mitochondrial Disorder AssaySample IDAge (sex)Assay∗mtDNA, only LR-PCR assay performed; 108, only 108-gene panel performed; both, both mtDNA and 108-gene assay performed.Reported phenotype or presentationAR18 months (F)mtDNALeigh syndromeAR230 years (F)mtDNAMuscle weakness, tingling, facial droopAR325 years (F)mtDNAChronic migrating muscle pain, slightly elevated CK and lactateAR44 years (F)mtDNASevere, progressive encephalopathyAR560 years (F)bothHearing loss, glaucoma, cataracts, stroke, osteopenia, ocular albinism and multiple congenital anomaliesAR627 years (M)mtDNASuspected MELAS, encephalopathy and seizures, disinhibition, aggressive behaviorAR714 years (M)mtDNARecurrent vomiting, abdominal pain, migraines, seizures, abnormal EEG but normal MRI, celiac diseaseAR87 months (M)bothBilateral pneumothoraces, hypertonia, abnormal brain MRIAR918 years (M)108Rule out mt diseaseAR1012 years (M)108Rule out mt diseaseAR118 years (M)108Rule out mt diseaseAR124 years (F)108Severe, progressive encephalopathyAR136 months (F)108Possible Kearns-Sayre syndromeF, female; M, male; CK, creatine kinase; EEG, electroencephalography; MELAS, mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes; MRI, magnetic resonance imaging; mt, mitochondrial.∗ mtDNA, only LR-PCR assay performed; 108, only 108-gene panel performed; both, both mtDNA and 108-gene assay performed. Open table in a new tab F, female; M, male; CK, creatine kinase; EEG, electroencephalography; MELAS, mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes; MRI, magnetic resonance imaging; mt, mitochondrial. Primer sequences for mtDNA LR-PCR were as listed in Table 2. Two sets of primers were used to independently amplify the mitochondrial genome in duplicate: Mito1, Mito2, and Mito3 primers and hmt1 and hmt2 primers (Figure 1). The two sets of primers were used to avoid allele dropouts. However, because of the heteroplasmy levels associated with mtDNA, this does not ensure that all mtDNA targeted regions are amplified. Primers were synthesized by Integrated DNA Technologies (Coralville, IA) with a 5′ amino C6 modification to avoid sequence overrepresentation of amplicon ends.28Harismendy O. Frazer K. Method for improving sequence coverage uniformity of targeted genomic intervals amplified by LR-PCR using Illumina GA sequencing-by-synthesis technology.Biotechniques. 2009; 46: 229-231Crossref PubMed Scopus (51) Google Scholar Sequences and primers for emulsion PCR enrichment were designed and supplied by RainDance Technologies based on our list of the 108 nuclear genes. All coding and noncoding exons for each transcript and a minimum of ±40 bases of intronic sequence were specified. A total of 1301 amplicons representing 1208 unique loci enriched 680,235 bases of the nuclear genome. Of these amplified bases, 459,116 bases represented exons and intron boundaries of interest, with the remaining 221,119 bases amplifying intergenic or intragenic regions. The average amplicon length was 523 bp (±330 bp), with a minimum of 193 bp and a maximum of 1500 bp.Table 2LR-PCR Primers for mtDNA AmplificationPrimerSequence∗Each primer has a 5′ amino C6 modification.Length†Length represents the total length in base pairs of the amplicon derived from the primer pairs (including primers). (bp)Mito1 F5′-ACATAGCACATTACAGTCAAATCCCTTCTCGTCCC-3′3968Mito1 R5′-TGAGATTGTTTGGGCTACTGCTCGCAGTGC-3′Mito2 F5′-TACTCAATCCTCTGATCAGGGTGAGCATCAAACTC-3′5513Mito2 R5′-GCTTGGATTAAGGCGACAGCGATTTCTAGGATAGT-3′Mito3 F5′-TCATTTTTATTGCCACAACTAACCTCCTCGGACTC-3′7814Mito3 R5′-CGTGATGTCTTATTTAAGGGGAACGTGTGGGCTAT-3′hmt F1 5695′-AACCAAACCCCAAAGACACC-3′9289hmt R1 98195′-GCCAATAATGACGTGAAGTCC-3′hmt F2 96115′-TCCCACTCCTAAACACATCC-3′7626hmt R2 6265′-TTTATGGGGTGATGTGAGCC-3′F, forward; R, reverse.∗ Each primer has a 5′ amino C6 modification.† Length represents the total length in base pairs of the amplicon derived from the primer pairs (including primers). Open table in a new tab F, forward; R, reverse. mtDNA was coisolated with genomic DNA isolated from blood samples by PureGene DNA extraction (Qiagen, Valencia, CA). The PCR master mix contained 2.5 U TaKaRa LA Taq (Clontech Laboratories, Mountain View, CA), 1× TaKaRa LA Buffer II, 1.6 mmol/L dNTPs, 1 μmol/L forward and reverse primers, and 10 ng human genomic DNA (final concentrations). PCR was performed in an ABI 9700 system (Life Technologies, Carlsbad, CA) using the following conditions: 95°C for two minutes, then 30 cycles of 95°C for 15 seconds and 68°C for 10 minutes, followed by 68°C for 20 minutes and finally a 4°C hold. Amplicons were analyzed by agarose gel electrophoresis, individually purified by QIAquick PCR purification (Qiagen), quantified with a NanoDrop 8000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA), and equimolarly pooled. The pooled DNA (1 μg) was sheared using a Covaris S2 ultrasonicator (Woburn, MA) for two cycles at 9°C, as follows: duty cycle 10%, intensity 5, 200 cycles per burst, and time 60 seconds per cycle. The pooled amplicons were processed using a SPRI-TE nucleic acid extractor (Beckman Coulter, Brea, CA) to create indexed, paired-end libraries for sequencing on the Illumina MiSeq (2 × 150 base reads) or HiSeq (2 × 100 base reads) platform, according to the manufacturer's protocol (Illumina, San Diego, CA). Genomic DNA (10 μg) was sheared for 10 cycles at 20°C (duty cycle 20%, intensity 0.1, 1000 cycles per burst, and time 60 seconds per cycle). The RainDance emulsion PCR library enrichment was performed according to the manufacturer's protocol. The PCR master mix contained 3.75 U Platinum HiFi Taq polymerase (Life Technologies), 14.4% w/v droplet stabilizer (RainDance), 1× HiFi PCR buffer, 0.68 mmol/L dNTPs, 2.5 mmol/L MgCl2, 7.2% w/v dimethyl sulfoxide, 576 mmol/L betaine, and 1.5 μg sheared DNA (all final concentrations). After the merge, the RainDance library was amplified using the following conditions: 94°C for 2 minutes and then 55 cycles of 94°C for 15 seconds, 54°C for 15 seconds, and 68°C for 30 seconds, followed by 68°C for 10 minutes and at a 4°C hold. After amplification, emulsions were disrupted and PCR purified before end repair and ligation according to the manufacturer's protocol. The ligated amplicon DNA was sheared for three cycles at 20°C (duty cycle 10%, intensity 5, 2000 cycles per burst, and time 60 seconds per cycle) to create fragments in the range of 300 to 600 bp for Illumina libraries. The sonicated DNA was processed using a Beckman Coulter SPRI-TE nucleic acid extractor to create indexed, paired-end libraries for sequencing on the Illumina NGS platforms. All libraries were amplified by quantitative real-time PCR (qPCR) (Kapa Biosystems, Woburn, MA) to estimate concentration. Samples were indexed during Illumina library preparation and pooled to leverage platform throughput. After pooling, a second qPCR was performed to estimate the final concentration before NGS sequencing. For pooling, the desired targeted average coverages were 5000-fold (approximately 83 Mb) for the mtDNA assay and 750-fold (approximately 510 Mb) for the 108-gene panel. The high targeted coverage reflects the goal of very deep sequencing data for both assays. Although the mtDNA assay generally has equimolar amplification product representation, the 108-gene panel does not, and the proportion of reads aligned to the regions of interest is between 40% and 60%. To maintain a minimum of 50-fold coverage for each targeted region in the 108-gene panel, 500 Mb of sequence reads are required. After NGS, the raw data are parsed using CASAVA consensus assessment of sequence and variation software (version 1.8 series; Illumina) based on index and converted to fastq file format. Two parallel analyses were performed for the mtDNA and the 108-gene panel: one with CLC Genomics Workbench software (version 4.7+; CLC Bio, Cambridge, MA) and a second using an internally developed pipeline. Two analyses were performed by two different informatacists, to determine variant concordance and ensure sample integrity. In the first analysis, the CLC Bio software was used to perform alignments, single nucleotide polymorphism and deletion/insertion polymorphism calls, and alignment viewing. Variant filtering was performed using in-house data sets derived from exome, whole-genome, and previous mitochondrial disorder sequencing experiments. Further filtering was performed using the dbSNP database of single-nucleotide polymorphisms (http://www.ncbi.nlm.nih.gov/projects/SNP, build 132), the HGMD Human Gene Mutation Database (http://www.hgmd.cf.ac.uk/ac/index.php), and the MITOMAP database (http://www.mitomap.org/mitomap). The second analysis used a pipeline consisting of the Burrows-Wheeler Aligner (BWA version 0.6.1+; http://bio-bwa.sourceforge.net),29Li H. Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform.Bioinformatics. 2009; 25: 1754-1760Crossref PubMed Scopus (26628) Google Scholar, 30Li H. Handsaker B. Wysoker A. Fennell T. Ruan J. Homer N. Marth G. Abecasis G. Durbin R. 1000 Genome Project Data Processing SubgroupThe Sequence Alignment/Map format and SAMtools.Bioinformatics. 2009; 25: 2078-2079Crossref PubMed Scopus (31532) Google Scholar Picard utilities (version 1.50+; http://picard.sourceforge.net),29Li H. Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform.Bioinformatics. 2009; 25: 1754-1760Crossref PubMed Scopus (26628) Google Scholar SAMtools utilities (version 0.1.16+; http://samtools.sourceforge.net),30Li H. Handsaker B. Wysoker A. Fennell T. Ruan J. Homer N. Marth G. Abecasis G. Durbin R. 1000 Genome Project Data Processing SubgroupThe Sequence Alignment/Map format and SAMtools.Bioinformatics. 2009; 25: 2078-2079Crossref PubMed Scopus (31532) Google Scholar the GATK genome analysis toolkit (version 1.6+; http://www.broadinstitute.org/gatk),31McKenna A. Hanna M. Banks E. Sivachenko A. Cibulskis K. Kernytsky A. Garimella K. Altshuler D. Gabriel S. Daly M. DePristo M.A. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.Genome Res. 2010; 20: 1297-1303Crossref PubMed Scopus (14763) Google Scholar, 32DePristo M.A. Banks E. Poplin R. Garimella K.V. Maguire J.R. Hartl C. Philippakis A.A. del Angel G. Rivas M.A. Hanna M. McKenna A. Fennell T.J. Kernytsky A.M. Sivachenko A.Y. Cibulskis K. Gabriel S.B. Altshuler D. Daly M.J. A framework for variation discovery and genotyping using next-generation DNA sequencing data.Nat Genet. 2011; 43: 491-498Crossref PubMed Scopus (7097) Google Scholar the IGV integrative genomics viewer (version 2.0; http://www.broadinstitute.org/igv),33Robinson J.T. Thorvaldsdóttir H. Winckler W. Guttman M. Lander E.S. Getz G. Mesirov J.P. Integrative genomics viewer.Nat Biotechnol. 2011; 29: 24-26Crossref PubMed Scopus (7582) Google Scholar and the Annovar genetic variant annotation tool (http://www.openbioinformatics.org/annovar).34Wang K. Li M. Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.Nucleic Acids Res. 2010; 38: e164Crossref PubMed Scopus (7866) Google Scholar The CLC and internal pipeline analyses are to be performed by two different individuals and are used to determine alignment and variant calling concordance between pipelines and bioinformaticist reports. The paired end, fastq mtDNA sequences files were aligned with PCR duplicates removed to NC_012920 reference sequence; the human genome was not included. Alignment and variant call hard cutoff parameters include coverage at >1000-fold, heteroplasmy levels of >10%, and minimum Q-scores of 30. Variants of disease-causing or unknown significance were verified by Sanger sequencing. The minimum required output for variant reports contains nucleotide position in the mitochondrial genome, gene name, gene symbol, coverage, reference nucleotide, alternative nucleotide, percentages of forward/reverse reads, percentage of variant read, mitochondrial sequence m. number, protein sequence p. number, amino acid change, and the dbSNP RefSNP rs number link (if available) and MitoMap reference data and link (if available). All variants of interest contain reference identifier links to publications, OMIM, and previous clinical interpretation (if available). Quality data metrics included average Q-score for aligned reads, total variants detected, average coverage, and the percentage with coverage at >1000-fold. The RainDance 108-gene panel was sequenced as paired-end reads, but was analyzed as single-end reads because of the concatenation step required for RainDance library preparation. Single-end fastq sequences were aligned to a masked reference genome (human genome build hg19) based on amplicon.bed coordinates supplied by RainDance. Unlike whole-genome, targeted exome sequencing, or the mtDNA assay, PCR duplicates are not removed. The data output is further filtered to report regions

Referência(s)