An Oligonucleotide Microarray for High-Throughput Sequencing of the Mitochondrial Genome
2006; Elsevier BV; Volume: 8; Issue: 4 Linguagem: Inglês
10.2353/jmoldx.2006.060008
ISSN1943-7811
AutoresShaoyu Zhou, Keyaunoosh Kassauei, David J. Cutler, Giulia C. Kennedy, David Sidransky, Anirban Maitra, Joseph A. Califano,
Tópico(s)Molecular Biology Techniques and Applications
ResumoPreviously we developed an oligonucleotide sequencing microarray (MitoChip) as an array-based sequencing platform for rapid and high-throughput analysis of mitochondrial DNA. The first generation MitoChip, however, was not tiled with probes for the noncoding D-loop region, a site frequently mutated in human cancers. Here we report the development of a second-generation MitoChip (v2.0) with oligonucleotide probes to sequence the entire mitochondrial genome. In addition, the MitoChip v2.0 contains redundant tiling of sequences for 500 of the most common haplotypes including single-nucleotide changes, insertions, and deletions. Sequencing results from 14 primary head and neck tumor tissues demonstrated that the v2.0 MitoChips detected a larger number of variants than the original version. Multiple coding region variants detected only in the second generation MitoChips, but not the earlier chip version, were further confirmed with conventional sequencing. Moreover, 31 variations in noncoding region were identified using MitoChips v2.0. Replicate experiments demonstrated >99.99% reproducibility in the second generation MitoChip. In seven head and neck cancer samples with matched lymphocyte DNA, the MitoChip v2.0 detected at least one cancer-associated mitochondrial mutation in four (57%) samples. These results indicate that the second generation MitoChip is a high-throughput platform for identification of mitochondrial DNA mutations in primary tumors. Previously we developed an oligonucleotide sequencing microarray (MitoChip) as an array-based sequencing platform for rapid and high-throughput analysis of mitochondrial DNA. The first generation MitoChip, however, was not tiled with probes for the noncoding D-loop region, a site frequently mutated in human cancers. Here we report the development of a second-generation MitoChip (v2.0) with oligonucleotide probes to sequence the entire mitochondrial genome. In addition, the MitoChip v2.0 contains redundant tiling of sequences for 500 of the most common haplotypes including single-nucleotide changes, insertions, and deletions. Sequencing results from 14 primary head and neck tumor tissues demonstrated that the v2.0 MitoChips detected a larger number of variants than the original version. Multiple coding region variants detected only in the second generation MitoChips, but not the earlier chip version, were further confirmed with conventional sequencing. Moreover, 31 variations in noncoding region were identified using MitoChips v2.0. Replicate experiments demonstrated >99.99% reproducibility in the second generation MitoChip. In seven head and neck cancer samples with matched lymphocyte DNA, the MitoChip v2.0 detected at least one cancer-associated mitochondrial mutation in four (57%) samples. These results indicate that the second generation MitoChip is a high-throughput platform for identification of mitochondrial DNA mutations in primary tumors. Identification of sensitive and specific molecular markers for early diagnosis of cancer is an area of active research, with immense potential benefits to patients. For example, patients with head and neck squamous cell carcinoma demonstrate a 90% 5-year survival rate if detected at stage I laryngeal cancer compared to a 99.99% reproducibility. However, a major limitation of the earlier version of MitoChip was its lack of tiling oligonucleotide probes for sequencing the mitochondrial displacement loop (D-loop). The D-loop is a noncoding region in the mitochondrial DNA molecule that controls both replication and transcription. Many of the somatic mitochondrial DNA mutations of primary tumors are found in the D-loop region.17Sanchez-Cespedes M Parrella P Nomoto S Cohen D Xiao Y Esteller M Jeronimo C Jordan RC Nicol T Koch WM Schoenberg M Mazzarelli P Fazio VM Sidransky D Identification of a mononucleotide repeat as a major target for mitochondrial DNA alterations in human tumors.Cancer Res. 2001; 61: 7015-7019PubMed Google Scholar,18Tamori A Nishiguchi S Nishikawa M Kubo S Koh N Hirohashi K Shiomi S Inoue M Correlation between clinical characteristics and mitochondrial D-loop DNA mutations in hepatocellular carcinoma.J Gastroenterol. 2004; 39: 1063-1068Crossref PubMed Scopus (49) Google Scholar Additional conventional sequencing of the D-loop region is therefore required to provide a complete mitochondrial genomic profiling of cancer samples while using the MitoChip v1.0 as a screening tool.To realize fully a high-throughput strategy for analysis of mtDNA, we present here a new second-generation version of MitoChip, referred to as MitoChip v2.0. The major difference is that the v2.0 chips are tilled with oligonucleotide probes for the entire 16.5-kb sequence of mtDNA (Table 1). In addition, to increase sensitivity of mutation detection, redundant probes are tilled in the MitoChip v2.0 for 500 of the most common haplotypes as observed in the MitoMap public database (http://www.mitomap.org/; accessed April, 2004). The reduced feature size of 8 μmol/L confers another advantage of the v2.0 MitoChips compared to the MitoChips v1.0 because it reduces oligonucleotide synthesis costs and decreases bleed through between adjacent tiled sequences. To evaluate the second generation MitoChip, we performed a comparison study between the v1.0 and v2.0 chips. The sequencing results of 14 primary tumor tissues demonstrated comparable sensitivity in the v2.0 MitoChips with >99.99% reproducibility, as reported previously for the v1.0 array. In addition, 31 variations in the D-loop region were successfully detected in the v2.0 chips. With sequencing of paired lymphocytes, somatic mtDNA mutations were detected in four of seven head and neck tumors with the MitoChip v2.0. The second generation MitoChip has thus proved to be a high-throughput sequencing microarray for sequencing the entire mitochondrial genome for early detection and clinical screening of human tumors.Table 1Design of v2.0 MitoChipTotal double-stranded DNA sequenced per Mitochip16,569 bpControl (plasmid) DNA980 bpMitochondrial coding sequence15,451 bp(Includes RCRS 573 through 16,024)Mitochondrial D-loop sequence1118 bp(Includes RCRS 16,025 through 572) Open table in a new tab Materials and MethodsDesign of Human Mitochondrial v2.0 Oligonucleotide MicroarrayThe MitoChip v2.0 was obtained from Affymetrix (commercially available GeneChip Human Mitochondrial Resequencing Array 2.0; Santa Clara, CA). Sequences comprising both strands of the entire 16,568-bp human mitochondrial genome were synthesized as overlapping 25-mers on high-density oligonucleotide arrays with 8 × 8-μm features. The Cambridge Reference Sequence was tiled as well as sequences representing 500 of the most common haplotypes observed in the MitoMap public database (http://www.mitomap.org/), which include single-nucleotide changes, insertions, and deletions. The MitoChip was fabricated using standard photolithography and solid-phase DNA synthesis by Affymetrix, as described previously.19Pease AC Solas D Sullivan EJ Cronin MT Holmes CP Fodor SP Light-generated oligonucleotide arrays for rapid DNA sequence analysis.Proc Natl Acad Sci USA. 1994; 91: 5022-5026Crossref PubMed Scopus (1180) Google Scholar,20Lipshutz RJ Fodor SP Gingeras TR Lockhart DJ High density synthetic oligonucleotide arrays.Nat Genet. 1999; 21: 20-24Crossref PubMed Scopus (1856) Google Scholar To query any given site from the human mitochondrial reference sequence, four features are tiled on the MitoChip. The four features differ only by the central or 13th base, which consists of each of the four possible nucleotides.DNA Sample Source and PreparationThe performance of the v1.0 and v2.0 MitoChips was evaluated using DNA from 14 head and neck cancer samples from 14 individual patients. Of the 14 samples, seven had matched normal (lymphocyte DNA) samples to identify cancer-associated mutations. All tumor and normal specimens were collected before surgical resections with prior consent from patients. Tumor specimens were frozen and microdissected on a cryostat so that the tumor samples contained greater than 70% neoplastic cells. DNA from tumor sections was digested with 1% sodium dodecyl sulfate/proteinase K, extracted by phenol-chloroform, and ethanol precipitated. Control DNAs from peripheral lymphocytes were processed in the same manner as described previously.21Ahrendt SA Halachmi S Chow JT Wu L Halachmi N Yang SC Wehage S Jen J Sidransky D Rapid p53 sequence analysis in primary lung cancer using an oligonucleotide probe array.Proc Natl Acad Sci USA. 1999; 96: 7382-7387Crossref PubMed Scopus (211) Google ScholarPolymerase Chain Reaction (PCR) AmplificationThe entire mitochondrial DNA sequence was amplified in three overlapping long PCR fragments, with each reaction containing 50 ng of genomic DNA. The primers for PCR amplification were the same as in the previous report for MitoChip v1.016Maitra A Cohen Y Gillespie SE Mambo E Fukushima N Hoque MO Shah N Goggins M Califano J Sidransky D Chakravarti A The human MitoChip: a high-throughput sequencing microarray for mitochondrial mutation detection.Genome Res. 2004; 14: 812-819Crossref PubMed Scopus (207) Google Scholar and were selected using the Amplify 1.2 program as described.22Cutler DJ Zwick ME Carrasquillo MM Yohn CT Tobin KP Kashuk C Mathews DJ Shah NA Eichler EE Warrington JA Chakravarti A High-throughput variation detection and genotyping using microarrays.Genome Res. 2001; 11: 1913-1925PubMed Google Scholar Amplification was accomplished in 50-μl PCRs performed in thin-walled polypropylene plates using the high-fidelity TaKaRa LA Taq (TaKaRa Mirus Bio, Madison, WI), as described in the previous report22Cutler DJ Zwick ME Carrasquillo MM Yohn CT Tobin KP Kashuk C Mathews DJ Shah NA Eichler EE Warrington JA Chakravarti A High-throughput variation detection and genotyping using microarrays.Genome Res. 2001; 11: 1913-1925PubMed Google Scholar with slight modifications. Specifically, all PCR reactions were optimized to perform under standard conditions without addition of 5% dimethyl sulfoxide to aid in the amplification of GC-rich regions as reported previously.22Cutler DJ Zwick ME Carrasquillo MM Yohn CT Tobin KP Kashuk C Mathews DJ Shah NA Eichler EE Warrington JA Chakravarti A High-throughput variation detection and genotyping using microarrays.Genome Res. 2001; 11: 1913-1925PubMed Google Scholar The cycling conditions for all reactions were as follows: 1) 95°C for 2 minutes, 2) 95°C for 15 seconds, 3) 68°C for 7 minutes, 4) repeat step 2 for 29 times, 5) final extension for 12 minutes. As a control for PCR amplification and subsequent hybridization, a 7.5-kb plasmid DNA (Tag IQ-EX template) was amplified concomitantly with the test samples, using forward and reverse primers included in the CustomSeq control kit (Affymetrix, Inc.). The specificity of the reactions was confirmed by agarose gel electrophoresis. The PCR products were purified using QIAQuick PCR Clean up kit (Qiagen, Inc., Valencia, CA), and the resultant purified DNA was resuspended in 30 to 40 μl vol of EB buffer (Affymetrix, Inc.). The concentration of each purified PCR product was determined spectrophotometrically, and a yield of 100 to 200 ng/μl of PCR reactions was routinely obtained.Fragment Pooling, DNA Fragmentation, Labeling, and Chip HybridizationThe procedures on sample pooling, DNA fragmentation, and labeling were identical for both v1.0 and v2.0 MitoChips. Specifically, to obtain optimal performance across the microarray, we pooled equimolar amounts from the three amplified fragments to ensure that an equal number of targets existed for each probe. The pooled DNA fragments were digested with DNase I for 15 minutes in a 50-μl reaction containing Affymetrix fragmentation reagent (0.2 U of DNase I/μg DNA), 5 μl of OnePhorAll buffer (Amersham Life Sciences, Arlington Heights, IL), and EB buffer. Samples were then incubated at 95°C for 15 minutes to inactivate DNase I. Fragmented DNA was labeled by adding 2.0 μl of GeneChip DNA labeling reagent and 3.4 μl of 30 U/μl terminal deoxynucleotidyl transferase (both from Affymetrix).Prehybridization, hybridization, washing, and scanning of the MitoChip were performed as described in the Affymetrix CustomSeq Resequencing protocol. The prehybridizations were performed for 15 minutes in 80-μl (for v2.0 chips) or 200-μl (for v1.0 chips) solution containing 3 mol/L tetramethylammonium chloride, 0.1% Tween 20, and 10 mmol/L Tris, pH 7.8. The chips were hybridized for 16 hours at 48°C with 60 rpm rotation in a hybridization solution containing 3 mol/L tetramethylammonium chloride, 100 μg/ml herring sperm DNA, 500 μg/ml bovine serum albumin, 10 mmol/L Tris, pH 7.8, 0.01% Tween 20, and 200 pmol/L control oligo. The chips were then washed on the Affymetrix fluidics station using the preprogrammed CustomSeq Resequencing wash protocols.Automated Batch Analysis of Microarray DataThe analysis of microarray data for both v1.0 and v2.0 MitoChips was done using RA tools, a modified version of the previously described adaptive background genotype-calling scheme (ABACUS);22Cutler DJ Zwick ME Carrasquillo MM Yohn CT Tobin KP Kashuk C Mathews DJ Shah NA Eichler EE Warrington JA Chakravarti A High-throughput variation detection and genotyping using microarrays.Genome Res. 2001; 11: 1913-1925PubMed Google Scholar the open source software is available at http://www.dpgp.org/. Briefly, RA tools uses an objective statistical framework to assign each genotype call a quality score, which is the difference between the log (base 10) likelihood of the best fitting and the second best fitting statistical model for assigning a genotype at any position on the sequencing array. The total quality score threshold (totThresh) is the quality score that a given base has to exceed to be called. Increasing this value requires increased support for base calls and, as a consequence, fewer bases are called. Bases that fail to reach this threshold are called "N." The optimum total threshold quality score was determined empirically to be 12, which yields the highest base call rate with the lowest discrepancy between genotypes for replicate samples (see below). The original ABACUS algorithm and the RA tools version has been successfully applied for high-throughput variation detection in human and pathogenic organisms,22Cutler DJ Zwick ME Carrasquillo MM Yohn CT Tobin KP Kashuk C Mathews DJ Shah NA Eichler EE Warrington JA Chakravarti A High-throughput variation detection and genotyping using microarrays.Genome Res. 2001; 11: 1913-1925PubMed Google Scholar,23Wong CW Albert TJ Vega VB Norton JE Cutler DJ Richmond TA Stanton LW Liu ET Miller LD Tracking the evolution of the SARS coronavirus using high-throughput, high-density resequencing arrays.Genome Res. 2004; 14: 398-405Crossref PubMed Scopus (100) Google Scholar as well as for detection of mitochondrial sequence variations in human embryonic stem cells.24Maitra A Arking DE Shivapurkar N Ikeda M Stastny V Kassauei K Sui G Cutler DJ Liu Y Brimble SN Noaksson K Hyllner J Schulz TC Zeng X Freed WJ Crook J Abraham S Colman A Sartipy P Matsui S Carpenter M Gazdar AF Rao M Chakravarti A Genomic alterations in cultured human embryonic stem cells.Nat Genet. 2005; 37: 1099-1103Crossref PubMed Scopus (535) Google ScholarConventional Dye Terminator Sequencing of Mitochondrial DNATo confirm a subset of variations identified in coding region by the v2.0 MitoChip, six pairs of primers were designed to amplify mitochondrial segments. The forward and reverse primers were respectively 5′-CGATCAAAAGGGACAAGCAT-3′, and 5′-GGTTTGGGGCTAGGTTTAG-3′, for verifying site 921; 5′-GCTAAGACCCAAACTGGGATT-3, and 5′-GGCCCTGTTCAACTAAGCAC-3′, for sites 1189 and 1393; 5′-TGACCGCTCTGAGCTAAACC, and 5′-CTACCTTTGCACGGTTAGGG-3′ for site 1811; 5′-AACATCACCTCTAGCATCACCA-3′, and 5′-CGTCAGCGAAGGGTTGTAGT-3′ for site 3196; 5′-AGCATTCCCCCTCAAACCTA-3′, and 5′-GAGAGGAGGGTGGATGGAAT-3′ for site 4792; and 5′-CCATCCCTACGCATCCTTTA-3′, and 5′-TCCGAGGAGGTTAGTTGTGG-3′ for site 8392. To confirm variations identified in D-loop region, two overlapping fragments of 613 and 679 bp were PCR amplified and sequenced using primers reported previously.25Suzuki M Toyooka S Miyajima K Iizasa T Fujisawa T Bekele NB Gazdar AF Alterations in the mitochondrial displacement loop in lung cancers.Clin Cancer Res. 2003; 9: 5636-5641PubMed Google Scholar The purified DNA products were then sequenced with dye terminator platform using the ABI BigDye cycle sequencing kit (Applied Biosystems, Foster City, CA).ResultsTotal Mitochondrial Base CallsSamples used in the present study included 14 head and neck tumor DNA and seven matched lymphocyte DNA samples. The entire mitochondrial genome was amplified with three PCR reactions. To simplify the experiments, we modified the PCR conditions used previously16Maitra A Cohen Y Gillespie SE Mambo E Fukushima N Hoque MO Shah N Goggins M Califano J Sidransky D Chakravarti A The human MitoChip: a high-throughput sequencing microarray for mitochondrial mutation detection.Genome Res. 2004; 14: 812-819Crossref PubMed Scopus (207) Google Scholar and obtained an optimal PCR program for all of the reactions as described in the Materials and Methods. In addition, we reduced the total amount of DNA to 150 ng for the amplification of the entire mitochondrial genome. With this DNA we performed 47 MitoChip assays. Of the 47 assays, 14 tumor DNAs were sequenced with v1.0 MitoChip assays, the same DNA was also applied to v2.0 MitoChips for the purpose of comparison. Four samples selected from the original 14 samples were repeated with v2.0 MitoChips to examine the reproducibility of v2.0 chips. Each sample was repeated three times, thus a total of 12 comparisons was made to estimate the reproducibility of the v2.0 MitoChips. Finally, seven matched lymphocyte DNA samples (serving as normal control) were sequenced with the v2.0 MitoChips to identify the mutations. The same procedures for DNA fragmentation and hybridization were used for both v1.0 and 2.0 MitoChips. A different washing program was applied to the two versions of the MitoChips because of the different designs. Using a totThresh of 12, RA tools assigned a mean base call 96.8% (ranging from 95.8 to 97.8%) in v1.0 chips. For v2.0 chips the overall call rate across all of the chips was 94.6% (ranging from 89.2 to 96.8%) (Table 2). Although most chips exhibited a call rate ∼95%, one v2.0 MitoChip recalled 89.2%. Although the recall rate was slightly lower in v2.0 MitoChips compared with v1.0 chips, no significant differences were found in the mitochondrial base call rates between the v1.0 and v2.0 MitoChips.Table 2Summary of v1.0 and v2.0 MitoChip AssaysTotal MitoChips assays47Total v1.0 MitoChip assays14 Total mitochondrial DNA sequenced by v1.0 chips15,451*14 = 216,314 bp Total percentage bases called in v1.0 chips (%)96.8Total v2.0 MitoChips assays33 Total mitochondrial DNA sequenced by v2.0 chips16,569*33 = 546,777 bp Total percentage bases called in v2.0 chips (%)94.6 Open table in a new tab Sensitivity of mtDNA Variation Detection between the v1.0 and v2.0 MitoChipsTo evaluate the application of the v2.0 MitoChips, we first compared the sensitivity of variation detection between the v1.0 and v2.0 chips. Independent aliquots of the same PCR products were used for hybridization to both v1.0 and v2.0 MitoChips. For the comparison purposes, the use of the same PCR products was necessary to rule out any possibility of PCR-induced artifacts, although the polymerase used is known to have extremely high fidelity. Sequencing variants were identified by comparing to mitochondrial reference sequence (http://www.mitomap.org). A difference in sequence when compared to the reference sequence was referred to as variation. When a variation was present in matched normal tissues but absent in tumors, we termed these variations somatic mutations. Of the 120 variations, 75 were detected by both v1.0 and v2.0 MitoChips, and 17 variations were detected only in MitoChip v1.0, and the other 28 detected by MitoChip v2.0 only (Figure 1). From these comparison experiments the v2.0 MitoChips seem to exhibit a higher detection sensitivity. A question arises whether the variations detected in the v2.0 only were truly present or were an artifact of array-based sequencing. Since the MitoChip v1.0 has been evaluated before,16Maitra A Cohen Y Gillespie SE Mambo E Fukushima N Hoque MO Shah N Goggins M Califano J Sidransky D Chakravarti A The human MitoChip: a high-throughput sequencing microarray for mitochondrial mutation detection.Genome Res. 2004; 14: 812-819Crossref PubMed Scopus (207) Google Scholar the 75 variations detected by both types of MitoChips should be real. Therefore, we randomly selected eight variations of the 28 variations only detected by v2.0 MitoChips and sequenced these by conventional dye terminator platform. All of the eight variations were confirmed in the conventional sequencing with 100% accuracy (Table 3). It is worthy to note that 17 variations identified in v1.0 went undetected in v2.0 chips. This was mainly attributable to lack of ability to call some nucleotide positions in v2.0 chips (call rate of 94.2). As mentioned above, a new feature of the v2.0 MitoChip is its tiling of probes for detecting variations in the mitochondrial D-loop region. Indeed, 31 variations were detected in the v2.0 MitoChips (Figure 1). Of the 31 variations, 20 were homoplasmic and 11 were heteroplasmic. The criteria for the identification of heteroplasmic variations was based on the ability of the RA sequencing software to make a reliable call for heteroplasmy, using preset threshold values. Thus, this MitoChip may be able to detect even low-level heteroplasmy. We sequenced all these 31 variations detected in D-loop region. Nineteen of twenty homoplasmic variations were confirmed by conventional dye terminator platform. Five of eleven heteroplasmic variations were confirmed, and the remaining six appeared to be homoplasmic variations. It has been known that the MitoChip possesses higher sensitivity compared to the conventional sequencing assay. As showed in Figure 2, the heteroplasmic variation (C-C/T) in sample 2039 is clearly displayed in sequencing chromatogram. By examining the chromatograms two heteroplasmic variations were found to exist in sample 2075. However, in sample 2007, conventional sequencing detected a homoplasmic variation (A-G) rather than heteroplasmic variation (A-A/G) identified in MitoChip (Figure 2). It is likely that this heteroplasmic variation is real but cannot be detected by conventional sequencing assay because of its low sensitivity. Further studies including dilution experiments may be needed to fully evaluate the ability of MitoChip v2.0 to detect heteroplasmic variations.Table 3Confirmation of Mitochondrial DNA Variations by Conventional SequencingSample codeVariation siteVariation (normal-variation)Confirmation17594791A-GYes17591393G-AYes20073197T-CYes2051921T-CYes20391189T-CYes20391811A-GYes21883197T-CYes21958393C-TYes Open table in a new tab Figure 2Chromatograms showing conventional sequencing of D-loop in representative samples. Arrows indicate the variation sites identified in MitoChips. The corresponding sequencing results of MitoChip and conventional assay are presented to the left of the corresponding chromatograms. Nucleotide sites are specified in parentheses.View Large
Referência(s)