Validation of Whole Genome Methylation Profiling Classifier for Central Nervous System Tumors
2022; Elsevier BV; Volume: 24; Issue: 8 Linguagem: Inglês
10.1016/j.jmoldx.2022.04.009
ISSN1943-7811
AutoresLucas Santana‐Santos, Kwok Ling Kam, David Dittmann, Stephanie De Vito, Matthew McCord, Pouya Jamshidi, Hailie Fowler, Xinkun Wang, Alan Aalsburg, Daniel J. Brat, Craig Horbinski, Lawrence J. Jennings,
Tópico(s)Brain Metastases and Treatment
ResumoThe 2021 WHO Classification of Tumors of the Central Nervous System includes several tumor types and subtypes for which the diagnosis is at least partially reliant on utilization of whole genome methylation profiling. The current approach to array DNA methylation profiling utilizes a reference library of tumor DNA methylation data, and a machine learning–based tumor classifier. This approach was pioneered and popularized by the German Cancer Research Network (DKFZ) and University Hospital Heidelberg. This research group has kindly made their classifier for central nervous system tumors freely available as a research tool via a web-based portal. However, their classifier is not maintained in a clinical testing environment. Therefore, the Northwestern Medicine (NM) classifier was developed and validated. The NM classifier was validated using the same training and validation data sets as the DKFZ group. Using the DKFZ validation data set, the NM classifier's performance showed high concordance (92%) and comparable accuracy (specificity 94.0% versus 84.9% for DKFZ, sensitivity 88.6% versus 94.7% for DKFZ). Receiver-operator characteristic curves showed areas under the curve of 0.964 versus 0.966 for NM and DKFZ classifiers, respectively. In addition, in-house validation was performed and performance was compared using both classifiers. The NM classifier performed comparably well and is currently offered for clinical testing. The 2021 WHO Classification of Tumors of the Central Nervous System includes several tumor types and subtypes for which the diagnosis is at least partially reliant on utilization of whole genome methylation profiling. The current approach to array DNA methylation profiling utilizes a reference library of tumor DNA methylation data, and a machine learning–based tumor classifier. This approach was pioneered and popularized by the German Cancer Research Network (DKFZ) and University Hospital Heidelberg. This research group has kindly made their classifier for central nervous system tumors freely available as a research tool via a web-based portal. However, their classifier is not maintained in a clinical testing environment. Therefore, the Northwestern Medicine (NM) classifier was developed and validated. The NM classifier was validated using the same training and validation data sets as the DKFZ group. Using the DKFZ validation data set, the NM classifier's performance showed high concordance (92%) and comparable accuracy (specificity 94.0% versus 84.9% for DKFZ, sensitivity 88.6% versus 94.7% for DKFZ). Receiver-operator characteristic curves showed areas under the curve of 0.964 versus 0.966 for NM and DKFZ classifiers, respectively. In addition, in-house validation was performed and performance was compared using both classifiers. The NM classifier performed comparably well and is currently offered for clinical testing. Historically, central nervous system (CNS) tumor classification and grading have been based on histology. Although histology provides crucial diagnostic information, there is inherent subjectivity and interobserver variability in diagnoses based solely on morphology. The importance of advanced molecular testing for diagnosis, prognosis, and targeted therapy has made ancillary molecular studies integral for classification and grading of CNS tumors today.1Horbinski C. Ligon K.L. Brastianos P. Huse J.T. Venere M. Chang S. Buckner J. Cloughesy T. Jenkins R.B. Giannini C. Stupp R. Nabors L.B. Wen P.Y. Aldape K.J. Lukas R.V. Galanis E. Eberhart C.G. Brat D.J. Sarkaria J.N. The medical necessity of advanced molecular testing in the diagnosis and treatment of brain tumor patients.Neuro Oncol. 2019; 21: 1498-1508Crossref PubMed Scopus (23) Google Scholar The 2016 and, more recently, the 2021 World Health Organization CNS tumor classification system,2Louis D.N. Perry A. Wesseling P. Brat D.J. Cree I.A. Figarella-Branger D. Hawkins C. Ng H.K. Pfister S.M. Reifenberger G. Soffietti R. von Deimling A. Ellison D.W. The 2021 WHO Classification of Tumors of the Central Nervous System: a summary.Neuro Oncol. 2021; 23: 1231-1251Crossref PubMed Scopus (877) Google Scholar list many important genetic alterations for diagnoses (eg, IDH, ATRX, TP53, TERT promoter, H3F3A, BRAF, MYB, MYBL1, MN1, etc.). Capitalizing on the genomic signature of a tumor, array DNA methylation profiling has proven valuable for the classification of tumors with diagnostically challenging or unusual morphology. It is also useful for identifying subtypes within certain tumor families.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar,4Capper D. Stichel D. Sahm F. Jones D.T.W. Schrimpf D. Sill M. Schmid S. Hovestadt V. Reuss D.E. Koelsche C. Reinhardt A. Wefers A.K. Huang K. Sievers P. Ebrahimi A. Schöler A. Teichmann D. Koch A. Hänggi D. Unterberg A. Platten M. Wick W. Witt O. Milde T. Korshunov A. Pfister S.M. von Deimling A. Practical implementation of DNA methylation and copy-number-based CNS tumor diagnostics: the Heidelberg experience.Acta Neuropathol. 2018; 136: 181-210Crossref PubMed Scopus (162) Google Scholar Furthermore, it has proven valuable in characterizing new tumor classes.5Sturm D. Orr B.A. Toprak U.H. Hovestadt V. Jones D.T.W. Capper D. et al.New brain tumor entities emerge from molecular classification of CNS-PNETs.Cell. 2016; 164: 1060-1072Abstract Full Text Full Text PDF PubMed Scopus (482) Google Scholar For instance, ependymomas are a morphologically similar family of tumors. However, DNA methylation profiling has defined biologically distinct types, which occur in distinct anatomic locations and distinct patient populations.6Witt H. Gramatzki D. Hentschel B. Pajtler K.W. Felsberg J. Schackert G. Löffler M. Capper D. Sahm F. Sill M. von Deimling A. Kool M. Herrlinger U. Westphal M. Pietsch T. Reifenberger G. Pfister S.M. Tonn J.C. Weller M. German Glioma Network DNA methylation-based classification of ependymomas in adulthood: implications for diagnosis and treatment.Neuro Oncol. 2018; 20: 1616-1624Crossref PubMed Scopus (35) Google Scholar,7Ellison D.W. Aldape K.D. Capper D. Fouladi M. Gilbert M.R. Gilbertson R.J. Hawkins C. Merchant T.E. Pajtler K. Venneti S. Louis D.N. cIMPACT-NOW update 7: advancing the molecular classification of ependymal tumors.Brain Pathol. 2020; 30: 863-866PubMed Google Scholar Some ependymoma types are characterized by distinct genetic alterations (ie, ZFTA1 and YAP1 fusions). Other types, such as posterior fossa groups A and B, lack distinct mutations and are most reliably distinguished by DNA methylation profiling.7Ellison D.W. Aldape K.D. Capper D. Fouladi M. Gilbert M.R. Gilbertson R.J. Hawkins C. Merchant T.E. Pajtler K. Venneti S. Louis D.N. cIMPACT-NOW update 7: advancing the molecular classification of ependymal tumors.Brain Pathol. 2020; 30: 863-866PubMed Google Scholar Medulloblastoma molecular subgroups were originally defined by transcriptome analysis.8Taylor M.D. Northcott P.A. Korshunov A. Remke M. Cho Y.-J. Clifford S.C. Eberhart C.G. Parsons D.W. Rutkowski S. Gajjar A. Ellison D.W. Lichter P. Gilbertson R.J. Pomeroy S.L. Kool M. Pfister S.M. Molecular subgroups of medulloblastoma: the current consensus.Acta Neuropathol. 2012; 123: 465-472Crossref PubMed Scopus (1179) Google Scholar They can also be distinguished by DNA methylation profiling.9Schwalbe E.C. Williamson D. Lindsey J.C. Hamilton D. Ryan S.L. Megahed H. Garami M. Hauser P. Dembowska-Baginska B. Perek D. Northcott P.A. Taylor M.D. Taylor R.E. Ellison D.W. Bailey S. Clifford S.C. DNA methylation profiling of medulloblastoma allows robust subclassification and improved outcome prediction using formalin-fixed biopsies.Acta Neuropathol. 2013; 125: 359-371Crossref PubMed Scopus (105) Google Scholar,10Korshunov A. Sahm F. Zheludkova O. Golanov A. Stichel D. Schrimpf D. Ryzhova M. Potapov A. Habel A. Meyer J. Lichter P. Jones D.T.W. von Deimling A. Pfister S.M. Kool M. DNA methylation profiling is a method of choice for molecular verification of pediatric WNT-activated medulloblastomas.Neuro Oncol. 2019; 21: 214-221Crossref PubMed Scopus (18) Google Scholar Morphology-based grading of meningiomas is complex, and certain morphologic grading criteria are subject to interobserver variability.11Rogers C.L. Perry A. Pugh S. Vogelbaum M.A. Brachman D. McMillan W. Jenrette J. Barani I. Shrieve D. Sloan A. Bovi J. Kwok Y. Burri S.H. Chao S.T. Spalding A.C. Anscher M.S. Bloom B. Mehta M. Pathology concordance levels for meningioma classification and grading in NRG Oncology RTOG Trial 0539.Neuro Oncol. 2016; 18: 565-574Crossref PubMed Scopus (68) Google Scholar DNA methylation profiling has proven to be an objective, reliable tool in predicting risk of recurrence in meningioma.12Nassiri F. Mamatjan Y. Suppiah S. Badhiwala J.H. Mansouri S. Karimi S. Saarela O. Poisson L. Gepfner-Tuma I. Schittenhelm J. Ng H.-K. Noushmehr H. Harter P. Baumgarten P. Weller M. Preusser M. Herold-Mende C. Tatagiba M. Tabatabai G. Sahm F. von Deimling A. International Consortium on Meningiomas Zadeh G. Aldape K.D. DNA methylation profiling to predict recurrence risk in meningioma: development and validation of a nomogram to optimize clinical management.Neuro Oncol. 2019; 21: 901-910Crossref PubMed Scopus (88) Google Scholar,13Maas S.L.N. Stichel D. Hielscher T. Sievers P. Berghoff A.S. Schrimpf D. et al.Integrated molecular-morphologic meningioma classification: a multicenter retrospective analysis, retrospectively and prospectively validated.J Clin Oncol. 2021; 39: 3839-3852Crossref PubMed Scopus (21) Google Scholar Methylation profiling has also proven useful in identifying pleomorphic xanthoastrocytoma and anaplastic pleomorphic xanthoastrocytoma, which can have quite heterogeneous morphology.14Vaubel R. Zschernack V. Tran Q.T. Jenkins S. Caron A. Milosevic D. Smadbeck J. Vasmatzis G. Kandels D. Gnekow A. Kramm C. Jenkins R. Kipp B.R. Rodriguez F.J. Orr B.A. Pietsch T. Giannini C. Biology and grading of pleomorphic xanthoastrocytoma-what have we learned about it?.Brain Pathol. 2021; 31: 20-32Crossref PubMed Scopus (11) Google Scholar,15Kam K.-L. Snuderl M. Khan O. Wolinsky J.-P. Gondi V. Grimm S. Horbinski C. Using methylation profiling to diagnose systemic metastases of pleomorphic xanthoastrocytoma.Neurooncol Adv. 2020; 2: vdz057PubMed Google Scholar DNA methylation profiling revealed that the tumor class formerly known as primitive neuroectodermal tumors of the CNS, which are characterized by small round blue cell histology, is actually composed of a number of biologically distinct tumor groups with unique genetic drivers.5Sturm D. Orr B.A. Toprak U.H. Hovestadt V. Jones D.T.W. Capper D. et al.New brain tumor entities emerge from molecular classification of CNS-PNETs.Cell. 2016; 164: 1060-1072Abstract Full Text Full Text PDF PubMed Scopus (482) Google Scholar The current approach to array DNA methylation profiling utilizes a reference library of tumor DNA methylation data, and a machine learning–based tumor classifier. This approach was pioneered and popularized by the German Cancer Research Network (DKFZ) and University Hospital Heidelberg.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar,4Capper D. Stichel D. Sahm F. Jones D.T.W. Schrimpf D. Sill M. Schmid S. Hovestadt V. Reuss D.E. Koelsche C. Reinhardt A. Wefers A.K. Huang K. Sievers P. Ebrahimi A. Schöler A. Teichmann D. Koch A. Hänggi D. Unterberg A. Platten M. Wick W. Witt O. Milde T. Korshunov A. Pfister S.M. von Deimling A. Practical implementation of DNA methylation and copy-number-based CNS tumor diagnostics: the Heidelberg experience.Acta Neuropathol. 2018; 136: 181-210Crossref PubMed Scopus (162) Google Scholar This research group has kindly made their classifier for central nervous system tumors freely available as a research tool via a web-based portal. However, this classifier is not maintained in clinical testing environment. Therefore, the Northwestern Medicine (NM) classifier of CNS tumors was developed and validated. The NM classifier was trained using the same 2801 publicly available samples, as well as the same normalization strategies that were used by the DKFZ group, and validated using the same 1104 publicly available samples. Additional validation data were collected in-house and analyzed using both classifiers for comparison. A previously published data set3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar comprising 3905 central nervous tumor samples was downloaded from GEO (https://www.ncbi.nlm.nih.gov/geo; accession number GSE109381). From that data set, 2801 samples were used in the original classifier training, and 1104 were used for validation, respectively labeled reference set and validation set in the GEO series metadata. In-house validation samples are available in GEO under the accession number GSE198855. Signal intensities were extracted from the methylation array raw data (IDAT files) using the minfi R package version 1.40.0).16Aryee M.J. Jaffe A.E. Corrada-Bravo H. Ladd-Acosta C. Feinberg A.P. Hansen K.D. Irizarry R.A. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.Bioinformatics. 2014; 30: 1363-1369Crossref PubMed Scopus (1865) Google Scholar Raw signals were normalized according to methods previously described.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar Normalized intensities were used to generate beta values according to Illumina's guidelines (Illumina, San Diego, CA). No batch effects were detected, and therefore, no correction was applied (data not shown). After normalization, probes in the following categories were removed: i) probes targeting sex chromosomes (n = 11,551); ii) probes containing single nucleotide polymorphisms in targeted CpG or within 5 bp of it (n = 7998); and iii) probes matching to more than one region in the human genome (n = 3965). Publicly available data were analyzed using Illumina Infinium HumanMethylation450 BeadChip (450k), whereas in-house samples were analyzed using Illumina Infinium Human MethylationEPIC BeadChip (EPIC, 850K). Therefore, to make the classifier compatible with both arrays, uncommon probes were removed (n = 33,199). A total of 428,799 probes were used in downstream analysis. Unsupervised clustering was performed using the same methodology as previously described.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar First, the 32,000 most variable probes from 2801 training samples were selected. Eigenvalue decomposition was performed, and 94 nontrivial components were used for t-Distributed Stochastic Neighbor Embedding (t-SNE) analysis. The RSpectra version 0.16-0 (R Project for Statistical Computing; https://cran.r-project.org/package=RSpectra) and Rtsne version 0.15 (R Project for Statistical Computing; https://cran.r-project.org/web/packages/Rtsne/index.html) packages were used for principal component analysis and t-SNE analysis, respectively. Following parameters were used for t-SNE analysis: theta = 0, pca = F, max_iter = 2500. Previous studies showed good classification performance of CNS tumors with a Random Forest classifier followed by score recalibration using an L2-penalized, multinomial logistic regression.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar For the NM classifier, a similar framework was used and training was performed using the original training data set of 2801 samples. First, the 100,000 topmost variable probes from 428,799 probes were selected. This filter reduced computational requirements without affecting classifier performance. Next, the most important features were selected by training a Random Forest classifier using the filtered 100,000 probes. The most important 10,000 features were selected and utilized in classifier training. Importance was measured by the decrease in class-specific classification accuracy when the feature was permutated. Features were ranked by the minimal rank of the variable importance across all classes. The final classification model was trained by fitting 10,000 trees (mtry parameter = 100) with the 10,000 most important probes selected as previously described.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar Random forest classifier was trained using the R package randomForest version 4.7-1 (R Project for Statistical Computing; https://cran.r-project.org/web/packages/randomForest/index.html). Score recalibration is necessary to translate random scores to meaningful class probabilities that can be used in a clinical context. Therefore, the L2-penalized, multinomial, logistic regression was fitted, which takes random forest scores as dependent variables, and methylation class as the response variable. To prevent overfitting and bias to unseen cases, independent random forest scores are needed to fit in the logistic regression. Therefore, random forest scores generated by a nested fivefold cross-validation were used. The glmnet package version 4.1-3 (R Project for Statistical Computing; https://cran.r-project.org/web/packages/glmnet/index.html) was used to fit this model. During training, the random forest classifier accuracy was assessed using fivefold, nested cross-validation following the workflow previously described.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar In each cross-validation iteration, one-fifth of the data was used as an independent validation set, whereas the remaining four-fifths was used as a training set. After the fifth iteration, all 2801 training samples would have been predicted by an independently trained predictor. The final random forest classifier and score recalibration logistic regression were also validated using an independent data set of 1104 CNS tumor cases. Additional validation was performed by profiling samples collected in-house and analyzed using both the DKFZ and NM classifiers for comparison. Samples included in the study were all from human brain sample, which was formalin-fixed for 8 hours followed by paraffin-embedding (FFPE) and stored at room temperature. All samples were extracted using either Covaris (FFPE DNA Kit; Covaris, Inc., Woburn, MA) or Purigen Biosystems (FFPE DNA Kit; Purigen Biosystems, Pleasanton, CA) kits following the manufacturer's instructions. Twenty samples were extracted using both methods as part of the assessment of precision. The DNA from FFPE blocks was checked for suitability for FFPE restoration, as indicated by the Infinium HD FFPE QC Assay (Illumina, Inc.), by performing a quantitative PCR with 2 ng of FFPE DNA. Delta quantification cycles (ΔCq) was calculated by subtracting the average value of Cq of the interrogated sample from the Cq value of a standard provided by the manufacturer. All FFPE samples had a ΔCq <5, which is the recommended threshold for suitability for FFPE restoration. Samples ranging from 50 ng to 500 ng of FFPE DNA were pipetted into strip tubes and processed using the EZ-96 DNA Methylation kit (Zymo Research Corp., Tustin, CA) following the manufacturer's recommendations for Infinium assays. Bisulfite-converted DNA (8 μL) from FFPE samples was processed as previously described.17Moran S. Vizoso M. Martinez-Cardús A. Gomez A. Matías-Guiu X. Chiavenna S.M. Fernandez A.G. Esteller M. Validation of DNA methylation profiling in formalin-fixed paraffin-embedded samples using the Infinium HumanMethylation450 Microarray.Epigenetics. 2014; 9: 829-833Crossref PubMed Scopus (46) Google Scholar Briefly, the DNA was denatured with 4 μL of NaOH 0.1N for 10 minutes at room temperature. A 1-hour reaction at 37°C was then performed with Primer Pre Restore and Amplification Mix Restore reagents supplied by the kit manufacturer (Illumina, Inc.). DNA was cleaned with a ZR-96 DNA Clean & Concentrator-5 kit (Zymo Research Corp.) and eluted in 13 μL of Elution Restore Buffer reagent (Illumina, Inc.). Cleaned DNA was then denatured for 2 minutes at 95°C, followed by ligation incubation at 37°C for 1 hour with Convert Master Mix reagents (Illumina, Inc.). The resulting material was cleaned with ZR-96 DNA Clean & Concentrator-5 kit (Zymo Research Corp.) and eluted in 10 μL of deionized water. MethylationEPIC beadarray was performed according to the Infinium HD Methylation Assay Protocol, as previously described with minor changes.16Aryee M.J. Jaffe A.E. Corrada-Bravo H. Ladd-Acosta C. Feinberg A.P. Hansen K.D. Irizarry R.A. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.Bioinformatics. 2014; 30: 1363-1369Crossref PubMed Scopus (1865) Google Scholar Eight microliters of restored FFPE bisulfite-converted DNA was used for each sample, and 28 μL of hybridization volume was used when loading the microarray. To establish a comprehensive CNS tumor reference cohort, genome-wide DNA methylation profiles of 2801 samples were used as previously described.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar Unsupervised clustering was performed using t-SNE (Figure 1), which shows the presence of 91 distinct methylation classes in agreement with previous observations.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar A classifier was developed using the Random Forest algorithm as previously described.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar Class probability is determined using a fitted L2-penalized, multinomial, logistic regression model that transforms the raw score into a probability or calibrated score. To assess the validity of this classifier, an independent group of 1104 samples was analyzed.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar The area under the curve was calculated for both the DKFZ and NM classifiers (Figure 2). Both classifiers showed comparable areas under the curve: 0.966 and 0.964 for DKFZ and NM classifiers, respectively (Figure 2). A score of 0.976 achieves a specificity of 1 and sensitivity of 0.775, whereas the Youden index corresponds to a score of 0.929 with a specificity of 0.976 and sensitivity of 0.863. A score of 0.9 achieves a better balance between specificity and sensitivity (0.940 and 0.886, respectively) than the previous two thresholds, and will be used for subsequent analysis (Figure 2). In comparison, the DKFZ classifier achieved a specificity of 0.849 and a sensitivity of 0.947 when the recommended score of 0.9 was applied. Concordance of the 1104 samples as compared to the DKFZ classifier can be found in Figure 3. Inaccurately classified samples are shown, and some can be seen to have high calibrated scores using one classifier or the other, but not both, demonstrating the potential to reduce misclassification through the utilization of more than one classifier. DKFZ and NM concordance was 92% (1012/1104) when a cutoff of 0.9 was applied for both classifiers (Supplemental Table S1). Out of the 92 nonconcordant cases, 88 cases had the same classification, only differing in the magnitude of the scores between classifiers. Classifiers also showed similar score distributions, as NM scores median was 0.997 with a SD of 0.197, and DKFZ median score was 0.999 with a SD of 0.167. To assess the overall accuracy of the methylation array, genome-wide DNA methylation profiles were performed on 53 unique samples across a range of tumor types (Supplemental Table S2). The diagnoses were determined using the combination of histology, location, and history together with ancillary studies when available (eg, immunohistochemistry, next-generation sequencing, single nucleotide polymorphism microarray, and methylation array). Accuracy, when compared with integrated sample diagnosis, was 69.8% (37/53) when a cutoff of 0.9 was applied, and 90.5% (48/53) was achieved when the classification was considered without a score threshold (Supplemental Table S2). In comparison, DKFZ accuracy was 77.5% (38/49) for in-house samples when a cutoff of 0.9 was applied, and 91.8% (45/49) was achieved when the classification was considered without a score threshold. It is important to notice that four samples failed analysis on the DKFZ website, and two of these samples did not match using the NM classifier, hence contributing to the lower accuracy numbers on the latter. The score correlation between NM and DKFZ was 0.81 with the highest variation among those with lowest scores (<0.5) as seen in Figure 4. Samples that were run more than once are shown in Table 1. The following variables were changed to ensure that sample processing was reproducible: date, extraction methods, and technologist. Twenty-four samples were run two or more times. Classification between the sets of samples was concordant for 100% (24/24) of the replicates. There was no appreciable difference in performance for any of the parameters that were assessed. Paired t-test showed no statistically significant difference in the scores obtained by samples extracted with the Purigen and Covaris kits (n = 10, P = 0.5911). Likewise, paired t-test showed no statistically significant difference in the scores obtained by samples processed by technologist 1 versus technologist 2 (n = 7, P = 1.00).Table 1Precision Sample ResultsSample setExtraction methodTechnicianRun dateClassScoreStatus1Purigen1July 23, 2021Craniopharyngioma, papillary0.15No matchPurigen1July 23, 20210.18No match2Purigen1July 23, 2021Meningioma1.00MatchPurigen1July 7, 20211.00Match3Purigen1July 23, 2021Glioblastoma, IDH wildtype, H3.3 G34 mutant0.98MatchPurigen1July 7, 20210.99Match4Purigen1July 30, 2021No match0.00No matchPurigen1July 23, 2021Control tissue, inflammatory tumor microenvironment0.11No match5Purigen1July 30, 2021Control tissue, reactive tumor microenvironment1.00MatchPurigen1September 7, 20210.98Match6Purigen1July 30, 2021IDH glioma1.00MatchPurigen1September 7, 20211.00Match7Purigen1July 23, 2021Ependymoma, myxopapillary1.00MatchPurigen1September 7, 20211.00Match8Purigen2July 23, 2021Infantile hemispheric glioma0.50No matchPurigen1July 23, 20210.48No match9Purigen1July 23, 2021Craniopharyngioma, adamantinomatous1.00MatchCovaris1July 30, 20211.00Match10Purigen2July 23, 2021Pineal parenchymal tumor1.00MatchPurigen2July 23, 20211.00MatchPurigen1July 23, 20211.00Match11Purigen2July 23, 2021IDH glioma1.00MatchPurigen2July 23, 20211.00Match12Purigen2July 23, 2021Schwannoma0.95MatchCovaris1July 30, 20210.96MatchPurigen1September 7, 20210.94Match13Purigen2July 23, 2021Solitary fibrous tumor/hemangiopericytoma1.00MatchCovaris1July 30, 20211.00MatchPurigen1September 7, 20211.00Match14Purigen2July 23, 2021Pituitary adenoma, FSH/LH0.99MatchPurigen2July 23, 20210.99Match15Purigen2July 16, 2021Plexus tumor1.00MatchPurigen2July 23, 20211.00Match16Purigen2July 16, 2021IDH glioma1.00MatchCovaris2July 16, 20211.00Match17Purigen2July 16, 2021Meningioma0.82No matchCovaris2July 16, 20210.82No match18Purigen2July 16, 2021Glioblastoma, IDH wildtype1.00MatchCovaris2July 16, 20210.99Match19Purigen2July 16, 2021Hemangioblastoma1.00MatchCovaris2July 16, 20211.00MatchPurigen1September 7, 20211.00Match20Purigen2July 16, 2021Meningioma1.00MatchCovaris2July 16, 20211.00Match21Covaris1July 30, 2021Medulloblastoma, SHH0.95MatchPurigen2July 23, 20210.94MatchPurigen2July 23, 20210.93Match22Purigen2July 23, 2021Low-grade glioma, dysembryoplastic neuroepithelial tumor1.00MatchPurigen2July 23, 20211.00Match23Purigen2July 23, 2021Pilocytic astrocytoma1.00MatchPurigen2July 23, 20211.00Match24Covaris1July 30, 2021Medulloblastoma group 3 and 41.00MatchPurigen2July 23, 20211.00MatchPurigen1July 23, 20211.00MatchSample with scores <0.9 are assigned a no match status.FSH, follicle-stimulating hormone; IDH, isocitrate dehydrogenase; LH, luteinizing hormone; SHH, sonic hedgehog. Open table in a new tab Sample with scores 0.9 are considered a match for class, whereas scores >0.5 are considered a match for subclass. There is no reference range for this brain tumor classifier. Although nontumor samples were included in the development and validation of the classifier (eg, pituitary gland anterior lobe, cerebellar hemisphere, hemispheric cortex, hypothalamus, pineal gland, white matter), these diagnostic categories are not considered to be the normal or reference range. The greatest challenge of validating whole genome methylation profiling of brain tumors for clinical testing was the development and validation of the NM classifier. Fortunately, the extensive work of other investigators was made publicly available.3Capper D. Jones D.T.W. Sill M. Hovestadt V. Schrimpf D. Sturm D. et al.DNA methylation-based classification of central nervous system tumours.Nature. 2018; 555: 469-474Crossref PubMed Scopus (1054) Google Scholar Using the same data sets and normalization methods, the NM classifier could be developed and validated with an independent set of data files and showed the NM classifier to be comparably accurate. Interestingly, some miscalled samples can be seen with high confidence by each classifier, but not with both. This variation is attributed to feature selection in the development of the classifier, and shows the potential advantage of having more than one classifier to assess a given sample. When applied to in-house validation samples and compared with the integrated sample diagnosis, the NM classifier accuracy was 69.8% (37/53) when a cutoff of 0.9 was applied, and 90.5% (48/53) was achieved when the classification was considered without a score threshold versus 77.5% (38/49) and 91.8% (45/49) for DKFZ. It should be noted that the denominator for the DKFZ was 49 because four samples failed analysis on the DKFZ website, and two of these samples did not match using the NM classifier, hence contributing to the lower accuracy numbers on the latter. The accuracy without a threshold closely matched the accuracy of both classifiers (92%) using the 1104 sample data set suggesting that the apparently lower accuracy for the in-house samples could be attributable to lower confidence scores. The lower confidence scores are likely due samples used in the validation, which were older stored DNA samples. Therefore, when confidence scores are low, the quality of the sample should be considered including low tumor cellularity. A chromosomal copy number plot can be very useful to assess the quality and cellularity of the DNA for a particular sample. Poorer quality DNA will have more variance or noise in the copy number estimates and lower tumor cellularity will have smaller shifts. If the quality is deemed good, other reasons for a low confidence score include a new or rare tumor type that was not represented in the classifier or a limitation of the classifier itself. Correlation with clinical history, histology, and ancillary studies is therefore essential for those cases with lower calibrated scores. The t-SNE plot can also be particularly useful for samples with lower confidence scores. The t-SNE plot is a dimension-reduction approach for visualization of high-dimensional data. The data being visualized are unsupervised clustering of all 2801 training samples, and it is a different clustering algorithm than that used to create the tumor classifier, which is done by random forest. For high score samples (score >0.9), t-SNE plots closely align with the classifier results. When the score is <0.9, the t-SNE plot may show proximity to a particular group and provides additional information together with copy number variations, mutations, histology, immunohistochemistry, and clinical presentation to help ascertain the most likely scenario (eg, low tumor cellularity or a novel subtype). These validation studies show methylation array on FFPE tissue samples to be highly robust and reproducible. Sample-specific issues, such as blood clot, necrosis, and estimated tumor cellularity had no significant effect on calibrated score. Also, repeat testing of 24 samples showed no difference between different extraction methods, technologists or runs and sample input as low as 50 ng showed good performance. As demonstrated previously,4Capper D. Stichel D. Sahm F. Jones D.T.W. Schrimpf D. Sill M. Schmid S. Hovestadt V. Reuss D.E. Koelsche C. Reinhardt A. Wefers A.K. Huang K. Sievers P. Ebrahimi A. Schöler A. Teichmann D. Koch A. Hänggi D. Unterberg A. Platten M. Wick W. Witt O. Milde T. Korshunov A. Pfister S.M. von Deimling A. Practical implementation of DNA methylation and copy-number-based CNS tumor diagnostics: the Heidelberg experience.Acta Neuropathol. 2018; 136: 181-210Crossref PubMed Scopus (162) Google Scholar, 5Sturm D. Orr B.A. Toprak U.H. Hovestadt V. Jones D.T.W. Capper D. et al.New brain tumor entities emerge from molecular classification of CNS-PNETs.Cell. 2016; 164: 1060-1072Abstract Full Text Full Text PDF PubMed Scopus (482) Google Scholar, 6Witt H. Gramatzki D. Hentschel B. Pajtler K.W. Felsberg J. Schackert G. Löffler M. Capper D. Sahm F. Sill M. von Deimling A. Kool M. Herrlinger U. Westphal M. Pietsch T. Reifenberger G. Pfister S.M. Tonn J.C. Weller M. German Glioma Network DNA methylation-based classification of ependymomas in adulthood: implications for diagnosis and treatment.Neuro Oncol. 2018; 20: 1616-1624Crossref PubMed Scopus (35) Google Scholar, 7Ellison D.W. Aldape K.D. Capper D. Fouladi M. Gilbert M.R. Gilbertson R.J. Hawkins C. Merchant T.E. Pajtler K. Venneti S. Louis D.N. cIMPACT-NOW update 7: advancing the molecular classification of ependymal tumors.Brain Pathol. 2020; 30: 863-866PubMed Google Scholar, 8Taylor M.D. Northcott P.A. Korshunov A. Remke M. Cho Y.-J. Clifford S.C. Eberhart C.G. Parsons D.W. Rutkowski S. Gajjar A. Ellison D.W. Lichter P. Gilbertson R.J. Pomeroy S.L. Kool M. Pfister S.M. Molecular subgroups of medulloblastoma: the current consensus.Acta Neuropathol. 2012; 123: 465-472Crossref PubMed Scopus (1179) Google Scholar, 9Schwalbe E.C. Williamson D. Lindsey J.C. Hamilton D. Ryan S.L. Megahed H. Garami M. Hauser P. Dembowska-Baginska B. Perek D. Northcott P.A. Taylor M.D. Taylor R.E. Ellison D.W. Bailey S. Clifford S.C. DNA methylation profiling of medulloblastoma allows robust subclassification and improved outcome prediction using formalin-fixed biopsies.Acta Neuropathol. 2013; 125: 359-371Crossref PubMed Scopus (105) Google Scholar, 10Korshunov A. Sahm F. Zheludkova O. Golanov A. Stichel D. Schrimpf D. Ryzhova M. Potapov A. Habel A. Meyer J. Lichter P. Jones D.T.W. von Deimling A. Pfister S.M. Kool M. DNA methylation profiling is a method of choice for molecular verification of pediatric WNT-activated medulloblastomas.Neuro Oncol. 2019; 21: 214-221Crossref PubMed Scopus (18) Google Scholar whole genome methylation profiling of brain tumors can be especially useful for pediatric brain tumors, unconventional tumors with atypical morphology and immunohistochemistry profile, subtyping of group 3 and 4 medulloblastomas, and subgrouping of ependymoma/pleomorphic xanthoastrocytoma. In addition, it could be used for other diagnostically challenging cases and discovery of new entities that may have therapeutic and prognostic implications. Moreover, other clinically useful information can be obtained through a methylation array, including MGMT methylation status and copy number changes. Indeed, others18Bady P. Delorenzi M. Hegi M.E. Sensitivity analysis of the MGMT-STP27 model and impact of genetic and epigenetic context to predict the MGMT methylation status in gliomas and other tumors.J Mol Diagn. 2016; 18: 350-361Abstract Full Text Full Text PDF PubMed Scopus (61) Google Scholar have shown accurate classification of MGMT methylation status using 2 positions on the methylation array. This approach was compared with MGMT methylation status by pyrosequencing as a separate validation and performed very well (data not shown). For copy number changes, as compared with single nucleotide polymorphism array, there are some inherent limitations, including an inability to call loss of heterozygosity or tetraploidy. However, large copy number changes can be easily discerned, and validation of this approach to identify clinically relevant copy number variations in brain tumors is in process. In summary, whole-genome methylation profiling of brain tumors for clinical testing has been developed and validated. A crucial part of this validation was the development and validation of a classifier that can be adapted and controlled within a Clinical Laboratory Improvement Amendments–certified clinical laboratory setting. Future developments will include updating the classifier to accommodate additional data and clinical entities. Indeed, sharing of deidentified data sets across institutions will be important for the continuing evolution of this exciting area of clinical testing; such endeavors are currently underway. D.J.B., C.H., and L.J.J. secured funding and formulated the validation plan; K.L.K., M.M., and P.J. performed case review and selection; K.L.K., D.D., S.D.V., H.F., X.W., and A.M.A. performed testing and analyzed primary data; L.S.-S. developed classifier and performed statistical analysis; L.S.-S., K.L.K., S.D.V., P.J., and L.J.J. wrote the manuscript; L.S.-S., D.J.B., C.H., and L.J.J. reviewed and edited the final draft. L.J.J. is the guarantor of this work and, as such, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Download .xlsx (.1 MB) Help with xlsx files Supplemental Table S1 Download .xlsx (.04 MB) Help with xlsx files Supplemental Table S2
Referência(s)