Detection of Lung Cancer Lymph Node Metastases from Whole-Slide Histopathologic Images Using a Two-Step Deep Learning Approach

Artigo Acesso aberto Revisado por pares

Detection of Lung Cancer Lymph Node Metastases from Whole-Slide Histopathologic Images Using a Two-Step Deep Learning Approach

2019; Elsevier BV; Volume: 189; Issue: 12 Linguagem: Inglês

10.1016/j.ajpath.2019.08.014

ISSN

1525-2191

Autores

Hoa Pham, Mitsuru Futakuchi, Andrey Bychkov, T Furukawa, Kishio Kuroda, Junya Fukuoka,

Tópico(s)

Colorectal Cancer Screening and Detection

Resumo

The application of deep learning for the detection of lymph node metastases on histologic slides has attracted worldwide attention due to its potentially important role in patient treatment and prognosis. Despite this attention, false-positive predictions remain problematic, particularly in the case of reactive lymphoid follicles. In this study, a novel two-step deep learning algorithm was developed to address the issue of false-positive prediction while maintaining accurate cancer detection. Three-hundred and forty-nine whole-slide lung cancer lymph node images, including 233 slides for algorithm training, 10 slides for validation, and 106 slides for evaluation, were collected. In the first step, a deep learning algorithm was used to eliminate frequently misclassified noncancerous regions (lymphoid follicles). In the second step, a deep learning classifier was developed to detect cancer cells. Using this two-step approach, errors were reduced by 36.4% on average and up to 89% in slides with reactive lymphoid follicles. Furthermore, 100% sensitivity was reached in cases of macrometastases, micrometastases, and isolated tumor cells. To reduce the small number of remaining false positives, a receiver-operating characteristic curve was created using foci size thresholds of 0.6 mm and 0.7 mm, achieving sensitivity and specificity of 79.6% and 96.5%, and 75.5% and 98.2%, respectively. A two-step approach can be used to detect lung cancer metastases in lymph node tissue effectively and with few false positives. The application of deep learning for the detection of lymph node metastases on histologic slides has attracted worldwide attention due to its potentially important role in patient treatment and prognosis. Despite this attention, false-positive predictions remain problematic, particularly in the case of reactive lymphoid follicles. In this study, a novel two-step deep learning algorithm was developed to address the issue of false-positive prediction while maintaining accurate cancer detection. Three-hundred and forty-nine whole-slide lung cancer lymph node images, including 233 slides for algorithm training, 10 slides for validation, and 106 slides for evaluation, were collected. In the first step, a deep learning algorithm was used to eliminate frequently misclassified noncancerous regions (lymphoid follicles). In the second step, a deep learning classifier was developed to detect cancer cells. Using this two-step approach, errors were reduced by 36.4% on average and up to 89% in slides with reactive lymphoid follicles. Furthermore, 100% sensitivity was reached in cases of macrometastases, micrometastases, and isolated tumor cells. To reduce the small number of remaining false positives, a receiver-operating characteristic curve was created using foci size thresholds of 0.6 mm and 0.7 mm, achieving sensitivity and specificity of 79.6% and 96.5%, and 75.5% and 98.2%, respectively. A two-step approach can be used to detect lung cancer metastases in lymph node tissue effectively and with few false positives. Traditional microscope and glass slides have been used by pathologists to diagnose disease since the mid-19th century. The conventional workflow involves manual review of numerous glass slides and requires a significant amount of time and effort on the part of the pathologist.1Raab S.S. Grzybicki D.M. Janosky J.E. Zarbo R.J. Meier F.A. Jensen C. Geyer S.J. Clinical impact and frequency of anatomic pathology errors in cancer diagnoses.Cancer. 2005; 104: 2205-2213Crossref PubMed Scopus (123) Google Scholar, 2Nakhleh R.E. Error reduction in surgical pathology.Arch Pathol Lab Med. 2006; 130: 630-632Crossref PubMed Google Scholar, 3Elmore J.G. Longton G.M. Carney P.A. Geller B.M. Onega T. Tosteson A.N.A. Nelson H.D. Pepe M.S. Allison K.H. Schnitt S.J. O'Malley F.P. Weaver D.L. Diagnostic concordance among pathologists interpreting breast biopsy specimens.JAMA. 2015; 313: 1122-1132Crossref PubMed Scopus (406) Google Scholar The emergence of slide scanning machines capable of outputting high resolution digital slides has brought traditional pathology into the digital era, providing numerous advantages to the pathology workflow. One of these advantages is the ability to use computational techniques, including automated image analysis, to aid pathologists in the examination and quantification of slides, thus reducing the time required for manual screening and improving the pathologist's accuracy, reproducibility, and workflow efficiency.4Gurcan M.N. Boucheron L.E. Can A. Madabhushi A. Rajpoot N.M. Yener B. Histopathological image analysis: a review.IEEE Rev Biomed Eng. 2009; 2: 147-171Crossref PubMed Scopus (1261) Google Scholar, 5Holten-Rossing H. Talman M.-L.M. Jylling A.M.B. Lænkholm A.-V. Kristensson M. Vainer B. Application of automated image analysis reduces the workload of manual screening of sentinel lymph node biopsies in breast cancer.Histopathology. 2017; 71: 866-873Crossref PubMed Scopus (24) Google Scholar, 6Steiner D.F. MacDonald R. Liu Y. Truszkowski P. Hipp J.D. Gammage C. Thng F. Peng L. Stumpe M.C. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer.Am J Surg Pathol. 2018; 42: 1636-1646Crossref PubMed Scopus (224) Google Scholar Recently, the application of deep learning techniques to assist diagnosis has attracted considerable interest in pathology. Convolutional neural networks (CNNs), in particular, have demonstrated enormous potential in medical image recognition tasks.7Ghaznavi F. Evans A. Madabhushi A. Feldman M. Digital imaging in pathology: whole-slide imaging and beyond.Annu Rev Pathol. 2013; 8: 331-359Crossref PubMed Scopus (290) Google Scholar, 8Krizhevsky A. Sutskever I. Hinton G.E. ImageNet classification with deep convolutional neural networks.Commun ACM. 2017; 60: 84-90Crossref Scopus (9986) Google Scholar, 9Szegedy C. Liu W. Jia Y. Sermanet P. Reed S. Anguelov D. Erhan D. Vanhoucke V. Rabinovich A. Going deeper with convolutions. Institute of Electrical and Electronics Engineers (IEEE), Piscataway, NJ2015: 1-9Google Scholar, 10Litjens G. Sánchez C.I. Timofeeva N. Hermsen M. Nagtegaal I. Kovacs I. Hulsbergen-van de Kaa C. Bult P. van Ginneken B. van der Laak J. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis.Sci Rep. 2016; 6: 26286Crossref PubMed Scopus (625) Google Scholar In pathology, CNNs have been used in several image recognition applications with valuable results, from tumor cell detection in primary breast cancer, to grading glioma and prostate cancer, to counting mitoses and segmentation of tumor-associated stroma, to building whole-slide image (WSI)-based prognostic data.10Litjens G. Sánchez C.I. Timofeeva N. Hermsen M. Nagtegaal I. Kovacs I. Hulsbergen-van de Kaa C. Bult P. van Ginneken B. van der Laak J. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis.Sci Rep. 2016; 6: 26286Crossref PubMed Scopus (625) Google Scholar, 11Cireşan D.C. Giusti A. Gambardella L.M. Schmidhuber J. Mitosis detection in breast cancer histology images with deep neural networks.Med Image Comput Comput Assist Interv. 2013; 16: 411-418PubMed Google Scholar, 12Xu J. Luo X. Wang G. Gilmore H. Madabhushi A. A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images.Neurocomputing. 2016; 191: 214-223Crossref PubMed Scopus (322) Google Scholar, 13Cruz-Roa A. Basavanhally A. González F. Gilmore H. Feldman M. Ganesan S. Shih N. Tomaszewski J. Madabhushi A. Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks.in: Gurcan M.N. Madabhushi A. SPIE Proceedings Vol. 9041: Medical Imaging 2014: Digital Pathology. International Society for Optics and Photonics, Bellingham WA2014: 904103Google Scholar, 14Ertosun M.G. Rubin D.L. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks.AMIA Annu Symp Proc. 2015; 2015: 1899-1908PubMed Google Scholar, 15Ehteshami Bejnordi B. Mullooly M. Pfeiffer R.M. Fan S. Vacek P.M. Weaver D.L. Herschorn S. Brinton L.A. van Ginneken B. Karssemeijer N. Beck A.H. Gierach G.L. van der Laak J.A.W.M. Sherman M.E. Using deep convolutional neural networks to identify and classify tumor-associated stroma in diagnostic breast biopsies.Mod Pathol. 2018; 31: 1502-1512Crossref PubMed Scopus (105) Google Scholar, 16Bychkov D. Linder N. Turkki R. Nordling S. Kovanen P.E. Verrill C. Walliander M. Lundin M. Haglund C. Lundin J. Deep learning based tissue analysis predicts outcome in colorectal cancer.Sci Rep. 2018; 8Crossref Scopus (329) Google Scholar A review of lymph nodes is critical for staging cancer and making appropriate therapeutic decisions.17Gress D.M. Edge S.B. Greene F.L. Washington M.K. Asare E.A. Brierley J.D. Byrd D.R. Compton C.C. Jessup J.M. Winchester D.P. Amin M.B. Gershenwald J.E. Principles of cancer staging.in: Amin M.B. Edge S.B. Greene F.L. Byrd D.R. Brookland R.K. Washington M.K. Gershenwald J.E. Compton C.C. Hess K.R. Sullivan D.C. Jessup J.M. Brierley J.D. Gaspar L.E. Schilsky R.L. Balch C.M. Winchester D.P. Asare E.A. Madera M. Gress D.M. Meyer L.R. AJCC Cancer Staging Manual. Springer International Publishing, Cham, Switzerland2017: 3-30Crossref Google Scholar Involvement of multiple lymph node levels is a key factor in determining prognosis, and careful assessment of lymph node status is required for accurate staging. However, manually screening numerous slides can be tedious and time-consuming for the pathologist, and humans are prone to mistakes, because they have to keep track of which areas they have examined.18van Diest P.J. Histopathological workup of sentinel lymph nodes: how much is enough?.J Clin Pathol. 1999; 52: 871-873Crossref PubMed Scopus (60) Google Scholar,19Weaver D.L. Krag D.N. Manna E.A. Ashikaga T. Harlow S.P. Bauer K.D. Comparison of pathologist-detected and automated computer-assisted image analysis detected sentinel lymph node micrometastases in breast cancer.Mod Pathol. 2003; 16: 1159-1163Crossref PubMed Scopus (48) Google Scholar For this reason, the detection of metastases in lymph node tissue is an area that could stand to benefit from developments in automated tissue classification using machine learning approaches. In 2016 and 2017 a series of competitive international challenges, Cancer Metastases in Lymph Nodes Challenge 2016 and 2017 (CAMELYON16 and CAMELYON17), were held to identify machine learning algorithms capable of detecting and staging breast cancer metastases.20Ehteshami Bejnordi B. Veta M. Johannes van Diest P. van Ginneken B. Karssemeijer N. Litjens G. et al.Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.JAMA. 2017; 318: 2199-2210Crossref PubMed Scopus (1426) Google Scholar,21Bandi P. Geessink O. Manson Q. van Dijk M. Balkenhol M. Hermsen M. et al.From detection of individual metastases to classification of lymph node status at the patient level: the CAMELYON17 challenge.IEEE Trans Med Imaging. 2019; 38: 550-560Crossref PubMed Scopus (170) Google Scholar Some of the top-scoring entries in these challenges were able to demonstrate better performance in detecting micrometastases than a pathologist with time constraints,20Ehteshami Bejnordi B. Veta M. Johannes van Diest P. van Ginneken B. Karssemeijer N. Litjens G. et al.Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.JAMA. 2017; 318: 2199-2210Crossref PubMed Scopus (1426) Google Scholar as is usual with most pathologists working in busy practices. Similar to breast cancer, lymph node metastases in lung cancer play an important role in evaluating disease stage, selecting treatment options, and determining prognosis.22Andre F. Grunenwald D. Pignon J.-P. Dujon A. Pujol J.L. Brichon P.Y. Brouchet L. Quoix E. Westeel V. Le Chevalier T. Survival of patients with resected N2 non–small-cell lung cancer: evidence for a subclassification and implications.J Clin Oncol. 2000; 18: 2981-2989Crossref PubMed Scopus (461) Google Scholar,23Betticher D.C. Hsu Schmitz S.-F. Tötsch M. Hansen E. Joss C. von Briel C. Schmid R.A. Pless M. Habicht J. Roth A.D. Spiliopoulos A. Stahel R. Weder W. Stupp R. Egli F. Furrer M. Honegger H. Wernli M. Cerny T. Ris H.-B. Mediastinal lymph node clearance after docetaxel-cisplatin neoadjuvant chemotherapy is prognostic of survival in patients with stage IIIA pN2 non–small-cell lung cancer: a multicenter phase II trial.J Clin Oncol. 2003; 21: 1752-1759Crossref PubMed Scopus (379) Google Scholar Among all cancers, lung cancer is the leading cause of cancer-related death worldwide.24Bray F. Ferlay J. Soerjomataram I. Siegel R.L. Torre L.A. Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.CA Cancer J Clin. 2018; 68: 394-424Crossref PubMed Scopus (50539) Google Scholar Although lymph node metastases in lung cancer and breast cancer share some similar characteristics, they have certain distinct histologic features attributed to their respective cancer subtypes. In addition, mediastinal lymph nodes frequently show more prominently reactive histologic changes, including multiple hyperplastic lymphoid follicles and abundant anthracotic pigment-laden macrophages. These findings are not usual for extramediastinal locations, which may create difficulty when distinguishing tumor and nontumor components in lymph node tissue using machine learning algorithms. For tumor detection tasks, there is a tradeoff between achieving high sensitivity in detecting micrometastases and a high false-positive error rate, especially for the identification of isolated tumor cells (ITC).21Bandi P. Geessink O. Manson Q. van Dijk M. Balkenhol M. Hermsen M. et al.From detection of individual metastases to classification of lymph node status at the patient level: the CAMELYON17 challenge.IEEE Trans Med Imaging. 2019; 38: 550-560Crossref PubMed Scopus (170) Google Scholar Interestingly, errors made by deep learning algorithms do not strongly correlate with human errors, and are more often attributed to technical issues with the digital slide, such as out-of-focus areas or folds, misclassification of tumor confounding histologic patterns, or benign components of the lymph node that share morphological similarity with tumor tissue, including germinal centers, macrophages, and stroma.6Steiner D.F. MacDonald R. Liu Y. Truszkowski P. Hipp J.D. Gammage C. Thng F. Peng L. Stumpe M.C. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer.Am J Surg Pathol. 2018; 42: 1636-1646Crossref PubMed Scopus (224) Google Scholar,25Wang D. Khosla A. Gargeya R. Irshad H. Beck A.H. Deep Learning for Identifying Metastatic Breast Cancer.arXiv 2016Google Scholar,26Liu Y. Gadepalli K. Norouzi M. Dahl G.E. Kohlberger T. Boyko A. Venugopalan S. Timofeev A. Nelson P.Q. Corrado G.S. Hipp J.D. Peng L. Stumpe M.C. Detecting Cancer Metastases on Gigapixel Pathology Images.arXiv 2017Google Scholar Although technical errors can be prevented by more careful preparation of slides, the histologic tumor mimics, especially by hyperplastic lymph nodes with reactive lymphoid follicles and enlarged germinal centers, remains an issue without an effective solution. Since these are common components of all lymph node sections, this limitation significantly restricts the clinical utility of algorithms used for metastatic tumor detection. In this study, a deep learning–based software program with an integrated CNN algorithm was applied to the detection of lung cancer lymph node metastases in WSIs. A new method for metastatic tumor detection in lung cancer is proposed involving two steps of deep learning tissue classification in which the first step is used for exclusion of germinal centers and the second for tumor cell detection. The hypothesis is that this new approach can reduce false positives caused by tumor mimics and increase accuracy in the detection of lymph node metastases compared to using only one deep learning algorithm (one-step approach). This study was approved by the ethical board of Nagasaki University Hospital (19021824) and Kameda Medical Center (18-210). A total of 349 lymph node slides from 101 lung cancer patients with various histologic tumor types and stages were enrolled. Slides were collected at Nagasaki University Hospital, Japan, from 2014 to 2018, and from Kameda General Hospital, Japan, from 2007 to 2018. Details on the WSI data used in this study are shown in Table 1. Of 349 slides, 233 slides were used for training algorithms, 10 slides were used for validation, and 106 slides were used for testing. The validation set, which was separate from the training and testing sets, was used in the first step of the study and for all parameter tuning and model design choices in the trials.Table 1Data for the Whole-Slide Lymph Node Images Used for the Lung Cancer Metastasis ExperimentsCategoryTrainingValidationTestingTotalMacrometastasis100424128Micrometastasis702330ITC0022Nonmetastasis126657189Total23310106349Data are numbers of slides per category.ITC, isolated tumor cells. Open table in a new tab Data are numbers of slides per category. ITC, isolated tumor cells. Further details on the histologic subtypes can be found in Supplemental Table S1. Metastases were classified following the clinical practice guidelines as macrometastases (the largest tumor deposit had a diameter ≥ 2 mm), micrometastases (0.2 to 2 mm), and ITC (<0.2 mm).17Gress D.M. Edge S.B. Greene F.L. Washington M.K. Asare E.A. Brierley J.D. Byrd D.R. Compton C.C. Jessup J.M. Winchester D.P. Amin M.B. Gershenwald J.E. Principles of cancer staging.in: Amin M.B. Edge S.B. Greene F.L. Byrd D.R. Brookland R.K. Washington M.K. Gershenwald J.E. Compton C.C. Hess K.R. Sullivan D.C. Jessup J.M. Brierley J.D. Gaspar L.E. Schilsky R.L. Balch C.M. Winchester D.P. Asare E.A. Madera M. Gress D.M. Meyer L.R. AJCC Cancer Staging Manual. Springer International Publishing, Cham, Switzerland2017: 3-30Crossref Google Scholar,27James D.B. Mary K.G. Christian W. TNM Classification of Malignant Tumours.ed 8. John Wiley & Sons, Hoboken, NJ2018Google Scholar Glass slides were scanned into digital slides using an Aperio Scanscope CS2 digital slide scanner (Leica Biosystems, Buffalo Grove, IL) with a 40× objective (0.2517 μm/pixel). Digital slides were imported into HALO software version 2.2 (Indica Labs, Corrales, CA) for all subsequent steps, including annotation, training, and classification of digital slides. Tissue classification was performed using the HALO Tissue Classifier analysis module (random forest algorithm) and HALO AI (CNN, VGG network). Annotations used for training the tissue classification algorithms were drawn by one pathologist with 7 years of experience in pathology (H.H.N.P.), with supervision by an expert pulmonary pathologist (J.F.). First, a single classifier with two classes, tumor and nontumor, was developed using a CNN algorithm. Around 4000 annotations, including various polygonal outlines, were provided for training of each class using high-resolution images (0.25 μm per pixel). The classifier was trained for 3 × 106 iterations. The results showed that, although all metastases were detected at the slide-based level (100% sensitivity), many false-positive foci were found in both metastatic and nonmetastatic slides (0% specificity). Lymphoid follicles were found to be a common tumor mimic and a frequent cause of false-positive foci using this classifier (Figure 1A). In the next step, a second classifier was created with three classes: tumor, lymphoid follicle, and other tissue. The HALO AI CNN was trained with 2371 annotations labeled lymphoid follicles, 3902 annotations labeled tumor, and 3030 annotations for others. Training and classification were performed at high resolution (0.25 μm per pixel) for 7.4 × 105 training iterations. The hypothesis was that by separating lymphoid follicles into a distinct class, the tumor detection algorithm would better differentiate them from tumors. When it was applied in the validation set, there was similar sensitivity and specificity as the first classifier, due to misclassification of lymphoid follicles, especially reactive follicles (Figure 1B). Therefore, these two classifiers were not further used in this study. A step-wise approach in which two separate classifiers were developed and linked was then tested. The first classifier was designed to exclude lymphoid follicles from the rest of the tissue (including any tumor that might be present), and the second classifier was designed to detect tumor cells in the lymphoid-excluded tissue. In the first step, two different algorithms were tested to build lymphoid exclusion classifier, a random forest machine learning algorithm, and a deep learning CNN algorithm. In the next step, a deep learning algorithm was used to detect tumor cells. This method was based on the idea that a two-step algorithm could reduce most false-positive findings and produce a more accurate tumor detection tool in comparison to the one-step strategy (Figure 1, C and D). In the training step, annotations were divided into three separate component classes: tumor, lymphoid follicle, and other. Depending on the specific task of each machine learning model, different classes were chosen as the input data for training that model (Figure 2). Annotations of tumor were provided in various sizes, mimicking both macro and micrometastases, as well as ITC, to maximize the learning ability of the models. To determine the best model for lymphoid follicle detection, two different models were created. Model 1 was a random forest classifier [Lymphoid Follicle Random Forest Model (LFRFM)]. Random forest classifiers can obtain a good result with a small amount of training data.28Han T. Jiang D. Zhao Q. Wang L. Yin K. Comparison of random forest, artificial neural networks and support vector machine for intelligent diagnosis of rotating machinery.Trans Inst Meas Control. 2018; 40: 2681-2693Crossref Scopus (166) Google Scholar In addition, based on the settings of the HALO software (analysis on texture and color of images) that prefers few and small training regions, 65 annotations were provided with two classes: lymphoid follicles (20 annotations) and others (45 annotations) including tumor area, for training the model. Training and classification were performed at a low resolution of 4.4 μm per pixel. Model 2 was a deep learning classifier [Lymphoid Follicle CNN (LFCNN)]. Because a deep learning algorithm typically requires a considerable amount of training data for its multiple deep layers of structure to improve accuracy,28Han T. Jiang D. Zhao Q. Wang L. Yin K. Comparison of random forest, artificial neural networks and support vector machine for intelligent diagnosis of rotating machinery.Trans Inst Meas Control. 2018; 40: 2681-2693Crossref Scopus (166) Google Scholar more annotations were provided with 2332 training regions in total, representing two classes: lymphoid follicles (1243 annotations) and others (1089 annotations) containing the tumor area. Training and classification were performed at a medium resolution of 1.04 μm per pixel with 3.35 × 104 training iterations. Two models then were applied in the validation set to test the ability of lymphoid follicle detection. Among them, the model that could best identify lymphoid follicles on WSIs was chosen. All lymphoid follicles would then be excluded, and the layer without lymphoid follicles would be further analyzed in a second step using another deep learning model, model 3 [Tumor Detection CNN (TDCNN)], to detect cancer cells. The task of the second step was to detect cancer cells on lymph node slides using model 3 (TDCNN). In this step, the HALO AI CNN was trained with 10,155 total training annotations representing two classes: tumor (4196 annotations) and others (5959 annotations). Training and classification were performed at a high resolution of 0.25 μm per pixel, which is equal to the high-power field of a microscope used in pathologic diagnoses. It was then trained for 1 × 107 iterations. Apart from its primary purpose, to analyze the layer without lymphoid follicles achieved from the first step to detect metastatic tumors, the TDCNN model was also used to predict metastases in one step for purposes of comparison with the two-step deep learning algorithm. In this study, HALO AI settings were fixed at a probability threshold of 50% on the tumor heatmap for the outcome, indicating that only pixels with more than 50% possibility of displaying a tumor were labeled as positive for the cancer class on the WSIs. The GeForce GTX TITAN X graphics card (NVIDIA, Santa Clara, CA) provided the required GPU for HALO AI. HALO AI utilizes the Caffe engine and a fully convolutional version of the VGG architecture29Long J. Shelhamer E. Darrell T. Fully Convolutional Networks for Semantic Segmentation.arXiv 2014Google Scholar with all padding removed. Training was conducted on patches of 435 × 435 at the defined resolution. The patches were generated by picking a random class (with equal probability for each class), a random image containing annotation for the chosen class, and a random point inside a region of the chosen class and image. The patches were cropped around the chosen point and further augmented with random rotations and random shifts in hue, saturation, contrast, and brightness. The model was pre-trained on ImageNet and then trained for the defined number of iterations using RMSProp30Tieleman T. Hinton G. Lecture 6.5: RMSProp: divide the gradient by a running average of its recent magnitude.COURSERA: Neural Networks for Machine Learning. 2012; 4: 26-31Google Scholar (delta of 0.9) with a learning rate of 1e-3, a reduction in the learning rate by 10% every 10,000 iterations, and an L2 regularization of 5e-4. Because there was no padding in the model during analysis, the tile size was increased to 1867 × 1867, increasing the performance without changing the output. In the first step (lymphoid follicle exclusion), two models were tested in the described trial slide set to evaluate their ability to identify lymphoid follicles. Each image was divided into small patches (100 × 100 μm) and compared with the annotation of the pathologist (ground truth for lymphoid follicle detection). The patch would be considered as i) a true positive if the predicted area overlapped by more than 50% with ground truth, ii) a false positive if there was less than 50% overlap with ground truth, iii) a true negative if there was no positive prediction outside of ground truth areas, or iv) a false negative if there was no positive prediction inside a ground truth area. To evaluate the outcome of each model, accuracy was calculated based on the sum of the accuracy of all patches for all images. In the second step, evaluation was performed at the slide level. After analysis, slides were labeled as metastasis or no metastasis based on the presence or absence of a tumor classification on the slide and as macrometastases, micrometastases, or ITC based on the largest diameter of positive area measured manually with a ruler in HALO software. The maximum positive area identified by the algorithm was chosen if multiple metastatic foci were identified on a single slide. This result was compared against the ground truth, which was provided by the recorded diagnosis of the expert pathologist. For optimal assessment of false-positive reduction on slides, the testing data set was split into two groups: with and without lymphoid follicles, due to the fact that not all lymph node slides contain lymphoid follicles. In this study, lymphoid follicles were defined as reactive lymphoid follicles with enlargement in shape and size, a prominent germinal center, a mantle zone, and numerous tingible body macrophages with mixed centroblasts and centrocytes. The slides in the group without lymphoid follicles had only small lymphocyte aggregations or a few small-to-normal–sized lymphoid follicles. The two groups were then analyzed using both the one-step and the two-step methods, to evaluate the amount of error reduction achieved by the two-step method. The formula for false-positive area reduction was calculated as:Reduction offalse positive (FP)=FP area of one step−FP area of two stepFP area of one step Statistical significance was determined using a one-sided t-test, with P < 0.05 considered statistically significant. Statistical analysis was performed using the Stata statistical software package version 14.2 (StataCorp LP, College Station, TX). The two models performed differently with respect to lymphoid follicle prediction. The accuracy of model 1 (LFRFM) and model 2 (LFCNN) was 51.7% and 94.5%, respectively. The LFCNN model showed a well-fitting shape with the original lymphoid follicles seen in hematoxylin and eosin images, whereas the LFRFM showed many false positives in which tumor cells were misclassified as lymphoid follicles (Figure 3). On the basis of these results, the LFCNN model was chosen to eliminate all lymphoid follicles from the slides prior to the second step. To evaluate, in detail, the tumor detection results at the slide level, a confusion matrix was created showing accuracy in percentage terms for tumor prediction in different sized metastases (Table 2). The two-step deep learning algorithm performed well in identifying positive slides, including all macrometastases, micrometastases, and ITC with 100% accuracy. Examples of metastasis prediction are displayed in Figure 4. By co

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Detection of Lung Cancer Lymph Node Metastases from Whole-Slide Histopathologic Images Using a Two-Step Deep Learning Approach