Generative Deep Learning in Digital Pathology Workflows
2021; Elsevier BV; Volume: 191; Issue: 10 Linguagem: Inglês
10.1016/j.ajpath.2021.02.024
ISSN1525-2191
AutoresDavid Morrison, David Harris‐Birtill, Peter D. Caie,
Tópico(s)Cell Image Analysis Techniques
ResumoMany modern histopathology laboratories are in the process of digitizing their workflows. Digitization of tissue images has made it feasible to research the augmentation or automation of clinical reporting and diagnosis. The application of modern computer vision techniques, based on deep learning, promises systems that can identify pathologies in slide images with a high degree of accuracy. Generative modeling is an approach to machine learning and deep learning that can be used to transform and generate data. It can be applied to a broad range of tasks within digital pathology, including the removal of color and intensity artifacts, the adaption of images in one domain into those of another, and the generation of synthetic digital tissue samples. This review provides an introduction to the topic, considers these applications, and discusses future directions for generative models within histopathology. Many modern histopathology laboratories are in the process of digitizing their workflows. Digitization of tissue images has made it feasible to research the augmentation or automation of clinical reporting and diagnosis. The application of modern computer vision techniques, based on deep learning, promises systems that can identify pathologies in slide images with a high degree of accuracy. Generative modeling is an approach to machine learning and deep learning that can be used to transform and generate data. It can be applied to a broad range of tasks within digital pathology, including the removal of color and intensity artifacts, the adaption of images in one domain into those of another, and the generation of synthetic digital tissue samples. This review provides an introduction to the topic, considers these applications, and discusses future directions for generative models within histopathology. Clinical histopathology is at an exciting paradigm shift, with many laboratories replacing traditional microscopy with high-resolution scanners and large digital displays. Unlike traditional slides, digital images can be shared electronically, marked up simultaneously by multiple pathologists, and assessed automatically.1Dimitriou N. Arandjelovic O. Caie P.D. Deep learning for whole slide image analysis: an' overview.Front Med (Lausanne). 2019; 6: 264Crossref PubMed Scopus (68) Google Scholar The deployment into clinical practice of systems that automate and augment diagnostic reporting is expected to lead to a significant increase in assessment capacity alongside quicker reporting times. This article provides a brief introduction to deep generative models, reviews their current use in digital pathology, and envisions their future applications within the field. To contextualize this work, deep generative models are discussed in relation to the current state-of-the-art deep learning techniques for pathology and the problems that generative techniques can solve within a conventional pipeline. Before discussing the place that generative models could take in the field of automated histopathology, it is necessary to describe the current typical workflow of machine learning in digital pathology and some of the common issues that can hinder downstream reporting tasks. A taxonomy of data science tasks, independent of pathology, organized into five categories, undertaken sequentially: obtain, scrub, explore, model, and interpret (dataists, http://www.dataists.com/2010/09/a-taxonomy-of-data-science, last accessed September 18, 2020) are shown in Figure 1. This model can be used to understand the process of applying machine learning in digital pathology. Data are obtained through the fixing, staining, and scanning of tissue to transform into a set of whole slide images. These images are then scrubbed, or preprocessed, to remove artifacts and prepared to be used in the modeling phase. Tasks such as stain normalization, data augmentation, and patch generation fall into this category. In the exploration phase, resulting scrubbed data are analyzed, either automatically or by a human, to determine an appropriate modeling technique, such as a specific neural network architecture. A large number of different pathologies and tissue types may be of interest in digital pathology. This makes it impractical to iteratively try every possible modeling technique, and in the case of ensemble learning, every combination of technique. The machine learning system is trained and evaluated during the modeling phase. In the interpretation phase, human pathologists are presented with the predictions of the model which can be used for clinical or research work. The automation of whole slide image (WSI) analysis and diagnosis presents several significant challenges.2Komura D. Ishikawa S. Machine learning methods for histopathological image analysis.Comput Struct Biotechnol J. 2018; 16: 34-42Crossref PubMed Scopus (320) Google Scholar Foremost is the issue of data size; whole slide images are multi-gigabyte images in the range of approximately 100,000 × 100,000 pixels. This makes a direct application of modern computer vision algorithms on non-specialist computing hardware impractical. Typical solutions to overcome this include: downsampling the image and breaking the image up into smaller subimages called patches. Second, data availability is problematic for most researchers. Supervised machine learning requires labels for each sample. In WSI analysis, this may mean assigning a category to each slide as a whole, identifying a set of points of interest on the tissue, or drawing around areas to segment tissue types or pathologies. For each of these, a trained specialist in histopathology is required. The process is time-consuming and expensive, and there is often a lot of interobserver and intraobserver variability between the labels provided by pathologists. As a result, data sets used to train automated digital pathology models tend to be small compared with those available in other computer vision subfields, such as ImageNet,3Jia D. Dong W. Socher R. Li L.-J. Li K. Fei-Fei L. Imagenet: a large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New York, NY2009: 248-255Google Scholar where non-specialists can straightforwardly provide labels (eg, labeling a cat versus a dog). This situation, however, has been improved by the release of tissue annotated open data sets, such as Camelyon164Bejnordi B.E. Veta M. van Diest P.J. van Ginneken B. Karssemeijer N. Litjens G. Jeroen A. van der Laak W.M. the CAMELYON16 ConsortiumDiagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.JAMA. 2017; 318: 2199-2210Crossref PubMed Scopus (1062) Google Scholar and Camelyon175Peter B. Geessink O. Manson Q. van Dijk M. Balkenhol M. Hermsen M. et al.From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge.IEEE Trans Med Imag. 2018; 38: 550-560Google Scholar. Furthermore, initiatives such as iCAIRD (iCAIRD, https://icaird.com) and Pathlake (PathLAKE, https://www.pathlake.org, both last accessed September 18, 2020) provide large, well-annotated, and curated WSI data sets linked to clinicopathologic data. These make rich digital pathology training material widely available, albeit within narrowly defined clinical reporting and specific tissue types. Third, WSI analysis experiences several domain-specific image artifacts caused by the process of surgical removal, fixing, cutting, staining, and scanning the tissue. These can include folds in the tissue, retraction artifact, variations in the application of chemicals in the staining process, small cracks and imperfections in the glass slide and coverslip, partial blurring of the image caused by focusing errors, and image resolution and compression differences between different scanners and file formats. Despite these challenges, computer vision techniques based on supervised and weakly supervised learning have been used to successfully automate some common assessment tasks in histopathology. These include, for example, cell nucleus identification, pathology classification, and cancer segmentation.6Janowczyk A. Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases.J Pathol Inform. 2016; 7: 29Crossref PubMed Scopus (561) Google Scholar Unsurprisingly, state-of-the-art results on slide classification tasks, such as the work by Campanella et al7Campanella G. Hanna M.G. Geneslaw L. Miraflor A. Werneck Krauss Silva V. Busam K.J. Brogi E. Reuter V.E. Klimstra D.S. Fuchs T.J. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nat Med. 2019; 25: 1301-1309Crossref PubMed Scopus (453) Google Scholar on prostate cancer, basal cell carcinoma, and breast cancer nodal metastases, rely on large data sets. This section briefly introduces required terminology from computer vision, deep learning, and generative modeling before describing their uses in a digital pathology workflow. First, an image filter or kernel is a rectangular matrix that can be applied to parts of a digital image to extract information, called features, from it. To apply a filter, a dot-product is performed (component-wise multiplication followed by a sum) between the filter and a section of the image with the same dimensions. In computer vision, this operation is referred to as a convolution. By sliding the filter across the image and performing the convolution at each point, this operation can produce a new matrix, known as a feature map. Filters that recognize primitive features, such as horizontal or vertical lines, can be hand crafted; however, more complex features must be learned by the model. Neural networks are the most commonly used machine learning approaches.8Goodfellow I. Bengio Y. Courville A. Bengio Y. Deep Learning. Vol 1. MIT Press, Cambridge, MA2016Google Scholar A convolutional neural network9LeCun Y. Boser B. Denker J.S. Henderson D. Howard R.E. Hubbard W. Jackel L.D. Backpropagation applied to handwritten zip code recognition.Neural Comput. 1989; 1: 541-551Crossref Google Scholar is a machine learning approach that enables image filters to be learned from data rather than programmed explicitly. Generative models are an approach to machine learning in which systems attempt to estimate the probability of a specific sample being picked at random based on training data.10Goodfellow I. Pouget-Abadie J. Mirza M. Xu B. Warde-Farley D. Ozair S. Courville A. Bengio Y. Generative adversarial nets. Advances in Neural Information Processing Systems. Curran Associates, Red Hook, NY2014: 2672-2680Google Scholar Once there is an estimate for the probability density function over the training set, the model can be used to generate new examples. For example, a model can be trained to generate new images of cats by training it on a large number of images of cats. Generative models are contrasted with discriminative models, which estimate the probability of an output value given an input value (this includes classification and regression problems). Recently, generative models based on deep learning have shown promise in generating novel data across a range of domains and tasks. The most effective techniques, such as generative adversarial networks (GANs)10Goodfellow I. Pouget-Abadie J. Mirza M. Xu B. Warde-Farley D. Ozair S. Courville A. Bengio Y. Generative adversarial nets. Advances in Neural Information Processing Systems. Curran Associates, Red Hook, NY2014: 2672-2680Google Scholar and variational autoencoders,11Hinton G.E. Zemel R.S. Autoencoders, minimum description length and helmholtz free energy.Advances in Neural Information Processing Systems. 1994; 6: 3-10Google Scholar come from a class of models known as latent variable generative models. In such systems, a model is trained that takes the lower-dimensional representation of data, called the latent space vector, and generates high-dimensional data from it. GANs and variational autoencoders differ in the way they are trained, but both conceptualize generation as decoding. By changing what data are passed in, as the latent space vector, model parameters can be learned that enable the model to perform data translation tasks. Figure 2 shows an example of a latent space vector and generated images. In their recent review of GANs in pathology, Tschuchnig et al12Tschuchnig M.E. Oostingh G.J. Gadermayr M. Generative adversarial networks in digital pathology: a survey on trends and future potential.Patterns (N Y). 2020; 1: 100089Abstract Full Text Full Text PDF PubMed Scopus (17) Google Scholar split the GANs up based on what kind of translation task the model is training for. This put the emphasis on task (eg, image-to-image translation verses label-to-image translation). The rest of this review describes how different generative models have been trained to perform different translation tasks and how these could be usefully applied to the automated reporting of a clinical task within a digital pathology pipeline. Generative adversarial networks10Goodfellow I. Pouget-Abadie J. Mirza M. Xu B. Warde-Farley D. Ozair S. Courville A. Bengio Y. Generative adversarial nets. Advances in Neural Information Processing Systems. Curran Associates, Red Hook, NY2014: 2672-2680Google Scholar are a class of generative model in which a network, known as the generator, is trained by having it attempt to trick a second model, known as the discriminator. The discriminator and the generator are trained simultaneously. During training, the generator is sampled from by having to translate noise into fake data. The discriminator is then trained on a combination of the fake data, labeled as fake, and the real data, labeled as real. The generator is then trained by having it generate fake data and asking the discriminator to predict labels for it. The training loss for the generator is based on how well the discriminator can tell them apart [ie, how well the generator can fool it (generating fake data that the discriminator classes as real)]. This simultaneous training procedure can cause GANs to both be computationally expensive and experience difficulty in converging to an accurate solution. Generative models have the potential to overcome several issues that come up when developing computer vision systems for digital diagnosis and reporting. For example, data sets stained at different institutions can often have a lot of variation in color and intensity. It can be expensive and time-consuming to acquire high-quality labeled training data. Generative models can generate synthetic data sets to overcome this. They can also be used to virtually stain tissue, reducing the tissue preparation overhead. During tissue preparation, particularly staining, variations in color and intensity can be introduced between different whole slide images. These artifacts can complicate the interpretation of the slide by pathologists and computers. When this occurs, similar tissue features can present differently or different ones similarly. Such artifacts are introduced from several sources, such as differences between scanners, the thickness of the cut tissue samples, and the amounts and concentrations of chemicals used in varying staining protocols. These issues can be mitigated in three ways: ignoring color information, training models to learn features insensitive to the artifacts,13Liu Y. Gadepalli K. Norouzi M. Dahl G.E. Kohlberger T. Boyko A. Venugopalan S. Timofeev A. Nelson P.Q. Corrado G.S. Hipp J.D. Peng L. Stumpe M.C. Detecting cancer metastases on gigapixel pathology images.arXiv. 2017; ([Preprint] doi:1703.02442)Google Scholar or normalizing images to account for differences. By converting the image to grayscale, much of the information provided by the staining process is lost. Analysis techniques for grayscale pathology images have to rely on other features (eg, texture and morphology),14Mosquera-Lopez C. Agaian S. Velez-Hoyos A. Thompson I. Computer-aided prostate cancer diagnosis from digitized histopathology: a review on texture-based systems.IEEE Rev Biomed Eng. 2014; 8: 98-113Crossref PubMed Scopus (81) Google Scholar,15Liu Y.-Y. Chen M. Ishikawa H. Wollstein G. Schuman J.S. Rehg J.M. Automated macular pathology diagnosis in retinal oct images using multi-scale spatial pyramid and local binary patterns in texture and shape encoding.IEEE Trans Med Imaging. 2011; 15: 748-759Google Scholar leading to lower performance on downstream tasks. In other situations, artifacts can be compensated for by applying a large number of color perturbations to the training data so that a wide range of variations are presented to the model during training.13Liu Y. Gadepalli K. Norouzi M. Dahl G.E. Kohlberger T. Boyko A. Venugopalan S. Timofeev A. Nelson P.Q. Corrado G.S. Hipp J.D. Peng L. Stumpe M.C. Detecting cancer metastases on gigapixel pathology images.arXiv. 2017; ([Preprint] doi:1703.02442)Google Scholar This technique requires the perturbations to be statistically similar to the color and intensity variations across the data to be assessed, information that is not always available, and requires increased computational and memory overheads because of the large amount of data augmentation. Ruifrok and Johnston16Ruifrok A.C. Johnston D.A. Quantification of histochemical staining by color deconvolution.Anal Quant Cytol Histol. 2001; 23: 291-299PubMed Google Scholar proposed a novel method based on color-deconvolution that dependeds on user-determined color information to reconstruct images for each stain. This method provides state-of-the-art results for stain normalization but is limited in its applicability to extensive studies because the user needs to estimate the values used in the deconvolution manually. Magee et al17Magee D. Treanor D. Crellin D. Shires M. Smith K. Mohee K. Quirke P. Colour normalisation in digital histopathology images. Proc Optical Tissue Image Analysis in Microscopy, Histopathology and Endoscopy (MICCAI Workshop). Vol 100. MICCAI, Springer Nature, London, UK2009: 100-111Google Scholar presented a method for estimating the required color deconvolution parameters from the image data, eliminating the need for user input. This work was extended by Khan et al18Khan A.M. Rajpoot N. Treanor D. Magee D. A nonlinear mapping approach to stain normalization in digital histopathology images using image-specific color deconvolution.IEEE Trans Biomed Eng. 2014; 61: 1729-1738Crossref PubMed Scopus (324) Google Scholar to account for image-specific color variations and to improve the training data used to separate the different stains. A limitation of color-deconvolution techniques is their failure to take into account information outside of the image color (eg, tissue structure or texture). Generative models are able to address this limitation. Stain normalization can be thought of as an image generation problem. Generative models have proved useful for image generation and recently have been applied to generate normalized pathology slides. Three different approaches have been applied to this task: stain-style tranfer,19Cho H, Lim S, Choi G, Min H: Neural stain-style transfer learning using gan for histopathological images, JMLR: Workshop and Conference Proceedings 80:1–10, 2017 ACML 2017.Google Scholar CycleGAN20Zhu J.-Y. Park T. Isola P. Efros A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York, NY2017: 2223-2232Google Scholar based image-to-image translation, and Pix2Pix-based translation.21Isola P. Zhu J.-Y. Zhou T. Efros A.A. Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New York, NY2017: 1125-1134Google Scholar Neural style transfer22Gatys L.A. Ecker A.S. Bethge M. A neural algorithm of artistic style.Journal of Vision J Vis. 2016; 16 (326–326)Google Scholar is an image translation technique that transfers the style of one source image onto the content of another to generate a target image. The terms style and content can be a little misleading at first; content refers to aspects of the image, like the shape and arrangement of nuclei and cells and the tissue architecture that they comprise; and style refers to aspects such as color, like the hematoxylin and eosin (H&E) shades, and texture (eg, the nuclear chromatin). Style representations are derived from correlations in between the same location in different activation maps of the same layer of a neural network. For example, there might be a filter that recognizes blue pixels and another that recognizes a curve. If they consistently activate together, then this would represent that curves are generally blue. Stain normalization can be thought of as a kind of style transfer from the source to the target; however, it is important that only the color distribution is transformed, not other histopathologic features. Stain-style transfer19Cho H, Lim S, Choi G, Min H: Neural stain-style transfer learning using gan for histopathological images, JMLR: Workshop and Conference Proceedings 80:1–10, 2017 ACML 2017.Google Scholar uses a modification on GANs to perform color normalization, as indicated by its application on patches extracted from the Camelyon16 data set.4Bejnordi B.E. Veta M. van Diest P.J. van Ginneken B. Karssemeijer N. Litjens G. Jeroen A. van der Laak W.M. the CAMELYON16 ConsortiumDiagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.JAMA. 2017; 318: 2199-2210Crossref PubMed Scopus (1062) Google Scholar The normalized patches improve tumor classification. In this technique, the input into the GAN generator is changed from noise to the unnormalized image. A conditional GAN23Mirza M. Simon O. Conditional generative adversarial nets.arXiv. 2014; ([Preprint] doi:1411.1784)Google Scholar is then used in which both the generator and discriminator are trained to generate and discriminate class labels for each patch, in this case tumor or nontumor, in addition to the fake or real labels. On its own, this produces distortion in the patches' noncolor histopathologic features. To address these issues, two other loss functions were added to the system: reconstruction loss, to minimize the difference between the source and generated images; and feature-preserving loss, which derives a loss by comparing the activations of the final layer of the discriminator when the source and generated images are passed though the network. This approach improves the classification accuracy of a convolutional neural network–based model trained on image patches extracted from the Camelyon16 data set. BenTaieb and Hamarneh24BenTaieb A. Hamarneh G. Adversarial stain transfer for histopathology image analysis.IEEE Trans Med Imaging. 2017; 37: 792-802Crossref Scopus (94) Google Scholar propose a similar approach in which the generator architecture is replaced with a U-Net encoder-decoder style network, called the stain transfer network, and the discriminator is given an additional classification task. This approach was assessed on both classification and segmentation tasks, across three separate data sets, showing it can be used to improve the identification of a wide range of tissue and pathology types. Pix2Pix21Isola P. Zhu J.-Y. Zhou T. Efros A.A. Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New York, NY2017: 1125-1134Google Scholar is an extension of conditional GANs, which, like other image-to-image translation models, learns the mapping from one image domain to another. The difference with Pix2Pix is that it also learns a loss function to train the translation model. This means that models based on Pix2Pix can be trained to translate between different domains without the need to specify a specific loss function for that translation, something that is hard to do. Like conditional GANs for image-to-image translation, Pix2Pix requires image pairs, one from each domain, as example translations. Salehi and Chalechale25Salehi P. Chalechale A. Pix2pix-based stain-to-stain translation: a solution for robust stain normalization in histopathology images analysis, 2020 International Conference on Machine Vision and Image Processing (MVIP). IEEE, New York, NY2020: 1-7Google Scholar applied this approach successfully to the stain normalization using five different H&E data sets. The method involves destaining the patches by reducing them to grayscale, before synthetically restaining them in a way that ensures that the color is consistent. This is similar to the artificial staining proposed by Rana et al,26Rana A. Yauney G. Lowe A. Shah P. Computational histological staining and destaining of prostate core biopsy rgb images with generative adversarial neural networks. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, New York, NY2018: 828-834Google Scholar discussed under data adaptation, and has been shown to perform well across a range of statistical measurements comparing ground-truth stained images against those restained using the GAN. This indicates that they may improve downstream assessment tasks, such as tumor classification and segmentation, in a similar way to the stain-style transfer techniques.19Cho H, Lim S, Choi G, Min H: Neural stain-style transfer learning using gan for histopathological images, JMLR: Workshop and Conference Proceedings 80:1–10, 2017 ACML 2017.Google Scholar,24BenTaieb A. Hamarneh G. Adversarial stain transfer for histopathology image analysis.IEEE Trans Med Imaging. 2017; 37: 792-802Crossref Scopus (94) Google Scholar One of the key disadvantages of Pix2Pix is the need for paired images from the source and target domain (eg, coregistered images before and after staining). CycleGAN20Zhu J.-Y. Park T. Isola P. Efros A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York, NY2017: 2223-2232Google Scholar bypasses this requirement, allowing models to be trained to translate from a source to a target domain without the need for paired examples. This is done by training an inverse mapping from the source to target domain, at the same time as training the translation. By comparing the original image with one that has had the forward and inverse transformation applied to it, a loss called cycle-consistency loss is derived. When the generator is trained, cycle-consistency loss is minimized, as is the conventional adversarial loss derived from trying to fool the discriminator. de Bel et al27de Bel T. Hermsen M. Kers J. van der Laak J. Litjens G. Stain-transforming cycle-consistent generative adversarial networks for improved segmentation of renal histopathology, Proceedings of The 2nd International Conference on Medical Imaging with Deep Learning. PMLR, London, UK2019: 151-163Google Scholar showed that modifying the original CycleGAN20Zhu J.-Y. Park T. Isola P. Efros A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York, NY2017: 2223-2232Google Scholar to use a U-Net28Ronneberger O. Fischer P. Brox T. U-net: convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Berlin, Germany2015: 234-241Google Scholar style architecture made it more suitable for use with pathology images. This system can be used to artificially stain images to a high quality. The technique was applied to two data sets of renal tissue sections stained with periodic acid-Schiff from different staining centers. Models trained using the normalized data had increased accuracy when segmenting various objects of interest within the renal slides, such as arteries, tubuli, and glomeruli. However, the system was able to generate changes in texture, something that breaks the constraint that the transform should preserve noncolor tissue features and potentially introduces unwanted bias into the generated data sets. Data adaptation is the task of taking the data in one domain, such as H&E WSIs, and translating them into images that resemble those in a different domain, such as immunofluorescence WSIs. This can be useful as a data augmentation technique, allowing for images labeled in one domain to be used effectively for learning in another domain. Doing this relies on the image translation process retaining the correct labels. For example, if something is labeled as a cell nucleus, it has to still look like a cell nucleus once it has been translated. One possible use of this data adaptation is to enrich patches with additional channels showing different fluorescence labels that highlight different kinds of information. This is called multiplexing and has traditionally been achieved though relabeling the same tissue multiple times and scanning in each fluorophore separately. There are two issues with this: after multiple relabeling, the tissue quality begins to degrade; and scanning requires the slides to be precisely aligned to allow the tissue to be coregistered. By doing virtual staining, the tissue is not degraded, and because a single scan is used, there are no issues related to alignment. A histopathologic-to-immunofluorescence translation model that uses Pix2Pix21Isola P. Zhu J.-Y. Zhou T. Efros A.A. Image-to-image translation with conditional adversarial networks. Proceedings of the IE
Referência(s)