Moving from ImageNet to RadImageNet for Improved Transfer Learning and Generalizability
2022; Radiological Society of North America; Volume: 4; Issue: 5 Linguagem: Inglês
10.1148/ryai.220126
ISSN2638-6100
Autores Tópico(s)Radiomics and Machine Learning in Medical Imaging
ResumoHomeRadiology: Artificial IntelligenceVol. 4, No. 5 Previous CommentaryFree AccessMoving from ImageNet to RadImageNet for Improved Transfer Learning and GeneralizabilityAlexandre Cadrin-Chênevert Alexandre Cadrin-Chênevert Author AffiliationsFrom the Department of Medical Imaging, CISSS Lanaudière, 1000 Boulevard Sainte-Anne, Saint-Charles-Borromée, QC, Canada J6E 6J2; and Department of Radiology and Nuclear Medicine, Laval University, Quebec City, Canada.Address correspondence to the author (email: [email protected]).Alexandre Cadrin-Chênevert Published Online:Aug 10 2022https://doi.org/10.1148/ryai.220126MoreSectionsPDF ToolsImage ViewerAdd to favoritesCiteTrack CitationsPermissionsReprints ShareShare onFacebookTwitterLinked In See also the article by Mei et al in this issue.Alexandre Cadrin-Chênevert, MD, BEng, is a diagnostic and interventional radiologist at CISSS Lanaudière and clinical professor at Laval University. He has previously served as chief of the medical imaging department. As a Kaggle competition master, he has successfully participated in many machine learning competitions. He is an early member of the Canadian Association of Radiologists (CAR) Artificial Intelligence (AI) Standing Committee. His current research interests include deep learning, computer vision, object detection, self-supervised learning, model generalizability, and public medical imaging datasets.Download as PowerPointOpen in Image Viewer The last decade has seen very rapid development of the computer vision field with the union of three main factors: the advent of deep learning algorithms, the exponential growth of computing power, and the public sharing of large image datasets. Among these factors, the creation and curation of the publicly available ImageNet dataset (1) is arguably the one that most catalyzed the development of this field. ImageNet is used to measure the performance of classification algorithms and architectures. This resource also is used as a basis for pretraining deep learning networks to perform efficient transfer learning to subsequent downstream tasks.When the number of images is more limited, the concept of transfer learning from a network pretrained with ImageNet therefore achieves generally substantially better performance than training the same model initialized with random weights. The explainable foundation is based on the transfer of basic visual representations learned by the original network. Thus, the network, through its internal representations, can understand the meaning of simple shapes and then assemble them to integrate more complex shapes. Moreover, the stable regularization of the transferred network generally also improves the robustness to variable shifts in the data distribution.In medical imaging, the application and benefit of transfer learning from ImageNet has been demonstrated repeatedly in the literature. However, the medical imaging domain has distinctly different properties from the so-called natural (photographic) images included in the ImageNet dataset. In particular, medical images generally are formatted only in shades of gray rather than the red-green-blue channels of full-color natural images. Because of the physical properties of their acquisition, medical images usually are much noisier than natural images, which makes the transfer of texture representations more limited. Many complex shapes of the human body in two or three dimensions are not frequently represented in a natural image dataset such as ImageNet. Thus, in this context, transfer learning from ImageNet sometimes can drift toward unintended biases found in a specific dataset of medical images. These nongeneralizable biases frequently are not related to the anatomic region or pathology being studied. This learning drift consequently undermines the robustness of deep learning networks to new distributions on the same task.Several authors have tried to overcome these specific differences in medical imaging by using either self-supervised pretraining or unsupervised pretraining (2–4). The results were promising, but none of these studies made their images publicly available, which substantially limits the scope and reproducibility of the results. In addition, several very large medical imaging datasets have been made publicly available, but these datasets were generally focused on a single or a few modalities such as chest radiography, head CT, or mammography (5–9). Even if the number of images was very large in these previously shared datasets, the lack of diversity limits the ability to apply transfer learning widely to various subsequent downstream tasks.The creation and sharing of a large and diverse dataset of medical images can be considered the holy grail for those interested in computer vision in medical imaging. The article by Mei et al in this issue of Radiology: Artificial Intelligence embarks successfully on this fundamental but challenging quest (10). The East River Medical Imaging group provided nearly 1.35 million CT, MRI, and US images with annotated categories extracted from key images as determined by the radiologist's interpretation. Several parts of the human body are variously represented in this dataset, including the head, spine, thorax, abdomen, pelvis, thyroid, and upper and lower limbs.The authors from the Biomedical Engineering and Imaging Institute at the Icahn School of Medicine at Mount Sinai methodically trained deep convolutional neural networks on this shared medical imaging dataset. They tested the performance of transfer learning from this pretraining on several different medical imaging datasets. They convincingly report and demonstrate the performance benefit of transfer learning from RadImageNet compared with transfer learning from ImageNet for both classification and semantic segmentation tasks. Although RadImageNet annotations can be considered as weak labels with a presumably large variation in interpretation, it is unequivocal that this approach using 165 classification categories allows the training of well-regularized deep convolutional neural networks that can efficiently transfer learned medical representations to subsequent medical imaging tasks. The authors also analyzed and compared the gain in classification performance between small and large datasets. There seems to be a tendency to have a more substantial performance gain using transfer learning on smaller datasets, although the gain is also present with larger datasets.The value of this project is even more remarkable considering that the authors have shared with the scientific community their pretrained models. It is also possible to request access to the images directly from the data provider. Such sharing, even if conditional, offers to researchers a real option to test the reproducibility of the study. This access to models and images also opens a whole new perspective for future projects. For example, new model architectures could be pretrained with the available images while comparing the performance with the basic published models. Several comparative performance studies between pretraining from ImageNet or RadImageNet followed by transfer learning on other medical imaging data could be performed.It also will be exciting to compare the robustness of pretrained networks with RadImageNet to shifts in data distribution such as variations in medical equipment, reconstruction algorithms, or pathology prevalence in a given population. Ultimately, such analyses could make the clinical deployment of multiple deep learning algorithms in diverse conditions and populations generally safer and more robust.It is understood that the RadImageNet dataset does not currently include standard radiographs, which is a practical limitation to applying optimal transfer learning to this modality. Interestingly, the authors still demonstrated a transfer learning performance gain over ImageNet on a chest radiograph dataset and a pneumonia classification task. This result confirms that part of the benefit is explained by pretraining exclusively in shades of gray rather than in color. However, it would be promising to see a future dataset version that includes standard radiographs and other imaging modalities. An amalgamation of RadImageNet with other public datasets that specifically include standard radiographs could potentially expand the overall representation and transfer capability of the pretrained models.Thus, for the computer vision research community interested in medical imaging, RadImageNet can have a substantial positive and catalytic impact in the short term. This impact can be direct by using the shared pretrained models or the image dataset as a basis for transfer learning. The impact can also be indirect by reusing the concept of specialized pretraining for medical imaging with other approaches, such as unsupervised learning on very large amounts of unannotated data. For clinical radiologists and patients, the effect and impact of RadImageNet will not be easily perceptible in the short term, but it could substantially improve the performance and robustness of several clinically deployed algorithms over the foreseeable future.Disclosures of conflicts of interest: A.C.C. Payment from Société Canadienne Française de Radiologie for artificial intelligence webinar; payment for travel expenses from Radiological Society of North America AI Summit 2018; associate editor for Radiology: Artificial Intelligence.Author declared no funding for this work.References1. Deng J, Dong W, Socher R, Li LJ, Li K, Li FF . ImageNet: A large-scale hierarchical image database . In: 2009 IEEE Conference on Computer Vision and Pattern Recognition , Miami, FL , June 20–25, 2009 . Piscataway, NJ : IEEE ; 2009 ; 248 – 255 . Google Scholar2. Parakh A, Lee H, Lee JH, Eisner BH, Sahani DV, Do S . Urinary stone detection on CT images using deep convolutional neural networks: evaluation of model performance and generalization . Radiol Artif Intell 2019 ; 1 ( 4 ): e180066 . Link, Google Scholar3. Xie Y, Richmond D . Pre-training on grayscale ImageNet improves medical image classification . In: Leal-Taixé L, Roth S , eds. Computer Vision – ECCV 2018 Workshops. ECCV 2018. Lecture Notes in Computer Science, vol 11134 . Cham, Switzerland : Springer ; 2019 ; 476 – 484 . Crossref, Google Scholar4. Ghesu FC, Georgescu B, Mansoor A, et al . Self-supervised learning from 100 million medical images . arXiv:2201.01283 [preprint] https://arxiv.org/abs/2201.01283. Posted January 4, 2022. Accessed June 23, 2022 . Google Scholar5. Irvin J, Rajpurkar P, Ko M, et al . CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison . Proc AAAI Conf Artif Intell 2019 ; 33 ( 1 ): 590 – 597 . Crossref, Google Scholar6. Flanders AE, Prevedello LM, Shih G, et al . Construction of a machine learning dataset through collaboration: The RSNA 2019 Brain CT Hemorrhage Challenge . Radiol Artif Intell 2020 ; 2 ( 3 ): e190211 [Published correction appears in Radiol Artif Intell 2020;2(4):e209002.] . Link, Google Scholar7. Dembrower K, Lindholm P, Strand F . A multi-million mammography image dataset and population-based screening cohort for the training and evaluation of deep neural networks—the Cohort of Screen-Aged Women (CSAW) . J Digit Imaging 2020 ; 33 ( 2 ): 408 – 413 . Crossref, Medline, Google Scholar8. Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM . ChestX-Ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases . In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , Honolulu, HI , July 21–26, 2017 . Piscataway, NJ : IEEE , 2017 ; 3462 – 3471 . Google Scholar9. Johnson AEW, Pollard TJ, Berkowitz SJ, et al . MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports . Sci Data 2019 ; 6 ( 1 ): 317 . Crossref, Medline, Google Scholar10. Mei X, Liu Z, Robson PM, et al . RadImageNet: An open radiological deep learning research dataset for effective transfer learning . Radiol Artif Intell 2022 ; 4 ( 5 ): e210315 . Link, Google ScholarArticle HistoryReceived: June 27 2022Revision requested: June 29 2022Revision received: June 30 2022Accepted: July 7 2022Published online: Aug 10 2022 FiguresReferencesRelatedDetailsAccompanying This ArticleRadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning27 Jul 2022Radiology: Artificial IntelligenceRecommended Articles Deep Learning: A Primer for RadiologistsRadioGraphics2017Volume: 37Issue: 7pp. 2113-2131Deep Learning: An Update for RadiologistsRadioGraphics2021Volume: 41Issue: 5pp. 1427-1445Natural Language Processing of Radiology Text Reports: Interactive Text ClassificationRadiology: Artificial Intelligence2021Volume: 3Issue: 4Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural NetworksRadiology2017Volume: 284Issue: 2pp. 574-582Current Applications and Future Impact of Machine Learning in RadiologyRadiology2018Volume: 288Issue: 2pp. 318-328See More RSNA Education Exhibits Neural Networks in Deep Learning: A Simplified Explanation for RadiologistsDigital Posters2019Artificial Intelligence Using Neural Network Architecture for Radiology (AINNAR): The Decoding of the Technical Terms in AI Digital Posters20183-Minute Recipe for Deep Learning: Principle, Hardware, and SoftwareDigital Posters2018 RSNA Case Collection Acute COVID-19 virus infection RSNA Case Collection2020Dedifferentiated Thyroid CancerRSNA Case Collection2020Fibrosis in recovered COVID-19 pneumoniaRSNA Case Collection2020 Vol. 4, No. 5 Metrics Altmetric Score PDF download
Referência(s)