Artigo Acesso aberto Revisado por pares

Overcoming data scarcity in biomedical imaging with a foundational multi-task model

2024; Nature Portfolio; Volume: 4; Issue: 7 Linguagem: Inglês

10.1038/s43588-024-00662-z

ISSN

2662-8457

Autores

Raphael Schäfer, Till Nicke, Henning Höfener, Annkristin Lange, Dorit Merhof, Friedrich Feuerhake, Volkmar Schulz, Johannes Lotz, Fabian Kießling,

Tópico(s)

AI in cancer detection

Resumo

Abstract Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more specialized datasets common in biomedical imaging. Here we propose a multi-task learning strategy that decouples the number of training tasks from memory requirements. We trained a universal biomedical pretrained model (UMedPT) on a multi-task database including tomographic, microscopic and X-ray images, with various labeling strategies such as classification, segmentation and object detection. The UMedPT foundational model outperformed ImageNet pretraining and previous state-of-the-art models. For classification tasks related to the pretraining database, it maintained its performance with only 1% of the original training data and without fine-tuning. For out-of-domain tasks it required only 50% of the original training data. In an external independent validation, imaging features extracted using UMedPT proved to set a new standard for cross-center transferability.

Referência(s)