A retrieval-augmented chatbot based on GPT-4 provides appropriate differential diagnosis in gastrointestinal radiology: a proof of concept study
2024; Springer Science+Business Media; Volume: 8; Issue: 1 Linguagem: Inglês
10.1186/s41747-024-00457-x
ISSN2509-9280
AutoresStephan Rau, Alexander Rau, Johanna Nattenmüller, Anna Maria Fink, Fabian Bamberg, Marco Reisert, Maximilian Frederik Russe,
Tópico(s)Radiomics and Machine Learning in Medical Imaging
ResumoAbstract Background We investigated the potential of an imaging-aware GPT-4-based chatbot in providing diagnoses based on imaging descriptions of abdominal pathologies. Methods Utilizing zero-shot learning via the LlamaIndex framework, GPT-4 was enhanced using the 96 documents from the Radiographics Top 10 Reading List on gastrointestinal imaging, creating a gastrointestinal imaging-aware chatbot (GIA-CB). To assess its diagnostic capability, 50 cases on a variety of abdominal pathologies were created, comprising radiological findings in fluoroscopy, MRI, and CT. We compared the GIA-CB to the generic GPT-4 chatbot (g-CB) in providing the primary and 2 additional differential diagnoses, using interpretations from senior-level radiologists as ground truth. The trustworthiness of the GIA-CB was evaluated by investigating the source documents as provided by the knowledge-retrieval mechanism. Mann–Whitney U test was employed. Results The GIA-CB demonstrated a high capability to identify the most appropriate differential diagnosis in 39/50 cases (78%), significantly surpassing the g-CB in 27/50 cases (54%) ( p = 0.006). Notably, the GIA-CB offered the primary differential in the top 3 differential diagnoses in 45/50 cases (90%) versus g-CB with 37/50 cases (74%) ( p = 0.022) and always with appropriate explanations. The median response time was 29.8 s for GIA-CB and 15.7 s for g-CB, and the mean cost per case was $0.15 and $0.02, respectively. Conclusions The GIA-CB not only provided an accurate diagnosis for gastrointestinal pathologies, but also direct access to source documents, providing insight into the decision-making process, a step towards trustworthy and explainable AI. Integrating context-specific data into AI models can support evidence-based clinical decision-making. Relevance statement A context-aware GPT-4 chatbot demonstrates high accuracy in providing differential diagnoses based on imaging descriptions, surpassing the generic GPT-4. It provided formulated rationale and source excerpts supporting the diagnoses, thus enhancing trustworthy decision-support. Key points • Knowledge retrieval enhances differential diagnoses in a gastrointestinal imaging-aware chatbot (GIA-CB). • GIA-CB outperformed the generic counterpart, providing formulated rationale and source excerpts. • GIA-CB has the potential to pave the way for AI-assisted decision support systems. Graphical Abstract
Referência(s)