Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.

Publication date: Jun 11, 2025

The rapid advancements in natural language processing, particularly the development of large language models (LLMs), have opened new avenues for managing complex clinical text data. However, the inherent complexity and specificity of medical texts present significant challenges for the practical application of prompt engineering in diagnostic tasks. This paper explores LLMs with new prompt engineering technology to enhance model interpretability and improve the prediction performance of pulmonary disease based on a traditional deep learning model. A retrospective dataset including 2965 chest CT radiology reports was constructed. The reports were from 4 cohorts, namely, healthy individuals and patients with pulmonary tuberculosis, lung cancer, and pneumonia. Then, a novel prompt engineering strategy that integrates feature summarization (F-Sum), chain of thought (CoT) reasoning, and a hybrid retrieval-augmented generation (RAG) framework was proposed. A feature summarization approach, leveraging term frequency-inverse document frequency (TF-IDF) and K-means clustering, was used to extract and distill key radiological findings related to 3 diseases. Simultaneously, the hybrid RAG framework combined dense and sparse vector representations to enhance LLMs’ comprehension of disease-related text. In total, 3 state-of-the-art LLMs, GLM-4-Plus, GLM-4-air (Zhipu AI), and GPT-4o (OpenAI), were integrated with the prompt strategy to evaluate the efficiency in recognizing pneumonia, tuberculosis, and lung cancer. The traditional deep learning model, BERT (Bidirectional Encoder Representations from Transformers), was also compared to assess the superiority of LLMs. Finally, the proposed method was tested on an external validation dataset consisted of 343 chest computed tomography (CT) report from another hospital. Compared with BERT-based prediction model and various other prompt engineering techniques, our method with GLM-4-Plus achieved the best performance on test dataset, attaining an F1-score of 0. 89 and accuracy of 0. 89. On the external validation dataset, F1-score (0. 86) and accuracy (0. 92) of the proposed method with GPT-4o were the highest. Compared to the popular strategy with manually selected typical samples (few-shot) and CoT designed by doctors (F1-score=0. 83 and accuracy=0. 83), the proposed method that summarized disease characteristics (F-Sum) based on LLM and automatically generated CoT performed better (F1-score=0. 89 and accuracy=0. 90). Although the BERT-based model got similar results on the test dataset (F1-score=0. 85 and accuracy=0. 88), its predictive performance significantly decreased on the external validation set (F1-score=0. 48 and accuracy=0. 78). These findings highlight the potential of LLMs to revolutionize pulmonary disease prediction, particularly in resource-constrained settings, by surpassing traditional models in both accuracy and flexibility. The proposed prompt engineering strategy not only improves predictive performance but also enhances the adaptability of LLMs in complex medical contexts, offering a promising tool for advancing disease diagnosis and clinical decision-making.

Open Access PDF

Concepts Keywords
Adaptability Deep Learning
Cancer Humans
F1 Large Language Models
Pulmonary large language models
Tomography LLM
Lung Diseases
Lung Neoplasms
Natural Language Processing
prompt engineering
pulmonary disease prediction
RAG
retrieval-augmented generation
Retrospective Studies
Tomography, X-Ray Computed

Semantics

Type Source Name
disease MESH Pulmonary Disease
disease MESH pulmonary tuberculosis
disease MESH lung cancer
disease MESH pneumonia
drug DRUGBANK Medical air
disease MESH tuberculosis
pathway KEGG Tuberculosis
drug DRUGBANK Tropicamide
disease MESH morbidity
disease IDO quality
disease MESH viral pneumonia
drug DRUGBANK Gold
drug DRUGBANK Coenzyme M
disease IDO process
disease MESH privacy
disease IDO algorithm
drug DRUGBANK Isoxaflutole
disease IDO entity
drug DRUGBANK Ademetionine
disease MESH confusion
drug DRUGBANK Methyprylon
disease MESH fibrosis
disease MESH tics
drug DRUGBANK L-Valine
disease MESH granulomas
disease MESH miliary tuberculosis
disease MESH Mycoplasma pneumonia
disease MESH pancreatic cancer
pathway KEGG Pancreatic cancer
disease MESH emergency
disease IDO intervention
disease IDO cell
disease IDO pathogen
disease MESH infections
pathway REACTOME Reproduction

Original Article

Leave a Comment

Your email address will not be published. Required fields are marked *