A Novel Method for Generating Ultrasound Report Generation Using a Multi-modal Feature Fusion
Authors: Cheddi, F., Habbani, A. and Nait-charif, H.
Journal: Lecture Notes in Networks and Systems
Volume: 1397 LNNS
Pages: 302-307
eISSN: 2367-3389
ISSN: 2367-3370
DOI: 10.1007/978-3-031-90921-4_42
Abstract:The integration of deep learning into medical imaging has shown significant promise for enhancing diagnostic accuracy and efficiency. This paper presents a novel deep learning-based approach for ultrasound report generation, using a multimodal fusion strategy to combine image and text data effectively. We employ a blend of convolutional neural networks (CNNs) and vision transformers (ViTs) to extract features, while we utilize DistillGPT2 to produce medical reports. The proposed system is trained and validated on extensive ultrasound datasets, demonstrating its capability to understand and describe complex medical scenarios. The rigorous results show that our method outperforms previous approaches, which attained a 0.471 in BLEU3 and a 0.765 in ROUGE_L, demonstrating superior performance in producing coherent and clinically relevant reports.
Source: Scopus