Enhancing Medical Dialogue Summarization: A MediExtract Distillation Framework

Authors: Liu, X., Huang, M., Rusnachenko, N., Ive, J., Chang, J. and Zhang, J.J.

Journal: Proceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024

Pages: 6466-6473

DOI: 10.1109/BIBM62325.2024.10822640

Abstract:

Automatic summarization of medical dialogues, which converts colloquial doctor-patient conversations into concise notes, is increasingly important due to the growing complexity of healthcare data. However, the complexity of medical language and the lack of annotated datasets pose challenges for summarization models. In this paper, we propose a MediExtract Distillation Framework (MEDF), a novel hybrid teacher-student distillation process that leverages the power of Large Language Models (LLMs) in information capturing to enhance the performance of a smaller student model. Utilizing medical key information generated by GPT-3.5-Turbo, the model training involves two feedforward branches per iteration: one using ground truth as labels and another using generated structured medical key information as an auxiliary supervision. We validated our method on the MTS-Dialogue dataset, achieving a +2.1% improvement in BLEURT compared to previous methods, demonstrating its effectiveness in summarizing medical dialogues. Additionally, using UMLS-based BERTScore, we observed a +1.8% increase in MedBERTScore for medical term extraction, highlighting our model's practical benefits in clinical information processing. Our framework is publicly available at: https://github.com/Xiaoxiao-Liu/distill-d2n.git

Source: Scopus