Do Language Models Help or Harm? the Role of LLM-Generated Explanations in Human-AI Image Classification Tasks

Authors: Naiseh, M., Zieni, B., Chiara, N., Bouchachia, H.

Journal: 2026 IEEE Conference on Artificial Intelligence Cai 2026

Publication Date: 01/01/2026

Pages: 244-249

DOI: 10.1109/CAI68641.2026.11536442

Abstract:

As large language models (LLMs) are increasingly integrated into explainable AI (XAI) pipelines, there is growing interest in whether their fluent, human-like explanations improve or hinder decision-making in AI-assisted tasks. In this study, we examine how LLM-generated narrative explanations affect user understanding, confidence, and accuracy in a human-AI computer vision setting. Participants completed a fine-grained image classification task involving dog breeds, supported by either visual-only explanations (Grad-CAM) or visual + narrative explanations generated using GPT-4o. Using a 2×2 within-subjects design, we evaluated the effects of explanation type and model correctness on participant agreement with the AI, confidence ratings, decision accuracy, and confidence-accuracy calibration. Our results reveal a double-edged effect: narrative explanations increased confidence, especially when the model was correct, but did not improve overall accuracy. Critically, participants were more likely to accept incorrect predictions when a narrative explanation was present, suggesting a risk of overtrust.

Source: Scopus