Multimodal hierarchical classification using cascade-of-thought
Authors: Hou, J., Tan, Z., Hu, Q., Wang, P., Gong, Y.
Journal: Information Processing and Management
Publication Date: 01/04/2026
Volume: 63
Issue: 3
eISSN: 1873-5371
ISSN: 0306-4573
DOI: 10.1016/j.ipm.2025.104555
Abstract:We propose Cascade-of-Thought (CSOT), a novel prompt-based method for multimodal hierarchical classification (MHC) that requires no training or labeled exemplars. Inspired by the LLM-as-a-Judge (LaaJ) paradigm, CSOT decomposes classification into rationale generation, confidence scoring, and decision ranking–each implemented via structured prompts to a vision-language model (VLM). Experiments on two public MHC benchmarks demonstrate that CSOT yields substantial performance gains, particularly for weaker VLMs, while also enhancing the output quality of near-ceiling models. CSOT offers a flexible, generalizable solution for real-world MHC tasks.
Source: Scopus
Multimo dal hierarchical classification using cascade-of-thought
Authors: Hou, J., Tan, Z., Hu, Q., Wang, P., Gong, Y.
Journal: INFORMATION PROCESSING & MANAGEMENT
Publication Date: 04/2026
Volume: 63
Issue: 3
eISSN: 1873-5371
ISSN: 0306-4573
DOI: 10.1016/j.ipm.2025.104555
Source: Web of Science