Multimodal hierarchical classification using cascade-of-thought

Authors: Hou, J., Tan, Z., Hu, Q., Wang, P., Gong, Y.

Journal: Information Processing and Management

Publication Date: 01/04/2026

Volume: 63

Issue: 3

eISSN: 1873-5371

ISSN: 0306-4573

DOI: 10.1016/j.ipm.2025.104555

Abstract:

We propose Cascade-of-Thought (CSOT), a novel prompt-based method for multimodal hierarchical classification (MHC) that requires no training or labeled exemplars. Inspired by the LLM-as-a-Judge (LaaJ) paradigm, CSOT decomposes classification into rationale generation, confidence scoring, and decision ranking–each implemented via structured prompts to a vision-language model (VLM). Experiments on two public MHC benchmarks demonstrate that CSOT yields substantial performance gains, particularly for weaker VLMs, while also enhancing the output quality of near-ceiling models. CSOT offers a flexible, generalizable solution for real-world MHC tasks.

Source: Scopus

Multimo dal hierarchical classification using cascade-of-thought

Authors: Hou, J., Tan, Z., Hu, Q., Wang, P., Gong, Y.

Journal: INFORMATION PROCESSING & MANAGEMENT

Publication Date: 04/2026

Volume: 63

Issue: 3

eISSN: 1873-5371

ISSN: 0306-4573

DOI: 10.1016/j.ipm.2025.104555

Source: Web of Science