Multi-scale feature enhancement using EfficientNet-B7 and PANet in faster R-CNN for small object detection
Authors: Nazir, A. and Wani, M.A.
Journal: International Journal of Information Technology Singapore
eISSN: 2511-2112
ISSN: 2511-2104
DOI: 10.1007/s41870-025-02790-9
Abstract:Small object detection remains one of the most challenging tasks in computer vision due to limited semantic information and reduced spatial resolution. This paper presents an enhanced two-stage Faster R-CNN model by integrating EfficientNet-B7 as the backbone for multi-scale feature extraction and Path Aggregation Network (PANet) for feature fusion. Two specialised networks are integrated, EfficientNet-B7 based on compound scaling that extracts deep, high-resolution feature representations and PANet to fuse multi-scale features via top-down and bottom-up pathways and to improve small-object identification performance, these modules are trained together to acquire complementary representations to enhance small object detection. The hybrid model proposed in this paper show significant performance enhancement on MS COCO and PASCAL VOC benchmark datasets, in detecting small-scale objects, both in terms of accuracy and computational efficiency. This paper also explains the training methodology, loss functions, and optimization methods used. This study analyses the architectural components, and evaluation metrics employed in this implementation. These findings highlight the efficacy of integrating a high-capacity attention-rich backbone with advanced feature fusion modules, offering a promising direction for improving small-scale object detection performance.
Source: Scopus