Unsupervised Salient Object Detection on Light Field with High-Quality Synthetic Labels

Authors: Zheng, Y., Luo, Z., Cao, Y., Yang, X., Xu, W., Lin, Z., Yin, N. and Wang, P.

Journal: IEEE Transactions on Circuits and Systems for Video Technology

eISSN: 1558-2205

ISSN: 1051-8215

DOI: 10.1109/TCSVT.2024.3514754

Abstract:

Most current Light Field Salient Object Detection (LFSOD) methods require full supervision with labor-intensive pixel-level annotations. Unsupervised Light Field Salient Object Detection (ULFSOD) has gained attention due to this limitation. However, existing methods use traditional handcrafted techniques to generate noisy pseudo-labels, which degrades the performance of models trained on them. To mitigate this issue, we present a novel learning-based approach to synthesize labels for ULFSOD. We introduce a prominent focal stack identification module that utilizes light field information (focal stack, depth map, and RGB color image) to generate high-quality pixel-level pseudo-labels, aiding network training. Additionally, we propose a novel model architecture for LFSOD, combining a multi-scale spatial attention module for focal stack information with a cross fusion module for RGB and focal stack integration. Through extensive experiments, we demonstrate that our pseudo-label generation method significantly outperforms existing methods in label quality. Our proposed model, trained with our labels, shows significant improvement on ULFSOD, achieving new state-of-the-art scores across public benchmarks.

https://eprints.bournemouth.ac.uk/40606/

Source: Scopus

Unsupervised Salient Object Detection on Light Field with High-Quality Synthetic Labels

Authors: Zheng, Y., Luo, Z., Cao, Y., Yang, X., Xu, W., Lin, Z., Yin, N. and Wang, P.

Journal: IEEE transactions on circuits and systems for video technology (Print)

Publisher: IEEE

eISSN: 1558-2205

ISSN: 1051-8215

Abstract:

Most current Light Field Salient Object Detection (LFSOD) methods require full supervision with labor-intensive pixel-level annotations. Unsupervised Light Field Salient Object Detection (ULFSOD) has gained attention due to this limitation. However, existing methods use traditional handcrafted techniques to generate noisy pseudo-labels, which degrades the performance of models trained on them. To mitigate this issue, we present a novel learning-based approach to synthesize labels for ULFSOD. We introduce a prominent focal stack identification module that utilizes light field information (focal stack, depth map, and RGB color image) to generate high-quality pixel-level pseudo labels, aiding network training. Additionally, we propose a novel model architecture for LFSOD, combining a multi-scale spatial attention module for focal stack information with a cross fusion module for RGB and focal stack integration. Through extensive experiments, we demonstrate that our pseudo-label generation method significantly outperforms existing methods in label quality. Our proposed model, trained with our labels, shows significant improvement on ULFSOD, achieving new state-of-the art scores across public benchmarks.

https://eprints.bournemouth.ac.uk/40606/

Source: Manual

Unsupervised Salient Object Detection on Light Field with High-Quality Synthetic Labels

Authors: Zheng, Y., Luo, Z., Cao, Y., Yang, X., Xu, W., Lin, Z., Yin, N. and Wang, P.

Journal: IEEE Transactions on Circuits and Systems for Video Technology

Publisher: IEEE

ISSN: 1051-8215

Abstract:

Most current Light Field Salient Object Detection (LFSOD) methods require full supervision with labor-intensive pixel-level annotations. Unsupervised Light Field Salient Object Detection (ULFSOD) has gained attention due to this limitation. However, existing methods use traditional handcrafted techniques to generate noisy pseudo-labels, which degrades the performance of models trained on them. To mitigate this issue, we present a novel learning-based approach to synthesize labels for ULFSOD. We introduce a prominent focal stack identification module that utilizes light field information (focal stack, depth map, and RGB color image) to generate high-quality pixel-level pseudo labels, aiding network training. Additionally, we propose a novel model architecture for LFSOD, combining a multi-scale spatial attention module for focal stack information with a cross fusion module for RGB and focal stack integration. Through extensive experiments, we demonstrate that our pseudo-label generation method significantly outperforms existing methods in label quality. Our proposed model, trained with our labels, shows significant improvement on ULFSOD, achieving new state-of-the art scores across public benchmarks.

https://eprints.bournemouth.ac.uk/40606/

Source: BURO EPrints