Non-uniform label smoothing for diabetic retinopathy grading from retinal fundus images with deep neural networks

Authors: Galdran, A., Chelbi, J., Kobi, R., Dolz, J., Lombaert, H., Ayed, I.B. and Chakor, H.

Journal: Translational Vision Science and Technology

Volume: 9

Issue: 2 Special Issue

Pages: 1-8

eISSN: 2164-2591

DOI: 10.1167/tvst.9.2.34

Abstract:

Purpose: Introducing a new technique to improve deep learning (DL) models designed for automatic grading of diabetic retinopathy (DR) from retinal fundus images by enhancing predictions’ consistency. Methods: A convolutional neural network (CNN) was optimized in three different manners to predict DR grade from eye fundus images. The optimization criteria were (1) the standard cross-entropy (CE) loss; (2) CE supplemented with label smoothing (LS), a regularization approach widely employed in computer vision tasks; and (3) our proposed non-uniform label smoothing (N-ULS), a modification of LS that models the underlying structure of expert annotations. Results: Performance was measured in terms of quadratic-weighted κ score (quad-κ) and average area under the receiver operating curve (AUROC), as well as with suitable metrics for analyzing diagnostic consistency, like weighted precision, recall, and F1 score,orMatthewscorrelationcoefficient.WhileLSgenerallyharmedtheperformance oftheCNN,N-ULSstatisticallysignificantlyimprovedperformancewithrespecttoCEin terms quad-κ score (73.17 vs. 77.69, P < 0.025), without any performance decrease in average AUROC. N-ULS achieved this while simultaneously increasing performance for all other analyzed metrics. Conclusions: For extending standard modeling approaches from DR detection to the more complex task of DR grading, it is essential to consider the underlying structure of expert annotations. The approach introduced in this article can be easily implemented in conjunction with deep neural networks to increase their consistency without sacrificing per-class performance. Translational Relevance: A straightforward modification of current standard training practices of CNNs can substantially improve consistency in DR grading, better modeling expert annotations and human variability.

Source: Scopus

Non-uniform Label Smoothing for Diabetic Retinopathy Grading from Retinal Fundus Images with Deep Neural Networks.

Authors: Galdran, A., Chelbi, J., Kobi, R., Dolz, J., Lombaert, H., Ben Ayed, I. and Chakor, H.

Journal: Transl Vis Sci Technol

Volume: 9

Issue: 2

Pages: 34

ISSN: 2164-2591

DOI: 10.1167/tvst.9.2.34

Abstract:

PURPOSE: Introducing a new technique to improve deep learning (DL) models designed for automatic grading of diabetic retinopathy (DR) from retinal fundus images by enhancing predictions' consistency. METHODS: A convolutional neural network (CNN) was optimized in three different manners to predict DR grade from eye fundus images. The optimization criteria were (1) the standard cross-entropy (CE) loss; (2) CE supplemented with label smoothing (LS), a regularization approach widely employed in computer vision tasks; and (3) our proposed non-uniform label smoothing (N-ULS), a modification of LS that models the underlying structure of expert annotations. RESULTS: Performance was measured in terms of quadratic-weighted κ score (quad-κ) and average area under the receiver operating curve (AUROC), as well as with suitable metrics for analyzing diagnostic consistency, like weighted precision, recall, and F1 score, or Matthews correlation coefficient. While LS generally harmed the performance of the CNN, N-ULS statistically significantly improved performance with respect to CE in terms quad-κ score (73.17 vs. 77.69, P < 0.025), without any performance decrease in average AUROC. N-ULS achieved this while simultaneously increasing performance for all other analyzed metrics. CONCLUSIONS: For extending standard modeling approaches from DR detection to the more complex task of DR grading, it is essential to consider the underlying structure of expert annotations. The approach introduced in this article can be easily implemented in conjunction with deep neural networks to increase their consistency without sacrificing per-class performance. TRANSLATIONAL RELEVANCE: A straightforward modification of current standard training practices of CNNs can substantially improve consistency in DR grading, better modeling expert annotations and human variability.

Source: PubMed

Non-uniform Label Smoothing for Diabetic Retinopathy Grading from Retinal Fundus Images with Deep Neural Networks

Authors: Galdran, A., Chelbi, J., Kobi, R., Dolz, J., Lombaert, H., ben Ayed, I. and Chakor, H.

Journal: TRANSLATIONAL VISION SCIENCE & TECHNOLOGY

Volume: 9

Issue: 2

ISSN: 2164-2591

DOI: 10.1167/tvst.9.2.34

Source: Web of Science (Lite)

Non-uniform Label Smoothing for Diabetic Retinopathy Grading from Retinal Fundus Images with Deep Neural Networks.

Authors: Galdran, A., Chelbi, J., Kobi, R., Dolz, J., Lombaert, H., Ben Ayed, I. and Chakor, H.

Journal: Translational vision science & technology

Volume: 9

Issue: 2

Pages: 34

eISSN: 2164-2591

ISSN: 2164-2591

DOI: 10.1167/tvst.9.2.34

Abstract:

Purpose

Introducing a new technique to improve deep learning (DL) models designed for automatic grading of diabetic retinopathy (DR) from retinal fundus images by enhancing predictions' consistency.

Methods

A convolutional neural network (CNN) was optimized in three different manners to predict DR grade from eye fundus images. The optimization criteria were (1) the standard cross-entropy (CE) loss; (2) CE supplemented with label smoothing (LS), a regularization approach widely employed in computer vision tasks; and (3) our proposed non-uniform label smoothing (N-ULS), a modification of LS that models the underlying structure of expert annotations.

Results

Performance was measured in terms of quadratic-weighted κ score (quad-κ) and average area under the receiver operating curve (AUROC), as well as with suitable metrics for analyzing diagnostic consistency, like weighted precision, recall, and F1 score, or Matthews correlation coefficient. While LS generally harmed the performance of the CNN, N-ULS statistically significantly improved performance with respect to CE in terms quad-κ score (73.17 vs. 77.69, P < 0.025), without any performance decrease in average AUROC. N-ULS achieved this while simultaneously increasing performance for all other analyzed metrics.

Conclusions

For extending standard modeling approaches from DR detection to the more complex task of DR grading, it is essential to consider the underlying structure of expert annotations. The approach introduced in this article can be easily implemented in conjunction with deep neural networks to increase their consistency without sacrificing per-class performance.

Translational relevance

A straightforward modification of current standard training practices of CNNs can substantially improve consistency in DR grading, better modeling expert annotations and human variability.

Source: Europe PubMed Central