Revisiting single-step adversarial training for robustness and generalization

Authors: Li, Z., Yu, D., Wu, M., Chan, S., Yu, H. and Han, Z.

Journal: Pattern Recognition

Volume: 151

ISSN: 0031-3203

DOI: 10.1016/j.patcog.2024.110356

Abstract:

Recently, single-step adversarial training has received high attention because it shows robustness and efficiency. However, a phenomenon referred to as “catastrophic overfitting” has been observed, which is prevalent in single-step defenses and may frustrate attempts to use FGSM adversarial training. To address this issue, we propose a novel method, Stable and Efficient Adversarial Training (SEAT). SEAT mitigates catastrophic overfitting by harnessing on local properties that differentiate a robust model from one prone to catastrophic overfitting. The proposed SEAT is underpinned by robust theoretical justifications, in that minimizing the SEAT loss is demonstrated to promote a smoother empirical risk, consequently enhancing robustness. Experimental results demonstrate that the proposed method successfully mitigates catastrophic overfitting, yielding superior performance amongst efficient defenses. Our single-step method can reach 51% robust accuracy for CIFAR-10 with l∞ perturbations of radius 8/255 under a strong PGD-50 attack, matching the performance of a 10-step iterative method at merely 3% computational cost.

Source: Scopus