Revisiting single-step adversarial training for robustness and generalization
Authors: Li, Z., Yu, D., Wu, M., Chan, S., Yu, H. and Han, Z.
Journal: Pattern Recognition
Volume: 151
ISSN: 0031-3203
DOI: 10.1016/j.patcog.2024.110356
Abstract:Recently, single-step adversarial training has received high attention because it shows robustness and efficiency. However, a phenomenon referred to as “catastrophic overfitting” has been observed, which is prevalent in single-step defenses and may frustrate attempts to use FGSM adversarial training. To address this issue, we propose a novel method, Stable and Efficient Adversarial Training (SEAT). SEAT mitigates catastrophic overfitting by harnessing on local properties that differentiate a robust model from one prone to catastrophic overfitting. The proposed SEAT is underpinned by robust theoretical justifications, in that minimizing the SEAT loss is demonstrated to promote a smoother empirical risk, consequently enhancing robustness. Experimental results demonstrate that the proposed method successfully mitigates catastrophic overfitting, yielding superior performance amongst efficient defenses. Our single-step method can reach 51% robust accuracy for CIFAR-10 with l∞ perturbations of radius 8/255 under a strong PGD-50 attack, matching the performance of a 10-step iterative method at merely 3% computational cost.
Source: Scopus