Adversarial robustness via attention transfer

Authors: Li, Z., Feng, C., Wu, M., Yu, H., Zheng, J. and Zhu, F.

Journal: Pattern Recognition Letters

Volume: 146

Pages: 172-178

ISSN: 0167-8655

DOI: 10.1016/j.patrec.2021.03.011

Abstract:

Deep neural networks are known to be vulnerable to adversarial attacks. The empirical analysis in our study suggests that attacks tend to induce diverse network architectures to shift the attention to irrelevant regions. Motivated by this observation, we propose a regularization technique which enforces the attentions to be well aligned via the knowledge transfer mechanism, thereby encouraging the robustness. Resultant model exhibits unprecedented robustness, securing 63.81% adversarial accuracy where the prior art is 51.59% on CIFAR-10 dataset under PGD attacks. In addition, we go beyond performance to analytically investigate the proposed method as an effective defense. Significantly flattened loss landscape can be observed, demonstrating the promise of the proposed method for improving robustness and thus the deployment in security-sensitive settings.

https://eprints.bournemouth.ac.uk/35804/

Source: Scopus