TBAC: Transformers Based Attention Consensus for Human Activity Recognition

Authors: Yadav, S.K., Kera, S.B., Gonela, R.V., Tiwari, K., Pandey, H.M. and Akbar, S.A.

Journal: Proceedings of the International Joint Conference on Neural Networks

Volume: 2022-July

ISBN: 9781728186719

DOI: 10.1109/IJCNN55064.2022.9892906

Abstract:

Human Activity Recognition is an important task in Computer Vision that involves the utilization of spatio-temporal features of videos to classify human actions. The temporal portion of videos contains vital information needed for accurate classification. However, common Deep Learning methods simply average the temporal features, thereby giving all frames equal importance irrespective of their relevance, which negatively impacts the accuracy of the model. To combat this adverse effect, this paper proposes a novel Transformer Based Attention Consensus (TBAC) module. The TBAC module can be used in a plug-and-play manner as an alternate to the conventional consensus meth-ods of any existing video action recognition network. The TBAC module contains four components: (i) Query Sampling Unit, (ii) Attention Extraction Unit, (iii) Softening Unit, and (iv) Attention Consensus Unit. Our experiments demonstrate that the use of the TBAC module in place of classical consensus can improve the performance of the CNN-based action recognition models, such as Channel Separated Convolutional Network (CSN), Temporal Shift Module (TSM), and Temporal Segment Network (TSN). We also propose the Decision Consensus (DC) algorithm that utilizes multiple independent but related action recognizer models in order to improve upon the performance of most of these constituent models, using a novel fusion algorithm. Results have been obtained on two benchmark human action recognition datasets, HMDB51 and HAA500. The use of the proposed TBAC module along with Decision Consensus achieves state-of-the-art performances, with 85.23% and 83.73% classification accuracies on the two databases HMDB51 and HAA500, respectively. The code will be made publicly available.

https://eprints.bournemouth.ac.uk/36995/

Source: Scopus

TBAC: Transformers Based Attention Consensus for Human Activity Recognition

Authors: Yadav, S.K., Kera, S.B., Gonela, R.V., Tiwari, K., Pandey, H.M. and Akbar, S.A.

Journal: 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

ISSN: 2161-4393

DOI: 10.1109/IJCNN55064.2022.9892906

https://eprints.bournemouth.ac.uk/36995/

Source: Web of Science (Lite)

TBAC: Transformers Based Attention Consensus for Human Activity Recognition

Authors: Yadav, S., Kera, S.B., Gonela, R.V., Tiwari, K., Pandey, H. and Akbar, S.A.

Conference: IEEE WCCI 2022 International Joint Conference on Neural Networks (IJCNN 2022)

Dates: 18-23 July 2022

Abstract:

Human Activity Recognition is an important task in Computer Vision that involves the utilization of spatio-temporal features of videos to classify human actions. The temporal portion of videos contains vital information needed for accurate classification. However, common Deep Learning methods simply average the temporal features, thereby giving all frames equal importance irrespective of their relevance, which negatively impacts the accuracy of the model. To combat this adverse effect, this paper proposes a novel Transformer Based Attention Consensus (TBAC) module. The TBAC module can be used in a plug-and play manner as an alternate to the conventional consensus methods of any existing video action recognition network. The TBAC module contains four components: (i) Query Sampling Unit, (ii) Attention Extraction Unit, (iii) Softening Unit, and (iv) Attention Consensus Unit. Our experiments demonstrate that the use of the TBAC module in place of classical consensus can improve the performance of the CNN-based action recognition models, such as Channel Separated Convolutional Network (CSN), Temporal Shift Module (TSM), and Temporal Segment Network (TSN). We also propose the Decision Consensus (DC) algorithm that utilizes multiple independent but related action recognizer models in order to improve upon the performance of most of these constituent models, using a novel fusion algorithm. Results have been obtained on two benchmark human action recognition datasets, HMDB51 and HAA500. The use of the proposed TBAC module along with Decision Consensus achieves state-of-the-art performances, with 85.23% and 83.73% classification accuracies on the two databases HMDB51 and HAA500, respectively. The code will be made publicly available.

https://eprints.bournemouth.ac.uk/36995/

Source: Manual

TBAC: Transformers Based Attention Consensus for Human Activity Recognition

Authors: Yadav, S., Kera, S.B., Gonela, R.V., Tiwari, K., Pandey, H. and Akbar, S.A.

Conference: IEEE WCCI 2022 International Joint Conference on Neural Networks (IJCNN 2022)

Abstract:

Human Activity Recognition is an important task in Computer Vision that involves the utilization of spatio-temporal features of videos to classify human actions. The temporal portion of videos contains vital information needed for accurate classification. However, common Deep Learning methods simply average the temporal features, thereby giving all frames equal importance irrespective of their relevance, which negatively impacts the accuracy of the model. To combat this adverse effect, this paper proposes a novel Transformer Based Attention Consensus (TBAC) module. The TBAC module can be used in a plug-and play manner as an alternate to the conventional consensus methods of any existing video action recognition network. The TBAC module contains four components: (i) Query Sampling Unit, (ii) Attention Extraction Unit, (iii) Softening Unit, and (iv) Attention Consensus Unit. Our experiments demonstrate that the use of the TBAC module in place of classical consensus can improve the performance of the CNN-based action recognition models, such as Channel Separated Convolutional Network (CSN), Temporal Shift Module (TSM), and Temporal Segment Network (TSN). We also propose the Decision Consensus (DC) algorithm that utilizes multiple independent but related action recognizer models in order to improve upon the performance of most of these constituent models, using a novel fusion algorithm. Results have been obtained on two benchmark human action recognition datasets, HMDB51 and HAA500. The use of the proposed TBAC module along with Decision Consensus achieves state-of-the-art performances, with 85.23% and 83.73% classification accuracies on the two databases HMDB51 and HAA500, respectively. The code will be made publicly available.

https://eprints.bournemouth.ac.uk/36995/

Source: BURO EPrints