Bournemouth University

Staff Profile Pages

FaTNET: Feature-alignment transformer network for human pose transfer

Authors: Luo, Y., Yuan, C., Gao, L., Xu, W., Yang, X. and Wang, P.

Journal: Pattern Recognition

Volume: 165

ISSN: 0031-3203

DOI: 10.1016/j.patcog.2025.111626

Abstract:

Pose-guided person image generation involves converting an image of a person from a source pose to a target pose. This task presents significant challenges due to the extensive variability and occlusion. Existing methods heavily rely on CNN-based architectures, which are constrained by their local receptive fields and often struggle to preserve the details of style and shape. To address this problem, we propose a novel framework for human pose transfer with transformers, which can employ global dependencies and keep local features as well. The proposed framework consists of transformer encoder, feature alignment network and transformer synthetic network, enabling the generation of realistic person images with desired poses. The core idea of our framework is to obtain a novel prior image aligned with the target image through the feature alignment network in the embedded and disentangled feature space, and then synthesize the final fine image through the transformer synthetic network by recurrently warping the result of previous stage with the correlation matrix between aligned features and source images. In contrast to previous convolution and non-local methods, ours can employ the global receptive field and preserve detail features as well. The results of qualitative and quantitative experiments demonstrate the superiority of our model in human pose transfer.

https://eprints.bournemouth.ac.uk/40994/

Source: Scopus

FaTNET: Feature-alignment transformer network for human pose transfer

Authors: Luo, Y., Yuan, C., Gao, L., Xu, W., Yang, X. and Wang, P.

Journal: PATTERN RECOGNITION

Volume: 165

eISSN: 1873-5142

ISSN: 0031-3203

DOI: 10.1016/j.patcog.2025.111626

https://eprints.bournemouth.ac.uk/40994/

Source: Web of Science (Lite)

FaTNET: Feature-alignment transformer network for human pose transfer

Authors: Luo, Y., Yuan, C., Gao, L., Xu, W., Yang, X. and Wang, P.

Journal: Pattern Recognition

Volume: 165

ISSN: 0031-3203

DOI: 10.1016/j.patcog.2025.111626

Abstract:

Pose-guided person image generation involves converting an image of a person from a source pose to a target pose. This task presents significant challenges due to the extensive variability and occlusion. Existing methods heavily rely on CNN-based architectures, which are constrained by their local receptive fields and often struggle to preserve the details of style and shape. To address this problem, we propose a novel framework for human pose transfer with transformers, which can employ global dependencies and keep local features as well. The proposed framework consists of transformer encoder, feature alignment network and transformer synthetic network, enabling the generation of realistic person images with desired poses. The core idea of our framework is to obtain a novel prior image aligned with the target image through the feature alignment network in the embedded and disentangled feature space, and then synthesize the final fine image through the transformer synthetic network by recurrently warping the result of previous stage with the correlation matrix between aligned features and source images. In contrast to previous convolution and non-local methods, ours can employ the global receptive field and preserve detail features as well. The results of qualitative and quantitative experiments demonstrate the superiority of our model in human pose transfer.

https://eprints.bournemouth.ac.uk/40994/

Source: Manual

FaTNET: Feature-alignment transformer network for human pose transfer

Authors: Luo, Y., Yuan, C., Gao, L., Xu, W., Yang, X. and Wang, P.

Journal: Pattern Recognition

Volume: 165

ISSN: 0031-3203

Abstract:

Pose-guided person image generation involves converting an image of a person from a source pose to a target pose. This task presents significant challenges due to the extensive variability and occlusion. Existing methods heavily rely on CNN-based architectures, which are constrained by their local receptive fields and often struggle to preserve the details of style and shape. To address this problem, we propose a novel framework for human pose transfer with transformers, which can employ global dependencies and keep local features as well. The proposed framework consists of transformer encoder, feature alignment network and transformer synthetic network, enabling the generation of realistic person images with desired poses. The core idea of our framework is to obtain a novel prior image aligned with the target image through the feature alignment network in the embedded and disentangled feature space, and then synthesize the final fine image through the transformer synthetic network by recurrently warping the result of previous stage with the correlation matrix between aligned features and source images. In contrast to previous convolution and non-local methods, ours can employ the global receptive field and preserve detail features as well. The results of qualitative and quantitative experiments demonstrate the superiority of our model in human pose transfer.

https://eprints.bournemouth.ac.uk/40994/

Source: BURO EPrints