Forecasting Distillation: Enhancing 3D Human Motion Prediction with Guidance Regularization
Authors: Du, Y., Wang, Z., Li, Y., Yang, X. and Wu, C.
Journal: Proceedings of the International Joint Conference on Neural Networks
DOI: 10.1109/IJCNN60899.2024.10650336
Abstract:Human motion prediction aims to forecast future body poses from historically observed sequences, which is challenging due to motion's complex dynamics. Existing methods mainly focus on dedicated network structures to model the spatial and temporal dependencies. The predicted results are required to be strictly similar to the training samples with l2 loss in the current training pipeline. It needs to be pointed out that most approaches predict the next frame conditioned on the previously predicted sequence, where a small error in the initial frame could be accumulated significantly. In addition, recent work indicated that different stages could play different roles. Hence, this paper considers a new direction by introducing a model learning framework with motion guidance regularization to reduce uncertainty. The guidance information is extracted from a designed Fusion Feature Extraction network (FE-Net) while knowledge distilling is conducted through intermediate supervision to improve the multi-stage prediction network during training. Incorporated with baseline models, our guidance design exhibits clear performance gains in terms of 3D mean per joint position error (MPJPE) on benchmark datasets Human3.6M, CMU Mocap, and 3DPW datasets, respectively. Related code will be available on https://github.com/tempAnonymous2024/MotionPredict-GuidanceReg.
https://eprints.bournemouth.ac.uk/40539/
Source: Scopus
Forecasting Distillation: Enhancing 3D Human Motion Prediction with Guidance Regularization
Authors: Du, Y., Wang, Z., Li, Y., Yang, X. and Wu, C.
Publisher: IEEE
Place of Publication: Piscataway, NJ
Abstract:Human motion prediction aims to forecast future body poses from historically observed sequences, which is challenging due to motion's complex dynamics. Existing methods mainly focus on dedicated network structures to model the spatial and temporal dependencies. The predicted results are required to be strictly similar to the training samples with l2 loss in the current training pipeline. It needs to be pointed out that most approaches predict the next frame conditioned on the previously predicted sequence, where a small error in the initial frame could be accumulated significantly. In addition, recent work indicated that different stages could play different roles. Hence, this paper considers a new direction by introducing a model learning framework with motion guidance regularization to reduce uncertainty. The guidance information is extracted from a designed Fusion Feature Extraction network (FE-Net) while knowledge distilling is conducted through intermediate supervision to improve the multi-stage prediction network during training. Incorporated with baseline models, our guidance design exhibits clear performance gains in terms of 3D mean per joint position error (MPJPE) on benchmark datasets Human3.6M, CMU Mocap, and 3DPW datasets, respectively. Related code will be available on https://github.com/tempAnonymous2024/MotionPredict-GuidanceReg.
https://eprints.bournemouth.ac.uk/40539/
Source: BURO EPrints