GAMAFLOW: ESTIMATING 3D SCENE FLOW VIA GROUPED ATTENTION AND GLOBAL MOTION AGGREGATION

Authors: Li, Z., Yang, X. and Zhang, J.

Journal: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Pages: 3955-3959

ISSN: 1520-6149

DOI: 10.1109/ICASSP48485.2024.10447849

Abstract:

The estimation of 3D motion fields, known as scene flow estimation, is an essential task in autonomous driving and robotic navigation. Existing learning-based methods either predict scene flow through flow-embedding layers or rely on local search methods to establish soft correspondences. However, these methods often neglect distant points which, in fact, represent the true matching elements. To address this challenge, we introduce GAMAFlow, a point-voxel architecture that models local motion and global motion to predict scene flow iteratively. In particular, GAMAFlow integrates the advantages of (i) the point Transformer with Grouped Attention and (ii) global Motion Aggregation to boost the efficacy of point-voxel correlation. Such an approach facilitates learning long-distance dependencies between current frame and next frame. Experiments illustrate the performance gains achieved by GAMAFlow compared to existing works on both FlyingThings3D and KITTI benchmarks.

https://eprints.bournemouth.ac.uk/40135/

Source: Scopus

GAMAFlow: Estimating 3D Scene Flow via Grouped Attention and Global Motion Aggregation

Authors: Li, Z., Yang, X. and Zhang, J.

Pages: 3955-3959

Publisher: IEEE

Place of Publication: New York

ISBN: 9798350344851

ISSN: 1520-6149

Abstract:

The estimation of 3D motion fields, known as scene flow estimation, is an essential task in autonomous driving and robotic navigation. Existing learning-based methods either predict scene flow through flow-embedding layers or rely on local search methods to establish soft correspondences. However, these methods often neglect distant points which, in fact, represent the true matching elements. To address this challenge, we introduce GAMAFlow, a point-voxel architecture that models local motion and global motion to predict scene flow iteratively. In particular, GAMAFlow integrates the advantages of (i) the point Transformer with Grouped Attention and (ii) global Motion Aggregation to boost the efficacy of point-voxel correlation. Such an approach facilitates learning long-distance dependencies between current frame and next frame. Experiments illustrate the performance gains achieved by GAMAFlow compared to existing works on both FlyingThings3D and KITTI benchmarks.

https://eprints.bournemouth.ac.uk/40135/

Source: BURO EPrints