Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo

CVPR 2021

Jiahao Lin Gim Hee Lee
Department of Computer Science, National University of Singapore

Plane Sweep


Existing approaches for multi-view multi-person 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views and solve for the 3D pose estimation for each person. Establishing cross-view correspondences is challenging in multi-person scenes, and incorrect correspondences will lead to sub-optimal performance for the multi-stage pipeline. In this work, we present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot. Specifically, we propose to perform depth regression for each joint of each 2D pose in a target camera view. Cross-view consistency constraints are implicitly enforced by multiple reference camera views via the plane sweep algorithm to facilitate accurate depth regression. We adopt a coarse-to-fine scheme to first regress the person-level depth followed by a per-person joint-level relative depth estimation. 3D poses are obtained from a simple back-projection given the estimated depths. We evaluate our approach on benchmark datasets where it outperforms previous state-of-the-arts while being remarkably efficient.

[Paper] [Arxiv] [Code]


@InProceedings{Lin_2021_CVPR, author = {Lin, Jiahao and Lee, Gim Hee}, title = {Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {11886-11895} }



Webpage design borrowed from LASR