Multi-View Multi-Person 3D Pose Estimation with Uncalibrated Camera Networks

Yan Xu (Carnegie Mellon University),* Kris Kitani (Carnegie Mellon University)
The 33rd British Machine Vision Conference


Existing efforts in multi-view multi-person 3D human pose estimation often rely on 6 DoF camera poses to obtain cross-view body joint matches for solving 3D poses. Some other efforts use networks specifically trained for each dataset to regress 3D human poses. These methods do not generalize well to scenarios in the wild since they require calibrated camera poses or large amounts of training data. We present an approach that requires none of them. Our key insight is to combine (1) the well-developed 2D human detection and description networks that can be pre-trained on open datasets with (2) multi-view geometry and optimization algorithms that generalize to arbitrary settings. Using 2D human appearance embedding as the input, we solve cross-view human matching as an optimization problem with the numbers of cameras and people and the fact that one person cannot be matched to another person in the same view as the constraints. With the cross-view matches, we estimate the camera poses and 3D human poses simultaneously using multi-view geometry and bundle adjustment optimization. On open datasets, our approach reaches smaller pose estimation error than previous works with fewer requirements of camera pose and model training. We also evaluate our approach with three wild datasets with various settings, including indoor and outdoor environments, static and dynamic cameras, etc. It shows excellent generalization ability across different settings.



author    = {Yan Xu and Kris Kitani},
title     = {Multi-View Multi-Person 3D Pose Estimation with Uncalibrated Camera Networks},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {}

Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection