CroCPS: Addressing Photometric Challenges in Self-Supervised Category-Level 6D Object Poses with Cross-Modal Learning


Pengyuan Wang (TUM),* Lorenzo Garattoni (Toyota-Europe), Sven Meier (Toyota Motor Europe), Nassir Navab ("TU Munich, Germany"), Benjamin Busam (Technical University of Munich)
The 33rd British Machine Vision Conference

Abstract

Estimating 6D object poses for everyday household objects is a crucial and challenging task for robotic applications. Recent advances in category-level object pose estimation show great potential in this direction. Since the training of the networks relies heavily on ground truth 6D poses, which are expensive to annotate in real environments, self-supervised methods become a realistic approach to overcome the domain gap between synthetic and real images. However, these methods work poorly on photometrically-challenging objects because of the missing depth or artifacts in RGBD data. We propose to use the polarization clues to overcome the drawbacks of RGBD images and improve the detection performance for objects with specular surfaces in the self-supervision stage. To this end, we generate a synthetic dataset containing cutlery of various shapes and sizes, and a markerless real dataset with accurate 6D pose annotations. We introduce several novel losses for self-supervision based on inputs of multiple modalities which fully utilize the polarization information. The experiment result shows that the proposed method improves both 2D detection and 3D IoU of the predicted bounding boxes over SOTA methods without usage of annotated ground truth. This work constitutes the first solution for self-supervision on challenging reflective objects and explores the usage of polarization images. We evaluate the effectiveness of the proposed pipeline by proposing synthetic and real data and thorough evaluations.

Video



Citation

@inproceedings{Wang_2022_BMVC,
author    = {Pengyuan Wang and Lorenzo Garattoni and Sven Meier and Nassir Navab and Benjamin  Busam},
title     = {CroCPS: Addressing Photometric Challenges in Self-Supervised Category-Level 6D Object Poses with Cross-Modal Learning},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0390.pdf}
}


Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection