Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation


Liang Zeng (Delft University of Technology), Attila Lengyel (Delft University of Technology), Nergis Tomen (Delft University of Technology), Jan C van Gemert (Delft University of Technology)*
The 33rd British Machine Vision Conference
Best Student Paper Award Honourable Mention

Abstract

In this work, we leverage estimated depth to boost self-supervised contrastive learning for segmentation of urban scenes, where unlabeled videos are readily available for training self-supervised depth estimation. We argue that the semantics of a coherent group of pixels in 3D space is self-contained and invariant to the contexts in which they appear. We group coherent, semantically related pixels into coherent depth regions given their estimated depth and use copy-paste to synthetically vary their contexts. In this way, cross-context correspondences are built in contrastive learning and a context-invariant representation is learned. For unsupervised semantic segmentation of urban scenes, our method surpasses the previous state-of-the-art baseline by +7.14% in mIoU on Cityscapes and +6.65% on KITTI. For fine-tuning on Cityscapes and KITTI segmentation, our method is competitive with existing models, yet, we do not need to pre-train on ImageNet or COCO, while we are also more computationally efficient. Our code is available on https://github.com/LeungTsang/CPCDR.

Video



Citation

@inproceedings{Zeng_2022_BMVC,
author    = {Liang Zeng and Attila Lengyel and Nergis  Tomen and Jan C van Gemert},
title     = {Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0893.pdf}
}


Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection