Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment


Robin Karlsson (Nagoya University),* Tomoki Hayashi (Nagoya University), Keisuke Fujii (Nagoya University / RIKEN), Alexander Carballo (Nagoya University), Kento Ohtani (Nagoya University), Kazuya Takeda (Nagoya University)
The 33rd British Machine Vision Conference

Abstract

Recent self-supervised models have demonstrated equal or better performance than supervised methods, opening for AI systems to learn visual representations from practically unlimited data. However, these methods are typically classification-based and thus ineffective for learning high-resolution feature maps that preserve precise spatial information. This work introduces superpixels to improve self-supervised learning of dense semantically rich visual concept embeddings. Decomposing images into a small set of visually coherent regions reduces the computational complexity by $\mathcal{O}(1000)$ while preserving detail. We experimentally show that contrasting over regions improves the effectiveness of contrastive learning methods, extends their applicability to high-resolution images, improves overclustering performance, superpixels are better than grids, and regional masking improves performance. The expressiveness of our dense embeddings is demonstrated by improving the SOTA unsupervised semantic segmentation benchmark on Cityscapes, and for convolutional models on COCO.

Video



Citation

@inproceedings{Karlsson_2022_BMVC,
author    = {Robin Karlsson and Tomoki Hayashi and Keisuke Fujii and Alexander Carballo and Kento Ohtani and Kazuya Takeda},
title     = {Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0699.pdf}
}


Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection