Task Generalizable Spatial and Texture Aware Image Downsizing Network

Lin Ma (Samsung),* Weiming Li (Samsung Research China – Beijing (SRC-B)), Hongsheng Li (The Chinese University of Hong Kong), Qiang Wang (Samsung Research China, Beijing), Ji-Yeon Kim (Samsung Advanced Institute of Technology)
The 33rd British Machine Vision Conference


Nowadays CNN pipelines often downsize input images to a fixed size to use batch normalization efficiently. For the mostly used downsizing method by bilinear interpolation, information loss may occur since only the relative distance is considered to compute the interpolation coefficients. To preserve more image information, we propose a simple yet efficient interpolation method DownsizeNet, which extracts and fuses local texture information into interpolation by a modified CNN network. Specifically, it encodes the relative distance by a map and aligns it spatially with CNN texture features by our specially designed floating type pooling. The DownsizeNet allows end-to-end training by following CNN task and can be embedded in various CNN networks seamlessly with little extra cost. Experimental results on seven architectures of two tasks, including four object detection pipelines and three classical segmentation pipelines and on four datasets (Pascal VOC2007, MS COCO, Pascal VOC2012 Segmentation and Cityscapes) demonstrate that our method consistently reduces accuracy drop than using bilinear interpolation. Further, we also demonstrate that our interpolation module can generalize well to different pipelines without re-training.



author    = {Lin Ma and Weiming Li and Hongsheng Li and Qiang Wang and Ji-Yeon Kim},
title     = {Task Generalizable Spatial and Texture Aware Image Downsizing Network},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0315.pdf}

Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection