Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Classification

Zhenxin Wu (Department of Computer Science, Jinan University),* Qingliang Chen (Jinan University), Yongjian Huang (Guangzhou Xuanyuan Research Institute Co., Ltd.)
The 33rd British Machine Vision Conference


Fine-grained Visual Classification (FGVC) aims to identify objects from subcategories. It is a very challenging task because of the subtle inter-class differences. Existing research applies large-scale convolutional neural networks or visual transformers as the feature extractor, which is extremely computationally expensive. Real-world scenarios of fine-grained recognition often require a more lightweight mobile network that can be utilized offline. However, the fundamental mobile network feature extraction capability is weaker than large-scale models, and thus performs poorly on FGVC. In this paper, based on the lightweight MobilenetV2, we propose a Progressive Multi-Stage Interactive training method with a Recursive Mosaic Generator (RMG-PMSI). First, we propose a Recursive Mosaic Generator (RMG) that generates images with different granularities in different phases. Then, the features of different stages pass through a Multi-Stage Interaction (MSI) module, which strengthens and complements the corresponding features of different stages. Finally, using progressive training (P), the features extracted by the model in different stages can be fully utilized and fused. Experiments on three prestigious fine-grained benchmarks show that RMG-PMSI can significantly improve the performance in mobile networks with good transferability.



author    = {Zhenxin Wu and Qingliang Chen and Yongjian Huang},
title     = {Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Classification},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {}

Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection