Towards a more efficient few-shot learning-based human gesture recognition via dynamic vision sensors


Linglin Jing (Loughborough University), Yifan Wang (Loughborough University), Tailin Chen (Newcastle Univesity), Shirin Dora (Loughborough University ), ZHIGANG JI (Liverpool John Moores University), Hui Fang (Loughborough University)*
The 33rd British Machine Vision Conference

Abstract

For human gesture recognition task, recent fully supervised deep learning models have achieved impressive performance when sufficient samples of predefined gesture classes are provided. However, these models do not generalise well for new classes, thus limiting the model accuracy on unforeseen gesture categories. Few shot learning based human gesture recognition (FSL-HGR) addresses this problem by supporting faster learning using only a few samples from new gesture classes. In this paper, we develop a novel FSL-HGR method which enables energy-efficient inference across large number of classes. Specifically, we adapt a surrogate gradient-based spiking neural network model to efficiently process video sequences collected via dynamic vision sensors. With a focus on energy-efficiency, we design two strategies, spiking noise suppression and emission sparsity learning, to significantly reduce the spike emission rate in all layers of the network. Additionally, we introduce a dual-speed stream contrastive learning to achieve high accuracy without increasing computational burden associated with inference using dual stream processing. Our experimental results demonstrate the effectiveness of our approach. We achieve state-of-ate-art 84.75%, and 92.82% accuracy on 5way-1shot and 5way-5shot learning task with 60.02% and 58.21% reduced spike emission number respectively compared to a standard SNN architecture without using our learning strategies when processing the DVS128 Gesture dataset.

Video



Citation

@inproceedings{Jing_2022_BMVC,
author    = {Linglin Jing and Yifan Wang and Tailin Chen and Shirin Dora and ZHIGANG JI and Hui Fang},
title     = {Towards a more efficient few-shot learning-based human gesture recognition via dynamic vision sensors},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0938.pdf}
}


Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection