Compressing Video Calls using Synthetic Talking Heads

Madhav Agarwal (IIIT-Hyderabad),* Anchit Gupta (IIIT Hyderabad), Rudrabha Mukhopadhyay (IIIT Hyderabad), Vinay Namboodiri (University of Bath), C.V. Jawahar (IIIT-Hyderabad)

The 33^rd British Machine Vision Conference

Abstract

We leverage the modern advancements in talking head generation to propose an end-to-end system for talking head video compression. Our algorithm transmits pivot frames intermittently while the rest of the talking head video is generated by animating them. We use a state-of-the-art face reenactment network to detect key points in the non-pivot frames and transmit them to the receiver. A dense flow is then calculated to warp a pivot frame to reconstruct the non-pivot ones. Transmitting key points instead of full frames leads to significant compression. We propose a novel algorithm to adaptively select the best-suited pivot frames at regular intervals to provide a smooth experience. We also propose a frame-interpolater at the receiver's end to improve the compression levels further. Finally, a face enhancement network improves reconstruction quality, significantly improving several aspects like the sharpness of the generations. We evaluate our method both qualitatively and quantitatively on benchmark datasets and compare it with multiple compression techniques. We release a demo video and additional information at https://cvit.iiit.ac.in/research/projects/cvit-projects/talking-video-compression.

Video

Citation

@inproceedings{Agarwal_2022_BMVC,
author    = {Madhav Agarwal and Anchit Gupta and Rudrabha Mukhopadhyay and Vinay Namboodiri and C.V. Jawahar},
title     = {Compressing Video Calls using Synthetic Talking Heads},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0289.pdf}
}

Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection