Implicit texture mapping for multi-view video synthesis

Mohamed I Lakhal (Huawei),* Oswald Lanz (Free University of Bozen-Bolzano), ANDREA CAVALLARO (Queen Mary University of London, UK)
The 33rd British Machine Vision Conference


Multi-view video synthesis generates the scene dynamics from a viewpoint given a source view and one or more modalities of a targeted view. In this paper, we frame video synthesis as a feature learning problem and solve it as target-view motion synthesis with spatial refinement. Specifically, we propose a motion synthesis network with a novel recurrent neural layer that learns the spatio-temporal representation of the target-view. Next, a refinement network corrects the generated coarse texture by learning the residual (\textit{i.e.}~high-frequency textures) through a UNet generator. Experimental results show visual quality enhancement of the proposed pipeline over state-of-the-art methods.



