TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction

Mohammad Saber Pourheydari (University of Bonn), Emad Bahrami (University of Bonn),* Mohsen Fayyaz (Microsoft), Gianpiero Francesca (Toyota-Europe), Mehdi Noroozi (Bosch Gmb), Jürgen Gall (University of Bonn)
The 33rd British Machine Vision Conference


While recurrent neural networks (RNNs) demonstrate outstanding capabilities for future video frame prediction, they model dynamics in a discrete time space, i.e., they predict the frames sequentially with a fixed temporal step. RNNs are therefore prone to accumulate the error as the number of future frames increases. In contrast, partial differential equations (PDEs) model physical phenomena like dynamics in a continuous time space. However, the estimated PDE for frame forecasting needs to be numerically solved, which is done by discretization of the PDE and diminishes most of the advantages compared to discrete models. In this work, we, therefore, propose to approximate the motion in a video by a continuous function using the Taylor series. To this end, we introduce TaylorSwiftNet, a novel convolutional neural network that learns to estimate the higher order terms of the Taylor series for a given input video. TaylorSwiftNet can swiftly predict future frames in parallel and it allows to change the temporal resolution of the forecast frames on-the-fly. The experimental results on various datasets demonstrate the superiority of our model.



author    = {Mohammad Saber Pourheydari and Emad Bahrami and Mohsen Fayyaz and Gianpiero Francesca and Mehdi Noroozi and Jürgen Gall},
title     = {TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0389.pdf}

Copyright © 2022 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection