Style2NeRF: An Unsupervised One-Shot NeRF for Semantic 3D Reconstruction

James Charles (Cambridge University),* Wim Abbeloos (Toyota Motor Europe), Daniel Olmeda Reino (Toyota Motor Europe), Roberto Cipolla (University of Cambridge)
The 33rd British Machine Vision Conference


We present Style2NeRF, an unsupervised model for one-shot recovery of 3D pose, shape and appearance of symmetric objects. Style2NeRF contains a transcoder which disentangles 2D representations from pretrained StyleGANs, then maps them to a semantically editable 3D NeRF generator. As such, the generative NeRF inherits StyleGAN's expressiveness and image editing properties, translating them to 3D. We make four key contributions: (i) We provide a novel model to accurately estimate an object's 3D pose, shape and appearance without any human supervision during training; (ii) We show how to map between semantically meaningful 2D and 3D representations using a novel disentangled generative NeRF; (iii) we introduce the pose and viewpoint ambiguity problem (suffered by existing 3D GAN methods) and propose a solution improving pose estimation accuracy; (iv) Finally, via transfer learning, we show our model can be trained on real car images where the pose distribution is unknown. Style2NeRF outperforms the state-of-the-art on the CARLA cars dataset as well as a fully supervised model for the task of car pose estimation on ShapeNet-cars and a new dataset of real car images.



