Thèse de doctorat
Résumé : Who has never immersed themselves in the memories of a photo album or the universe rendered by a movie? Cameras have been invented to record a part of the world at a precise moment for recollection, information or entertainment purposes. However, beside the ever-increasing quality of the captured images, only one viewpoint, holding in a flat image, is captured. In order to increase the immersion, new methods have emerged to reconstruct and render the three dimensions of a scene, allowing the user to better understand, measure or simply live the captured moment.Due to the public and industry's interest for immersive representations of the real-world (3D cinema, metaverse, virtual reality,...), the MPEG-I group (subgroup of MPEG for Immersive video coding) launched a vast work for standardization of immersive video coding. Research on view synthesis, the technique to create new viewpoints from a given set of input images, has been tackled by companies such as Google and NVIDIA, obtaining incredibly photo-realistic results. However, these methods are not yet accessible at large, due to the acquisition setups, the reconstruction process, the hardware cost or the processing time needed to render new viewpoints.In this manuscript, we present a view synthesis method relying on the principle of depth image-based rendering: given a set of real-world images representing a scene, along with some geometry information represented by depth maps, we reconstruct new images taken from any location in the scene. We aim to reach photo-realistic results, wide navigation range, real-time processing on midrange material and multi-input modality.We demonstrate the quality of the image synthesized by this method compared to other state-of-the-art view synthesis approaches. Thanks to a GPU implementation, we reach real-time results and high visual quality with less than ten input images. Thus, we prove that our method is suitable not only for real-time virtual reality applications, but also for holography and 3D displays.We also address the challenge of rendering non-Lambertian objects (e.g. mirrors, glasses, transparent liquids...). While omnipresent, they are left aside or barely mentioned in most of the view synthesis methods. Indeed, their particular interactions with light make them violate the Lambertian assumption, which is the implicit basis of most of the reconstruction and rendering methods. However, to reach a faithful rendering of the scene, and not only a plausible approximation of the scene, we extended the depth image-based rendering paradigm to reproduce their changing appearance.Eventually, we studied the particular case of plenoptic cameras. Those devices, mimicking multi-faceted insect eyes, capture in one shot several viewpoints of the scene: exactly what is needed to overcome the gap between a flat photograph and the 3D scene in which we can immerse. However, the view synthesis with plenoptic camera, and all the related methods to make view synthesis possible (calibration, geometry estimation), are still at their first faltering steps, compared to analogous methods with regular imaging devices. As an invitation to take a journey to the plenoptic world, we explored the calibration of the plenoptic cameras, for single and multi-view datasets. We contributed to the development of view synthesis of new images within a small range. And finally, we extended our software to free viewpoint view synthesis using plenoptic cameras with calibrated parameters.This thesis contributes to the development of photo-realistic, multi-input view synthesis. It extends the understanding of depth image-based rendering by pushing its limits beyond non-Lambertian scenes and plenoptic cameras. Furthermore, it makes the method more accessible at large through the release of open-source datasets and software, the low requirements of the proposed approaches (pattern-free calibration, portable methods) and the extensive comparisons with other state-of-the-art methods. In this sense, it opens doors to fascinating new challenges and applications in the domain of view synthesis.