Abstract
A densely-sampled light field provides richer appearance and geometry information of 3D scenes than a sparsely-sampled one, leading to a range of applications including 3D reconstruction, image post-refocusing, and virtual reality. However, existing ways to capture densely-sampled light fields present great challenges. Specifically, earlier light field capturing devices, e.g., the camera array or computer-controlled gantry, can be bulky and expensive to sample dense-sampled light fields, while recent cost-effective commercial light field cameras suffer from a trade-off between spatial and angular resolution because of the limited sensor resolution. Although the coded aperture camera can obtain a light field with the same spatial resolution as that of the camera sensor, it requires post-reconstruction algorithms to reconstruct the full 4D light field from 2D coded measurements.In this thesis, we explore deep learning-based frameworks to reconstruct the densely-sampled light field image from inputs that have severe incompletion of angular information, i.e., an extremely sparse light field with a wide baseline and coded aperture measurements. Such two kinds of inputs still remain great challenges for existing light field reconstruction methods, and the reconstruction quality of light fields from these methods is rather limited. Moreover, our proposed frameworks are extendable to other reconstruction tasks, i.e., light field denoising, light field spatial super-resolution, and novel view synthesis on multi-view datasets, as well as achieve significantly better performance than other state-of-the-art methods on these tasks. The two frameworks are summarized as follows:
(1) we propose a novel learning-based framework for the reconstruction of high-quality light fields from acquisitions via learned coded apertures. The proposed method incorporates the measurement observation into the deep learning framework elegantly to avoid relying entirely on data-driven priors for light field reconstruction. Specifically, we first formulate the compressive light field reconstruction as an inverse problem with an implicit regularization term. Then, we construct the regularization term with a deep efficient spatial-angular separable convolutional sub-network in the form of local and global residual learning to comprehensively explore the signal distribution free from the limited representation ability and inefficiency of deterministic mathematical modeling. Furthermore, we extend this pipeline to light field denoising and spatial super-resolution, which could be considered as variants of coded aperture imaging equipped with different degradation matrices. Extensive experimental results demonstrate that the proposed methods outperform state-of-the-art approaches to a significant extent both quantitatively and qualitatively, i.e., the reconstructed light fields not only achieve much higher PSNR/SSIM but also preserve the light field parallax structure better on both real and synthetic light field benchmarks; and
(2) we propose content-aware warping, which adaptively learns the interpolation weights for pixels of a relatively large neighborhood from their contextual information via a lightweight neural network. Based on this learnable warping module, we propose a new end-to-end learning-based framework for novel view synthesis from a set of input source views, in which two additional modules, namely confidence-based blending and feature-assistant spatial refinement, are naturally proposed to handle the occlusion issue and capture the spatial correlation among pixels of the synthesized view, respectively. Besides, we also propose a weight-smoothness loss term to regularize the network. Experimental results on light field datasets with wide baselines and multi-view datasets show that the proposed method significantly outperforms state-of-the-art methods both quantitatively and visually.
| Date of Award | 12 Jul 2023 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Junhui HOU (Supervisor) |