Towards Immersive 3-D Telepresence: Compact Light-field Representation and Beyond
DescriptionLight-field (LF) data is capable of providing viewers with a realistic 3-D visual perception in glasses-free and fatigue-free manners with the help of LF display technology. Thus, LF is considered as a promising paradigm for immersive 3-D telepresence, which can be widely applied to entertainment, telerehabilitation, business, etc., to significantly reform human work and life styles. LF also facilitates/allows a wide range of interesting applications, e.g., image editing/post-capture refocus, 3-D reconstruction, etc. More importantly, recent advances in commercial hand-held LF cameras make the convenient acquisition of LF images in a single snapshot possible, which dramatically boosts LF-based research and applications.However, the astronomical amount of LF data poses great challenges to its storage, transmission, and display/rendering. Moreover, due to the challenge of characterizing the inherent structure of LF data, conventional image/video coding schemes are not well suited for LF data. Therefore, it is urgently needed to study more efficient LF data tailored compression frameworks for advancing the deployment of LF technology.Unlike conventional architectures based on predictive and transform coding, we will investigate novel frameworks from two new perspectives, which are expected to be capable of accurately characterizing the inherent structure. Specifically, from the statistical learning perspective, we will investigate computationally and memory-efficient deep learning architectures and training strategies for high-quality spatial and angular reconstruction, which are novel to address the challenge posed by the high dimensionality of LF.From the geometric modeling perspective, to explore the multidimensional correlation of LF data simultaneously for the best global performance, we propose to formulate a multi-sparsity and low-rank regularized compact matrix factorization with the assistance of proposed novel accurate depth reasoning with occluded region regularization.Moreover, in contrast to the conventional explicit model based rate-distortion-optimization (RDO), which requires time-consuming pre-encoding, it is promising to study learning based RDO with efficient random forests/trees and deep features to determine the optimal encoding parameters faster and more accurately.With our novel algorithms that cover multidisciplinary knowledge, as well as our promising preliminary experimental verification, our frameworks have great potential in achieving 30% data size reduction at the same reconstruction quality than current state-of- the-art methods, which could be a significant boost to LF-based immersive 3-D telepresence. Beyond compact representation, our algorithms can address other fundamental issues of LF acquisition and applications, including, but not limited to, compressive LF sensing, precise distance measurement/perception, etc.
|Effective start/end date||1/01/19 → …|