Depth estimation and enhancement for 3D video view synthesis
Student thesis: Doctoral Thesis
Related Research Unit(s)
The Depth-enhanced 3D Video (3DV) format can achieve high efficiency and is highly compatible with 3D video systems, such as the stereoscopic system and multiview video system, in which color video with corresponding depth images are used to synthesize virtual views using the Depth Image Based Rendering (DIBR) technique. The quality of the synthesized virtual views depends largely on the accuracy of the depth image. To improve the quality of 3D video synthesis, advance depth estimation and enhancement techniques are developed in this work. In practice, a depth image can be estimated from a single 2-Dimensional image according to the depth cues in the scene. Based on motion parallax, block-matching motion estimation is the most commonly used approach for automatic depth estimation. Due to its block-based structure, staircase artifacts are inevitably created in the estimated depth image. To minimize these artifacts, a color fusion process using color information is proposed to enhance the depth image quality. Color segmentation is first applied to the color image. The good boundary information of the color-segmented image is then fused with the blocky depth image obtained from motion estimation to eliminate the staircase artifacts. Thus, a depth image with good boundaries can be obtained. In some applications, however, the automatic approach cannot provide a sufficiently high quality depth image due to the difficulty of extracting the depth cues accurately. To tackle this problem, a semi-automatic approach that uses user-defined labels is investigated, but its computational requirement is very high. Thus, a fast semi-automatic depth estimation algorithm using Watershed segmentation and Random Walks is proposed, which can achieve similar depth estimation performance to the conventional semi-automatic methods but with much lower computational requirements. On the other hand, the quality of the views synthesized by DIBR also largely depends on the alignment of the object boundaries of the color image. The misalignment of sharp depth image edges is the major cause of the unwanted artifacts in the disoccluded regions of the synthesized views. The conventional smooth filter approach blurs the depth image to reduce the disoccluded regions. The drawbacks are the degradation of 3D perception in the reconstructed 3D videos and the destruction of the texture in background regions. The conventional edge preserving filter utilizes the color image information in order to align the depth edges with the color edges. Unfortunately, the characteristics of the color edges and depth edges are very different, which causes unwanted boundary artifacts in the synthesized virtual views. In this work, a new depth image pre-processing approach is proposed. It utilizes the Watershed color segmentation method to correct the depth image misalignment and then the depth image object boundaries are extended to cover the transitional edge regions of the color image. This approach can handle the sharp depth image edges lying inside or outside the 2D object boundaries. The quality of the disoccluded regions of the synthesized views can be significantly improved and unknown depth values can also be estimated. In Multiview Video plus Depth (MVD) format, virtual views are generated from decoded texture videos with corresponding decoded depth images obtained through DIBR. 3DV-ATM is a reference model for the H.264/AVC based Multiview Video Coding (MVC) and aims at achieving high coding efficiency for 3D video in MVD format. Depth images are first downsampled and then coded by 3DV-ATM; however, sharp object boundary characteristic of depth images do not match well with the transform coding based nature of H.264/AVC in 3DV-ATM. Depth boundaries are often blurred with ringing artifacts in the decoded depth images that ultimately result in noticeable artifacts in the synthesized virtual views. A low complexity adaptive depth truncation filter is proposed to recover the sharp object boundaries of the depth images using adaptive block repositioning and expansion to increase the depth value refinement accuracy. This new approach is very efficient and can avoid false depth boundary refinement when block boundaries lie around the depth edge regions and ensure sufficient information within the processing block for depth layer classification.
- Image processing, Digital techniques, 3-D video (Three-dimensional imaging)