Motion compensated and disparity compensated prediction techniques for video coding


Student thesis: Doctoral Thesis

View graph of relations


  • Ka Man WONG

Related Research Unit(s)


Awarding Institution
Award date3 Oct 2011


Inter frame prediction plays a vital role in achieving enormous video coding efficiency in advanced video coding standards. It uses reference frames to predict the contents of current frame. Block matching techniques are used to estimate and compensate the spatial difference between frames. In Motion Compensated Prediction (MCP), reference frames are formed from the same scene in different time and block-matching techniques are used for modeling the movement of objects from time to time. In Disparity Compensated Prediction (DCP), however, the reference frames are based on frames captured from other viewing angles. Although the characteristics of these MCP and DCP frames are quite different, they are performed by the same set of coding tools in existing video coding standards. It assumes block based pure translations between frames and does not consider the actual motion and disparity effects. Different strategies for MCP and DCP are proposed in this thesis for improving the prediction accuracy and the coding efficiency in practical video codec implementation. For MCP, motion can be very complex and normally not just pure translational motion. Attempts toward the general motion model are usually too complex in parameter estimation and too heavy with memory requirement in practical implementation. In this study, a translation and zoom motion model is investigated and the Block-matching Translation and Zoom Motion-Compensated Prediction (BTZMCP) is proposed to extend the assumed model to a more general one with zoom and translation motions in a practical way. A novel subsampling technique applied to the interpolated frame for subpixel motion estimation is used to implement the spatial scaling of temporal reference frames without using additional memory. Although BTZMCP requires block-matching with larger candidate pool, there is no additional zoom-related parameter estimation involved. Experimental results show that BTZMCP can achieve significant improvement on coding efficiency. For DCP, as the reference frames are usually from the same moment, there are some known correlations among frames from these views. Conventional motion compensated prediction approach can be used to predict the disparity effect among views. However, in multiple views scenario, the same object in different views has deformations of different extents and, thus accurate disparity prediction cannot be achieved with such simple translational motion model. Similar to single video coding case, attempts to achieve more accurate disparity prediction usually involve affine transformations and are too complex for practical implementation. As zooming effect usually not exist in DCP, stretching, compression and shearing (SCSH) effects are investigated for better modeling the disparity effect in this study. Specially designed subsampling patterns for SCSH effects are designed for block-matching among frames from different views in disparity estimation and compensation. No affine parameter estimation or additional frame buffers is required in the proposed disparity compensated prediction algorithm. The overall increase in memory requirement and computational complexity is moderate. Experimental results show that the Rate-Distortion (RD) performance is also improved significantly. As the proposed new techniques for MCP and DCP can be applied to the current video coding standards, the developments on reference software are also reported to show the feasibility of these techniques.

    Research areas

  • Video compression, Coding theory, Digital video