High efficiency computing technologies for digital video coding
Student thesis: Doctoral Thesis
Related Research Unit(s)
Digital video coding plays an important role in telecommunication and multimedia applications where the bandwidth and storage capacity are limited. It aims to reduce the amount of data needed for a video without losing too much of visual quality. In the past three decades, video coding has been rapidly developed, which has resulted in a few number of video coding standards, such as H.262/MPEG-2 Video, H.263, MPEG-4 Visual, H.264/Advanced Video Coding (AVC), and High Efficiency Video Coding (HEVC). The H.264 and HEVC can be regarded as the two successful video coding standards, by using several advanced coding tools, such as Rate Distortion (RD) optimization, variable block-size partitions, variable block-size Motion Estimation (ME) and so on. These leading video coding standards highly improve the coding efficiency, however, the achieved coding efficiency is at the cost of high computational complexity, which limits these video coding standards to be used in real-time multimedia applications. Hence, reducing the computational complexity is vital for these video coding standards. In order to reduce the computational complexity and maintain a comparable RD performance, four novel algorithms are proposed in this thesis. Firstly, a predictive and distribution-oriented fast ME algorithm is proposed for H.264/AVC. Based on the Motion Vector (MV) selection correlation among spatial and temporal neighboring blocks, three predictive models are adopted to predict an initial search point. Then, a small diamond search pattern and an asymmetrical cross search pattern are jointly employed to predict the best matching block, which is based on the learning results of the best MVs distribution in natural video sequences. Compared with the other algorithms in the literature, the proposed algorithm achieves significant encoding time saving while the RD performance degradation is negligible. Especially, the proposed algorithm can work well in video sequences with various motion activities and formats, and is more suitable for real-time multimedia applications. Secondly, an efficient ME and Disparity Estimation (DE) algorithm is proposed to reduce the computational complexity of the multiview video coding encoder. In this work, according to the characteristics of coded block pattern and RD cost, an early DIRECT mode decision algorithm is proposed. After that, based on the characteristics of the initial search point in the ME and Disparity DE process and the principle that the best point is center-biased, an early ME/DE termination strategy is proposed. If the ME/DE early termination condition is not satisfied, the ME/DE search window will be reduced by applying the optimal stopping theory. At last, two block matching search strategies are proposed to predict the best point for the ME/DE. Experimental results show that the proposed algorithm can achieve 50.05% to 77.61%, 64.83% on average encoding time saving. Meanwhile, the RD performance degradation is negligible. Especially, the proposed algorithm can be applied into not only the odd views but also the even views. Thirdly, a fast mode decision algorithm is proposed to reduce the computational complexity of the multiview depth video coding which is based on the mode selection correlation between the depth video and its corresponding texture video, macroblcok motion activity and coded block pattern. Experimental results show that the proposed algorithm can achieve 67.18% and 69.90% encoding time saving for the even and odd views, respectively, while maintaining a comparable RD performance. In addition, with the dramatic encoding time reduction, the proposed algorithm becomes more suitable for real-time multimedia applications. Finally, an early MERGE mode decision algorithm is proposed to reduce the computational complexity of the HEVC encoder. Firstly, based on the all-zero block (AZB) and the ME information of the INTER 2Nx2N mode, an early MERGE mode decision is proposed for the root Coding Units (CUs) (i.e., 64x64 size CUs). Then, an early MERGE mode decision is proposed for the children CUs (i.e., 32x32, 16x16, and 8x8 size CUs) by considering the mode selection correlation between the root CU and the children CUs. To maximize the computational complexity reduction, when the root CUs are encoded as non-MERGE modes, the AZB and ME information are also used for early termination of the children CUs. Experimental results demonstrate that compared to the state-of-the-art published algorithm, the proposed algorithm can achieve about 35% on average encoding time saving while the RD performance degradation is negligible.
- Digital video, Video compression, Computational complexity, Coding theory