Fast Algorithms and Efficient Implementations for Video Coding

視頻編碼的快速方法與高效實現

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date6 Sep 2017

Abstract

With the demand of higher quality of digital videos, such as High Definition (HD) videos, network bandwidth has been promoted to satisfy the requirements, along with digital video coding technology, which is to reduce the size of original videos without or with negligible visual quality loss. Various video coding technologies have been standardized as H.262, H.263, MPEG-4, and H.264 in the past decades. High Efficiency Video Coding (HEVC) standard, also known as H.265, is considered as the successor to H.264, the most successful digital video coding standard in history. New techniques are adopted in HEVC, resulting in 50% of bitrate reduction averagely, while the encoding process is several times more complex than H.264. As HEVC is targeting Ultra High Definition (UHD) video, the reduction of encoding and decoding time is expected to be focused.
This thesis focuses on techniques from algorithms to architectures, aiming to produce time efficient encoding algorithm and hardware friendly implementation for HEVC, solving the challenges brought by new block structure in video coding. In algorithm level, fast encoding algorithms are proposed in this thesis, including fast intra and inter block partitioning and mode selection. Fast intra block partitioning is based on gradient analysis of each frame while fast inter mode decision method uses motion analysis among frames. In hardware architecture level, this thesis presents efficient implementations for key modules of HEVC, which produces high-throughput architecture for HEVC process.
In this thesis, the gradient-based complexity reduction methods for intra coding in HEVC include single-scaled analysis and multi-scaled levels. In the single scaled method, the decision of block partitioning is based on the gradient of a block, which is derived from gradient computations. A block is considered as a multiple gradients region which will be split into multiple sub-blocks, as the corner points are detected within the block. A block without corner points is treated as being non-split when its RD cost is small according the statistics of the previous frames. As block partitioning in HEVC adopts quad-tree structure, a block and its sub-blocks should be all taken into considerations in gradient based fast algorithms. Instead of using traditional gradient analysis in single scale, the multi-scaled gradient is adopted in this thesis, using the proposed the global and local edge complexities. Both the global and local edge complexities in horizontal, vertical, 45 degree diagonal, and 135 degree diagonal directions are proposed and used to decide the partitioning of a CU. Coupled with handling its four sub-CUs in the same way, a CU is decided to be split, non-split, or undetermined for each depth. The thresholds for the edge complexities are derived from randomly selected frames of the test sequences. With the multi-scaled gradient analysis on a block and its sub-blocks, the coding efficiency is retained better than single-scaled method while reducing similar encoding complexity.
As inter block partitioning is to reduce redundancy between frames, the proposed fast inter algorithm is based on the motion correlation between spatial and temporal correlation. Motion information is to inter CU decision what gradient information to intra mode decision. The fast inter algorithm consists of four individual methods covering the whole process of CU decision from mode evaluation to CU termination and CU splitting. Spatial motion correlation can be found that early termination occurs when SKIP mode will be chosen as an optimum mode for over half of CUs. Temporal motion correlations between two neighboring frames are exploited, and motion complexity (MC) of the split collocated CU in reference frame is taken as a parameter for CU splitting decision. Meanwhile, the optimal block partitioning of spatial blocks are used to constrain the searching range. The method highly reduces inter encoding time while the coding efficiency loss is very negligible. The proposed fast inter algorithms can reduce over 50% coding time with negligible video quality loss.
Efficient hardware architecture for key modules in HEVC is also essential, as HEVC is always running on different platforms. In this thesis, a hardware architecture is proposed for intra coding. It presents how to design high-throughput circuits for digital video coding and how to efficiently utilize hardware resources to implement computations in HEVC. In this thesis, a fully pipelined hardware architecture for intra prediction of HEVC is presented. The proposed techniques include: 1) a new depth-traversal based reference buffer; 2) a new mode-dependent scanning order and reference selection scheme; 3) a new inverse reference extension method. As a result, the proposed hardware architecture is the only one capable to work in full pipeline, which can provide the best throughput compared with existing works.