This is due to the fact that a number of new techniques are adopted, which include directional prediction for intra coded blocks, variable block size for inter coded blocks, multi-refere
Trang 1ACKNOWLEDGEMENTS
Without help of many people and parties, the thesis would not have been completed by this moment First of all, I would like to express my heartfelt gratitude to my supervisor, Dr Ko Chi Chung, for his valuable advice and guidance during various phases of my study, especially for his serious research attitude and his encouragement
on me to take the challenges Secondly, I need to give thanks to NUS for providing
me with an opportunity of pursuing further research and study in such a beautiful university I miss all those lecturers and classmates with whom I spent great time together Thirdly, I would like to thank Institute for Infocomm Research for giving good support during the entire process of my research Without this support, the timely completion of the thesis is unimaginable Last but not least, I strongly appreciate my family for their deep understanding Thanks to my wife, for her love, patience and encouragement throughout my Ph.D study period Thanks to my lovely daughter Her birth has brought me a new chapter of my life and new angle of view to look at this world Thanks to my parents, for taking good care of my daughter, and their confidence in my capability, determination and passion to excel all the time
Trang 2TABLE OF CONTENTS
ACKNOWLEDGEMENTS i
TABLE OF CONTENTS ii
SUMMARY vi
NOMENCLATURE viii
LIST OF FIGURES x
LIST OF TABLES xii
CHAPTER 1 INTRODUCTION 1
1.1 Background 1
1.2 Objectives 2
1.3 Thesis Contributions 3
1.4 Organization of the Thesis 4 CHAPTER 2 H.264 AND LITERATURE SURVEY 6
2.1 H.264 6 2.2 H.264 Encoder 9
2.3 H.264 Decoder 10
2.4 Predictive Coding 11
2.4.1 Intra Coding 11
2.4.2 Inter Coding 14
2.5 Motion Estimation 15
2.6 Mode Decision 17
2.7 Literature 18
Trang 3CHAPTER 3 FAST INTRA MODE DECISION FOR H.264 29
3.1 Overview of Intra Coding in H.264 30 3.2 Determining the Primary Edge Direction in the Image Block 32
3.2.2.1 4 x 4 luma block edge direction histogram 35 3.2.2.2 Edge direction histogram for 16 x 16 luma and 8 x 8 chroma
3.3 Mode Decision for Intra Prediction 38 3.3.1 4 x 4 Luma Block Prediction Modes 40 3.3.2 16 x 16 Luma Block Prediction Modes 40 3.3.3 8 x 8 Chroma Prediction Mode 41 3.3.4 Algorithm Complexity Analysis 41
3.4.1 Experiments on IPPPP Sequences 43 3.4.2 Experiments on All Intra Frames Sequences 46 3.4.3 Experiments on IBBPBB Sequences 48 3.4.4 Comparison of Different Fast Intra Prediction Methods 51
4.1.1 Inter Mode Decision in H.264/AVC 53
Trang 44.1.2 Observations and Motivation 55 4.2 Determination of Homogenity and Stationarity 57 4.2.1 Homogeneous Regions Determination 57 4.2.2 Stationary Regions Detection 59
4.3.1 Experiments on IPPP Sequences 63 4.3.2 Experiments on IBBP Sequences 64
CHAPTER 5 FAST INTRA 4X4 MODE ELIMINATION APPROACHES FOR
H.264 69
5.2 Fast Intra 4 x 4 Mode Elimination 72
CHAPTER 6 ADAPTIVE INTERPOLATION APPROACHES FOR H.264
79
6.3 General Interpolation Approaches 82
Trang 56.4.1 Approach One 84
7.2.2 Reordering Motion Estimation Steps for Different Block Sizes 96
REFERENCE 97
Trang 6SUMMARY
The new international video coding standard H.264, also known as Advanced Video Coding (AVC), has been proposed by JVT Compared to previous video coding standards, H.264/AVC has significantly better performance in terms of being able to achieve much better peak signal-to-noise ratio (PSNR) and visual quality at the same bit rate This is due to the fact that a number of new techniques are adopted, which include directional prediction for intra coded blocks, variable block size for inter coded blocks, multi-reference frame motion estimation, integer transform, in-loop filter and context-based adaptive binary arithmetic coding (CABAC), and rate distortion optimization (RDO) Unfortunately, this good performance is obtained at the expense
of very high computational complexity Therefore, the main aim of this thesis is to develop fast algorithms that can improve the encoding speed of H.264 without significant loss of visual quality
A fast mode decision algorithm is firstly presented for intra prediction in H.264 video coding By making use of the edge direction histogram, the number of mode combinations for luma and chroma blocks in a macroblock (MB) that take part in the rate distortion optimization calculation has been reduced significantly from 592 to as low as 132 This results in great reduction in the complexity and computation load of the encoder Experimental results show that the fast algorithm has negligible loss of PSNR compared to the original H.264 scheme
Secondly, a fast inter mode decision algorithm is proposed to decide the best mode in the inter coding of H.264 It makes use of the spatial homogeneity and the temporal stationarity characteristics of the textures of video objects Specifically,
Trang 7homogeneity decision of a block is based on edge information inside the block, and co-sited MB difference is used to decide whether the MB is temporal stationary Based on the homogeneity and stationarity of video objects, only a small number of inter modes are used in RDO The experimental results show that the fast algorithm is able to reduce much encoding time, with negligible PSNR loss
Thirdly, two fast intra 4x4 mode elimination methods are proposed for H.264 coding The lossless method checks the cost after each 4x4 block intra mode decision, and terminates if the cost is higher than the minimum cost of inter mode coding On the other hand, by using some low cost preprocessing to make prediction, the lossy method terminates if the cost is higher than some fraction of this minimum cost Experimental results show that the lossless method can reduce the encoding time without any sacrifice of visual quality The lossy method can further reduce encoding time with negligible PSNR loss or bit rate increase
Finally, this thesis presents two adaptive interpolation methods that can significantly reduce the interpolation operation in H.264 coding By making use of the flag matrix data structure and using interpolation on-demand, the proposed methods are able to increase the encoder speed greatly without any PSNR loss or increase in bit rate
Trang 8NOMENCLATURE
CABAC context-based binary arithmetic coding
CAVLC context-based variable length coding
CIF common interchange format
DC direct coefficient
ISO International Standard Organization
ITU International Telecommunication Union
MPEG Motion Picture Experts Group
MV motion vector
QP quantization parameter
Trang 9SAD Sum of Absolute Difference
ME/MC motion estimation/motion compensation
MV motion vector
FIR Finite Impulse Response
Trang 10LIST OF FIGURES
Figure 2.1 H.264 encoder architecture
Figure 2.2 H.264 decoder architecture
Figure 2.3 Intra 4x4 prediction modes
Figure 2.4 Intra 16x16 prediction modes
Figure 2.5 Variable partition sizes employed in inter coding
Figure 2.6 Multi-frame motion estimation/motion compensation
Figure 3.1 An example of intra prediction
Figure3.2 Examples of 4x4 edge patterns and their preferred intra predication
directions
Figure 3.3 Edge direction histogram of 4 x 4 blocks
Figure 3.4 Intra 8 x 8 and 16 x 16 prediction mode directions
Figure 3.5 Edge direction histogram of 16 x 16 luma and 8 x 8 chroma blocks
Figure 3.6 News, Ch_Psnr = -0.067dB, Ch_Bits =1.226 %
Figure 3.7 Mobile, Ch_Psnr = -0.018dB, Ch_Bits =0.451%
Figure 3.8 Time saving at different intra periods
Figure 3.9 Time saving at different size of searching area
Figure 3.10 News, Ch_Psnr = -0.294dB, Ch_Bits =3.902%
Figure 3.11 Mobile, Ch_Psnr = -0.255dB, Ch_Bits =3.168%
Figure 3.12 News, Ch_Psnr = -0.156dB, Ch_Bits =3.106%
Trang 11Figure 3.13 Mobile, Ch_Psnr = -0.013dB, Ch_Bits =0.379%
Figure 4.1 Different partitions in an MB
Figure 4.2 Segmentation of video objects in H.264 algorithm
Figure 4.3 RD curve for ‘News’ (IPPP)
Figure 4.4 RD curve for ‘Mobile’ (IPPP)
Figure 4.5 RD curve for ‘News’ (IBBP)
Figure 4.6 RD curve for ‘Mobile’ (IBBP)
Figure 5.1 Overall mode decision process
Figure 5.2 Border difference of 4 x 4 block
Figure 6.1 Half pixel interpolation
Figure 6.2 Quarter pixel interpolation
Figure 6.3 Match between current block and reference block
Figure 6.4 Active and Inactive macroblocks
Figure 6.5 Reference blocks in the reference frame
Figure 6.6 Interpolated frame memory reorganization
Figure 6.7 Flag matrix for a Quarter CIF frame
Figure 6.8 Flow chart of the second approach
Trang 12LIST OF TABLES
Table 3.1 Number of candidate modes
Table 3.2 Results for IPPPP sequences
Table 3.3 Results for IIIII sequences
Table 3.4 Results for IBBPB sequences
Table 3.5 Comparison of different fast intra prediction methods
Table 4.1 Results for IPPP sequences
Table 4.2 Results for IBBP sequences
Table 4.3 Results for ‘News’ (IPPP)
Table 4.4 Results for ‘Mobile’ (IPPP)
Table 4.5 Results for ‘News’ (IBBP)
Table 4.6 Results for ‘Mobile’ (IBBP)
Table 5.1 Codec Performance (QP=28)
Table 5.2 Codec Performance (QP=32)
Table 5.3 Codec Performance (QP=36)
Table 5.4 Codec Performance (QP=40)
Table 6.1 Speed increase at QP = 16
Table 6.2 Speed increase at QP = 32