Video Coding Fundamentals

Juan Francisco Rodríguez Herrera
Vicente González Ruiz

July 10, 2017

Contents

1 Sources of redundancy
2 Memory requirements of PCM video
3 Block-based ME (Motion Estimation)
4 Sub-pixel accuracy
5 Matching criteria (similitude between macroblocks)
6 Searching strategies
7 The GOP (Group Of Pictures) concept
8 Lossy predictive video coding
9 MCTF (Motion Compensated Temporal Filtering)
10 t+2d vs. 2d+t vs. 2d+t+2d
11 Deblocking filtering
12 Bit-rate allocation
13 Quality scalability
14 Temporal scalability
15 Spatial scalability

1 Sources of redundancy

  1. Spatial redundancy: Pixels are very similar in its neighborhood or tends to repeat textures.
  2. Temporal redundancy: Temporally adjacent images are typically very alike.
  3. Visual redundancy: Humans hardly perceive high spatial and temporal frequencies (we like more low frequencies).

2 Memory requirements of PCM video

3 Block-based ME (Motion Estimation)

PIC

4 Sub-pixel accuracy

PIC

5 Matching criteria (similitude between macroblocks)

6 Searching strategies

7 The GOP (Group Of Pictures) concept

PIC

8 Lossy predictive video coding

Let V i the i-th image of the video sequence and V i[q] and approximation of V i with quality q (most video compressors are lossy). In this context, an hybrid video codec (t+2d)1 has the following structure:

PIC

9 MCTF (Motion Compensated Temporal Filtering)

PIC

10 t+2d vs. 2d+t vs. 2d+t+2d

PIC

11 Deblocking filtering

PIC

12 Bit-rate allocation

PIC

13 Quality scalability

PIC

14 Temporal scalability

PIC

V t = {V 2t×i;0 i < #V 2t } = {V 2it1;0 i < #V t1 2 }, (1)

where #V is the number of pixtures in V and t denotes the Temporal Resolution Level (TRL).

15 Spatial scalability

PIC