Video Coding Fundamentals

Juan Francisco Rodríguez Herrera
Vicente González Ruiz

September 27, 2014

Contents

1 Memory requirements of PCM video
2 Sources of redundancy
3 Block-based ME (Motion Estimation)
4 Sub-pixel accuracy
5 Matching criteria (similitude between macroblocks)
6 Searching strategies
7 The GOP (Group Of Pictures) concept
8 Lossy predictive video coding
9 MCTF (Motion Compensated Temporal Filtering)
10 t+2d vs. 2d+t vs. 2d+t+2d
11 Deblocking filtering
12 The bit-rate allocation problem
13 Quality scalability
14 Temporal scalability
15 Spatial scalability

1 Memory requirements of PCM video

2 Sources of redundancy

  1. Spatial redundancy: Pixels are very similar in its neighborhood or tends to repeat textures.
  2. Temporal redundancy: Temporally adjacent images are typically very alike.
  3. Visual redundancy: Humans hardly perceive high spatial and temporal frequencies (we like more low frequencies).

3 Block-based ME (Motion Estimation)

PIC

4 Sub-pixel accuracy

PIC

5 Matching criteria (similitude between macroblocks)

6 Searching strategies

7 The GOP (Group Of Pictures) concept

PIC

8 Lossy predictive video coding

Let V i the i-th image of the video sequence and V i[q] and approximation of V i with quality q (most video compressors are lossy). In this context, an hybrid video codec (t+2d) has the following structure:

PIC

9 MCTF (Motion Compensated Temporal Filtering)

PIC

10 t+2d vs. 2d+t vs. 2d+t+2d

PIC

11 Deblocking filtering

PIC

12 The bit-rate allocation problem

PIC

13 Quality scalability

PIC

14 Temporal scalability

PIC

  t                #V--     t−1        #V-t−1
V  = {V2t×i; 0 ≤ i < 2t } = {V2i  ; 0 ≤ i <  2  },
(1)

where #V is the number of pixtures in V , t denotes the Temporal Resolution Level (TRL).

15 Spatial scalability

PIC