Symbol encoding

Juan Francisco Rodríguez Herrera
Vicente González Ruiz

September 12, 2016

We can compress if each symbol is translated by code-words and, in average, the lengths of the code-words are smaller than the length of the symbols.
The encoder and the decoder have a probabilistic model $M$ which says to the variable-length encoder ( $C$ )/decoder( $C^{- 1}$ ) the probability $p (s)$ of each symbol $s$ (see Figure 1).
The most probable symbols are represented by the shorter code-words and viceversa.

Figure 1: Block diagram of the entropy encoding/decoding.

Data is the representation of the information.
Lossless data compression uses a shorter representation of the information.
By deﬁnition, a bit of data stores a bit of information if and only if it represents the occurrence of an equiprobable event (an event that can be true or false with the same probability).
By deﬁnition, a symbol $s$ with probability $p (s)$ stores
$I (s) = - {log}_{2} p (s)$ (Eq:symbol_information)

bits of information.
The length of the code-word depends on the probability as:

The entropy $H (S)$ measures the amount of information per symbol that a source of information $S$ produces, in average, i.e.
$H (S) = \frac{1}{N} \sum_{s = 1}^{N} p (s) \times I (s)$ (1)

bits-of-information/symbol, where $N$ is the size of the source alphabet (number of diﬀerent symbols).