LZW

Juan Francisco Rodríguez Herrera
Vicente González Ruiz

July 10, 2017

Contents

1 Fundamentals
2 Encoder
3 Decoder

1 Fundamentals

2 Encoder

  1. w next input symbol.
  2. While the input is not exhausted:
    1. k next input symbol.
    2. If wk is found in the dictionary, then:
      1. w address of wk in the dictionary.
    3. Else:
      1. Output w.
      2. Insert wk in the dictionary.
      3. w k.

Example

Input w k Output Comment





ab 97 b 97 D[257] ab
a 98 a 98 D[258] ba
b 97 b w = 257
c 257 c 257 D[259] abc
b 99 b 99 D[260] cb
a 98 a w = 258
b 258 b 258 D[261] bab
a 98 a w = 258
b 258 b w = 261
a 261 a 261 D[262] baba
a 97 a 97 D[263] aa
a 97 a w = 263
a 263 a 263 D[264] aaa
a 97 a w = 263

     

  1. w next input symbol.
  2. While the input is not exhausted:
    1. k next input symbol.
    2. If wk is found in the dictionary, then:
      1. w address of wk in the dictionary.
    3. Else:
      1. Output w.
      2. Insert wk in the dictionary.
      3. w k.

Address String w k




0 0 NULL
97 0 a
98 0 b
99 0 c
256
reserved for the ESC code
257 ab 97 b
258 ba 98 a
259 abc 257 c
260 cb 99 b
261 bab 258 b
262 baba 261 a
263 aa 97 a
264 aaa 263 a

3 Decoder

  1. code first input code-word.
  2. Output code.
  3. old_code code.
  4. While the input is not exhausted:
    1. code next input code-word.
    2. w old_code.
    3. If code is found in the dictionary, then:
      1. Output string(code).
    4. Else:
      1. Output string(w).
      2. Output k.
    5. k first symbol of the last output.
    6. Insert wk in the dictionary.
    7. old_code code.

Example

Input code w k Output Comment






97 97 a Initialization
98 98 97 b b code < 257, D[257] ab
257 257 98 a ab code < 258, D[258] ba
99 99 257 c c code < 259, D[259] abc
258 258 99 b ba code < 260, D[260] cb
261 261 258 b bab code = 261, D[261] bab
97 97 261 a a code < 262, D[262] baba
263 263 97 a aa code = 263, D[263] aa

  1. code first input code-word.
  2. Output code.
  3. old_code code.
  4. While the input is not exhausted:
    1. code next input code-word.
    2. w old_code.
    3. If code is found in the dictionary, then:
      1. Output string(code).
    4. Else:
      1. Output string(w).
      2. Output k.
    5. k first symbol of the last output.
    6. Insert wk in the dictionary.
    7. old_code code.

Let’s go to the lab!

  1. Download http://www.ace.ual.es/~vruiz/doctorado/text-compression.tar.gz and make lzw15v.
  2. Encode each image of The Test Image Corpus using the LZW codec. Continue filling the table:
        Codec | lena boats pepers zelda Average  
    ----------+--------------------------------  
            : |    :     :      :     :       :  
       lzw15v | ....  ....   ....  ....    ....

  3. Again, check the reversibility of the LZW codec!

References

[1]   Compuserve Incorporated. Graphics Interchange Format (GIF) Specification, June 1987.

[2]   D. Hamaker. Compress and Compact Discussed Further. Commun. ACM, 31:1139–1140, 1988.

[3]   T.A. Welch. A Technique for High-Performance Data Compression. IEEE Computer, pages 8–19, 1984.