Vorbis

Vicente Gonzlez Ruiz

September 27, 2014

Contents

1 Introduction
 1.1 What is Vorbis?
 1.2 What is Ogg Vorbis?
 1.3 Why Vorbis born?
 1.4 How Vorbis is used?
 1.5 Who uses Vorbis?
 1.6 Licensing
2 How Vorbis works?
 2.1 The Vorbis encoder
 2.2 Overlaped processing
 2.3 Windowing
 2.4 MDCT (Modified Discrete Cosine Transform)
 2.5 SAM (pSycho Acoustic Model) [3]
  2.5.1 ATH (Absolute Threshold of Hearing) model [4]
  2.5.2 Frequency resolution and simultaneous masking
  2.5.3 Temporal masking
 2.6 Quantization
 2.7 Floor and residue encoding
 2.8 Vorbis’s VQ (Vector Quantization)
 2.9 Huffman coding
 2.10 Packet “peeling”
 2.11 Channel coupling
3 Ogg
 3.1 What is Ogg?
 3.2 The Ogg format

Part 1
Introduction

1.1 What is Vorbis?

1.2 What is Ogg Vorbis?

1.3 Why Vorbis born?

1.4 How Vorbis is used?

1.5 Who uses Vorbis?





SpotifyAudacityWinAmpGStreamer




VLC Media Player Firefox Chrome Android




HTML5




1.6 Licensing

Part 2
How Vorbis works?

2.1 The Vorbis encoder

          PCM   +---------+  Ogg  
          ----->| Encoder |------->  
          audio +---------+ stream  
               /           \  
            /                 \  
         /                       \  
     /                               \  
 /                                       \  
+--------+    +-----+    +---+    +-------+  
| W+MDCT |--->| SAM |--->| Q |--->| VQ+HE |  
+--------+    +-----+    +---+    +-------+  
 
W+MDCT = Windowed Modified Discrete Cosine Transform  
SAM = pSycho Acoustic Model  
Q = Quantization  
VQ+HE = Vector Quantization + Huffman Encoding

2.2 Overlaped processing

0              N-1            2N-1            3N-1  
+---------------+---------------+---------------+ s[n]  
<--------Transform Step--------->  
                <---------Transform Step-------->

2.3 Windowing

PIC
PIC

2.4 MDCT (Modified Discrete Cosine Transform)

2.5 SAM (pSycho Acoustic Model) [3]

2.5.1 ATH (Absolute Threshold of Hearing) model [4]

PIC

2.5.2 Frequency resolution and simultaneous masking

2.5.3 Temporal masking

2.6 Quantization

2.7 Floor and residue encoding

2.8 Vorbis’s VQ (Vector Quantization)

2.9 Huffman coding

2.10 Packet “peeling”

2.11 Channel coupling

Part 3
Ogg

3.1 What is Ogg?

3.2 The Ogg format

struct Ogg_Stream {  
  struct* Ogg_page;  
};  
 
struct Ogg_page {  
  uint8[4] Ogg_Magic_Number = "OggS" /* the Ogg magic number */  
  uint8    Version = 0;  
  uint8    Header_Type;              /* type of page that follows: BOS, Continuation or EOS */  
  uint64   Granule Position;         /* A time marker */  
  uint32   Bit-stream_Serial_Number; /* Identifies the stream in multi-stream seqs */  
  uint32   Page_Sequence_Number;  
  uint32   CRC32,  
  uint8    Page_Segmens;             /* Number of segments in this page */  
  struct   Segment_Table;  
};  
 
struct Segment_Table {  
  uint8* Segment_Length; /* In bytes */  
};

Bibliography

[1]   The Xiph Open Source Community. Vorbis audio compression. http://xiph.org/vorbis.

[2]   Xiph.Org Foundation. Ogg vorbis documentation. http://xiph.org/vorbis/doc.

[3]   Erik Montnémery and Johannes Sandvall. Ogg/Vorbis in embedded systems. PhD thesis, Lunds Tekniska Högskola, Lunds Universitet, 2004.

[4]   E. Terhardt. Calclating virtual pitch. Hearing Res., 1:155–182, 1979.