doc/06-floor0.tex

   1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
   2 %!TEX root = Vorbis_I_spec.tex
   3 \section{Floor type 0 setup and decode} \label{vorbis:spec:floor0}
   4
   5 \subsection{Overview}
   6
   7 Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately
   8 known as Line Spectral Frequency or LSF) representation to encode a
   9 smooth spectral envelope curve as the frequency response of the LSP
  10 filter.  This representation is equivalent to a traditional all-pole
  11 infinite impulse response filter as would be used in linear predictive
  12 coding; LSP representation may be converted to LPC representation and
  13 vice-versa.
  14
  15
  16
  17 \subsection{Floor 0 format}
  18
  19 Floor zero configuration consists of six integer fields and a list of
  20 VQ codebooks for use in coding/decoding the LSP filter coefficient
  21 values used by each frame.
  22
  23 \subsubsection{header decode}
  24
  25 Configuration information for instances of floor zero decodes from the
  26 codec setup header (third packet).  configuration decode proceeds as
  27 follows:
  28
  29 \begin{Verbatim}[commandchars=\\\{\}]
  30   1) [floor0\_order] = read an unsigned integer of 8 bits
  31   2) [floor0\_rate] = read an unsigned integer of 16 bits
  32   3) [floor0\_bark\_map\_size] = read an unsigned integer of 16 bits
  33   4) [floor0\_amplitude\_bits] = read an unsigned integer of six bits
  34   5) [floor0\_amplitude\_offset] = read an unsigned integer of eight bits
  35   6) [floor0\_number\_of\_books] = read an unsigned integer of four bits and add 1
  36   7) array [floor0\_book\_list] = read a list of [floor0\_number\_of\_books] unsigned integers of eight bits each;
  37 \end{Verbatim}
  38
  39 An end-of-packet condition during any of these bitstream reads renders
  40 this stream undecodable.  In addition, any element of the array
  41 \varname{[floor0\_book\_list]} that is greater than the maximum codebook
  42 number for this bitstream is an error condition that also renders the
  43 stream undecodable.
  44
  45
  46
  47 \subsubsection{packet decode} \label{vorbis:spec:floor0-decode}
  48
  49 Extracting a floor0 curve from an audio packet consists of first
  50 decoding the curve amplitude and \varname{[floor0\_order]} LSP
  51 coefficient values from the bitstream, and then computing the floor
  52 curve, which is defined as the frequency response of the decoded LSP
  53 filter.
  54
  55 Packet decode proceeds as follows:
  56 \begin{Verbatim}[commandchars=\\\{\}]
  57   1) [amplitude] = read an unsigned integer of [floor0\_amplitude\_bits] bits
  58   2) if ( [amplitude] is greater than zero ) \{
  59        3) [coefficients] is an empty, zero length vector
  60        4) [booknumber] = read an unsigned integer of \link{vorbis:spec:ilog}{ilog}( [floor0\_number\_of\_books] ) bits
  61        5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable
  62        6) [last] = zero;
  63        7) vector [temp\_vector] = read vector from bitstream using codebook number [floor0\_book\_list] element [booknumber] in VQ context.
  64        8) add the scalar value [last] to each scalar in vector [temp\_vector]
  65        9) [last] = the value of the last scalar in vector [temp\_vector]
  66       10) concatenate [temp\_vector] onto the end of the [coefficients] vector
  67       11) if (length of vector [coefficients] is less than [floor0\_order], continue at step 6
  68
  69      \}
  70
  71  12) done.
  72
  73 \end{Verbatim}
  74
  75 Take note of the following properties of decode:
  76 \begin{itemize}
  77  \item An \varname{[amplitude]} value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis).  Several later stages of decode don't occur for an unused channel.
  78  \item An end-of-packet condition during decode should be considered a
  79 nominal occruence; if end-of-packet is reached during any read
  80 operation above, floor decode is to return 'unused' status as if the
  81 \varname{[amplitude]} value had read zero at the beginning of decode.
  82
  83  \item The book number used for decode
  84 can, in fact, be stored in the bitstream in \link{vorbis:spec:ilog}{ilog}( \varname{[floor0\_number\_of\_books]} -
  85 1 ) bits.  Nevertheless, the above specification is correct and values
  86 greater than the maximum possible book value are reserved.
  87
  88  \item The number of scalars read into the vector \varname{[coefficients]}
  89 may be greater than \varname{[floor0\_order]}, the number actually
  90 required for curve computation.  For example, if the VQ codebook used
  91 for the floor currently being decoded has a
  92 \varname{[codebook\_dimensions]} value of three and
  93 \varname{[floor0\_order]} is ten, the only way to fill all the needed
  94 scalars in \varname{[coefficients]} is to to read a total of twelve
  95 scalars as four vectors of three scalars each.  This is not an error
  96 condition, and care must be taken not to allow a buffer overflow in
  97 decode. The extra values are not used and may be ignored or discarded.
  98 \end{itemize}
  99
 100
 101
 102
 103 \subsubsection{curve computation} \label{vorbis:spec:floor0-synth}
 104
 105 Given an \varname{[amplitude]} integer and \varname{[coefficients]}
 106 vector from packet decode as well as the [floor0\_order],
 107 [floor0\_rate], [floor0\_bark\_map\_size], [floor0\_amplitude\_bits] and
 108 [floor0\_amplitude\_offset] values from floor setup, and an output
 109 vector size \varname{[n]} specified by the decode process, we compute a
 110 floor output vector.
 111
 112 If the value \varname{[amplitude]} is zero, the return value is a
 113 length \varname{[n]} vector with all-zero scalars.  Otherwise, begin by
 114 assuming the following definitions for the given vector to be
 115 synthesized:
 116
 117    \begin{displaymath}
 118      \mathrm{map}_i = \left\{
 119        \begin{array}{ll}
 120           \min (
 121             \mathtt{floor0\texttt{\_}bark\texttt{\_}map\texttt{\_}size} - 1,
 122             foobar
 123           ) & \textrm{for } i \in [0,n-1] \\
 124           -1 & \textrm{for } i = n
 125         \end{array}
 126       \right.
 127     \end{displaymath}
 128
 129     where
 130
 131     \begin{displaymath}
 132     foobar =
 133       \left\lfloor
 134         \mathrm{bark}\left(\frac{\mathtt{floor0\texttt{\_}rate} \cdot i}{2n}\right) \cdot \frac{\mathtt{floor0\texttt{\_}bark\texttt{\_}map\texttt{\_}size}} {\mathrm{bark}(.5 \cdot \mathtt{floor0\texttt{\_}rate})}
 135       \right\rfloor
 136     \end{displaymath}
 137
 138     and
 139
 140     \begin{displaymath}
 141       \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2) + .0001x
 142     \end{displaymath}
 143
 144 The above is used to synthesize the LSP curve on a Bark-scale frequency
 145 axis, then map the result to a linear-scale frequency axis.
 146 Similarly, the below calculation synthesizes the output LSP curve \varname{[output]} on a log
 147 (dB) amplitude scale, mapping it to linear amplitude in the last step:
 148
 149 \begin{enumerate}
 150  \item  \varname{[i]} = 0
 151  \item  \varname{[$\omega$]} = $\pi$ * map element \varname{[i]} / \varname{[floor0\_bark\_map\_size]}
 152  \item if ( \varname{[floor0\_order]} is odd ) {
 153   \begin{enumerate}
 154    \item calculate \varname{[p]} and \varname{[q]} according to:
 155            \begin{eqnarray*}
 156              p & = & (1 - \cos^2\omega)\prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-3}{2}} 4 (\cos([\mathtt{coefficients}]_{2j+1}) - \cos \omega)^2 \\
 157              q & = & \frac{1}{4} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-1}{2}} 4 (\cos([\mathtt{coefficients}]_{2j}) - \cos \omega)^2
 158            \end{eqnarray*}
 159
 160   \end{enumerate}
 161   } else \varname{[floor0\_order]} is even {
 162   \begin{enumerate}[resume]
 163    \item calculate \varname{[p]} and \varname{[q]} according to:
 164            \begin{eqnarray*}
 165              p & = & \frac{(1 - \cos\omega)}{2} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-2}{2}} 4 (\cos([\mathtt{coefficients}]_{2j+1}) - \cos \omega)^2 \\
 166              q & = & \frac{(1 + \cos\omega)}{2} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-2}{2}} 4 (\cos([\mathtt{coefficients}]_{2j}) - \cos \omega)^2
 167            \end{eqnarray*}
 168
 169   \end{enumerate}
 170   }
 171
 172  \item calculate \varname{[linear\_floor\_value]} according to:
 173          \begin{displaymath}
 174            \exp \left( .11512925 \left(\frac{\mathtt{amplitude} \cdot \mathtt{floor0\texttt{\_}amplitute\texttt{\_}offset}}{(2^{\mathtt{floor0\texttt{\_}amplitude\texttt{\_}bits}}-1)\sqrt{p+q}}
 175                   - \mathtt{floor0\texttt{\_}amplitude\texttt{\_}offset} \right) \right)
 176          \end{displaymath}
 177
 178  \item \varname{[iteration\_condition]} = map element \varname{[i]}
 179  \item \varname{[output]} element \varname{[i]} = \varname{[linear\_floor\_value]}
 180  \item increment \varname{[i]}
 181  \item if ( map element \varname{[i]} is equal to \varname{[iteration\_condition]} ) continue at step 5
 182  \item if ( \varname{[i]} is less than \varname{[n]} ) continue at step 2
 183  \item done
 184 \end{enumerate}
 185
 186 \paragraph{Errata 20150227: Bark scale computation}
 187
 188 Due to a typo when typesetting this version of the specification from the original HTML document, the Bark scale computation previously erroneously read:
 189
 190     \begin{displaymath}
 191       \hbox{\sout{$
 192       \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2 + .0001x)
 193       $}}
 194     \end{displaymath}
 195
 196 Note that the last parenthesis is misplaced.  This document now uses the correct equation as it appeared in the original HTML spec document:
 197
 198     \begin{displaymath}
 199       \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2) + .0001x
 200     \end{displaymath}
 201