`Ccllicr ct al.
`
`[19]
`
`[11] Patent Number:
`
`5,884,269
`
`[45] Date of Patent:
`
`Mar. 16, 1999
`
`US005884269A
`
`4,841,299
`4,882,754
`
`6/1989 Weaver
`11/1989 Weaver et al.
`OTHER PUBLICATIONS
`
`C Cellier et al., “Lossless Audio Data Compression for Real
`Time Applications,” 95th AES Convention, Oct. 1993.
`
`Primary Examz'ner—Chi II. Pham
`Asszfsmm E.\tamzTn.er—Rick_V Q, Ngu
`Attorney, Agent, or Firm—Kenyon & Kenyon
`
`ABSTRACT
`[57]
`An audio signal compression and decompression method
`and apparatus that provide lossless, realtime performance.
`The compression/decompression method and apparatus are
`based on an entropy encoding technique using multiple
`Hufifman code tables. Uncompressed audio data samples are
`first processed by a prediction filter which generates predic-
`tion error samples, An optimum coding table is then selected
`from a number of different preselected tables which have
`been tailored to different probability density functions of the
`prediction error. For each frame of prediction error samples,
`an entropy encoder selects the one Hu ‘man code table
`which will yield the shortest encoded representation of the
`Frame of prediction error samples. The frame of prediction
`error samples is then encoded using the selected Hu man
`code table. Ablock structure for the compressed data and a
`decoder for reconstructing the original audio signal from the
`compressed data are also disclosed.
`
`14 Claims, 7 Drawing Sheets
`
`[54] LOSSLESS COMPRESSION/
`DECOMPRESSION OF DIGITAL AUDIO
`DATA
`
`[75]
`
`Inventors: Claude Cellier, Chexbres, Switzerland;
`Pierre Chénes, Ferreyres, Switzerland
`
`[73] Assignee: Merging Technologies, Puidoux,
`Switzerland
`
`1211 Appl. No.: 422,457
`
`[22]
`
`Filed:
`
`Apr. 17, 1995
`
`Int. Cl.“
`[51]
`[52] U.S. Cl.
`
`........... G10L 7/00; GIUL 3/02
`.. 704/501; 704/504; 371/378;
`341/64
`395/238, 2.39,
`[58] Field of Search
`375/242, 243, 254; 371/'30,
`395./2.3-2.3
`37.1, 37.2, 37.7, 37.8; 382/56; 358/246,
`261.1, 261.2, 427; 704/200, 500, 501, 503,
`504, 341/50, 51, 64, 65
`
`
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`..
`..
`..
`
`at
`..
`.... ..
`
`
`
`.. 340/347
`.. 128/696
`.. 504/715
`.. 340/347
`340/347
`381/36
`381,35
`
`8/‘I983 Weaver
`4,396,906
`S/1984 Weaver
`4,449,536
`5/1985 Weaver
`4,503,510
`8/‘I985 Weaver
`4,535,320
`4,546,342 10/1985 Weaver
`4,754,483
`6/1988 Weaver
`4,802,222
`1/1989 Weaver
`
`
`
`104
`
`COMPACT HUFFMAN
`
`TABLES DICTIONARY
`
`
`
`
`
`HUFFMAN
`
`CODING AND
`OUTPUT.
`FRAME
`STORAGE
`CODING
`
`
`
`
`
`PREDICTOR
`
`BEST TABLE
`SELECTOR
`
`COMPACT
`HUFFMAN WEIGHT
`TABLES
`
`
`
`
`
`Samsung Exhibit 1050
`
`Samsung v. Affinity
`IPR2014-01181
`
`Page 00001
`
`Samsung Exhibit 1050
`Samsung v. Affinity
`IPR2014-01181
`Page 00001
`
`
`
`U.S. Patent
`
`Mar. 16, 1999
`
`Sheet 1 of 7
`
`5,884,269
`
`104
`
`COMPACT HUFFMAN
`
`TABLES DICTIONARY
`
`PREDICTOR
`
`BEST TABLE
`SELECTOR
`
`HUFFMAN
`CODING AND
`FRAME
`CODING
`
`OUTPUT.
`STORAGE
`
`COMPACT
`HUFFMAN WEIGHT
`TABLES
`
`HUFFMAN TAB LES
`DICTIONARY
`
`INPUT,
`STORAGE
`
`FRAME
`°%‘,’_%%',§°
`
`HUFFMAN
`DECODING
`
`INVERSE
`PREDICTOR
`
`OUTPUT
`
`Page 00002
`
`
`
`U.S. Patent
`
`Mar. 16, 1999
`
`Sheet 2 of 7
`
`m5._<>«Eu
`
`NnST...»7
`
`
`
`doc:m_h.Eom_o
`
`
`
`duo:m_<u._z:
`
`t_.__m_<momn_
`
`Page 00003
`
`
`
`U.S. Patent
`
`Mar. 16, 1999
`
`Sheet 3 of 7
`
`5,884,269
`
`401
`
`INIT ALL BIN HIT COUNTERS To 0, N=O
`
`402
`
`GET ERR SAMPLE [N
`
`403
`
`FIND BIN HIT FOR ERR SAMPLE [N]
`
`YES
`
`406
`
`INIT COST or EACH TABLE TO 0.
`TABLE POINTER T=O
`
`407
`
`COMPUTE COST FOR TABLE [T]
`
`408
`
`T=0 OR COST[T]<MINCOST ?
`
`409
`
`MINCOST = COST[T]
`BESTTABLE = T
`
`41 1
`
`T= LAST TABLE ?
`
`YES
`
`FIG. 4
`
`Page 00004
`
`
`
`U.S. Patent
`
`Mar. 16, 1999
`
`Sheet 4 of 7
`
`5,884,269
`
`\/w3.vSv$mmm_<mm»oz~_$mm_<8»
`
`w_vAzEEmm<
`
`E::z_moz_E:_._z_moz.
`
`Sm
`
`
`
`A...vAzv~_.,.mmm:
`
`
`EsunuEH._mm<flawumEmm_<_uEumm:oumm.mm:
`
`
`§:_._z_moz_Eczzaoz_E:_._z_moz.§:_._z_moz.
`
`
`
`RTEu$m_mm<3...?u$mmm_<E:_._z_moz_
`
`§:_._z_moz.mNuEmmm<
`
`3.053...E
`
`5......uEmmm:
`
`
`
`5Immz_mmom.HEzo_m_8oom§XzVEH._mm<
`
`AzEEmm_<
`
`m.Ezra8..
`
`
`
`A....oENSzoE
`
`
`
`
`
`u._n_2<mmm..._omas,”.E:omm<63
`
`
`
`...~nX§_$mm<mom
`
`mm»
`
`...23%mm<
`
`Page 00005
`
`
`
`U.S. Patent
`
`Mar. 16, 1999
`
`Sheet 5 of 7
`
`5,884,269
`
`FROM 404 (FIG. 4)
`
`TAKE ABSOLUTE VALUE or ERR SAMPLE
`ABS ERR(N)
`
`LBYTE = ABS ERR(N) AND WFF
`HBYTE = ABS ERR(N) AND Ema
`HBYTE = HBYTE SHIFTED 3 ans RIGHT
`
`INC B|NH|T[H|GH TABLE[HBYTE]]
`ABS ERR > 255
`
`INC B|NHIT[LOW TABLE[LBYTE]]
`ABS ERR < 256
`
`TO 404 (FIG. 4)
`
`FIG. 5(3)
`
`Page 00006
`
`
`
`U.S. Patent
`
`Mar. 16, 1999
`
`Sheet 6 of 7
`
`5,884,269
`
`COST = 0. LEVEL POINTER L=O
`
`COST = COST+(BlNH|T[L] * WE|GHT[L])
`
`Page 00007
`
`
`
`U.S. Patent
`
`99916:1r..aM
`
`Sheet 7 of 7
`
`5,884,269
`
`o
`
`oS
`
`oz>mEa;
`
`
`
`
`
`Baumn._._mE8>Em_mm_._5zm:x8._mW%M%4oNmN2onS9m_.mmm
`
`F._o
`
`Fm
`
`F.
`
`
`
`
`
`2.<2__.Adz:xfim~_wm_.._pauflmwumB>Em_m_non
`
`
`
`
`
`
`
`
`
`
`
`SENI._._3SSEE0;._.m<._:<mm_._.m._._mEm...Bmmuyazoo
`
`
`
`
`
`
`
`...>5£52m:xoamyz
`
`
`
`zfifimcmma:oummummzoo
`
`Page 00008
`
`
`
`5,884,269
`
`1
`LOSSLESS COMPRESSIONI
`DECOMPRESSION OF DIGITAL AUDIO
`DATA
`
`FIELD OF THE INVENTION
`
`The present invention relates to an apparatus and method
`for digitally compressing and decompressing audio signals.
`More specifically, the present invention relates to an appa-
`ratus and method for the lossless compression and decom-
`pression of digital audio data.
`BACKGROUND INVENTION
`
`Sampled digital audio data, from sources such as speech
`or musical instruments, particularly in the form of linear
`Pulse Code Modulation (PCM) samples, tends to include a
`high degee of redundancy because of the high degree of
`dependence between successive sample values.
`With the proliferation of multimedia applications, several
`compression/decompression algorithms have been pro- ~
`moted. US. Pat. No. 4,396,906 to Weaver describes a
`system which includes means for digitizing analog signals,
`for compression filtering the digital samples, and for Huff-
`man encoding the compressed digital samples for recording
`or transmission. The US. Pat. No. 4,396,906 patent also
`describes a receiving system which includes a Huffman
`decoder, a digital reconstruction filter and means for con-
`verting the decoded, reconstructed digital signals back to
`analog form. A similar system is describe in an article by U.
`E. Ruttimann et al., entitled “Compression of the ECG by
`Prediction or Interpolation and Entropy Encoding”, IEEE
`Transactions on Biomedical Engineering, Vol. BME—26, No.
`11, November 1979, pp. 613-623. Another system is
`described in an article by K. L. Ripley et al., entitled “A
`Computer System for Capturing Electroeardiographic
`Data”, Pro. Comput. Cardiol., 1976, pp. 439—445.
`To achieve constant compression rates, however, existing
`schemes have sacrificed audio integrity, losing some of the
`information contained in the original audio signal. There are
`lossless compression algorithms which have been used to
`compress text and data files with completely accurate recov-
`ery of the primary data upon decompression. These
`techniques, however, are optimized for text and data and are
`only marginally effective in compressing audio data.
`Some methods of audio compression are based on psy-
`choacoustics. Such perceptual coding algorithms drop psy-
`choacoustically imperceptible audio information. While
`acceptable for most consumer audio delivery formats, such
`as MiniDisc, DAT and CD-ROMs, such an approach is
`inadequate for professional audio production, where mate-
`rial may go through multiple iterations of compression and
`decompression before being mastered onto the final delivery
`medium. Any loss of audio information is compounded with
`each iteration causing a serious compromise in the audio
`integrity of the finished product. There clearly exists a need i
`for a truly lossless, realtime compression technology.
`SUMMARY OF THE INVENTION
`
`9
`
`i
`
`The present invention provides an apparatus and method
`of compressing and decompressing digital audio data in
`realtime, while being lossless.
`In accordance with the present invention, digital audio
`samples are first applied to a digital compression filter or
`predictor in order to reduce the correlation between samples,
`while simultaneously reducing the average variance over a
`block of samples. The error output of the predictor, i.e., the
`
`2
`difference between the actual sampled value and the pre-
`dicted value for that sample, is then provided to an entropy
`encoder, which can be based on a Huffman or arithmetic
`coding scheme,
`to encode digital audio samples into a
`compressed form using code words of varying length.
`An object of the present invention is to provide a highly
`elficient and compact way of mapping the statistics of the
`actual audio signal, on a block basis, in order to select the
`optimum encoding table from a number of di erent prese-
`lected tables which have been tailored to di erent probabil-
`ity density functions (PDFs) of the predictor error. It has
`been discovered that the long-term statistics 0 ‘ the predic-
`tion error of commonly used predictors fol ow a Laplacian
`distribution. For each block of audio data,
`the entropy
`encoder of the present invention selects one of a number of
`encoding tables, each of which closely matches a Laplacian
`distribution with a given variance. The encoding tables
`correspond to a series of Laplacian distributions whose
`variances are in a geometric progression.
`While an ideal matching to each PDF would suggest a
`different probability for each value, such an implementation,
`for a 16-bit sigial, would require 64K entries for each
`encoding table. In accordance with the present invention, the
`range of possible values is divided into sub-ranges or “bins”
`with boundaries at integer powers of two, with each bin
`having one entry in each encoding table. As a result, the
`number of entries per table is only n+1, where n is the
`number o bits of the predictor error signal to be encoded.
`Each Huffman encoding table includes a prefix code for each
`of the n+1 bin entries. Each Huffman table will yield a
`minimum coding length for a different bin and as such will
`yield a di erent coding cost depending on the variance of the
`predictor error values over a frame. The Hu man table
`having the lowest coding cost for each frame is selected and
`used to encode the predictor error values for that frame.
`Another object of the present invention is to provide an
`efiicient and flexible compressed data file forma . In accor-
`dance with the present invention, compressed audio data is
`organized in blocks having a header and a body of user data,
`i.e.,
`the encoded audio data. The start of each block is
`delimited by a unique sync word in the header. Each of the
`variable-length codes in the encoding tables are of such a
`format
`that they can be combined in any order without
`forming the same pattern as the unique code that is used for
`the sync word in the block header. This feature adds a level
`of error detection making it possible to prevent the propa-
`gation of an error beyond the frame in which the error
`occurred. The use of a unique sync word also allows for easy
`synchronization of a random access system fetch to an
`arbitrary location in a compressed audio data file and for
`subsequently quickly initializing and synchronizing the
`decoding process of the following frame.
`The block header structure in accordance with the present
`invention also provides a means of rapidly jumping forward
`to the following block header, or backwards to the preceding
`block header, by indicating the size of the compressed block.
`Each block header also optionally includes a Sample
`Address Count (SAC) element to help identify the current
`block of data during a random access.
`The block header data can also be stored in a separate file
`that can be used to facilitate the search for a specific location
`in a compressed audio file. This feature is particularly
`helpful
`in conjunction with lossless compression which
`inherently provides a variable rate of compression and thus
`prohibits the use of a simple search or seek routine through
`a linearly mapped storage medium. In accordance with the
`
`Page 00009
`
`
`
`5,884,269
`
`3
`present invention, the contents of each block is organized so
`as to only minimally add to the overhead of the compressed
`data.
`
`With the present invention, compression ratios better than
`321 are possible, depending on the uncompressed audio
`source material, while the reconstructed output data is
`bit-for-bit identical to the original uncompressed audio data.
`As such, unlike with known methods and apparatus, audio
`data can go through an unlimited number of compression
`and decompression cycles with no distortion or loss of data
`integrity.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`lossless audio data
`FIG. 1 is a block diagram of a
`compression encoder in accordance with the present inven-
`tion.
`
`FIG. 2 is a block diagram of a decoder of compressed
`audio data in accordance with the present invention.
`FIG. 3 is a graph of the probability density function of the -
`prediction error of a predictor of the encoder of the present
`invention.
`
`FIG. 4 is a flow—chart of a method for selecting a Huffman
`encoding table in accordance with the present invention.
`FIGS. 5(A) and 5(B) are flow-charts of two method for
`determining the distribution of prediction error values over
`a frame of audio data.
`
`FIG. 6 is a flow—chart of a method for determining the
`coding cost of encoding a frame of audio data using a
`particular Huffman encoding table.
`FIG. 7 shows the organization of compressed audio
`information in accordance with the present invention.
`DETAILED DESCRIPTION OF THE DRAWING
`
`FIG. 1 shows a block diagram of an audio data compres-
`sion encoder in accordance with the present invention. In the
`exemplary embodiment of the encoder of FIG. 1, digital
`audio data organized as a serial sequence of audio samples
`grouped in frames is serially input to the encoder at an input
`101. A frame of input digital audio samples is any suitable
`sequence of audio samples from which the encoder extracts
`relevant statistics (described below) in order to optimize the
`coding process. A frame of 1152 samples (which is the same
`length used in the MPEG L—II audio coding process) pro-
`vides a good compromise between coding delay and the
`ability to track varying audio statistics. The coding efi‘l-
`ciency of the encoder of the present invention has been
`found to be relatively constant over a range of frame lengths
`of 512 to 2048 samples. For frame lengths below 512 _
`samples, framing overhead becomes a factor in reducing
`coding e iciency. For frame lengths longer than 2048
`samples, ooding efficiency for rapidly changing input audio
`sample streams will tend to suffer.
`Upon input to the encoder of the present invention, the ..
`input audio data is provided first to a predictor (or prediction
`filter) 102. Using previous audio samples, the predictor 102
`generates a prediction of the current audio sample and
`provides at
`its output
`the di erence between the actual
`sample and the predicted va ue, also referred to as the
`prediction error. The predictor 102 thus serves to remove the
`redundancy between consecutive audio samples,
`thereby
`narrowing the probability density function of the prediction
`error about zero. The predictor 102 can also be thought of as
`a filter that is optimized to output only the unpredictable
`components of the input audio signal. The general expres-
`sion for the function of a predictor is as follows:
`
`4
`
`e(n) = x(n) — _E /z,x(n —j)
`}=1
`
`where x(n) are the input audio sample values, e(n) are the
`corresponding prediction error sample values and 11]. are the
`prediction filter coefficients.
`Various implementations of the predictor 102 can be used
`in the encoder of the present invention. The filter coefficients
`h,- can either be fixed, as in a simple finite impulse response
`(FIR) filter, or can vary as a function of the original signal
`x(n), as in a forward adaptive prediction filter, or as a
`function of the predicted signal x‘ (n), as in a backward
`adaptive prediction filter. The predictor 102 can also be
`implemented as an array of fixed FIR filters with a selector
`to select the filter providing the best prediction error for a
`given frame. In this case, however, a reconstruction filter
`must be informed of which filter was used to encode a
`particular block of audio data. This information can be
`provided in a block header. The predictor 102 can also be
`designed as a hybrid fixed FIR/adaptive implementation;
`i.e., the predictor 102 can switch between a fixed FIR filter
`or an adaptive filter to process a given block of audio data
`depending on which of the two would yield a better predic-
`tion. As in the case of multiple fixed FIR filters, some
`indication must be provided to the reconstruction filter as to
`whether the fixed FIR or adaptive filter was used to encode
`a given block of audio data. It should be clear that there is
`no need to provide any such information in a single filter
`implementation (whether fixed coefficient or adaptive), in
`which there is no switching between different filters.
`An FIR filter with fixed coefficients has the advantage of
`being implemented with a DSP with few instructions.
`Although the adaptive implementations will provide better
`prediction performance and thus better compression, they
`are more complex, requiring more instructions and storage
`space.
`In an exemplary embodiment of the present
`invention, the predictor 102 is implemented as a dual—FIR
`design in which the predictor selects between two different
`fixed FIR filters depending on which provides better per-
`formance. The first filter has three coeflicients: h0=3, h_1=—3
`and h_2=l; while the second filter has two coefficients:
`hO=1.5 and h_1=—0.5. A bit in the header of each block of
`compressed audio data is used to indicate to the decoder the
`filter that was used to encode the audio data.
`As shown in FIG. 1, the prediction error output of the
`predictor 102 is provided to an entropy coding block com-
`prised of a best table selector 103, a compact Huffman tables
`dictionary 104 and compact Huffman weight tables 105. For
`each frame of input audio data, the best table selector 103
`selects one of a plurality of Huffman tables stored in the
`compact Huffman tables dictionary 104 on the basis of a
`minimum cost search. In other words, the table selector 103
`selects that Huffman table which when used to encode the
`current frame of error samples will yield the most compact
`encoded representation. The best table selector 103 performs
`the minimum cost search by using the compact Huffman
`weight tables 105 to determine, for each entry or “bin” in
`each Huffman table, an encoding “cost” in terms of the
`number of code bits needed to represent the prediction error
`samples falling within each bin. The operation of the best
`table selector 103 and the contents of the Huffman ode and
`weight
`tables will be described in greater detail further
`below.
`Once the selector 103 has selected the best Huffman table
`for a given frame, the prediction error values of the frame are
`encoded by a Huffman coding and frame coding block 106.
`
`Page 00010
`
`
`
`5,884,269
`
`~
`
`5
`The coding block 106 encodes each prediction error sample
`with a corresponding codeword stored in the Huffman table
`selected by the selector 103 from the table dictionary 104. In
`addition to encoding each error sample with a Huffman table
`codeword, the coding block 106 also forms each frame of
`encoded error samples into a block by appending thereto a
`header which includes a unique block sync bit pattern, an
`indication of the Iluffman table used to encode the error
`samples in that block and other information described fur-
`ther below. As such, the coding element 106 generates at its
`output blocks of compressed audio data which can then be
`stored, transmitted, etc. and later decoded to recreate the
`original uncompressed audio data stream. To facilitate stor-
`age and retrieval, the coding block 106 appends a number of
`bits to the end of the encoded audio data bitstream of each
`block in order to align all consecutive block headers on byte
`boundaries or a multiple thereof. In the exemplary embodi-
`ment of FIG. 1, block headers are aligned on 32-bit (or 4
`byte) boundaries in order to accommodate many current
`generation processors which handle 32-bit data. The format ~
`and contents of each block of encoded audio data will be
`described in greater detail below.
`FIG. 2 shows a decoder in accordance with the present
`invention for decoding the audio data encoded with the
`encoder of FIG. 1 to generate an exact recreation of the
`original audio data. The encoded data, which is provided
`serially to the decoder for example from a storage device or
`by transmission, is first processed by a frame decoding block
`202. The decoding block 202 first scans the incoming bit
`stream for a Valid sync pattern delimiting the start of a block.
`By ensuring that each block starts at a 32-bit boundary, the
`process of scanning the incoming bit stream for a sync
`pattern is appreciably accelerated. Once a sync pattern has
`been detected, the decoding block 202 determines, from the
`header information,
`the Huffman table that was used to
`encode the frame of compressed audio data within the block
`currently being decoded. The decoding block 202 also reads
`any ancillary information included i11 each block of incom-
`ing data.
`The frame decoder 202 outputs to a Huffman decoder 203
`the compressed audio data and the identity of the encoding
`Iluffman table for each block of compressed audio data
`received. Using the identity of the encoding Huffman table,
`the Huffman decoder 203 retrieves the appropriate Huffman
`table from a Huffman tables dictionary and uses the retrieved
`table to convert the encoded bitstream of compressed audio
`data in the current block back into prediction error samples.
`The decoded prediction error samples are then provided to
`an inverse predictor (or inverse prediction filter) 205 which
`adds each error sample to a corresponding predicted sample I
`value to recreate each actual audio data sample. The opera-
`tion of the inverse predictor will not be described in further
`detail since it should be well known to a person of ordinary
`skill.
`
`The operation of the encoder of the present invention will
`now be described in greater detail.
`It has been determined that the prediction error output of
`the predictor 102 of the encoder of FIG. 1 exhibits a
`Laplacian probability density function (pdf) expressed as
`follows:
`
`6
`—continued
`A =
`1‘
`Zoe 0
`
`and where o is the estimated standard deviation of the pdf,
`it is the estimated mean and k is a parameter which can be
`used to tailor the pdf to the type of audio signal represented.
`For example, it has been determined that a value of k of
`approximately 1.1 minimizes the mean quadratic error
`between the model and the actual measured signal for music,
`whereas a value of 1.7 is better suited for voice. Acontinu—
`ous representation of the probability density function of the
`prediction error is shown as the smooth curve in the graph
`of FIG. 3.
`In accordance with the encoding method of the present
`invention, prediction error values that have a high probabil-
`ity of occurrence are encoded using shorter codes, while
`prediction error values with a low probability of occurrence
`are encoded using longer codes. In other words, the encod-
`ing method of the present invention tracks the entropy of the
`input audio signal by assigning codes of length inversely
`proportional to the probability of occurrence of the predic-
`tion error value to which each code is assigned.
`In an exemplary embodiment in which the input audio
`signal is quantized into 16-bit samples, each sample could
`take on a value between -32768 and +32767. The probabil-
`ity of occurrence of each value is different. This is even more
`true of the prediction error of a prediction filter operating on
`the input audio samples. The most probable prediction error
`values will be grouped around zero, with the probability
`density exponentially decreasing for larger values, as shown
`by the Laplacian distribution of FIG. 3. In an ideal system,
`a different codeword length can be selected for each of the
`216 possible values of the prediction error samples. Such a
`system, however, would require much memory in order to
`store all the possible code values and require much process-
`ing time in order to select the optimal code for each sample.
`The method and apparatus of the present invention dras-
`tically reduces the number of different codeword entries in
`each Huffman table while closely tracking the probability of
`occurrence of each error sample value over the entire range
`of possible error sample values. In accordance with the
`present invention, error sample values are grouped into
`groups of values or “bins”. In each Huffman table, each bin
`has a corresponding codeword prefix entry. In other words,
`all error sample values falling within the same bin will have
`the same codeword prefix. The prefix is placed before a
`sullix representing the magnitude of the error sample value.
`Moreover, all error sample values falling within the same bin
`will have the same encoded length.
`The bins are defined by sub—dividing the entire range of
`error sample values into unequal size sub—ranges having
`boundaries at successive powers of 2. This sub-division is
`illustrated in FIG. 3 by the discrete model of the Laplacian
`probability density function. Furthermore, since the prob-
`ability density function of the error sample distribution is
`essentially symmetrical about zero, a further simplification
`is achieved by grouping positive and negative values with
`the same absolute value into the same bin. In the exemplary
`embodiment, the first bin is assigned to the value 0, the
`second bin to the values -1 and +1, the third bin to the values
`-3, -2, +2 and +3, the fourth bin to the values -7 to -4 and
`+4 to +7, etc., with the second-to-last bin assigned to the
`values —32767 to -16536 and +1 6536 to +32767 and the last
`bin assigned to the value -32768. It should be clear that the
`definition of bins can be readily adapted to any quantization
`
`Page 00011
`
`
`
`5,884,269
`
`7
`resolution such as 8, 18 or 20-bit audio. For a quantization
`resolution of n bits, the number of bins, and thus the number
`of entries in each Huffman table, will be n+1. Thus for 16-bit
`audio, each Huffman table will have 17 entries. It should be
`noted that the sizes of selected bins can be varied to fine tune
`the encoding process. For example, the bin for absolute
`values between 128 and 255 can be sub-divided into one bin
`from 128 to 195 and a second bin from 196 to 255. Although
`sub-dividing bins provides even better fine tuned code
`lengths, the cost is slightly greater processing complexity
`and memory requirements.
`As the probability density function of an audio signal
`varies from frame to frame,
`the encoder of the present
`invention tracks the varying statistics by selecting, for each
`frame, a Huffman table that is optimal for the probability
`density function of each frame. For predictable audio
`signals, most prediction error sample values will fall in the
`first few bins about zero with very few, if any, values falling
`in the higher bins. For unpredictable audio signals, such as
`noisy signals,
`the prediction error sample values will be -
`more evenly spread over all bins. Although the probability
`density function will vary from frame to frame,
`it will
`nonetheless have a Laplacian form and will be centered
`substantially about zero. What will vary appreciably,
`however, is the variance of the Laplacian probability density
`function. To account for the varying statistics of the input
`audio signal, an exemplary embodiment of the encoder of
`the present invention uses 15 different Huffman tables, with
`each table corresponding to a different variance of the
`probability density function. Alisting of the Huffman tables
`is included in the Appendix hereto. The 15 Huffman tables
`used in the exemplary embodiment are geometrically spaced
`to correspond to standard deviations ranging from 1 to 5000.
`The contents of each Huffman table will now be described
`with reference to the Huffman tables listed in the Appendix.
`In a given Huffman table, each of the 17 bins 0-16
`(assuming 16-bit quantized audio) is assigned a unique
`prefix code. The encoded representation of each error
`sample is comprised of the prefix for the bin in which the
`sample falls and a suflix. The suffix indicates the relative
`position of the sample within the bin and the sign of the
`sample. In accordance with the present invention, the prefix
`codes for each table are selected so that
`the length of
`encoded error samples within a bin corresponds inversely to
`the probability that an error sample falls within the bin.
`It will be noticed from the Huffman tables listed in the
`Appendix that each prefix code includes at least one “0” bit.
`This guarantees that the bitstream generated by the Huffman
`coder will never include a bit pattern that can be misinter-
`preted for the sync pattern (in the block header) which _
`consists of 32 consecutive “1” bits.
`It should also be observed that the maximum length of any
`prefix in any table does not exceed 16 bits. This feature
`makes it possible to store each table entry in no more than
`two bytes of memory. Thus for the exemplary embodiment
`of the encoder of the present invention for 16-bit audio
`samples, the entire Huffman tables dictionary 104 can be
`stored in 510 bytes (i.e., 15 tables><17 entries><2 bytes/eritry).
`FIG. 4 is a flow-chart describing the operation of the best
`table selector 103. As discussed above,
`the selector 103
`selects, for each frame of error sample values,
`the one
`Hulfman table which can be used to encode the frame most
`efliciently.
`Operation of the best table selector begins at step 401 in
`which a set of bin hit counters and a sample counter N are
`initialized to 0. There are 17 bin hit counters, one for each
`bin, that are used to count the number of prediction error
`
`.
`
`.
`
`8
`the table
`samples falling within each bin. At step 402,
`selector 103 fetches an error sample output from the pre-
`dictor 102. At step 403, a search of the 17 bins is conducted
`to determine in which bin the current error sa111ple falls. In
`step 403, the bin hit counter of the bin in which the current
`error sample falls is incremented. Step 403 is described in
`greater detail below in conjunction with FIGS. 5(A) and
`5(_B). At step 404, the sample counter is incremented and at
`step 405, it is determined whether the sample count N is
`equal to the frame length (i.e., the number of error samples
`in each frame). If the sample count N is not yet equal to the
`frame length, operation loops back to step 402 and steps
`402-405 are repeated. If it is determined in step 405 that the
`sample count N is equal
`to the frame length, operation
`continues to step 406. At this point, a frame of error samples
`has been read in and a distribution of the error samples over
`the 17 bins has been determined in terms of the number of
`error samples falling within each bin, as represented by the
`contents of the 17 bin hit counters.
`At step 406, a set of cost variables, one variable for each
`of the 15 Hulfman tables, and a table pointer T are initialized
`to zero. Operation then proceeds to step 407 in which the
`coding cost of a Huffman table (indexed by the pointer T) is
`determined using the set of 17 bin hit counts determined
`above. The coding cost associated with a Huffman table is
`the number of bits required to encode the current frame
`using that table. Step 407 is described more fully below in
`conjunction with FIG. 6. Operation then proceeds to s ep
`408 in which it is determined whether T=0 or whether he
`coding cost of the table currently under consideration is less
`than a variable MINCOST. MINCOST is the lowest coding
`cost to be calculated of the tables that have been considered
`so far. If it is determined in step 408 that T=0 or that he
`coding cost of the currently considered table is less than
`MINCOST, operation proceeds to step 409 in which MIN-
`COST is set to the coding cost of the currently considered
`table and a variable BESTTABIE, indicating the best tavle
`determined so far, is set to T, the current table pointer value.
`Operation then proceeds to step 410 in which the pointer T
`is incremented. If it was determined at step 408 that T is not
`equal to 0 and that the cost of the currently considered ta Jle
`is not less than MINCOST, then operation by—passes s ep
`409 and proceeds directly to step 410. After step 410, it is
`determined at step 411 whether the table pointer T is equal
`to the number of tables. If not, operation loops back to s ep
`407 and steps 407—4ll are repeated using another Huffman
`table. If T is equal to the number of tables, the search for he
`best table for the current frame is considered finished, with
`the best table being indicated by the variable BESTTABLE.
`FIG. 5(A) is a flow-chart of a first alternative procedure
`for carrying out step 403, described above, in which the bin
`in which the current error sample value falls is determine