`WORLD INTELLECTUAL PROPERTY ORGANIZATION
`International Bureau
`INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`WO 00/51243
`
`(11) International. Publication Number:
`
`(51) International Patent Classification 7 :
`HO3M 7/40, HO4N 7/50
`
`Al
`
`(43) International Publication Date:
`
`31 August 2000 (31.08.00)
`
`(21) International Application Number:
`
`PCT/KR99/00764
`
`(22) International Filing Date:
`
`11 December 1999 (11.12.99)
`
`(81) Designated States: AU, BR, CA, CN, DE, ES, GB, IN, JP,
`RU, US, European patent (AT, BE, CH, CY, DE, DK, ES,
`FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE).
`
`Published
`With international search report.
`Before the expiration of the time limit for amending the
`claims and to be republished in the event of the receipt of
`amendments.
`
`(30) Priority Data:
`1999/6157
`
`24 February 1999 (24.02.99)
`
`KR
`
`(71)(72) Applicant and Inventor: YOU, Soo, Geun [KR/KR]; Jam-
`won Hansin Apt. 1-1103, 56-3, Jamwon—dong, Seocho—gu,
`Seoul 137-030 (KR).
`
`(72) Inventor; and
`(75) Inventor/Applicant (for US only): PARK, Jung, Jae [KR/KR];
`Sujeong—gu, Seongnam,
`6516, Taepyeong 1—dong,
`Kyunggi—do 461-191 (KR).
`
`(74) Agent: PARK, Lae, Bong; 4F TLBS B/D, 464-1, Kunja—dong,
`Kwangjin—gu, Seoul 143-150 (KR).
`
`(54) Title: A BACKWARD DECODING METHOD OF DIGITAL AUDIO DATA
`
`100
`
`120
`
`130
`
`140
`
`MPEG Audio
`Bitstr
`
`(57) Abstract
`
`This invention provides a method of backward decoding compressed digital audio data into an analog audio data reversed in time.
`The method according to this invention comprises the steps of locating a header of a last frame of the compressed digital audio data;
`dequantizing a plurality of data blocks constructing the frame based on information contained in the located header; extracting time signals
`of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and synthesizing
`the extracted time signals of all subbands backward into real audio signal reversed in time. Therefore, this invention enables to record the
`decoded analog signal on both tracks on a magnetic tape simultaneously while the magnetic tape travels in one direction with little increase
`of computation load and memory size, resulting in a high speed recording.
`
`Page 1
`
`HULU LLC
`Exhibit 1012
`IPR2018-01195
`
`
`
`Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT.
`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`AL
`AM
`AT
`AU
`AZ
`BA
`BB
`BE
`BF
`BG
`BJ
`BR
`BY
`CA
`CF
`CG
`CH
`CI
`CM
`CN
`CU
`CZ
`DE
`DK
`EE
`
`Albania
`Armenia
`Austria
`Australia
`Azerbaijan
`Bosnia and Herzegovina
`Barbados
`Belgium
`Burkina Faso
`Bulgaria
`Benin
`Brazil
`Belarus
`Canada
`Central African Republic
`Congo
`Switzerland
`Cote d'Ivoire
`Cameroon
`China
`Cuba
`Czech Republic
`Germany
`Denmark
`Estonia
`
`ES
`Fl
`FR
`GA
`GB
`GE
`GH
`GN
`GR
`HU
`IE
`IL
`IS
`IT
`JP
`KE
`KG
`KP
`
`KR
`KZ
`LC
`LI
`LK
`LR
`
`Spain
`Finland
`France
`Gabon
`United Kingdom
`Georgia
`Ghana
`Guinea
`Greece
`Hungary
`Ireland
`Israel
`Iceland
`Italy
`Japan
`Kenya
`Kyrgyzstan
`Democratic People's
`Republic of Korea
`Republic of Korea
`Kazakstan
`Saint Lucia
`Liechtenstein
`Sri Lanka
`Liberia
`
`LS
`LT
`LU
`LV
`MC
`MD
`MG
`MK
`
`ML
`MN
`MR
`MW
`MX
`NE
`NL
`NO
`NZ
`PL
`PT
`RO
`RU
`SD
`SE
`SG
`
`Lesotho
`Lithuania
`Luxembourg
`Latvia
`Monaco
`Republic of Moldova
`Madagascar
`The former Yugoslav
`Republic of Macedonia
`Mali
`Mongolia
`Mauritania
`Malawi
`Mexico
`Niger
`Netherlands
`Norway
`New Zealand
`Poland
`Portugal
`Romania
`Russian Federation
`Sudan
`Sweden
`Singapore
`
`SI
`SK
`SN
`SZ
`TD
`TG
`TJ
`TM
`TR
`TT
`UA
`UG
`US
`UZ
`VN
`YU
`ZW
`
`Slovenia
`Slovakia
`Senegal
`Swaziland
`Chad
`Togo
`Tajikistan
`Turkmenistan
`Turkey
`Trinidad and Tobago
`Ukraine
`Uganda
`United States of America
`Uzbekistan
`Viet Nam
`Yugoslavia
`Zimbabwe
`
`Page 2
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`DESCRIPTION
`
`A BACKWARD DECODING METHOD OF DIGITAL AUDIO DATA
`
`1. Technical Field
`
`The present invention relates to a method of decoding
`
`5 compressed digital audio data backward, more particularly,
`
`to a method of backward decoding an MPEG (Moving Picture
`
`Experts Group) encoded audio data into analog audio
`
`signal with little increase of computation load and
`
`memory size.
`
`10 2. Background Art
`
`Digital audio signal is in general more robust to noise
`
`than analog signal and thus the quality is not subject to
`
`degradation during copy or transmission over network. The
`
`digital audio signals are, moreover, transmitted more
`
`15 rapidly and stored in storage media of less capacity due
`
`to effective compression methods recently developed.
`
`Many compression methods have been proposed to
`
`effectively encode audio signals into digital data. MPEG
`
`(Moving Picture Experts Group) audio coding schemes have
`
`20 been used for the standard in this area. The MPEG audio
`
`standards that are standardized as ISO (International
`
`Standardization Organization) - MPEG audio layer-1,
`
`1
`
`Page 3
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`layer-2, and layer-3 were devised to encode high-quality
`
`stereo audio signals with little or no perceptible loss
`
`of quality. They have been widely adopted in digital
`
`music broadcasting area and in addition have been used
`
`5 with MPEG video standards to encode multimedia data. In
`
`addition to MPEG-1, standard specifications for digital
`
`environments have been proposed; MPEG-2 includes
`
`standards on compression of multimedia data. Standards
`
`for object oriented multimedia communication are included
`
`10 in MPEG-4, which is in progress.
`
`MPEG-1 consists of five coding standards for
`
`compressing and storing moving picture and audio signals
`
`in digital storage media. MPEG audio standard includes
`
`three audio coding methods: layer-1, layer-2, and layer-3.
`
`15 MPEG audio layer-3 (hereinafter referred to as "MP3")
`
`algorithm includes a much more refined approach than in
`
`layer-1 and layer-2 to achieve higher compression ratio
`
`and sound quality, which will be described briefly below.
`
`MPEG audio layer-1, 2, 3 compress audio data using
`
`20 perceptual coding techniques which address perception of
`
`sound waves of the human auditory system. To be specific,
`
`they take an advantage of the human auditory system's
`
`inability to hear quantization noise under conditions of
`
`auditory masking. The "masking" is a perceptual property
`
`2
`
`Page 4
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`of the human ear which occurs whenever the presence of a
`
`strong audio signal makes a temporal or spectral
`
`neighborhood of weaker audio signals imperceptible. Let
`
`us suppose that a pianist plays the piano in front of
`
`5 audience. When the pianist does not touch keyboard, the
`
`audience can hear trailing sounds, but is no longer able
`
`to hear the trailing sounds at the instant of touching
`
`the keyboard. This is because, in presence of masking
`
`sounds, or the newly generated sounds, the trailing
`
`10 sounds which fall inside frequency bands centering the
`
`masking sound, so-called critical bands, and loudness of
`
`which is lower than a masking threshold are not audible.
`
`This phenomenon is called spectral masking effect. The
`
`masking ability of a given signal component depends on
`
`15 its frequency position and its loudness. The masking
`
`threshold is low in the sensitive frequency bands of the
`
`human ear, i.e., 2KHz to 5KHz, but high in other
`
`frequency bands.
`
`There is the temporal masking phenomenon in the human
`
`20 auditory system. That is, after hearing a loud sound, it
`
`takes a period of time for us to be able to hear a new
`
`sound that is not louder than the sound. For instance, it
`
`requires 5 milliseconds for us to be able to hear a new
`
`sound of 40 dB after hearing a sound of 60 dB during 5
`
`3
`
`Page 5
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`milliseconds. The temporal delay time also depends on
`
`frequency band.
`
`Based on a psychoacoustic model of the human ear, the
`
`MP3 works by dividing the audio signal into frequency
`
`5 subbands that approximate critical bands, then quantizing
`
`each subband according to the audibility of quantization
`
`noise within that band, so that the quantization noise is
`
`inaudible due to the spectral and temporal masking.
`
`The MP3 encoding process is described below in detail,
`
`10 step by step, with reference to FIGS. 1 and 2.
`
`(1). Subband coding and MDCT (Modified Discrete Cosine
`
`Transform)
`
`In the MP3 encoder, PCM format audio signal is, first,
`
`windowed and converted into spectral subband components
`
`15 via a filter bank 10, shown in FIG. 1, which consists of
`
`32 equally spaced bandpass filters. The filtered bandpass
`
`output signals are critically sub-sampled at the rate of
`
`1/32 of the sampling rate and then encoded.
`
`Polyphase filterbank is, in general, used to cancel the
`
`20 aliasing of adjacent overlapping bands that occurs
`
`otherwise because of the low sampling rate at the sub-
`
`sampling step. As another method, MDCT (Modified Discrete
`
`Cosine Transform) unit 20 and aliasing reduction unit 30
`
`are adopted to cancel the aliasing, thereby preventing
`
`4
`
`Page 6
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`deterioration of the quality.
`
`Because MDCT is essentially critically sampled DCT
`
`(Discrete Cosine Transform), the input PCM audio signal
`
`can be reconstructed perfectly in the absence of
`
`5 quantization errors. Discontinuities between transformed
`
`blocks occur since quantization is carried out.
`
`For each subband, the number of quantization bits is
`
`allocated by taking into account the masking effect by
`
`neighboring subbands. That is, quantization and bit
`
`10 allocation is performed to keep the quantization noise in
`
`all critical bands below the masking threshold.
`
`(2). Scaling
`
`Samples in each of the 32 subbands are normalized by a
`
`scale factor such that the sample of the largest
`
`15 magnitude is unity, and the scale factor is encoded for
`
`use in the decoder. With the scaling process, the
`
`amplitude of signal is compressed, therefore, the
`
`quantization noise is reduced and become inaudible due to
`
`the psychoacoustic phenomenon.
`
`20
`
`(3). Huffman Coding
`
`Variable-length Huffamn codes are used to get better
`
`data compression rate of the quantized samples. The
`
`Huffman coding is called entropy coding whereby
`
`redundancy reduction is carried out based on statistical
`
`5
`
`Page 7
`
`
`
`WO 00/51243
`
`PCT/ICR99/00764
`
`property of the digital data. The principle behind the
`
`Huffman coding is that codewords of small length are
`
`assigned to symbols having higher probability, while
`
`large-length codewords are assigned to symbols with lower
`
`5 probability. In effect, the average length of encoded
`
`data are reduced as small as possible.
`
`Let us consider an example for illustration. The
`
`quantized samples are 00, 01, 10, and 11. Their
`
`probabilities are 0.6, 0.2, 0.1, and 0.1, respectively.
`
`10 In case of using codewords of constant length, say, 2
`
`bits, the average length of a codeword is 2 bits without
`
`calculation of (2X0.6 + 2x0.2 + 2X0.1 + 2x0.1) / 4 = 2
`
`bits. However, if variable-length codewords are used,
`
`i.e., 1 bit is assigned to 00 with the highest
`
`15 probability, 2 bits for 01 with the second highest
`
`probability, and 3 bits for 10 and 11, the average length
`
`of the codeword leads to 1.6 bits ( =(1x0.6 + 2x0.2 + 3
`
`X0.1 + 3 X 0 . 1) /4 ).
`
`In addition, in order to achieve high compression rate,
`
`20 MP3 adopts bit reservoir buffering technique whereby
`
`unused bits in the frames in which the size of coded data
`
`are relatively small are used when the encoder needs more
`
`bits than the average number of bits to code a frame.
`
`After being processed by the above processes, the audio
`
`6
`
`Page 8
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`signal is formatted into a bitstream. FIG. 3 shows the
`
`arrangement of the various fields in a frame of an MP3
`
`encoded bitstream.
`
`Without data reduction, digital audio signals typically
`
`5 consist of 16 bit samples recorded at several sampling
`
`rates than twice the actual audio bandwidth, (e.g., 32KHz,
`
`44.1KHz, and 48KHz). In case of two channels stereo audio
`
`signals at a sampling rate of 44.1KHz with 16 bits per
`
`sample, the bit rate is 16X44100x2=1411200, or about 1.4
`
`10 Mbps. By using MP3 audio coding, the original sound data
`
`can be encoded at the bit rate of 128 to 256 Kbps. That
`
`is, 1.5 to 3 bits are, on the average, needed for
`
`sampling instead of 16 bits, and therefore the MP3
`
`enables to shrink down the original sound data from a CD-
`
`15 DA by a factor of about 12 without loss of the sound
`
`quality.
`
`Despite its advantages, digital audio recorders and
`
`players are in infancy for several reasons and analog
`
`audio recorders and players have still been the majority
`
`20 in the market. Accordingly, it would be attractive in
`
`terms of commercial products if it is possible that
`
`digital audio signals are recorded on analog signal
`
`storage media like magnetic tapes because users can enjoy
`
`digital audio without buying new digital audio recorders
`
`7
`
`Page 9
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`and players.
`
`Digital audio data are first decoded and then recorded
`
`on either track on a magnetic tape on which a forward
`
`track and a backward track are provided, That is, when
`
`5 the tape travels in the forward (backward) direction, the
`
`audio signals are recorded on the forward (backward)
`
`track. After completion of recording the audio signals on
`
`the forward track, the tape begins to travel in the
`
`backward and the audio signals are recorded thereon. As a
`
`10 result, it needs the time for two times tape travels to
`
`record the digital audio signals on a magnetic tape.
`
`For fast recording, it is possible to encode analog
`
`audio signals which were backward-reproduced and to
`
`decode and record the encoded signal on the tape during
`
`15 only one tape travel. However, the method has weak points
`
`of more storage spaces for the encoded backward-
`
`reproduced signals in addition to the encoded forward-
`
`reproduced signals, and imperfect reproduction of the
`
`audio signals due to MP3 encoding using masking
`
`20 phenomenon since small amplitude preceding large
`
`amplitude in view of normal reproduction was suppressed
`
`while encoding audio signal reproduced backward.
`
`3. Disclosure of Invention
`
`It is a primary object of the present invention to
`
`8
`
`Page 10
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`provide a method of backward decoding an MPEG digital
`
`audio data into an analog audio data which enables to
`
`record the decoded analog signal on analog signal storage
`
`media like magnetic tapes at a high speed with little
`
`5 increase of computation load and memory size.
`
`To achieve the object, the present invention provides a
`
`method of a method of backward decoding an MPEG audio
`
`data into an analog audio data, comprising the steps of
`
`locating a header of a last frame of the compressed
`
`10 digital audio data; dequantizing a plurality of data
`
`blocks constructing the frame based on information
`
`contained in the located header; extracting time signals
`
`of each frequency subband from the dequantized data
`
`blocks, reducing discontinuities between the dequantized
`
`15 data blocks; and synthesizing the extracted time signals
`
`of all subbands backward into real audio signal reversed
`
`in time.
`
`According to the method of backward decoding MPEG audio
`
`data according to the present invention, when MPEG audio
`
`20 data are asked to be recorded on a magnetic tape at a
`
`high speed, the MPEG audio data can be decoded and
`
`recorded on both of the two tracks on the magnetic tape
`
`simultaneously while the tape travels in one direction.
`
`Therefore, the backward decoding method according to
`
`9
`
`Page 11
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`present invention enables fast recording of MPEG audio
`
`data on both of tracks on the magnetic tape.
`
`4. Brief Description of Drawings
`
`The accompanying drawings, which are included to
`
`5 provide a further understanding of the invention,
`
`illustrate the preferred embodiment of this invention,
`
`and together with the description, serve to explain the
`
`principles of the present invention.
`
`In the drawings:
`
`10
`
`FIGS. 1 and 2 are block diagrams showing an MPEG audio
`
`encoder;
`
`FIG. 3 shows the arrangement of the various bit fields
`
`in a frame of MPEG audio data;
`
`FIG. 4 is a block diagram showing an MPEG audio
`
`15 decoder;
`
`FIG. 5 is a schematic diagram showing an illustration
`
`of the bit reservoir within a fixed length frame
`
`structure;
`
`FIG. 6 is a schematic diagram illustrating the overlap
`
`20 of inverse-modified-discrete-cosine-transformed blocks;
`
`FIG. 7 is a flow graph showing a synthesis filterbank;
`
`FIG. 8 is a flowchart showing an algorithm implementing
`
`the synthesis filterbank of FIG. 7;
`
`FIG. 9 is a block diagram of the flowchart of FIG. 8;
`
`10
`
`Page 12
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`FIG. 10 is a flow graph showing a synthesis filterbank
`
`for backward decoding according to the present invention;
`
`FIG. 11 is a flowchart showing an algorithm
`
`implementing the synthesis filterbank of FIG. 10; and
`
`5
`
`FIG. 12 is a block diagram of the flowchart of FIG. 11.
`
`5. Modes for Carrying out the Invention
`
`The preferred embodiments of the present invention will
`
`be described hereinafter in detail referring to the
`
`accompanying drawings.
`
`10
`
`FIG. 4 shows a block diagram of an MP3 audio decoder to
`
`which an embodiment of the present invention is applied,
`
`comprising a demultiplexer 100 for dividing an MP3 audio
`
`bitstream into several data of different types; a side-
`
`information decoder 110 for decoding side-information
`
`15 contained the bitstream; a Huffman-decoder 120 for
`
`Huffman-decoding the divided audio data; a dequantizer
`
`130 for obtaining actual frequency energies from the
`
`Huffman-decoded data; an inverse MDCT (IMDCT) unit 140
`
`for applying IMDCT to the energies; and a synthesis
`
`20 filterbank 150 for synthesizing subband values the into
`
`PCM samples.
`
`With reference to the MP3 audio decoder of FIG. 4, the
`
`method of backward decoding MP3 encoded audio data are
`
`described below step by step.
`
`11
`
`Page 13
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`(1). Identifying Frame Header
`
`The first step in the backward decoding process of an
`
`MP3 bitstream is to find where decoding is started in the
`
`bitstream. In MPEG audio, frames are independent of each
`
`5 other, and consequently the first step is to locate a
`
`frame header in the bitstream, requiring knowing the
`
`frame length. All MPEG bit streams are generally divided
`
`in separate chunks of bits called frames. There is a
`
`fixed number of frames per second for each MPEG format,
`
`10 which means that for a given bit rate and sampling
`
`frequency, each input frame has a fixed length and
`
`produces a fixed number of output samples.
`
`In order to obtain actual frame length, it is required
`
`to locate a frame header in the bitstream and to get the
`
`15 required information from it, because the frame length
`
`depends on the bit rate and sampling frequency. Locating
`
`header information is done by searching for a
`
`synchronization bit-pattern marked within the header.
`
`However, it happens that locating header information
`
`20 fails because some audio data may contain the same bit
`
`pattern as the synchronization bit-pattern.
`
`To alleviate this problem, on the assumption that
`
`neither bit rate nor sampling frequency does not change
`
`in an MP3 audio clip, the demultiplexer 100 analyzes the
`
`12
`
`Page 14
`
`
`
`WO 00/51243
`
`PCT/1C1199/00764
`
`first header in the stream and obtains the length of the
`
`frame having no padding bit based on information in the
`
`first header. By using the frame length, the header of
`
`the last frame is located while traveling the MP3 audio
`
`5 clip from the end.
`
`If padding bit is added to a frame, the frame length is
`
`increased by 1 byte. That is, the frame length may change
`
`from frame to frame due to the padding bit. Because it is
`
`uncertain that the last frame have padding bit, searching
`
`10 for the header of the last frame needs to examine whether
`
`the last frame header is away from the end of the clip by
`
`the frame length or one more byte away.
`
`(2). Obtaining Side-information
`
`After the frame header is found, the demultiplexer 100
`
`15 divides the input MP3 audio bitstream into side-
`
`information containing how the frame was encoded, scale
`
`factor specifying gain of each frequency band, and
`
`Huffman-coded data. The side-information decoder 110
`
`decodes the side-information so that the decoder knows
`
`20 what to do with the data contained in the frame.
`
`The number of bits required for MP3 encoding depends on
`
`acoustic characteristics of samples to be encoded with
`
`equal quality of sound. The coded data do not necessarily
`
`fit into a fixed length frame in the code bitstream.
`
`13
`
`Page 15
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`Based on this, MP3 uses bit reservoir technique whereby
`
`bit rate may be borrowed from previous frames in order to
`
`provide more bits to demanding parts of the input signal.
`
`To be specific, the encoder donates bits to a reservoir
`
`5 when it needs less than the average number of bits to
`
`code a frame. Later, when the encoder needs more than the
`
`average number of bits to code a frame, it borrows bits
`
`from the reservoir. The encoder can only borrow bits
`
`donated from past frames with limits. It cannot borrow
`
`10 from future frames. On the decoder's side, the current
`
`frame being decoded may include audio data belonging to
`
`the frames that will be presented subsequently. The
`
`starting byte of the audio data for the current frame is
`
`limited to 511 bytes away from that frame.
`
`15
`
`A 9-bit pointer is included in each frame's side-
`
`information that points to the location of the starting
`
`byte of the audio data for that frame, as shown in FIG. 5.
`
`That is, the audio data for the current frame being
`
`decoded, i.e., scale factor and Huffman-coded data may be
`
`20 included in data region of the previous frames, which are
`
`within 511 bytes distance from that frame. When MP3 audio
`
`data are forwardly decoded, if it is determined that data
`
`belonging to the current frame contains data for the
`
`subsequent frames, they are kept until the subsequent
`
`14
`
`Page 16
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`frames are decoded. On the other hand, in order to
`
`backward decoding MP3 audio data, when the current frame
`
`is decoded, it is checked whether or not the decoding
`
`current frame needs data contained in the precedent frame,
`
`5 and if any, the data are obtained in such a manner that
`
`headers of the precedent frames and data belonging to the
`
`frames are identified.
`
`(3). Huffman decoding
`
`Once obtaining the audio data are completed, the
`
`10 Huffman decoder 120 starts to Huffman-decode the audio
`
`data (including the data contained in the precedent
`
`frames) based on the side-information and Huffman trees
`
`which were constructed and used in the encoding process
`
`according to the data contents.
`
`15
`
`This step is the same as that of forward decoding.
`
`However, since a frame is encoded in two granules
`
`(granule 0 and granule 1) and data of granule 0 must be
`
`decoded in order to locate granule 1, two granules must
`
`be decoded to output granule 1 in the backward decoding
`
`20 process whereas it is possible to decode the MP3 encoded
`
`data from granule 0 to granule 1 sequentially in the
`
`forward decoding, whereas data of two granules must be
`
`decoded at a time in the backward decoding process.
`
`(4). Dequantizing and descaling
`
`15
`
`Page 17
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`When the Huffman-decoder 120 has decoded the audio data,
`
`they have to be dequantized by the dequantizer 130 and
`
`descaled using the scale factors into real spectral
`
`energy values. For example, if the Huffman-decoded value
`
`5 is Y, then the real spectral energy value is obtained by
`
`multiplying Y(4/3) and the scale factors.
`
`If the bitstream is a stereo signal, each channel can
`
`be transmitted separately in every frame, but
`
`transmission of the sum and the difference between the
`
`10 two channels is often adopted to reduce redundancies
`
`therebetween. If the bitstream was encoded in this way,
`
`the decoder has to perform stereo-processing to recover
`
`the original two channels.
`
`(5). IMDCT (inverse modified discrete cosine transform)
`
`15
`
`So far the signals have all been in the frequency
`
`domain, and to synthesize the output samples, a transform
`
`is applied that is the reverse of the time-to-frequency
`
`transform used in the encoder.
`
`In MPEG layer-3, MDCT is done to get better frequency
`
`20 resolution than in the other layers. MDCT are essentially
`
`critically sampled DCT, implying that if no quantizing
`
`had been done, the original signal would be reconstructed
`
`perfectly. However, because quantization is performed for
`
`each data block in the encoding process, discontinuities
`
`16
`
`Page 18
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`between data blocks occur inevitably. The single data
`
`block is the unit block of output samples of the decoder
`
`and is corresponding to a granule in inverse MDCT.
`
`To avoid discontinuities between the granules, which
`
`5 would lead to perceptible noise and clicks, the inverse
`
`MDCT uses 50% overlap, i.e., every inverse-modified-
`
`discrete-cosine-transformed granules are overlapped with
`
`half of the previous transformed granules to smooth out
`
`any discontinuities.
`
`10
`
`To be specific, IMDCT produces 36 samples output in a
`
`manner that the second half 18 samples of the previous
`
`granule is added to the first half 18 samples of the
`
`current granule, as shown in FIG. 6. For the backward
`
`decoding, the order in which granule is added must be
`
`15 reversed, i.e., the second half 18 samples of the current
`
`granule is added to the first half 18 samples of the
`
`precedent granule. For the end frame which is to be
`
`decoded at first at the backward decoding process, second
`
`granule of that frame is added with zeros or just used
`
`20 without overlapping.
`
`The IMDCT process in the forward decoding is expressed
`
`by the following equation.
`
`xi (n) = yi(n) + yi_1(n+18) 0 1-1<18, i=1,2,
`
`2N.
`
`where xi(n) is a target sample output, yi(n) is inverse-
`
`17
`
`Page 19
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`modified-discrete-cosine-transformed sample, i is the
`
`granule index, N is the total number of frames, and
`
`yo(n+18) are all zeros for O n<18.
`
`The above equation must be changed into the following
`
`5 equation for the IMDCT process in the backward decoding.
`
`xi (n) = yi (n+18) +
`
`(n) 0
`
`i=2N, 2N-1
`
`, 1.
`
`where v
`
`(n+18) are all zeros for O n<18. The
`
`overlapping procedure is the same as that of the forward
`
`decoding and therefore computation and memory size needed
`
`10 are identical.
`
`(6). Synthesis of Subband signals
`
`Once the transformed blocks is overlapped after the
`
`IMDCT process, the final step to get the output audio
`
`samples is to synthesize 32 subband samples. The subband
`
`15 synthesis operation is to interpolate 32 subband samples
`
`into audio samples in the time domain.
`
`A subband synthesis filter needs the delayed inputs of
`
`previous frames, but in case of the backward decoding,
`
`subband samples are presented to the synthesis filter in
`
`20 the reverse order to the forward decoding. Therefore,
`
`redesign of MPEG standard synthesis filterbank is
`
`required to perform the backward decoding operation. The
`
`MPEG standard synthesis filterbank for the forward
`
`decoding is described below in detail and then the
`
`18
`
`Page 20
`
`
`
`WO 00/51243
`
`PCT/1CR99/00764
`
`synthesis filterbank for the backward decoding according
`
`to the present invention is explained in detail.
`
`FIG. 7 shows a flow graph of an MPEG standard synthesis
`
`filterbank for forward decoding, whereby 32 subband
`
`5 samples are synthesized into audio samples of a time-
`
`series in the similar way to frequency division
`
`multiplexing. To be specific, 32 subband samples or
`
`xr(mTs1)'s, each of which is critically sampled at a
`
`sampling period of TS1, are synthesized into an output
`
`10 samples or s(nTs2) which is critically sampled signal at a
`
`sampling period of Ts2 (= Ts1 / 32).
`
`Here, xr(mTs1) is the r-th subband sample and xr(nTs2) is
`
`32 up-sampled from xr(mTsi) such that thirty-one zeros are
`
`inserted into the interval between (m-1)Ts1 and mTs, for
`
`15 xr(mTs1) samples. This up-sampling generates 31 images of
`
`baseband centered at harmonics of the original sampling
`
`frequency, kfs, (k=1,2„31). That is, sampling frequency
`
`is increased from f s" (= 1/TS1) to fs2 (=1/Ts2) for the
`
`original subband sample of xr(mTs1) .
`
`20
`
`For each subband, xr(nTs2) is processed by band-pass
`
`filter Hr(z) to pass the signal belonging to frequency-
`
`band allocated to each filter. The band-pass filter has
`
`512 orders and is constructed by phase-shifting a
`
`prototype low-pass filter.
`
`19
`
`Page 21
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`The flow graph of FIG. 7 is expressed by the equation
`
`31 511
`
`S (nTs2 ) = E E xr ((32t + n—k)Ts2 )• H r(kTs2)
`71" 511
`= E E xr ((32t + n — k)Ts2 ) • h(kT s2 ) • N r (k)
`r3T1
`(2r +1)(k +16)7r )
`= E E xr ((32t n — k)Ts2) • h(kT s2) •cos(
`64
`
`r=0 k=0
`
`(1)
`
`where r
`
`is
`
`the subband index ranging from 0 to 31, n is
`
`5 the output sample index ranging from 0 to 31, and St(nT s2 )
`
`is
`
`the synthesized
`
`output sample at time t. That is,
`
`St(nT s2 ) represents
`
`the synthesized
`
`output sample of 32
`
`subband samples or xr (tT s1 ) 's at time t.
`
`The equation
`
`(1) implies
`
`the convolution of xr (kTs2 ) and
`
`10 1-1,(KTs2 ) , which has 512 coefficients
`
`and is constructed
`
`by
`
`the product of the prototype
`
`low-pass filter
`
`h(kTs2 ) and
`
`Nr (k) that
`
`is used for phase -shift
`
`thereof.
`
`Reduction of the number of computations,
`
`i.e.,
`
`multiplies
`
`and adds is possible
`
`in equation
`
`(1) . By
`
`15 utilizing
`
`the symmetry property of cosine
`
`terms and zeros
`
`that are filled
`
`in xr (kTs2 ) at the time of up-sampling,
`
`equation
`
`(1) leads
`
`to equation
`
`(2) , hereinafter,
`
`sampling
`
`period
`
`in the following equations
`
`is omitted
`
`for
`
`convenience and is Ts2 if not explicitly
`
`expressed.
`
`20
`
`Page 22
`
`
`
`WO 00/51243
`
`S ,(n) =
`
`=
`
`is
`
`i=0
`is
`
`i=0
`
`h(n + 32i) • (-1)[" 21 • g ,(n + 64i +32 x (i%2))
`
`d(n + 32i) • g r (n + 64i + 32 x (i%2))
`
`g ,(k + 64i) =
`
`31
`
`(2r + 1)(k + 16)75
`xr (32t — 321) • cos(
`64
`r=0
`
`
`
`PCT/KR99/00764
`
`(2)
`
`
`
`( 3 )
`
`where r is the subband index ranging from 0 to 31, n,
`
`and k are computation indices (n=0,1,2
`
`31, i= 0,1,2 —15,
`
`5 k=0,1,2,
`
`,63), t represents the time when the subband
`
`sample is presented to the decoder. % is the modular
`
`operator and [x] represents the largest integer that is
`
`not greater than x.
`
`For each subband, one sample is presented and
`
`10 multiplied by Nr(k), resulting in 64 samples. The 64
`
`samples are stored in 1024 FIFO (First In First Out)
`
`buffer, samples have been stored therein being shifted by
`
`64. 32 PCM output samples are obtained by multiplying
`
`samples in the 1024 FIFO buffer by coefficients of the
`
`15 time window.
`
`The synthesis filterbank for backward decoding
`
`according to the present invention will be described
`
`below in detail with reference to the MPEG standard
`
`synthesis filterbank for the forward decoding.
`
`20
`
`It should be noted that for backward decoding, subband
`
`samples are presented to the decoder in the reverse order
`
`to their playback order. For example, given N samples for
`
`21
`
`Page 23
`
`
`
`WO 00/51243
`
`PCT/KR99/00764
`
`each subband, while the forward decoder decodes the
`
`samples in the increasing order (t = 0,1,2,_,N-1), the
`
`samples have to be decoded in the decreasing order (t=N-
`
`1,N-2,...,0) for backward decoding.
`
`5
`
`Because MPEG standard synthesis filterbank requires
`
`past samples for synthesizing PCM audio samples, if
`
`samples are presented in the reverse order to perform
`
`backward decoding, MPEG standard synthesis filterbank
`
`cannot use the previous samples. As a result, MPEG
`
`10 standard synthesis filterbank must be modified to perform
`
`backward decoding. The structure thereof is explained
`
`below.
`
`FIG. 10 depicts a flow graph showing the synthesis
`
`filterbank for backward decoding according to the present
`
`15 invention, which is identical to the forward decoding
`
`synthesis filterbank except that Hr(Z) is replaced by
`
`Br (z) . Note that xr(InTsi) is presented to the filterbank in
`
`the decreasing order, i.e., m =
`
`Equation (1) is changed to equation (4) in accord