throbber
United States Patent r191
`Kim
`
`111111 ~llllll Ill lllll II 11~11111111111111111111111
`US005649052A
`5,649,052
`[111 Patent Number:
`[451 Date of Patent:
`*Jul. 15, 1997
`
`[54] ADAPTIVE DIGITAL AUDIO ENCODING
`SYSTEM
`
`[75]
`
`Inventor:
`
`Jong-II Kim, Incheon, Rep. of Korea
`
`Johnston, Transform Coding of Audio Signals Using Per(cid:173)
`ceptual Noise Criteria, 1EEE Journal on Selected Areas in
`Communications, vol. 6, No. 2 Feb. 1988.
`
`[73] Assignee: Daewoo Electronics Co Ltd., Seoul,
`Rep. of Korea
`
`[ * ] Notice:
`
`The term of this patent shall not extend
`beyond the expiration date of Pat. No.
`5,402,495.
`
`Primary Examiner-Allen R. MacDonald
`Assistant Examiner-Patrick N. Edouard
`Attorney, Agent, or Firm-Anderson Kill & Olick P.C.
`
`[57]
`
`ABSTRACT
`
`[21] Appl. No.: 366,144
`
`[22] Filed:
`
`Dec. 29, 1994
`
`[30]
`
`Foreign Application Priority Data
`
`[KR] Rep. of Korea ..................••...... 94-758
`
`Jan. 18, 1994
`Int. CL 6
`................................. GlOL 3/02; GlOL 9/00
`[51]
`[52] U.S. CI •.......................................... 395/2.35; 395/2.37
`[58] Field of Search ..................................... 3951235, 2.1,
`395/2.37, 2.91; 38112, 58, 56, 94, 36, 29-40
`
`[561
`
`References Cited
`
`U.S. P.iUENT DOCUMENI'S
`
`4,706,290 11/1987 Lim ........................................... 381/58
`5/1992 Taniguchi et al ......................... 381/36
`5,115,469
`3/1995 Kim .......................................... 381/94
`5,402,495
`
`OTHER PUBLICATIONS
`
`Johnston, "Sum-Difference Stereo Transform Coding",
`ICASSP '92: Acoustics, Speech & Signal Processing Con(cid:173)
`ference, pp.11-569-11-572 ,92.
`
`An encoding system employing a novel perceptual spectrum
`difference estimation device improves the coding efficiency
`and audio quality of a digitized audio signal. The system
`comprises M number of encoding means arranged in parallel
`for encoding the input digital audio signal of a current frame,
`respectively; M number of decoding means arranged in
`parallel for decoding each of the encoded digital audio
`signals; a first estimator for estimating a power density
`spectrum for a difference signal between the input digital
`audio signal and each of the decoded digital audio signals;
`a second estimator for estimating a power density spectrum
`for the input digital audio signal of the current frame and for
`determining a masking threshold therefor based on the
`power density spectrum for the input digital audio signal; a
`third estimator for estimating a set of perceptual spectrum
`distances based on the power density spectrum for each of
`the difference signals and the masking threshold; and a
`circuit for selecting an encoded digital audio signal having
`a smallest perceptual spectrum distance.
`
`1 Claim, 2 Drawing Sheets
`
`70
`
`80
`
`FORMATI(cid:173)
`ING
`CIRCUIT
`
`60
`~PARATOR
`
`PSD2(i)
`
`TO
`PE CEP A SO
`SPECTRUM
`TRANSMITIER
`DISTANCE
`ESTIMATOR
`
`100
`
`IPR2016-01710
`UNIFIED EX1026
`
`

`
`~
`
`.... = Ol
`\C
`.&;;ii..
`="'
`....
`Ol
`
`N
`
`"'""' s,
`~ a
`
`"'""' "'°'
`~
`
`~
`\0
`"'""'
`\0
`
`~ a. a
`
`rJ1 •
`~ •
`
`100
`
`TRANSMITIE~
`
`T
`
`50
`
`PSD2(i)
`
`COMPARATOR
`
`ING
`FORMATI(cid:173)
`
`60 ICIRCUIT
`
`---.----1
`
`~
`
`80
`
`70
`
`v r_L_----=--------------:.J 34
`130
`: DEC~D~~ 208 !
`! DECODER V
`I
`x(n,i) l i .. I ENCOD~ OB i I I ,---------------,
`SIGNAL I
`AUDIO
`INPUT DIGITAL
`
`y2(n,i)I
`
`1
`
`20A '20
`
`FIG. 1
`
`L____(y
`J-------r-31--1
`L ______________ J
`
`ENCODER
`
`I
`
`I
`
`~OA V'
`r--------------110
`
`PE CEPT AL
`
`ESTIMATOR
`I
`DISTANCE
`1
`iE2(k,i SPECTRUM
`6 I
`
`I
`I
`
`') 3
`
`e n,t
`2(
`-35------y PSD 1 (i)
`
`I M(k,i)
`
`4 I/
`
`41
`
`ESTIMATOR
`MASKINGTHRESHOLD
`
`L----------------~~ ------------------J
`'
`1
`
`'
`
`xrk l)1
`
`~~f1iTA~~~
`
`l POWER DENSITY
`I
`r-------------------------------------~40
`L-------E1("k}f _ _J L---~~~l~~~~~----J
`! p~~t~fRFDNJ11Y ! i POWER DENSl1Y
`I
`
`SPECTRUM
`
`r I
`
`I I
`
`ESTIMATOR
`
`-
`
`-;:i
`
`e 1 (n,i) I
`
`1'
`321 I
`I
`
`I
`1
`
`I
`I
`
`

`
`U.S. Patent
`
`Jul. 15, 1997
`
`Sheet 2 of 2
`
`5,649,052
`
`N
`•
`0 -
`
`N
`I")
`
`~ja
`
`~ w
`
`CD
`N
`""'\.
`I")
`
`<(
`N
`""'\.
`f""')
`
`I--:J
`r-u
`LL ct::
`LLO
`
`,,,,--.....
`·-...
`c
`~ ;:
`
`~
`
`0 z -
`51-
`::::>
`0
`Ou
`zo::
`S:u
`
`I
`
`,,,,--..... ·-...
`c
`......._,,,.
`
`~
`Q)
`
`

`
`5,649,052
`
`1
`ADAPTIVE DIGITAL AUDIO ENCODING
`SYSTEM
`
`FIELD OF THE INVENTION
`
`The present invention relates to a digital audio encoding
`system; and, more particularly, to an improved digital audio
`encoding system capable of providing an encoded audio
`signal with a minimum distortion measured in accordance
`with the human auditory perception.
`
`DESCRIPTION OF THE PRIOR ART
`
`Transmission of digitized audio signals makes it possible
`to deliver high quality audio signals comparable to those of
`standard compact disc/or digital audio tape. When an audio 15
`signal is expressed in a digital form, a substantial amount of
`data need be transmitted especially in the case of high
`definition television system. Since, however, the available
`frequency bandwidth assigned to such audio signals is
`limited, in order to transmit the substantial amounts of 20
`digital data, e.g., 768 Kbits per second for 16 bit PCM(Pulse
`Code Modulation) audio signal with 48 KHz sampling
`frequency, through the limited audio bandwidth of, e.g.,
`about 128 KHz, it is inevitable to compress the audio signal.
`At the receiving end of the digital transmission, the com- 25
`pressed audio signal is decoded.
`The quality of the decoded audio signal is largely dictated
`by the compression technique employed for the encoding
`thereof. Sometimes, in order to selectively generate an audio
`signal with a least distortion, the digital audio encoding 30
`system is provided with a plurality of encoders employing
`different compression techniques, a corresponding number
`of decoders and an audio distortion measuring device. In
`such a case, the encoders are arranged in a parallel fashion
`in order to carry out the encoding of the input digital audio 35
`signal simultaneously; and each of the decoders is coupled
`to its corresponding encoder for the decoding of the encoded
`digital audio signal therefrom. In such an arrangement, the
`digital audio encoding system selectively generates one of
`the encoded digital audio signals which causes a least audio 40
`distortion.
`Audio distortions are usually measured in terms of ''Total
`Harmonic Distortion(fHD)" and "Signal to Noise Ratios
`(SNR)", wherein said THD is a RMS(root-mean-square)
`sum of all the individual harmonic-distortion components 45
`and/or IMD's(Intermodulation Distortions) which consist of
`sum and difference products generated when two or more
`signals pass through an encoder; and said SNR represents
`the ratio between the amplitude of an input digital signal and
`the amplitude of an error signal.
`Such THD or SNR measurement, however, is a physical
`value which has no direct bearing on the human auditory
`faculty or and, accordingly, the conventional digital audio
`encoding system having such audio distortion measuring
`device has a limited ability to provide an encoded digital
`audio signal which best reflects the human auditory percep(cid:173)
`tion.
`
`2
`number of encoding means arranged in parallel for encoding
`the input digital audio signal in a current frame, respectively;
`M number of decoding means arranged in parallel for
`decoding each of the encoded digital audio signals, respec-
`5 tively; first estimation means for estimating a power density
`spectrum of a difference signal between the input digital
`audio signal and each of the decoded digital audio signals;
`second estimation means for estimating a power density
`spectrum of the input digital audio signal in the current
`10 frame and for determining a masking threshold thereof
`based on the power density spectrum of the input digital
`audio signal; third estimation means for estimating a set of
`perceptual spectrum distances based on the power density
`spectrum for each of the difference signals and the frequency
`masking threshold; and means for selecting an encoded
`digital audio signal having a smallest perceptual spectrum
`distance.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The above and other objects and features of the present
`invention will become apparent from the following descrip(cid:173)
`tion of preferred embodiments taken in conjunction with the
`accompanying drawings, in which:
`FIG. 1 is a block diagram illustrating a novel digital audio
`encoding system in accordance with the present invention;
`and
`FIG. 2 is a detailed block diagram depicting the power
`density spectrum estimator shown in FIG. 1.
`
`DEI'AILED DESCRIPTION OF THE
`PREFERRED EMBODIMENfS
`
`50
`
`Referring to FIG. 1, there is shown a block diagram
`illustrating a digital audio encoding system 100 in accor(cid:173)
`dance with the present invention.
`The encoding system 100 comprises an encoding device
`10, a decoding device 20, a first and a second power density
`spectrum estimation units 30 and 34, a masking threshold
`estimation unit 40, a perceptual spectrum distance estimator
`50, a comparator 60, a selector 70 and a formatting circuit
`80.
`An input digital audio signal x(n,i) of an ith frame, or a
`current frame, which includes N samples, i.e., n=O, 1, 2, ..
`,N-1, is applied to the encoding device 10 which is adapted
`to perform an encoding operation of the input digital audio
`signal at a predetermined bit rate, wherein N is a positive
`integer and one frame includes L, e.g., 32, subbands. A
`''frame" used herein denotes a part of the digital audio signal
`which corresponds to a fixed number of audio samples and
`is a processing unit for the encoding and decoding of the
`digital audio signal.
`As shown, the encoding device 10 includes a plurality of
`encoders, e.g., two encoders lOA and lOB, which are
`coupled in a parallel manner in order to simultaneously
`receive the input digital audio signal of the current frame and
`encode the input digital audio signal by using various
`compression techniques. For instance, the encoder lOAmay
`carry out an encoding operation of the input digital audio
`60 signal of the ith frame by employing an intra-frame bit
`allocation technique which adaptively assigns bits to each
`subband included within one frame based on a perceptual
`entropy for each of the subbands therein, and the encoder
`lOB may perform an encoding operation of the input digital
`65 audio signal by using an inter-frame bit allocation technique
`which adaptively assigns bits to each frame included within
`a predetermined. group of frames based on a perceptual
`
`55
`
`SUMMARY OF THE INVENTION
`
`It is, a primary object of the invention to provide a novel
`digital audio encoding system for adaptively encoding an
`input digital audio signal closely matching the human audi(cid:173)
`tory perception.
`In accordance with the present invention, there is pro(cid:173)
`vided a novel system for encoding an input digital audio
`signal having a plurality of frames, which comprises: M
`
`

`
`5,649,052
`
`3
`entropy for each frame; and, alternatively, the encoders lOA
`and lOB may include non-uniform and uniform quantizers,
`respectively.
`The perceptual entropy PE(i) for the ith frame, as is well
`known in the art, may be represented as:
`
`1
`1 L-1
`[
`PE(f)=L ~MAX 0, 2
`
`...fl.!!iL ]
`log2 M(m)
`
`dB
`
`Eq. (1)
`
`4
`
`1 N-1
`El(k, f) = 10log10 N n~ wl(n, i) · cJ2xkn!N
`1
`
`2
`I
`
`dB
`
`Eq. (5)
`
`5 wherein k=O, 1, . . ,(N/2)-1, N and n have the same
`meanings as previously defined.
`Referring back to FIG. 1, the second power density
`spectrum unit 34 is substantially identical to the first power
`density spectrum unit 30 excepting that the power density
`10 spectrum E2(k,i) for a difference signal e2(n,i) representa(cid:173)
`tive of the difference between the input digital audio signal
`x(n,i) and the decoded digital audio signal y2(n,i) from the
`decoder 20B is calculated therein. The difference signal
`e2(n,i) from the subtractor 35 may be represented as:
`
`wherein mis a subbandindex withm=O,l, ... ,L-1, Lbeing
`the total number of subbands in a frame; P(m), a sound
`pressure level in subband m estimated from a Fast Fourier
`Transform(FFf) technique; and M(m), a masking threshold
`in subband m.
`The encoded digital audio signal from each of the encod- 15
`ers is applied to the selector 70 and the decoding device 20
`which includes a plurality of decoders, e.g., 20A and 20B.
`Each of the decoders is adapted to decode a corresponding
`encoded digital audio signal from the encoders. The decoded
`digital audio signals yl(n,i) and y2(n,i) from the decoders 20
`20A and 20B are applied to the first and second power
`density spectrum estimation units 30 and 34, respectively,
`wherein each of said power density spectrum estimation
`units includes a subtractor 31(35) and a power density
`spectrum estimator 32(36), respectively. The subtractor 31 25
`included in the first power density spectrum estimation unit
`30 generates a difference signal el( n,i) representative of the
`difference between the input digital audio signal x(n,i) to the
`system and the decoded digital audio signal yl(n,i) from the
`decoder 20A, which may be represented as:
`
`el(n, i)=.x(n,i)-yl(n,f)
`
`Eq. (2)
`
`e2(n,i)=.x(n,i)-y2(n,f)
`
`Eq. (6)
`
`wherein n and i have the same meanings as previously
`defined.
`Therefore, it should be appreciated that the power density
`spectrum E2(k,i) for the difference signal e2(n,i) can be
`obtained by windowing the difference signal e2(n,i) with the
`hanning window h(n) as is done for the difference signal
`el(n,i) in Eq.( 4 ). Said power density spectrum E2(k,i) of the
`difference signal e2( n,i) for the ith frame may be obtained as:
`
`Eq. \/)
`
`2
`I
`1 N-1
`E2(k, f) = l0log10 N n~ w2(n, i) · e-J2xkn!N dB
`1
`30 wherein N, n, k, and i have the same meanings as previously
`defined, with w2(n,i)=e2(n,i)·h(n).
`In the meanwhile, the masking threshold estimation unit
`40 is adapted to receive the input digital audio signal x(n,i)
`of the ith frame and to estimate the masking threshold
`thereof. The masking threshold estimation unit 40 includes
`a power density spectrum estimator 41 and a masking
`threshold estimator 42. The power density spectrum estima(cid:173)
`tor 41 is substantially identical to the power density spec(cid:173)
`trum estimator included in the first or second power density
`spectrum estimation unit excepting that the power density
`spectrum X(k,i) of the input digital audio signal x(n,i) for the
`ith frame is calculated therein. The power density spectrum
`X(k,i) of the input digital audio signal x( n,i) for the ith frame
`may be obtained as:
`
`wherein both x(n,i) and yl(n,i) are P( e.g., 16)-bit pulse code
`modulation(PCM) audio signals.
`Subsequently, the difference signal is provided to the 35
`power density spectrum estimator 32 which serves to carry
`out the Fast Fourier Transform conversion thereof from the
`time domain to the frequency domain.
`Turning now to FIG. 2, the power density spectrum
`estimator 32 includes a windowing circuit 32A and a Fast 40
`Fourior Transform(FFI') circuit 32B.
`The windowing circuit 32A receives the difference signal
`el(n,i) from the subtracter 31; and performs the windowing
`process by multiplying the difference signal el(n,i) with a
`predetermined hanning window. The predetermined harm- 45
`ing window h(n) may be represented as:
`
`2
`I
`
`dB
`
`Eq. (8)
`
`h(n) = 0.5 \f8i3 {1 - cos(21fl7/N)}
`
`Eq. (3)
`
`wherein N is a positive integer and n=O, 1, 2, .. , N-1.
`Accordingly, the output wl(n,i) from the windowing
`circuit 32A may be represented as:
`
`wl(n,i)=el(n,z}h(n)
`
`wherein i is a frame index and n has the same meaning as
`previously defined.
`The output wl(n,i) from the windowing circuit 32A is
`then provided to the FFf circuit 32B which estimates the
`power density spectrum thereof; and, as a preferred embodi(cid:173)
`ment of the present invention, includes a 512 point FFf for
`Psychoacoustic Model I[or MPEG(moving pictures expert
`group)-Audio Layer I]. Accordingly, the power density
`spectrum El(k,i) for the difference signal el(n,i) of the ith
`frame, as is well known in the art, may be calculated as
`follows:
`
`1 N-1
`X(k, f) = lO!og10 N n~ w(n, f) · e-j2xbr/N
`1
`wherein N, n, k, and i have the same meanings as previously
`50 defined, with w(n,i)=x(n,i)·h(n).
`The power density spectrum of the input digital audio
`signal, X(k,i), estimated at the power density spectrum
`estimator 41 is then provided to the masking threshold
`estimator 42 which serves to estimate a masking threshold
`Eq. (4) 55 depending on the power density spectrum of the input digital
`audio signal.
`The masking threshold represents an audible limit closely
`reflecting the human auditory perception, which is a sum of
`the intrinsic audible limit or threshold of a sound and an
`60 increment caused by the presence of another(masking) con(cid:173)
`temporary sound in the frequency domain, as described in an
`article, which is incorporated herein by reference, entitled
`"Coding of Moving Pictures and Associated Audio", ISO/
`IEC/ITC1/SC29/WG11 N0501 MPEG 93(Jul., 1993),
`65 wherein the so-called Psychoacoustic Models I and II are
`discussed for the calculation of the masking threshold. In a
`preferred embodiment of the present invention, Psychoa-
`
`

`
`5,649,052
`
`5
`coustic Model I is advantageously employed in the masking
`threshold estimator 42.
`The power density spectrums El(k,i) and E2(k,i) and the
`masking threshold M(k,i) are simultaneously provided to the
`percep~al spectrum distance estimator 50 which is adapted 5
`to denve first and second perceptual spectrum distances
`PSDl(i) and PSD2(i) representative of the audio distortions
`for the power density spectrums El(k,i) and E2(k,i) as
`perceived by the human auditory faculty with the masking
`effect taken into consideration. The first perceptual spectrum 10
`distance PSDl(i) for the power density spectrum El(k,i)
`from the power density spectrum estimator 32 may be
`represented as:
`
`1
`(N/2)-1
`PSDl(i) = Ni2 ~ MAX[O, (El(k, z)-M(k, i))]
`
`F.q. (9) 15
`
`wherein k and i are the same as previously defined.
`Similarly, the second perceptual spectrum distance PSD2
`(i) for the power density spectrum E2(k,i) from the power
`density spectrum estimator 36 may be defined as:
`
`20
`
`(N/2)- 1
`1
`PSD2(i) = Ni2 ~ MAX[O, (E2(k, z)-M(k, i))]
`
`F.q (10)
`.
`
`25
`
`wherein k and i are the same as previously defined.
`As can be seen from Eqs.(9) and (10), the perceptual
`spectrum distance for the ith frame is estimated by the power
`density spectrum of the difference signal which exceeds the
`~sking threshold. The first and second perceptual spectrum
`distances PSDl(i) and PSD2(i) are applied to the comparator 30
`60 which serves to generate a selection signal identifying a
`least distorted digital audio signal among the two encoded
`digital audio signals from the encoders, e.g., lOA and lOB,
`by comparing their perceptual spectrum distances. The
`selection signal from the comparator 60 is then provided to 35
`the selector 70 and the formatting circuit 80.
`In response to the selection signal from the comparator
`60, the selector 70 selects the least distorted digital audio
`signal among the encoded digital audio signals from the
`
`6
`encoders to thereby provide the selected audio signal to the
`formatting circuit 80.
`At the formatting circuit 80, the selection signal from the
`comparator 60 and the selected audio signal from the
`selector 70 formatted and transmitted to a transmitter (not
`shown) for the transmission thereof.
`While the present invention has been shown and
`described with reference to the particular embodiments, it
`will be apparent to those skilled in the art that many changes
`and modifications may be made without departing from the
`spirit and scope of the invention as defined in the appended
`claim.
`What is claimed is:
`1. A system for adaptively encoding an input digital audio
`signal having a plurality of frames, which comprises:
`M number of encoding means arranged in parallel for
`encoding the input digital audio signal in a current
`frame, respectively;
`M number of decoding means arranged in parallel for
`decoding each of the encoded digital audio signals;
`first estimation means for estimating a power density
`spectrum of a difference signal between the input
`digital audio signal and each of the decoded digital
`audio signals;
`second estimation means for estimating a power density
`spectrum of the input digital audio signal in the current
`frame and for determining a masking threshold thereof
`based on the power density spectrum of the input
`digital audio signal;
`third estimation means for deriving a set of perceptual
`spectrum distances based on the power density spec(cid:173)
`trum of each of the difference signals and the masking
`threshold; and
`means for selecting an encoded digital audio signal hav(cid:173)
`ing a smallest perceptual spectrum distance.
`
`* * * * *

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket