`Werner et al.
`
`USOO6668088B1
`(10) Patent No.:
`US 6,668,088 B1
`(45) Date of Patent:
`Dec. 23, 2003
`
`(54) DIGITAL SIGNAL COMPRESSION
`ENCODING WITH IMPROVED
`QUANTISATION
`
`(75) Inventors:
`
`list RSSSM,
`Wii R, SMNEmes
`Knee, Petersfield (GB)
`(73) Assignees: British Broadcasting Corporation,
`London (GB); Snell & Wilcox Limited,
`Middlesex (GB)
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`(21) Appl. No.:
`09/380,047
`
`Feb. 25, 1998
`PCT/GB98/00582
`
`(22) PCT Fed:
`(86) PCT N
`O.
`S371 (c)(1),
`(2), (4) Date: Nov. 29, 1999
`(87) PCT Pub. No.: WO98/38800
`PCT Pub. Date: Sep. 3, 1998
`Foreign Application Priority Data
`
`(30)
`b. 25
`
`5,293,434 A * 3/1994 Feig et al. .................. 382/234
`5,301.242 A
`4/1994 Gonzales et al. ........... 382/239
`5,412,429 A * 5/1995 Glover .................. 375/240.11
`5,521,643 A 5/1996 Yim ........................... 348/419
`5,768,436 A
`6/1998 Keesman .................... 382/248
`5,778,192 A * 7/1998 Schuster et al. ............ 709/247
`5,933,194. A * 8/1999 Kim et al. ............... 348/403.1
`FOREIGN PATENT DOCUMENTS
`35 11 659
`10/1986
`............ HO3M/7/38
`O 478 230
`4/1992
`... HO4N/7/30
`OSO9576
`10/1992 .......... HO4N/7/133
`O 513520
`11/1992 .......... HO4N/7/133
`O 599 258
`6/1994 .......... HO4N/7/133
`92, S.
`1.
`- - - - -
`- - - - EN2:
`O 711 O79
`5/1996 . Ho4N,750
`O 720 375
`7/1996 ..... ... HO4N/7/26
`O 739 138
`10/1996
`..... ... HO4N/7/26
`95 35628
`12/1995
`..... ... HO4N/7/26
`96 34496
`10/1996
`............ HO4N/7/30
`
`DE
`EP
`EP
`EP
`EP
`E.
`EP
`EP
`EP
`WO
`WO
`
`* cited by examiner
`
`Primary Examiner Timothy M. Johnson
`(74) Attorney, Agent, or Firm-McDermott, Will & Emery
`57
`ABSTRACT
`(57)
`In compression encoding of a digital signal, Such as
`
`E. :5. 2. SR - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - OS MPEG2, transform coefficients are quantised with the lower
`CO. Z. If JES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`bound of each interval being controlled by a parameter). In
`(51) Int. Cl." ............................. G06K 9/36; G06K 9/46
`the MPEG2 reference coder, for example, ) =0.75. Because
`(52) U.S. Cl. ................... 382/239; 375/240.03; 382/251
`the quantised coefficients are variable length coded,
`(58) Field of Search ................................. 382/236, 239,
`improved quality or reduced bit rates can be achieved by
`382/245, 246, 248, 250, 251; 375/240.02-240.07,
`trolling 2.
`to vary dy
`lly the bound of each
`COLOS A SO SO WW CWCa C OOUC O CaC
`240.18, 240.2, 240.23; 348/404.1, 405.1,
`interval with respect to the associated representation level.
`419.1, 395.1
`The parameter can vary with coefficient amplitude, with
`frequency, or with quantisation Step size. In a transcoding
`operation, 2 can also vary with parameters in the initial
`coding operation.
`g Op
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`5.245,427 A
`9/1993 Kunihiro ................. 348/400.1
`
`
`
`DCT
`COEFFS.
`OUANTIZER
`SCALE
`
`4 Claims, 3 Drawing Sheets
`
`62
`
`64
`
`TRUNCATE
`
`LINEAR
`CUANTIZER
`
`COMPARATORS . CALCUAE 58
`LAMBDA/2
`
`
`
`
`
`BUILD
`HISTOGRAM
`
`ESTIMATE
`pdi
`PARAMETERS
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 1
`
`
`
`U.S. Patent
`
`Dec. 23, 2003
`
`Sheet 1 of 3
`
`US 6,668,088 B1
`
`Fig.1.
`
`q1
`
`d
`
`r1
`
`d1 (+1)
`
`r1 (+1)
`
`d1 (+2)
`
`X
`
`DCT
`COEFF.
`
`GRUANTIZER
`SCALE
`
`
`
`DIVIDE BY
`2d
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 2
`
`
`
`U.S. Patent
`
`Dec. 23, 2003
`
`Sheet 2 of 3
`
`US 6,668,088 B1
`
`Fig.3.
`
`DCT
`COEFF. DIVIDE BY W
`
`DIVIDE BY q
`
`
`
`
`
`OUTPUT
`
`OUANTIZER SCALE
`
`
`
`TRUNCATE
`
`LAMBDA/2
`
`Fig.4.
`
`SCALED DCT COEFF.
`
`
`
`
`
`
`
`LAMBDA/2
`
`TRUNCATE
`
`OUTPUT
`CODE
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 3
`
`
`
`U.S. Patent
`
`Dec. 23, 2003
`
`Sheet 3 of 3
`
`US 6,668,088 B1
`
`Fig.5.
`
`
`
`62
`
`TRUNCATE
`
`DCT
`COEFFS.
`
`
`
`
`
`OUANTIZER
`SCALE
`
`
`
`
`
`
`
`
`
`LINEAR
`CUANTIZER
`
`
`
`COMPARATORS
`
`CALCULATE
`LAMBDA/2
`
`BUD
`HISTOGRAM
`
`ESTMATE
`pdi
`PARAMETERS
`
`54
`
`56
`
`SCALED
`
`?ector
`
`
`
`
`
`
`TRUNCAE
`
`
`
`Fig.6.
`
`
`
`76
`
`OUTPUT
`CODE
`
`78
`
`
`
`TRUNCATE
`
`CODNG COST
`PS P
`ABLE
`
`E5AE
`
`SPSS
`
`STO
`
`70
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 4
`
`
`
`US 6,668,088 B1
`
`1
`DIGITAL SIGNAL COMPRESSION
`ENCODING WITH IMPROVED
`QUANTISATION
`
`FIELD OF THE INVENTION
`This invention relates to the compression of digital Video,
`audio or other Signals.
`
`2
`coded, Such that a bound of each interval is controlled by a
`parameter ). The transformation process may take a large
`variety of forms, including block-based transforms Such as
`the DCT of MPEG2, and sub-band coding.
`SUMMARY OF THE INVENTION
`It is an object of one aspect of the present invention to
`provide an improvement in Such a method which enables
`higher quality to be achieved at a given bitrate or a reduction
`in bitrate for a given level of quality.
`Accordingly, the present invention is in one aspect char
`acterised in that ) is controlled So as to vary dynamically the
`bound of each interval with respect to the associated repre
`Sentation level.
`Suitably, wherein each value is arithmetically combined
`with 2.
`Advantageously, ) is:
`a function of the quantity represented by the value;
`where the transformation is a DCT, a function of hori
`Zontal and vertical frequency;
`a function of the quantisation Step size; or
`a function of the amplitude of the value.
`In a particular form of the present invention, the digital
`Signal to be encoded has been Subjected to previous encod
`ing and decoding processes and ) is controlled as a function
`of a parameter in Said previous encoding and decoding
`proceSSeS.
`In a further aspect, the present invention consists in a (q,
`2) quantiser operating on a set of transform coefficients x
`representative of respective frequency indices f in which 2.
`is dynamically controlled in dependence upon the values of
`X and f.
`Advantageously, ) is dynamically controlled to minimise
`a cost function D+uH where D is a measure of the distortion
`introduced by the quantisation in the uncompressed domain
`and H is a measure of compressed bit rate.
`BRIEF DESCRIPTION OF THE DRAWINGS
`The invention will now be described by way of example
`with reference to the accompanying drawings, in which:
`FIG. 1 is a diagram illustrating the relationships between
`representation levels, decision levels and the value of ),
`FIG. 2 is a block diagram representation of the quantiza
`tion process in the MPEG2 reference coder;
`FIG. 3 is a block diagram representation of a simplified
`and improved quantization process,
`FIG. 4 is a block diagram representation of the core
`elements of FIG. 3;
`FIG. 5 is a block diagram representation of a quantization
`process according to one aspect of the present invention; and
`FIG. 6 is a block diagram representation of a quantization
`process according to a further aspect of the present inven
`tion.
`
`15
`
`BACKGROUND OF THE INVENTION
`Compression encoding generally involves a number of
`Separate techniques. These will usually include a
`transformation, Such as the block-based discrete cosine
`transform (DCT) of MPEG-2; an optional prediction step; a
`quantisation Step and variable length coding. This invention
`is particularly concerned in this context with quantisation.
`The quantisation Step maps a range of original amplitudes
`onto the same representation level. The quantisation proceSS
`is therefore irreversible. MPEG-2, (in common with other
`compression standards such as MPEG-1, JPEG, CCITT/
`ITU-T Rec.H.261 and ITU-T Rec.H.263) defines represen
`tation levels and leaves undefined the manner in which the
`original amplitudes are mapped onto a given Set of repre
`Sentation levels.
`In general terms, a quantizer assigns to an input value,
`which may be continuous or may previously have been
`Subjected to a quantisation process, a code usually Selected
`from quantization levels immediately above and immedi
`ately below the input value. The error in Such a quantization
`will generally be minimised if the quantization level closest
`to the input value is Selected. In a compression System, it is
`further necessary to consider the efficiency with which
`respective quantization levels may be coded. In variable
`length coding, the quantization levels which are employed
`most frequently are assigned the Shortest codes.
`Typically, the Zero level has the shortest code. A decision
`to assign a higher quantization level, on the basis that it is
`the closest, rather than a lower level (and especially the Zero
`level) will therefore decrease coding efficiency. In MPEG2,
`the overall bit rate of the compressed signal is maintained
`beneath a pre-determined limit by increasing the Separation
`of quantization levels in response to a tendency toward
`higher bit rate. Repeated decisions to assign quantization
`levels on the basis of which is closest, may through coding
`inefficiency thus lead to a coarser quantization process.
`The behaviour of a quantizer in this respect may be
`characterised through a parameter ), which is arithmetically
`combined with the input value, with one value of 2.
`(typically 2-1) representing the Selection of the closest
`quantization level or “rounding”. A different value of 2.
`(typically 2-0) will in contrast represent the automatic
`choice of the lower of the two nearest quantization levels, or
`“truncating”. In the MPEG2 reference coder, an attempt is
`made to compromise between the nominal reduction in error
`which is the attribute of rounding and the tendency toward
`bit rate efficiency which is associated with truncating, by
`setting a standard value for 2 of 2–0.75.
`Whilst particular attention has here been paid to MPEG2
`60
`coding, Similar considerations apply to other methods of
`compression encoding of a digital Signal, which including
`the Steps of conducting a transformation process to generate
`values and quantising the values through partitioning the
`amplitude range of a value into a set of a adjacent intervals,
`whereby each interval is mapped onto a respective one of a
`Set of representation levels which are to be variable length
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`In the Specifically mentioned compression Standards, the
`original amplitude X results from a discrete cosine transform
`(DCT) and is thus related to a horizontal frequency index
`fe and a vertical frequency index f. Whilst this approach
`is taken as an example in what follows, the invention is not
`restricted in this regard.
`In general, a quantiser describes a mapping from an
`original amplitude X of frequencies f
`and f
`onto an
`
`65
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 5
`
`
`
`US 6,668,088 B1
`
`4
`The present invention referS Specifically to quantization of
`intra DCT coefficients in MPEG2 video coding but can be
`applied to non-intra coefficients, to other video compression
`Schemes and to compression of Signals other than video. In
`MPEG2, the prior art is provided by what is known as Test
`Model 5 (TM5). The quantization scheme of TM5 for
`positive intra coefficients is illustrated in FIG. 2.
`In order to Simplify the description, the above diagram
`will be replaced by FIG. 3, which illustrates essentially the
`Same quantizer except for Small values of q, where it
`corrects an anomaly as described in the Paper.
`In this quantizer, the incoming coefficients are first
`divided by quantizer weighting matrix values, W, which
`depend on the coefficient frequency but which are fixed
`acroSS the picture, and then by a quantizer Scale value q
`which can vary from one macroblock to the next but which
`is the same for all coefficient frequencies. Prior to the adder,
`the equivalent inverse quantizer reconstruction levels are
`Simply the integers 0, 1, 2 . . . . A fixed number 2/2, is then
`added to the value and the result truncated. The Significance
`of ) is that a value of 0 makes the quantizer (of the value
`input to the adder) a simple truncation, while a value of 1
`makes it a rounding operation. In TM5, the value of ) is
`fixed at 0.75.
`Attention will hereafter be focused on the operation of the
`“core quantizer shown in FIG. 4.
`In a class of MPEG-2 compatible quantisers for, intra
`frame coding, non-negative original dct-coefficients X (or
`the same coefficients after division by weighting matrix
`values W) are mapped onto the representation levels as:
`
`3
`amplitude y=Q(x). The mapping performed by the quantiser
`is fully determined by the set of representation levels {r}
`and by the corresponding decision levels {d} as illustrated
`in FIG. 1. All original amplitudes in the ranged,s x<d.
`are mapped onto the same representation level y=Q(x)=r. AS
`can be seen from FIG. 1, consecutive decision levels are
`related by the quantisation Step Size q: and for a given
`representation level r, the corresponding decision level is
`d=d-q
`(1)
`
`calculated as:
`
`d = r - q
`2
`
`(2)
`
`15
`
`The quantiser is fully Specified by the quantisation Step
`Size q and the parameter ), for a given set of representation
`levels {r}. Therefore, a quantiser that complies with equa
`tions (1) and (2) can be referred to as a (q2) quantiser.
`Currently proposed quantisers, as described in the refer
`ence coders for the H.261, H.263, MPEG-1 and MPEG-2
`Standards, all apply a special type of (q, , ) quantiser in that
`a fixed value of 2 is used: for example =0.75 in the
`MPEG-2 reference coder or ) = 1.0 in the MPEG-1 reference
`coder for quantisation of intra-DCT-coefficients.
`is not
`According to one aspect of this invention, )
`constant but is a function that depends on the horizontal
`frequency index fi, the Vertical frequency indeX f, the
`quantisation Step-Size q and the amplitude X:
`(3)
`W=A(for fier, q, x)
`Examples of ways in which the function may usefully be
`derived to improve picture quality in Video compression at
`a given bit-rate-or to reduce the required bit-rate at a given
`picture quality-will be set out below.
`The invention extends also to the case of transcoding
`when a first generation amplitude y=Q(x) is mapped onto
`a second generation amplitude y=Q(y) to further reduce
`the bit-rate from the first to the second generation without
`having access to the original amplitude X. In this case, the
`first generation quantiser Q and the Second generation
`quantiser Q2 are described as a (q1, 2)-type quantiser and
`a (q2, .2)-type quantiser, respectively. The Second genera
`tion a value is described as a function:
`
`25
`
`35
`
`40
`
`45
`
`The floor functional extracts the integer part of the given
`argument a.
`Negative values are mirrored:
`
`The amplitude range of the quantisation Step-Size q in eq.
`(1) is standardised; q has to be transmitted as Side informa
`tion in every MPEG-2 bit stream. This does not hold for the
`parameter 2 in eq. (1). This parameter is not needed for
`reconstructing the dct-coefficients from the bit Stream, and is
`therefore not transmitted. However, the 2-value controls the
`mapping of the original dct-coefficients X onto the given Set
`of representation levels
`
`50
`
`(7)
`According to eq. (1), the (positive) X-axis is partitioned by
`the decision levels
`
`(8)
`
`55
`
`60
`
`Each X ed., d) is mapped onto the representation level
`y=r. As a special case, the interval 0, d) is mapped onto
`y=0.
`The parameter 2 can be adjusted for each quantisation
`Step-Size q, resulting in a distortion rate optimised quanti
`sation: the mean-Squared-error
`
`65
`
`is minimised under a bit rate constraint imposed on the
`coefficients y. In order to Simplify the analysis, the first order
`Source entropy
`
`(4)
`The parameter 22, that appears in Eqn. (4) is applied in
`a reference (q2, 22)-type quantiser. This reference quan
`tiser bypasses the first generation and directly maps an
`original amplitude X onto a Second generation reference
`amplitude y2 =Q(x).
`The functional relationship of Eqn. (4) can be used to
`minimise the error (y-ya) or the error (y-X). In the first
`case, the resulting Second generation quantiser may be called
`a maximum a-posteriori (MAP) quantiser. In the Second
`case, the resulting Second generation quantiser may be called
`a mean Squared error (MSE) quantiser. Examples of the
`Second generation (q2, 22 AP)-type and (q2, .2se)-type
`quantisers are given below. For a more detailed explanation
`of the theoretical background, reference is directed to the
`paper “Transcoding of MPEG-2 intra frames'-Oliver
`Werner-IEEE Transactions on Image Processing 1998,
`which will for ease of reference be referred to hereafter as
`“the Paper”. A copy of the Paper is appended to British
`patent application No. 9703831 from which the present
`application claims priority.
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 6
`
`
`
`US 6,668,088 B1
`
`H=X-P, log P.
`
`(10)
`
`and further from eq. (16)
`
`of the coefficients y instead of the MPEG-2 codeword table
`is taken to calculate the bit rate. It has been verified in the
`Paper that the entropy H can be used to derive a reliable
`estimate for the number of bits that result from the MPEG-2
`codeword table. In Eqn. (10), P, denotes the probability for
`the occurrence of the coefficient y=r.
`The above constrained minimisation problem can be
`Solved by applying the Lagrange multiplier method, intro
`ducing the Lagrange multiplier u. One then gets the basic
`equation to calculate the quantisation parameter ).
`
`= 0
`
`(11)
`
`15
`
`Note, that the solution for 2 that one obtains from Eqn.
`(11) depends on the value of u. The value of u is determined
`by the bit rate constraint.
`
`His Ho
`
`(12)
`
`where Ho Specifies the maximum allowed bit rate for encod
`ing the coefficients y. In general, the amplitude range of the
`Lagrange multiplier is 0<u<OO. In the special case of Ho so,
`one obtains u->0. Conversely for Ho->0, one obtains in
`general u->OO.
`The Laplacian probability density function (pdf) is an
`appropriate model for describing the Statistical distribution
`of the amplitudes of the original dict-coefficients. This model
`is now applied to evaluate analytically Eqn. (11). One then
`obtains a distortion-rate optimised quantiser characteristic
`by inserting the resulting value for 2 in eq. (5).
`Due to the Symmetric quantiser characteristic for positive
`and negative amplitudes in Eqns. (5) and (6), we introduce
`a pdf p for describing the distribution of the absolute original
`amplitudes x. The probability Po for the occurrence of the
`coefficient y=0 can then be specified as
`
`(13)
`
`25
`
`35
`
`40
`
`45
`
`Similarly, the probability P, for the coefficiently becomes
`
`P =
`
`t+1-}a
`p(x) dix i = 1, 2, ...
`(-; – a
`
`(14)
`
`50
`
`With Eqns. (13) and (14), the partial derivative of the
`entropy H of eq. (10) can be written after a straightforward
`calculation as
`
`55
`
`It can be seen from eq. (17) that
`
`P-0 if 0s as 1
`a > v II Us As
`
`(17)
`
`(18)
`
`Thus, when ) is increased from Zero to one, the resulting
`distortion D is monotonically decreasing until the minimum
`value is reached for = 1. The latter is the Solution to the
`unconstrained minimisation of the mean-Squared-error,
`however, the resulting entropy H will in general not fulfil the
`bit rate constraint of eq. (12).
`Under the assumption of P,2P, in eq. (15), we see that
`6H/0). 20. Thus, there is a monotonic behaviour: when ) is
`increased from Zero to one, the resulting distortion D mono
`tonically decreases, at the same time the resulting entropy H
`monotonically increases. Immediately, an iterative algo
`rithm can be derived from this monotonic behaviour. The
`parameter ) is initially Set to 2-1, and the resulting entropy
`H is computed. If H is larger than the target bit rate Ho, the
`value of ) is decreased in further iteration steps until the bit
`rate constraint, eq. (12) is fulfilled. While this iterative
`procedure forms the basis of a simplified distortion-rate
`method proposed for transcoding of I-frames, we continue to
`derive an analytical Solution for 2.
`Eqns. (15) and (17) can be evaluated for the Laplacian
`model:
`
`p(x) = f3. a . e. if x - d = (1-3); a
`2
`
`(19)
`
`After inserting the model pdf of Eqn. (19) in Eqns. (15)
`and (17), it can be shown that the basic equation (11) leads
`then to the analytical Solution for 2,
`A = 1 - (h(c) + (1-z)-log(L)
`
`(20)
`
`P
`
`with Z=e' and the 'Z'-entropy
`(21)
`h(z)=-2. log22-(1-z). log2(1-z)
`Eqn. (20) provides only an implicit Solution for 2, as the
`probability Po on the right hand side depends on 2 according
`to eq. (13). In general, the value of Po can be determined
`only for known) by applying the quantiser characteristic of
`Eqns. (5) and (6) and counting the relative frequency of the
`event y=0. However, eq. (20) is a fixed-point equation for 2.
`which becomes more obvious if the right hand side is
`described by the function
`
`of 14.
`2
`
`(15)
`
`g(l) = 1 - f - h(z) + (1-z): log. 1.)
`
`(22)
`
`60
`
`resulting in the classical fixed-point form =g(0). Thus, it
`follows from the fixed point theorem of Stefan Banach that
`the Solution for 2 can be found by an iterative procedure
`with
`
`65
`
`(23)
`in the (+1)-th iteration step. The iteration of (23) converges
`towards the solution for an arbitrary initial value 0
`if the
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 7
`
`
`
`7
`function g is self-contracting, i.e. Lipschitz-continuous
`with a Lipschitz-constant Smaller than one. As an applica
`tion of the mean theorem for the differential calculus, it is
`not difficult to prove that g is always self-contracting if the
`absolute value of the partial derivative is less than one. This
`yields the convergence condition
`
`C
`il
`1
`Ög
`1 > -- . - . (1 - Z.). -
`
`2. Ino, (15 p.
`
`>
`
`(24)
`
`A distortion-rate optimised quantisation method will now
`be derived based on the results obtained above. As an
`example, a technique is outlined for quantising the
`AC-coefficients of MPEG-2 intra frames. It is straightfor
`ward to modify this technique for quantising the dct
`coefficients of MPEG-2 inter frames, i.e. P- and B-frames.
`Firstly, one has to take into account that the 63
`AC-coefficients of an 8x8 dct-block do not share the same
`distribution. Thus, an individual Laplacian model pdf
`according to eq. (19) with parameter C is assigned to each
`AC-frequency index i. This results in an individual quantiser
`characteristic according to Eqns. (5) and (6) with parameter
`2. Furthermore, the quantisation Step-Size q depends on the
`Visual weight W and a frequency-independent qScale param
`eter aS
`
`15
`
`25
`
`di F
`
`w; qScale
`16
`
`(25)
`
`8
`Where u' is now an a priori constant linking distortion to bit
`rate.
`A theoretical argument based on coding white noise gives
`a law of 6 dB per bit per coefficient. In practice, observation
`of actual coding results at different bit rates gives a law of
`k dB per bit, where k takes values from about 5 to about 8
`depending on the overall bit rate. In practice, the intuitive 6
`dB law corresponds well with observation.
`Additionally, the qScale parameter can be changed to meet
`the bit rate constraint of Eqn. (25). In principle, the visual
`weights W. offer another degree of freedom but for Simplicity
`we assume a fixed weighting matrix as in the MPEG-2
`reference decoder. This results in the following distortion
`rate optimised quantisation technique which can be Stated in
`a C-language-like form:
`
`/* Begin of quantising the AC-coefficients in MPEG-2 intra frames */
`Dmin
`for (qscale = qmin: qscale is qmax; qscale = qscale + 2)
`/* linear qscale table*/
`
`do {
`Step 1: determine W1, W2, . . . , Wes by minimising D + u H;
`Step 2: calculate H = XH (W);
`At = u + 8; /*ö to be selected appropriately */
`while (H > H);
`Step 3: calculate D = c : X D, (W);
`if (D < Di){
`qscaleep = qscale;
`for (i = 1; is 63; i = i + 1)Wiep = Wi:
`Dmin = D;
`for (i = 1; is 63; i = i + 1)
`
`W; qScale
`9;...opt —
`
`US 6,668,088 B1
`
`For a given step-size q, the quantisation results in a
`distortion D,0) and a bit rate H(0) for the AC-coefficients
`of the same frequency index i. As the dct is an orthogonal
`transform, and as the distortion is measured by the mean
`Squared-error, the resulting distortion D in the Spatial
`(sample/pixel) domain can be written as
`
`35
`
`{
`
`D = c X D; ()
`
`(26)
`
`y = Q(x) - lx y do sgn(X)
`q;
`2 J "P"
`quantise all AC-coefficients of frequency-index i by
`
`40
`
`with Some positive normalising constant c. Alternatively the
`distortion can measured in the weighted coefficient domain
`in order to compensate for the variation in the human visual
`response at different spatial frequencies.
`Similarly, the total bit rate H becomes
`
`H = XH (A)
`
`(27)
`
`For a distortion rate optimised quantisation, the 63 param
`eters of have to be adjusted Such that the cost function
`D+iu.H
`(28)
`is minimised. The non-negative Lagrange multiplier u is
`determined by the bit rate constraint
`(29)
`His Ho
`Alternatively, if the distortion is expressed in the logarithmic
`domain as:
`
`45
`
`50
`
`55
`
`60
`
`D'=20 logo D dB
`
`The cost function to be minimised becomes:
`
`(28a)
`
`65
`
`f*End of quantising the AC-coefficients in MPEG-2 intra frames */
`
`There are several options for performing Step 1-Step 3:
`1. Options for performing Step 1
`The parameters 2, 2, . . .
`.
`can be determined
`a) analytically by applying Eqns. (20)-(23) of Section 3.
`b) iteratively by dynamic programming of D+u-H, where
`either of the options described in the next points can be
`used to calculate D and H.
`2. Options for performing Step 2
`H=XH(0) can be calculated
`a) by applying the Laplacian model pdf, resulting in
`
`H =X h(P)+ (1-P) is
`
`(32)
`
`where h(Po) and h(Z) are the entropies as defined in
`eq. (21) of P. (eq. (13)) and Z=e,", respectively.
`Note that Po in Eqn. (32) can be determined by
`counting for each dct-frequency index i the relative
`frequency of the Zero-amplitude y=Q(x)=0.
`Interestingly, eq. (32) shows that the impact of the
`quantisation parameters 2 on the resulting bit rate H
`only consists in controlling the Zero-amplitude prob
`abilities Po.
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 8
`
`
`
`US 6,668,088 B1
`
`10
`
`b) from a histogram of the original dct-coefficients, result
`ing with Eqns. (10), (13) and (14) in
`
`H = -X X. P. log P.
`
`i
`
`(33)
`
`il
`A(x) = A = 1 - 5 log.
`4;
`
`P 1 ),
`P
`
`(39)
`
`(l=1,..., L)
`Similar to eqs. (13), (14), the probabilities in eq. (39)
`depend on the lambda parameters,
`
`n- a
`P = ?
`p(x) dix and
`
`O
`
`I-1
`
`p(x) dix is: 1
`
`15
`
`(40)
`
`41
`(41)
`
`c) by applying the MPEG-2 codeword table
`3. Options for performing Step 3
`D=c XD (0) can be calculated
`a) by applying the Laplacian model pdf of Eqn. (19) and
`evaluating Eqn. (16).
`b) by calculating D=EI(X-y) directly from a histogram
`of the original dict-coefficients X.
`Depending on which options are chosen for Step 1-Step
`3, the proposed method results in a single pass encoding
`Scheme if the Laplacian model pdf is chosen or in a multi
`pass scheme if the MPEG-2 codeword table is chosen.
`Furthermore, the method can be applied on a frame, mac
`roblock or on a 8x8-block basis, and the options can be
`chosen appropriately. The latter is of particular interest for
`any rate control Scheme that Sets the target bit rate Ho either
`locally on a macroblock basis or globally on a frame basis.
`Furthermore, we note that the proposed method skips
`automatically high-frequency dct-coefficients if this is the
`best option in the rate-distortion Sense. This is indicated if
`the final quantisation parameter 2
`has a value close to
`one for low-frequency indices i but a Small value, e.g. Zero,
`for high-frequency indices.
`A distortion-rate optimised quantisation method for
`MPEG-2 compatible coding has been described, with sev
`eral options for an implementation. The invention can imme
`diately be applied to Standalone (first generation) coding. In
`particular, the results help designing a Sophisticated rate
`control Scheme.
`The quantiser characteristic of eqS. (5) and (6) can be
`generalised to
`
`25
`
`35
`
`At q(x)
`
`(34)
`
`40
`
`for non-negative amplitudes X. The floor-functional in eq.
`(34) returns the integer part of the argument a. Negative
`amplitudes are mirrored,
`
`45
`
`The generalisation is reflected by the amplitude dependent
`values u(x), q(x), r(x) in eq. (34). For a given set of
`representation levels .
`. . <r-1-r/-ra . . . and a given
`amplitude X, the pair of consecutive representation levels is
`Selected that fulfils
`
`50
`
`rt six<r
`
`55
`
`(36)
`
`The value of the local representation level is then set to
`
`r(x)=r,
`
`(37)
`
`The value of the local quantisation Step-Size results from
`
`60
`
`A Straightforward extension of the rate-distortion concept
`detailed above yields for the local lambda parameter, very
`Similar to eq. (20).
`
`65
`
`(38)
`
`Therefore, eq. (39) represents a system of non-linear
`equations for determining the lambda parameters 2, ..., 0.
`In general, this System can only be Solved numerically.
`However, eq. (39) can be simplified if the term log2(P.
`1/P) is interpreted as the difference
`
`P 1
`- 1 = log(f)
`
`(42)
`
`(43)
`
`of optimum codeword lengths
`I=-log2P, I-1=-log2P 1
`asSociated with the representation levels r, r.
`A practical implementation of the above will now be
`described.
`Once the probability distribution, parametric or actual, of
`the unquantized coefficients is known, it is possible to
`choose a Set of quantizer decision levels that will minimise
`the cost function B, because both the entropy H and the
`distortion Dare known as functions of the decision levels for
`a given probability distribution. This minimization can be
`performed off-line and the calculated Sets of decision levels
`stored for each of a set of probability distributions.
`In general, it will be seen that the optimum value of 2.
`corresponding to each decision level is different for different
`coefficient amplitudes. In practice, it appears that the great
`est variation in the optimum value of 2 with amplitude is
`apparent between the innermost quantizer level (the one
`whose reconstruction level is 0) and all the other levels. This
`means that it may be Sufficient in Some cases to calculate, for
`each coefficient index and for each value (Suitably
`quantized) of the probability distribution parameter, two
`values of), one for the innermost quantizer level and one for
`all the others.
`A practical approach following the above description is
`shown in FIG. 5.
`The DCT coefficients are taken to a linear quantizer 52
`providing the input to a histogram building unit 54. The
`histogram is thus based on linearly quantized versions of the
`input DCT coefficients. The level spacing of that linear
`quantizer 52 is not critical but should probably be about the
`Same as the average value of q. The extent of the histogram
`function required depends on the complexity of the para
`metric representation of the pdf, in the case of a Laplacian
`or Gaussian distribution it may be Sufficient to calculate the
`mean or variance of the coefficients, while in the Zero
`excluded Laplacian used in the Paper it is Sufficient to
`calculate the mean and the proportion of Zero values. This
`histogram, which may be built up over a picture period or
`longer, is used in block 56 as the basis of an estimate of the
`
`Amazon / WAG Acquisition
`Exhibit 1016
`Page 9
`
`
`
`US 6,668,088 B1
`
`1O
`
`15
`
`25
`
`11
`pdf parameter or parameters, providing one of the inputs to
`the calculation of in block 58.
`Another input to the calculation of ) is from a set of
`comparatorS 60 which are in effect a coarse quantizer,
`determining in which range of values the coefficient to be
`quantized falls. In the most likely case described above, it is
`Sufficient to compare the value with the innermost non-Zero
`reconstruction level. The final input required to calculate 2.
`is the quantizer Scale.
`In general, an analytical equation for 2 cannot be
`obtained. Instead, a Set of values can be calculated numeri
`cally for various combinations of pdf parameters, compara
`tor outputs and quantizer Scale values, and the results Stored
`in a lookup table. Such a table need not be very large (it may,
`for example, contain fewer than 1000 values) because the
`optima are not very sharp.
`The value of 2 calculated is then divided by 2 and added
`in adder 62 to the coefficient prior to the final truncation
`operation in block 64.
`Instead of using variable codeword lengths that depend on
`the current probabilities according to eq. (43), a fixed table
`of variable codeword lengths Co., . . . , C, can be applied to
`Simplify the process. The values of Co, .
`. . , C, c