throbber
lntemational Bureau
`worm) INTELLECI'UAL_ PROPERTY ORGANIZATION
`
`INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`
`(51) lntnrnntinnnl Patent Classification 6 =
`H04N 1/41
`
`(11) International Publication Number:
`
`WO 96/15620
`
`A1
`
`(43) International Publication Date:
`
`23 May 1996 (23.05.96)
`
`(21) International Application Number:
`
`PCT/CA95/00635
`
`(81) Designated States: CA, JP. US. Elll'0p€‘/an Patent (AT. BE. CH.
`DE, DK, ES. FR. GB, GR. IE. IT, LU, MC, NL, PT, SE).
`
`(22) International Filing Date:
`
`8 November 1995 (08.1 1.95)
`
`Published
`
`(30) Priority Data:
`94227386
`
`10 November 1994 (l0.l1.94)
`
`GB
`
`With international search report.
`Before the expiration of the time limit for amending the
`claims and to be republished in the event of the receipt of
`amendments.
`
`(71) Applicant (for all designated States except US): THE CHI-
`NESE UNIVERSITY OF HONG KONG [GB/GB]; Re-
`search Administration Office. Room 226, Pi Chiu Building,
`Shatin, New Territories (HK).
`
`(7I)(72) Applicant and Inventor: WU. Xiaolin [CA/CA]; 17: 683
`Windemere Road, London, Ontario NSX 3T9 (CA).
`
`(72) Inventor; and
`(75) Inventor/Applicant (for US only): MEMON. Nasir [IN/US];
`128 Buena Vista Drive, Dekalb, IL 60115 (US).
`
`(74) Agent: MANNING. Gavin, N.; Oyen Wiggs Green & Mutala,
`480 - 601 West Cordova Street, Vancouver, British Colum-
`bia V6B 1G1 (CA).
`
`(54) Title: CONTEXT-BASED. ADAPTIVE, LOSSLESS IMAGE CODEC
`
`10
`
`I
`
`GRAD|ENT-
`ADJUSTED
`PREDICTION
`
`CONDITIONAL
`PROBABILITIES
`ESTIMATION
`
`ERROR
`MODEUNG
`
`oooma
`§R§lR?,‘§flf,}f3
`
`hi ENTR
`
`(57) Abstract
`
`An encoding/decoding method is provided for lossless (reversible) compression of digital pictures of all types, including continuous-
`tone images, graphics, multimedia images of mixed text. graphics and photographs, binary documents and drawings. Continuous-tone mode
`and binary mode are identified on a pixel-by-pixel basis (12).
`in continuous-tone mode, context modeling and prediction (20. 22, 26) are
`employed involving mostly integer arithmetic and simple logic in a conceptually sophisticated scheme. Both the encoding and decoding
`techniques are suitable for sequential and progressive transmission, although different specific algorithms may be employed for the different
`specific cases. The system is symmetric. meaning that the encoder and decoder have the same time and space complexities.
`
`Sony, Ex. 1023, p.1
`
`

`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`Codes used to identify States party to the PCI‘ on the from pages of pamphlets publishing international
`applications under the PCP.
`AT
`AU
`BB
`BE
`BF
`BG
`BJ
`BR
`BY
`CA
`CF
`CG
`CH
`CI
`CM
`CN
`CS
`CZ
`DE
`DK
`ES
`FI
`FR
`GA
`
`United Kingdom
`Georgia
`Guinea
`Greece
`Hungary
`Ireland
`Italy
`Japan
`Kenya
`Kyrgystan
`Democraiic People '5 Republic
`of Korea
`Republic of Korea
`Kazakhstan
`Liechtenstein
`Sri Lanka
`Luxembourg
`Latvia
`Monaco
`Republic of Moldova
`Madagascar
`Mali
`Mongolia
`
`Austria
`Australia
`Barbados
`Belgium
`Burliina Faso
`Bulgaria
`Benin
`Brazil
`Belarus
`Canada
`Central African Republic
`Congo
`Swilzerland
`Cole d‘lvoixe
`Cameroon
`China
`Czechoslovakia
`Czech Republic
`Gennany
`Denmark
`Spain
`Finland
`France
`Gabon
`
`GE
`GE
`GN
`GR
`HU
`IE
`[T
`JP
`KE
`KG
`KP
`
`KR
`KZ
`Ll
`LK
`LU
`LV
`MC
`MD
`MG
`ML
`MN
`
`Mauritania
`Malawi
`Niger
`Netherlands
`Norway
`New Zealand
`Poland
`Portugal
`Romania
`Russian Federation
`Sudan
`Sweden
`Slovenia
`Slovakia
`Senegal
`Chad
`Togo
`Tajikistan
`Trinidad and Tobago
`Ukraine
`United States of America
`Uzbekistan
`Viei Nam
`
`Sony, Ex. 1023, p.2
`
`

`
`WO 96/15620
`
`PCTlCA95/00635
`
`CONTEXT-BASED, ADAPTIVE, LOSSLESS IMAGE CODEC
`
`CLAIM OF PRIORITY
`
`The present application claims partial priority of
`
`British Provisional Patent Application Serial No. 9422738-6
`
`filed 10 November 1994.
`
`BACKGROUND OF THE INVENTION
`
`With rapidly-advancing computer,
`
`telecommunication,
`
`and digital imaging technologies,
`
`there is an astronomical
`
`amount of image data for a wide range of applications such as
`
`education, entertainment, medical
`
`imaging, space exploration,
`
`electronic publishing, visual arts, etc. This rapid growth of
`
`image data puts punishing burdens on computer storage and
`
`visual communication bandwidth. Thus image compression
`
`becomes a pressing technical challenge in visual
`
`communications and computing, without which it will be
`
`difficult to build, deploy, and use cost—effective multimedia
`
`information systems.
`
`Lossless compression is a form of compression where
`
`an image can be reconstructed without any loss of information.
`
`Lossless image compression is required by medical imaging,
`
`satellite/aerial imaging,
`
`image archiving, preservation of
`
`precious art work and documents,
`
`the press, or any
`
`applications demanding ultra high image fidelity.
`
`Furthermore,
`
`lossless image coding is the necessary last step
`
`of many lossy image compression systems, such as lossless
`
`compression of codeword indices in vector quantization (VQ),
`
`and lossless compression of transform coefficients in Discrete
`
`Cosine Transform (DCT) and wavelet/subband—based coding.
`
`There exists a large body of literature on lossless
`
`image compression algorithms and systems, such as the IBM
`
`Q-coder, and JPEG lossless coder. Among notable patents and
`
`publications are the US patents and research publications
`listed below:
`
`SUBSTITUTE SHEET (RULE 26)
`
`Sony, Ex. 1023, p.3
`
`

`
`WO 96115620
`
`PCl'lCA95l006J5
`
`4,463,342 1984 IBM.
`
`4,749,983 07/1933 Langdon.
`
`4,969,204 11/1989 Melnychuck et al.
`
`5,050,230 09/1990 Jones et al.
`
`Universal Modeling and Coding —- J. Rissanen and G. Langdon,
`1981,
`IEEE, vol.
`IT-27.
`
`A Universal Data Compression System -- J. Rissanen, 1983,
`IEEE, vol.
`IT-29.
`
`Parameter Reduction and Context Selection for Compression of
`the Gray-Scale Images -- S. Todd, G. Langdon, and J. Rissanen,
`1985,
`IBM J. Res.
`& Develop., vol.
`Comparing the Lossless Image Compression Standards and
`Universal Context Modelling -- R. Arps, M. Weinberger, T.
`Truong, and J. Rissanen, Proc. of the Picture Coding
`symposium, Sacramento, September 1994.
`
`on the JPEG Model for Lossless Image Compression --
`G. Langdon, A. Gulati, and E. Seiler,
`Proc. of 1992 Data Compression Conf.
`New Methods for lossless Image Compression Using Arithmetic
`Coding --P. Howard and J. Vitter, 1992,
`& Manag.,
`vol. 28.
`
`The currently achievable lossless compression ratio
`is still modest, being typically from 1.5:1 to 2.5:1.
`For
`instance,
`in contrast to the success of JPEG's lossy
`compression standard,
`the current JPEG's lossless compression
`standard has sufficiently poor coding efficiency that it is
`seldom used in practice.
`
`ISO and JPEG solicited for proposals for
`In 1994,
`the next international standard for lossless image
`compression.
`The present invention is a result of the
`inventors’ response to the ISO solicitation.
`The lead
`inventor xiaolin Wu, developed a context-based, adaptive,
`lossless image coding and decoding technique (herein CALIC).
`Among nine proposals that were submitted to ISO for its
`initial evaluation as candidates for the lossless image
`compression standard in 1995,
`the present CALIC system ranked
`first according to a criterion that accounts for both coding
`efficiency and algorithm simplicity.
`
`Sony, Ex. 1023, p.4
`
`

`
`WO 96115620
`
`PCTICA95/00635
`
`3
`
`Known prior art on lossless compression of
`continuous-tone images is based on the principle of predictive
`coding.
`An image is traversed, and pixels are encoded in a
`fixed order, typically in raster scan sequence. Previously
`encoded pixels that are known to both the encoder and the
`decoder are used to predict the upcoming pixels.
`The
`prediction errors rather than the pixels themselves are
`The
`entropy encoded by Huffman or like arithmetic coding.
`original image is reconstructed by adding the error term back
`to the prediction value. The predictive coding works because
`the histogram of the errors is much more concentrated (heavily
`biased toward 0)
`than the histogram of the pixel values,
`resulting in a significantly smaller zero-order entropy for
`the former than for the latter. Among numerous prediction
`schemes in the literature,
`the simplest type is a fixed linear
`predictor such as those used under the current lossless JPEG
`standard.
`
`A linear predictor can be optimized on an
`image-by—image or even block-by-block basis via linear
`regression. However, such an optimization is expensive and
`brings only modest
`improvement in coding efficiency. Moreover
`the performance of linear predictors is not robust in the
`areas of edges. Adaptive, non-linear predictor? can adjust
`parameters according to the local edge strengths and
`orientations, if edges exist.
`The adjustment of predictor
`parameters can be made very efficient since it is based on
`local information.
`lossless image compression inherited
`Historically,
`the theoretical framework and methodology of text compression.
`Statistical modeling of the source being compressed plays a
`central role in any data compression systems.
`Suppose that we
`encode a finite source x1,x2,m,xn sequentially.
`The optimal
`code length of the sequence in bits is then
`
`-log
`
`9:1 p(zg,1lx,,m,zg),
`1 0
`
`(1)
`
`given the assignments of conditional probabilities.
`Arithmetic coding can approach this code length of the source.
`
`Sony, Ex. 1023, p.5
`
`

`
`wo 95/15520
`
`PCI‘/CA95/00635
`
`4
`
`The challenge is to assign the conditional probabilities
`p(x,i+1}|xi,m,x1) to maximize the product given above, hence
`minimize the code length.
`The achievable compression is
`governed by a scheme, or a model, that can assign high
`conditional probability distributions p(x,i+1}|xi,m,x1)
`given sequence.
`
`to the
`
`Fitting a given source well with statistical models
`is a difficult and computationally very expensive task.
`Context modeling of continuous—tone images is made even more
`
`states (contexts).
`
`If the number of model
`and space complexities for modeling.
`parameters is too large with respect to the image resolution,
`there will be insufficient samples to obtain good estimates of
`conditional probabilities on the model states,
`leading to poor
`coding efficiency. This is known as a "context dilution
`problem." This problem was theoretically formulated by
`Rissanen in the framework of stochastic complexity as the
`"model cost." Rissanen's work proves that the high complexity
`of a model can reduce coding efficiency, as observed by many
`data compression practitioners. What is needed are innovative
`algorithmic techniques to reduce the model complexity for
`improving both coding and computational efficiency.
`
`SUMMARY OF THE INVENTION
`According to the invention, an encoding/decoding
`method is provided for lossless (reversible) compression of
`digital pictures of all types,
`including continuous-tone
`images, graphics, multimedia images of mixed text, graphics
`and photographs, binary documents and drawings. Continuous-
`tone mode and binary mode are identified on a pixel—by-pixel
`basis.
`In continous-tone mode, context modeling and
`prediction are employed,
`involving mostly integer arithmetic
`and simple logic in a conceptually sophisticated scheme. Both
`the encoding and decoding techniques are suitable for
`sequential and progressive transmission, although different
`
`Sony, Ex. 1023, p.6
`
`

`
`W0 96/15620
`
`,
`
`PCTIC.-195100635
`
`5
`
`specific algorithms may be employed for the different specific
`cases.
`The system is symmetric, meaning that the encoder and
`
`decoder have the same time and space complexities.
`
`A primary reason for the improved coding efficiency
`
`by context modeling of errors and error feedback lies in the
`fact that the prediction error sequence is a composition of
`
`multiple sources of distinct underlying statistics.
`
`The use
`
`of proper contexts to separate the statistically distinct
`
`sources of errors from the code stream can get a significant
`
`reduction in conditional entropies.
`
`The invention will be better understood by reference
`
`to the following detailed description in connection with the
`
`accompanying drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Fig. 1 is a schematic description of an encoding
`
`process for compression according to the invention in which
`
`the decoding process is a reverse.
`
`Fig. 2 illustrates labeling of neighboring pixels
`
`for context-based prediction and modeling.
`
`Fig.
`
`3 shows four possible convexity relationships
`
`between a GAP predictor i and two neighboring pixels.
`
`Fig.
`
`4 illustrates a two-stage adaptive prediction
`
`scheme via context modeling of prediction errors and error
`
`feedback.
`
`Fig. 5 is a chart depicting a method for determining
`
`a possible hierarchy of contexts (causal templates around a
`
`coded pixel marked by '?') in which optimal modeling context
`is selected.
`
`Fig.
`
`6 is a two-part graph illustrating a sign
`
`flipping technique to sharpen conditional error probability
`
`density functions for entropy reduction without increasing
`
`modeling space.
`
`Fig.
`
`7 is the block diagram of a four-pass encoding
`
`scheme for progressive transmission according to the
`invention.
`
`Sony, Ex. 1023, p.7
`
`

`
`W0 96/15620
`
`PC!‘/CA95/00635
`
`6
`
`Fig.
`
`8 is a diagram and key illustrating interlaced
`
`sampling for the first pass and the configuration of
`
`prediction and modeling context for such a pass.
`
`Fig. 9 is a diagram and key illustrating interlaced
`
`sampling for the second pass and the configuration of
`
`prediction and modeling context for such a pass.
`
`Fig. 10 is a diagram and key illustrating
`functionally lossless reconstruction at the third pass and the
`configuration of prediction and modeling context used in this
`pass.
`
`DESCRIPTION OF SPECIFIC EMBODIMENTS
`
`Co
`
`es
`
`Referring to Fig. 1, before actual compression, a
`system 10 operating according to the invention includes a
`
`decision maker 12 which dynamically selects on parts of the
`image between a compression scheme for continuous-tone image
`data and a compression scheme for binary image data based on
`context of the pixel data.
`In binary mode,
`the image data is
`compressed by a first encoder 14 using context-based,
`adaptive,
`ternary entropy encoding.
`In the continuous-tone
`
`mode,
`
`the image data is compressed by a second encoder 16
`
`using context-based, adaptive, non-linear predictive coding.
`Before encoding or decoding an image,
`the system is designed
`to provide or selects between sequential transmission and
`
`to encode or decode image data by
`progressive transmission,
`the appropriated types of context-sensitive entropy-encoding
`and decoding processors.
`Image data encoded and decoded in
`
`raster scan order in a single pass through the image is
`
`preferably transmitted by sequential transmission.
`
`The coding
`
`process uses prediction contexts that involve only the
`
`previous two or three scan lines of coded pixels.
`
`the encoding and decoding processes require only
`Consequently,
`a simple double buffer 18 that holds only the two or three
`
`rows of pixels that immediately precede the current pixel.
`
`(Where greater buffer space is available,
`
`the system can
`
`achieve progressive transmission via interlaced subsampling
`
`Sony, Ex. 1023, p.8
`
`

`
`W0 96/159620
`
`PCI‘lCA95I00635
`
`7
`
`that encodes, and accordingly decodes, an image in four
`
`passes.)
`
`In coding a picture,
`
`the CALIC system dynamically
`
`The
`binary and continuous-tone modes.
`operates in two modes:
`binary mode is for the situation in which the current locality
`
`of interest in the input image has no more than two distinct
`
`intensity values; hence it is designed for a more general
`
`class of images or subimages than the class of black-and-white
`
`images.
`
`An innovation of the system is to distinguish
`
`between binary and continuous-tone types of pictures on a
`
`local, rather than a global, basis.
`
`The system automatically
`
`selects one of the two modes depending on the context of the
`
`current pixel.
`
`In the binary mode,
`
`the context-based adaptive
`
`entropy coder 14 is used to code three symbols,
`
`including an
`
`escape symbol. This distinction between binary and
`
`continuous-tone modes is important because the coding
`
`methodologies are very different in the two modes, and it
`
`contributes to the universality and robustness of the CALIC
`
`system over a wide range of pictures,
`
`including those that mix
`
`texts, graphics and photographs in multimedia applications.
`The selection between the continuous-tone and binary
`
`modes is based on pixel context.
`
`The mode selection is
`
`automatic and completely transparent to the user.
`
`No side
`
`information about mode switching is required.
`
`The coded image
`
`only needs a header of five bytes:
`
`two for width,
`
`two for
`
`height, and one for depth or the number of bits in intensity
`resolution.
`
`In the continuous-tone mode, a mechanism 20 is
`
`provided to compress pixel values by context-based, adaptive,
`
`predictive coding.
`
`To predict the current pixel value, a
`
`predictor 22 is provided for a local intensity gradient to be
`
`estimated based on the neighboring pixels in horizontal and
`
`vertical directions. These estimates are used to detect the
`
`magnitude and orientation of edges in the input
`
`image so that
`
`adjustments can be made to the prediction accordingly. This
`gradient-adjusted prediction (GAP), denoted by i, of I is
`further corrected based on the particular pattern that the
`
`Sony, Ex. 1023, p.9
`
`

`
`W0 96/1 5620
`
`PCTICA95I00635
`
`8
`
`(as shown here as an output of an
`neighboring pixels exhibit
`error modeling function 26). This is necessary because
`gradients alone cannot adequately characterize some of the
`
`more complex relationships between the predicted pixel and its
`surroundings. Context modeling can exploit higher-order
`
`structures such as texture patterns in the image. This is
`
`done by quantizing the neighborhood into different
`
`conditioning classes, and estimating the expegtatiog of the
`
`GAP prediction error e
`
`I - I in each of these classes
`
`separately.
`
`The reasons for estimating conditional
`
`expectations rather than the conditional density functions, as
`
`commonly suggested in the literature, are to prevent the
`
`problem of context dilution (sample counts are insufficient to
`
`obtain good estimates), and to reduce time and space
`complexities.
`The conditional error expectation estimate is
`
`then fed back to the GAP predictor 22 I to generate a second
`
`and improved prediction denoted by f, which is equal to I + E.
`The net effect is a context-based, adaptive, non-linear
`
`predictor that can correct itself by learning from its
`mistakes made in the past and in the same context. This is a
`
`key feature to distinguish the prediction scheme of the CALIC
`
`system from the existing prediction schemes.
`
`encoded.
`
`The final prediction error 5 = I - f is entropy
`In driving the entropy coder 16, a modest number
`
`(between 8 and 32) of conditional probabilities of prediction
`
`errors 5 are estimated (by an estimator 24) depending on an
`estimate of the prediction error energy.
`The error energy
`estimator is a linear function of the horizontal and vertical
`
`gradient and of the prediction errors at the preceding pixels.
`It can be optimized off line via linear regression for
`
`specific types of images.
`
`The conditioning classes for
`
`entropy coding are generated by quantizing the error energy
`
`estimator.
`
`The quantization can also be optimized via
`
`off-line dynamic programming to minimize the total code length
`of a set of training images.
`In addition,
`the CALIC system
`
`uses a novel and elegant technique to "sharpen" the
`
`conditional probabilities (thus reducing underlying entropy)
`
`Sony, Ex. 1023, p.10
`
`

`
`W0 96Il5620
`
`PCTICA95/00635
`
`9
`
`for entropy coding by a simple sign flipping mechanism 28
`
`without any increase in modeling space.
`
`3.
`
`_lg.
`
`I
`
`3
`
`E
`
`3.
`
`A nonlinear predictor according to the invention
`
`adapts itself to the image gradients near the predicted pixel.
`
`This predictor improves the robustness of traditional DPCM
`
`predictors, particularly in areas of edges.
`
`The basic
`
`function is now explained.
`
`Denote an input image of width W and height H by
`
`I[i,j], O 5 i < W,
`
`O s j < H.
`
`In order not to obscure the
`
`concept of the proposed compression algorithm, we only
`
`consider in the following development the sequential coding of
`
`interior pixels I[i,j], 2 s i < w—2,
`
`2 s j < H-2. Many
`
`possible treatments of boundary pixels are possible, and they
`
`do not make a significant difference in the final compression
`
`ratio due to the small population of boundary pixels.
`
`For
`
`instance, boundary pixels can be coded by a simple DPCM scheme
`
`as they are encountered in the raster scan.
`
`To facilitate the prediction of I[i,j] and entropy
`
`coding of the prediction error via context modeling, we
`
`compute the following quantities:
`
`d,, = |I[i-1.1"] -I[i-2,j]|+|I[i,j-1] —I[i-l,j-1]I+
`
`|ru+1,j—11—m.j—11|
`
`¢,=|I[i-1,j]-I[i-l,j-1][+|I[i,j-1]-1Ti,j-2]|+
`
`|I[i+l.j'l]-I[i+l.j'2]L
`
`Clearly, dh and dv are estimates, within a scaling
`
`factor, of the gradients of the intensity function near pixel
`
`I[i,j] in horizontal and vertical directions.
`
`The values of
`
`dh and dv are used to detect the magnitude and orientation of
`edges in the input image, and make necessary adjustments in
`
`the prediction. We aim to alleviate the problem that the
`
`precision of existing DPCM-type predictors can be adversely
`
`Sony, Ex. 1023, p.11
`
`

`
`W0 96/15620
`
`PCT/CA95/00635
`
`10
`
`affected by edges.
`
`In Equations (2)
`
`three absolute
`
`differences are used for d in each direction. This has been
`
`found to give the best compression results.
`
`Two or one
`
`absolute differences can be used here for lower complexity
`with a small loss in performance. Efficient incremental
`
`and/or parallel schemes for evaluating dh and dv are
`straightforward.
`For instance,
`to avoid unnecessary repeated
`computations, one can store the values of 1I[-,-] - I[-,-I
`associated with preceding pixels for future reference. This
`
`only requires an array of the size W.
`
`For simple denotation in the sequel, we let
`
`D = Iii; j'1}, W = I[i-1/ j]; H9 = I[i+1z j’1].
`nw = I[i-1, j-1], nn = I[i, j-2], ww = I[i-2, j],
`
`(3)
`
`meaning north, west, northeast, northwest, north-north, and
`west-west, respectively.
`
`The locations of these pixels with respect to I[i,j]
`are given in Fig. 2.
`
`Based on dh and dv, a simple technique, as described
`
`by the following conditional statements,
`
`is used to make a
`
`gradient—adjusted prediction (GAP) I[i,j] of I[i,j].
`
`Sony, Ex. 1023, p.12
`
`

`
`W0 96/15620
`
`PC!‘/CA95/00635
`
`11
`
`v I
`
`IF (d
`
`- dh > 80) {sharp horizontal edge}
`
`[i,j] = w
`
`ELSE IF (dv - dh
`
`< -80) {sharp vertical edge}
`
`f[i/j} = n
`
`ELSE {
`
`I'[i,:iJ
`
`(w+n)/2 + (ne-nw)/4;
`
`IF (dv - db
`
`32) {horizontal edge}
`
`T[i,j]
`
`(T[i:j]+W)/2
`
`ELSE IF (dv
`
`dh > 8) {weak horizontal edge}
`
`I'[iz_7'] = <3I'[i,j1+w)/4
`
`ELSE IF (dv - dh < -32) {vertical edge}
`
`T[i,j] = (f[i,j]+n)/2
`
`ELSE IF (dv - dh < -8) {weak vertical edge)
`
`f[i,j] = (3f[i,j]+D)/4
`
`} T
`
`he procedure given above is parallelizable. This
`
`technique differs from the existing linear predictors in that
`
`it weights the neighboring pixels of I[i,j] according to the
`estimated gradients of the image.
`In effect, f[i,j] is a
`
`simple, adaptive, nonlinear predictor.
`
`The predictor
`
`coefficients and thresholds given above were empirically
`
`chosen.
`
`A major criterion in choosing these coefficients is
`
`the ease of computations.
`
`For instance, most coefficients are
`
`The
`of power of 2 to avoid multiplication and divisions.
`multiples of 3
`in the procedure above can be computed by a bit
`shift and an addition.
`
`It is possible, albeit quite expensive,
`
`to optimize
`
`the coefficients and thresholds for an image or a class of
`
`images, so that a norm of the expected prediction error
`
`E{HI - IN} is minimized.
`
`It is not recommended that such an
`
`optimization process to be carried out on an image—by-image
`basis. However, it is important to point out that the
`
`coefficients and thresholds in computing
`
`I[i,j] can be set by
`
`the user,
`
`if the user knows the optimal or nearly optimal
`
`coefficients and thresholds for the target images.
`
`Sony, Ex. 1023, p.13
`
`

`
`W0 96/15620
`
`PCI‘ICA9S/00635
`
`12
`
`Error Energy Quantization for Minimum Entropv
`
`Although the nonlinear predictor f[i,j] outperforms
`
`linear predictors, it does not completely remove the
`
`statistical redundancy in the image.
`
`The variance of
`
`prediction errors e
`
`I - I strongly correlates to the
`
`smoothness of the image around the predicted pixel I[i,j].
`model this correlation at a small computational cost, we
`define an error energy estimator to be
`
`To
`
`wit
`A=ad,,+bdV+c!e'
`
`(4)
`
`where dh and dv are defined in Equation (2) which are reused
`
`here to quantify the variability of pixel values, and
`ew = I[i-1,j] - I[i-1,j].
`|ew|
`is chosen because large errors
`
`tend to occur consecutively.
`
`The coefficients a, b, and c can
`
`be determined,
`in an off-line design process, by the standard
`linear regression technique so that A is the least—squares
`estimator of the error strength |e| based on dh, dv,
`few].
`For algorithm efficiency, we recommend a = b
`1 and c = 2
`
`in
`
`Equation (4).
`
`In our experiments we also find that other
`
`definitions of A can sometimes estimate the error energy
`better, such as
`
`A = amin {dh' dv} + bmax X {iewit
`
`ienl}I
`
`depending on images, where en = I[i,j—1]
`
`- f[i,j-1].
`
`By conditioning the error distribution on A, we can
`
`separate the prediction errors into classes of different
`
`variances.
`
`Thus entropy coding of errors using estimated
`
`conditional probability p(e|A)
`
`improves coding efficiency over
`
`using p(e). For time and space efficiency, we quantize
`
`A to L levels.
`
`In practice, L=8 is found to be sufficient.
`
`Larger L improves coding efficiency only marginally.
`
`Denote the A quantizer by Q, i.e., Q(A)
`
`in {O,l,w,7}.
`
`In
`
`entropy coding of prediction errors, we estimate and use eight
`conditional probabilities p(e|Q(A)).
`Since A is a random
`
`variable,
`
`it requires only scalar quantization.
`
`The
`
`Sony, Ex. 1023, p.14
`
`

`
`W0 96/15620
`
`PCl‘lCA95/00635
`
`13
`
`errors.
`
`In an off-line design process, we get a training set
`
`of (e,A) pairs from test images, and use the standard dynamic
`
`programming technique to choose 0 = qo < ql <
`to partition A into L intervals such that
`
`< q(L_1} < qL = on
`
`L-1
`
`2:
`- 2:
`d‘ 0 G456 sqdq
`
`p(e)log p(e)
`
`is minimized.
`
`In practice, we found that an image-independent A
`
`quantizer whose bins are fixed,
`
`q1=5, q2=l5. q,=25. q,=42, q5=6O, q6=85, q7=14O.
`
`(7)
`
`worked almost as well as the optimal
`
`image-dependent A
`
`quantizer.
`
`Estimating L=8 conditional error probabilities
`
`p(e|Q(A)) requires only a modest amount of memory in the phase
`
`of entropy coding of prediction errors.
`
`Furthermore,
`
`the
`
`small number of involved conditional error probabilities means
`
`that even small
`
`images will provide enough samples for context
`
`modeling to learn p(e|Q(A)) quickly in adaptive entropy
`
`coding.
`
`Context Modeling of Prediction Errors
`
`The precision of the nonlinear GAP predictor I[i,j]
`
`can be significantly improved via context modeling, because
`
`gradients alone cannot adequately characterize some of the
`
`more complex relationships between the predicted pixel I[i,j]
`
`and its surrounding. Context modeling of the prediction error
`
`e = I - I can exploit higher-order structures such as texture
`
`patterns in the image for further compression gains.
`
`Denote by a K-tuple C = {xo,x1,~,xk_1} a modeling
`
`context of f[i,j] that consists of K events xi.
`For both space and time efficiencies and to avoid
`
`the problem of context dilution, we need to drastically reduce
`
`Sony, Ex. 1023, p.15
`
`

`
`W0 96/15620
`
`PCI‘lCA95/00635
`
`14
`
`the number of possible contexts C by vector quantization of C.
`Specifically, we consider a prediction context of eight events
`
`C ={3%,m,Jg,A9} =
`
`{n,w,nw,ne,nn,ww,2n—nn,2w—ww}
`
`(9)
`
`and quantize context C = {x0,x1,~,x7} to an 8-bit binary
`number b = b7b5~bo using the prediction value I as the
`
`threshold, namely
`
`0ifx,,2.i’[i,j]
`
`bk
`
`,Osk<K=8.
`
`lifxk< i[i,j]
`
`Clearly, number 3 (a binary vector codeword) encodes the
`
`texture patterns in the modeling context which are indicative
`
`of the behavior of e. Also note that an event xi in a
`prediction context need not be a neighboring pixel to I[i,j].
`It can be a function of some neighboring pixels.
`By letting
`x5 = 2n-nn, x7 = 2w-ww in the above, and consequently setting
`be and b7 depending on whether 2n-nn < I[i,j] and
`2w-ww < I[i,j], we can detect whether the prediction value
`I[i,j] forms a convex or concave waveform with respect to the
`neighboring pixels in vertical and horizontal directions, as
`depicted by Fig. 3. This convexity information is useful in
`the context modeling of e.
`
`Since the variability of neighboring pixels also
`influences the error distribution, we combine the quantized
`error energy 0 5 LQ(A)/2] < L/2 with the quantized texture
`pattern 0 5 B < 2‘ to form compound prediction contexts,
`denoted by C(LQ(A)/2J,B). This scheme can be viewed as a
`
`product quantization of two independently treated image
`features: spatial texture patterns and the energy of
`prediction errors.
`
`At a glance, we would seemingly use 4
`
`- 28 = 1024
`
`< L/2 = 4 and
`different compound contexts, since 0 5 LQ(A)/2]
`0 5 B < 23 = 28. However, not all 28 binary codewords of B
`
`Sony, Ex. 1023, p.16
`
`

`
`wo 96/15620
`
`PC!‘/CA9Sl00635
`
`15
`
`quantizer defined by (9) are possible.
`value f[i,j] lies in between n and nn,
`
`If the prediction
`then the condition
`
`2n-nn < f[i,j] is either always true or always false.
`
`In
`
`other words,
`
`the outcome of the test 2n-nn < f[i,j] in (9)
`
`is
`
`uniquely determined if n > f[i,j] and nn < f[i,j], or if
`n < I[i,j] and nn > f[i,j]. This should be apparent from
`
`By referring to (8) and (9), we see that in B, b6 is
`Fig. 3.
`fixed if b2 and b5 are different. Likewise, b7 is fixed if b0
`
`and b4 are different. Because b6, b2, and b5 are not
`independent,
`they can only yield 2-2 + 2-1 = 6 combinations.
`
`The same holds for b7, b0, and b4. Thus, 5 represents only
`
`6-6-4 = 144 rather than 256 valid texture patterns. Finally,
`
`the total number of valid compound contexts in error modeling
`
`is only 4
`
`- 144 = 576.
`
`Th model cost is still to high to estimate
`
`p(e|C(LQ(A)/2J,B)) for 576 compound contexts. Therefore, we
`
`estimate only the conditional expectations E{e]C(6,B)} using
`
`the corresponding sample means 3(6,fi) for different compound
`
`contexts. Computing 3(6,B)
`
`involves only accumulating the
`
`error terms in each compound context and maintaining a count
`
`on the occurrence of each context.
`
`For each compound context
`
`C(6,B),
`
`0 5 6 < 4, 0 5 B < 144, we count the number N(6,fi) of
`
`occurrences of C(6,B) for the purpose of computing the
`
`conditional sample mean ?(6,B). We only use one byte to store
`
`the occurrence count. Whenever the count reaches some value
`
`nm.x < 256, we scale down the count but let the corresponding
`E(6,B)
`intact. We recommend nmax = 128.
`
`In order to update ?(6,B)
`
`in error modeling, we use
`
`an accumulator S(6,B) of e(5,B), and compute
`
`?(5.B) = S(5.B)/N(6.B)
`
`whenever a compound context C(6,B) occurs. We use a 16-bit
`
`signed integer to store S(6,B) whose range is more than
`
`sufficient in practice.
`
`It takes more than 128 consecutive
`
`35
`
`E(5,B) = -255 or E(5,B) = 255 to violate the bounds
`
`Sony, Ex. 1023, p.17
`
`

`
`W0 96/15610
`
`PCT/CA95/00635
`
`16
`
`-215 < s(a,;3) < 215, but we have nmx = 123. Once N(6,B)
`
`reaches 128, we re-scale the variables by setting
`
`s(6.[5)=s(5.B)/2;
`
`N(6.B)=64
`
`(11)
`
`the coding system needs one byte for
`In summary,
`N(6,B) and two bytes for s(6,B) for each of 576 compound
`contexts.
`Thus the total size of the memory required by
`context modeling of prediction errors can be as small as 1728
`
`bytes.
`
`Besides being a technique to reduce the memory use,
`rescaling also has the beneficial side effect of aging the
`observed data. This is an inexpensive way of adapting the
`context error modeling to time varying sources.
`Indeed,
`the
`
`scaling technique gave a slightly better compression in our
`experiments.
`
`Context-based Adaptive Eredictor via E;;g; Feedback
`The idea of gaining coding efficiency by context
`modeling of expected errors E{e|C(6,B)} came from the
`
`is generally not
`observation that the conditional mean 3(6,B)
`zero in a given context C(6,B). This does not contradict the
`
`well-known fact that the DPCM prediction error without
`
`follows a Laplacian (symmetric
`conditioning on the contexts,
`exponential) distribution, hence is zero mean for most
`
`continuous-tone images.
`
`The Laplacian distribution can be
`
`viewed as a composition of distributions of different means
`
`and different variances. Context modeling of prediction
`
`errors is a means to separate these distributions. Therefore,
`the more biased is §(6,6) from zero,
`the more effective is the
`
`modeling.
`
`Since the conditional mean §(6,B)
`
`is the most
`
`likely prediction error in a given compound context C(6,B), we
`can improve the prediction by feeding back §(6,B) and
`
`adjusting the prediction I to f = I + E(6,B).
`
`In order not to over-adjust the GAP predictor, we
`
`actually consider the new prediction error 5 = I - f rather
`
`than e = I - I in context-based error modeling.
`
`The context
`
`modeling of e
`
`in turn leads to an improved predictor for
`
`Son

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket