`Beretta et al.
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US005850484A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,850,484
`Dec. 15, 1998
`
`[54] TEXT AND IMAGE SHARPENING OF JPEG
`COMPRESSED IMAGES IN THE
`FREQUENCY DOMAIN
`
`0593159A2
`07087491
`07143343
`
`9/1993 European Pat. Off ......... G06F 15/64
`3/1995
`Japan ............................... H04N 7/30
`6/1995
`Japan ............................... H04N 1!41
`
`[75]
`
`Inventors: Giordano Beretta, Palo Alto; Vasudev
`Bhaskaran, Mountain View;
`Konstantinos Konstantinides, San
`Jose, all of Calif.
`
`[73] Assignee: Hewlett-Packard Co., Palo Alto, Calif.
`
`[21] Appl. No.: 940,695
`
`[22] Filed:
`
`Sep. 30, 1997
`
`Related U.S. Application Data
`
`[63] Continuation of Ser. No. 411,369, Mar. 27, 1995, aban(cid:173)
`doned.
`Int. Cl.6
`....................................................... G06K 9/36
`[51]
`[52] U.S. Cl. .......................... 382/250; 382/251; 382/239;
`358/432; 348/404
`[58] Field of Search ..................................... 382/298, 233,
`382/251, 244, 232, 253, 250, 274, 252,
`238, 236, 166, 280, 270; 358/427, 426,
`432, 261.3, 448, 261.1, 433, 261.2, 430,
`458; 348/404, 432, 405, 433, 403, 391,
`384,422,393,430,394,409,395,390
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,776,030 10/1988 Tzou ....................................... 358/432
`4,780,761 10/1988 Daly eta!. .............................. 358/133
`5,063,608 11/1991 Siegel ........................................ 382/56
`5,073,820 12/1991 Nakagawa et a!. ..................... 358/133
`5,333,212
`7/1994 Ligtenberg ................................ 382/56
`5,410,352
`4/1995 Watanabe ................................ 348/405
`5,465,164 11/1995 Sugiura ................................... 358/432
`5,488,570
`1!1996 Agarwal .............................. 364/514 R
`
`FOREIGN PATENT DOCUMENTS
`
`0444884A2
`0513520A2
`
`2/1991 European Pat. Off ..
`4/1992 European Pat. Off ........ H04N 7/133
`
`OTHER PUBLICATIONS
`
`G. B. Beretta et al., "Experience with the New Color
`Facsimile Standard", ISCC Annual Meeting, Apr. 23-25,
`1995, pp. 1-7.
`Albert J. Ahumada, Jr. et al., "Luminance-Model-Based
`DCT Quantization for Color Image Comression", Human
`Vision, Visual Processing, and Digital Display III, 1666,
`365-374, SPIE, 1992.
`
`(List continued on next page.)
`
`Primary Examiner-Jose L. Couso
`Assistant Examiner-Matthew C. Bella
`
`[57]
`
`ABSTRACT
`
`The text and image enhancing technique according to the
`invention is integrated into the decoding or inverse quanti(cid:173)
`zation step that is necessarily required by the JPEG standard.
`The invention integrates the two by using two different
`quantization tables: a first quantization table (QE) for use in
`quantizing the image data during the compression step and
`a second quantization table used during the decode or
`inverse quantization during the decompression process. The
`second quantization table QD is related to the first quanti(cid:173)
`zation table according to a predetermined function of the
`energy in a reference image and the energy in a scanned
`image. The energy of the reference image lost during the
`scanning process, as represented by the energy in the
`scanned image, is restored during the decompression pro(cid:173)
`cess by appropriately scaling the second quantization table
`according to the predetermined function. The difference
`between the two tables, in particular the ratio of the two
`tables, determines the amount of image enhancing that is
`done in the two steps. By integrating the image enhancing
`and inverse quantization steps the method does not require
`any additional computations than already required for the
`compression and decompression processes.
`
`35 Claims, 7 Drawing Sheets
`
`62
`
`HUAWEI EX. 1007 - 1/18
`
`
`
`5,850,484
`Page 2
`
`01HER PUBLICATIONS
`
`Kenneth R. Alexander et al., "Spatial-Frequency Character(cid:173)
`istics of Letters Identification", J. Opt. Soc. Am. A, 11,9,
`2375-2382, 1994.
`Wen-Hsiung Chen et al., "Adaptive Coding of Monochrome
`and Color Images", IEEE Transactions on Communications,
`COM-25, 1285-1292, 1977.
`Bowonkoon Chitprasert et al., Human Visual Weighted
`Progressive Image Transmission, IEEE Transactions on
`Communications, COM-38, 7, 1040-1044, 1990.
`R. J. Clarke, Spectral Responses of the Discrete Cosine and
`Walsh-Hadamard Transforms, lEE Proc., 130, Part F,
`309-313, 1983.
`K.K. De Valois et al., Color-Luminance Masking Interac(cid:173)
`tions, Seeing Contour and Colour, 1.1. Kulikowski, C.M.
`Dickinson and I.J. Murray Editors, Pergamon Press, Oxford,
`1989.
`1. Raymond Edinger, Jr., "A Measure for Stairstepping in
`Digitized Text that Correlates with the Subjective Impres(cid:173)
`sion of Quality", IS&T's Tenth International Congress on
`Advances in Non-Impact Printing Technologies, 552-558,
`1994.
`Yasushi Hoshino et al., Applicability of a Standardized
`Discrete Cosine Transform Coding Method to Character
`Images, J. Electronic Imaging, 1, 3, 322-327, 1992.
`Chansik Hwang et al., Human Visual System Weighted
`Progresseive Image Transmission Using Lapped Orthogonal
`Transform/Classified Vector Quantization, Optical Engi(cid:173)
`neering, 32, 7, 1524-1530, 1993.
`International Organization for Standardization: Information
`Technology-Digital Compression and Coding of Continu(cid:173)
`ous-Tone Still Images-Part 1: Requirements and Guide(cid:173)
`lines, ISO!IEC IS 10918-1, Oct. 20, 1992.
`International Telecommunication Union: Amendments to
`ITU-T Rec. T.30 for Enabling Continuous-Tone Colour and
`Gray-Scale Modes for Group 3, COM 8-43-E, Question
`5!8, Mar. 1994.
`International Telecommunication Union: Amendments to
`ITU-T Rec. T -4 for Enabling Continuous-Time Colour and
`Gray-Scale Modes for Group 3, COM 8-44-E, Question
`5!8, Mar. 1994.
`Gordon E. Legge, "Reading: Effects of Contrast and Spatial
`Frequency", Applied Vision, OSA Technical Digest Series,
`16, 90-93, 1989.
`
`Gordon E. Legge et al., Contrast Masking in Human Vision,
`1. Opt. Soc. Am., 70,12,1458-1471, 1980.
`David L. McLaren et al., "Removal of Subjective Redun(cid:173)
`dancy from DCT-Coded Images", lEE Proceedings-!, 138,
`5, 345-350, 1991.
`for
`"Color-Facsimile System
`I. Miyagawa et al.,
`Mixed-Color Documents", SID 94 Digest, 887-890, 1994.
`Kathy T. Mullen, "The Contrast Sensitivity of Human
`Colour Vision to Red-Green and Blue-Yellow Chromatic
`Gratings", J. Physiol., 359, 381-400, 1985.
`Daivd H. Parish et al., "Object Spatial Frequencies, Retinal
`Spatial Frequencies, Noise, and the Efficiency of Letter
`Discrimination", Vision Res., 31, 7/8, 1399-1415, 1991.
`Denis G. Pelli et al., "Visual Factors in Letter Identifica(cid:173)
`tion", IS&T's 47th Annual Conference/ICPS, p. 411, 1994.
`Heidi A Peterson et al., An Improved Detection Model for
`DCT Coefficient Quantization, Human Vision, Visual Pro(cid:173)
`cessing, and Digital Display IV, 1913, 191-201, SPIE, 1993.
`Ricardo L.
`de Queiroz
`et
`al.,
`"Human Visual
`Sensitivity-Weighted Progressive
`Image Transmission
`Using the Lapped Orthogonal Transform", J. Electronic
`Imaging, 1, 3, 328-338, 1992.
`Ricardo L. de Queiroz et al., Modulated Lapped Orthogonal
`Transforms in Image Coding, Digital Video Compression on
`Personal Computers: Algorithms and Technologies, 2187,
`80-91, SPIE, 1993.
`Robert J. Safranek et al., "A Perceptually Tuned Sub-Band
`Image Coder with Image Dependent Quantization and
`Post-Quantization Data Compression", Proc. ICASSP 89, 3,
`1945-1948, 1989.
`Robert J. Safranek, JPEG Compliant Encoder Utilizing
`Perceptually Based Quantization, Human Vision, Visual
`Processing, and Digital Display V, 1913, 117-126, SPIE,
`1993.
`Andrew B. Watson, DCT Quantization Matrices Visually
`Optimized for Individual Images, Human Vision, Visual
`Processing, and Digital Display IV, 1913, 202-216, SPIE,
`1993.
`Andrew B. Watson et al., Discrete Cosine Transform (DCT)
`Basis Function Visibility: Effects of Viewing Distance and
`Contrast Masking, Human Vision, Visual Processing, and
`Digital Display V,
`2179,
`99-108, SPIE,
`1994.
`
`HUAWEI EX. 1007 - 2/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 1 of 7
`
`5,850,484
`
`10
`
`)
`
`FIG. 1
`(PRIOR ART)
`
`2 4
`L-)
`
`3 0 v
`
`Q TABLES
`
`H TABLES
`
`SOURCE IMAGE
`DATA
`
`, ...
`
`RASTER TO
`BLOCK
`TRANSLATION
`,,.
`V"14
`
`12
`L-)
`
`J6
`
`FDCT
`
`,,.
`v----18
`
`20
`
`l..t
`~(
`
`26 :l
`
`L...
`~(
`
`32
`
`QUANTIZATION
`
`/'22
`,,.
`
`ENTROPY
`CODING
`
`,,.
`COMPRESSED
`IMAGE DATA
`
`HUAWEI EX. 1007 - 3/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 2 of 7
`
`5,850,484
`
`'\
`
`co (\')\
`
`0
`~\
`
`C\J
`~\
`
`(/)
`
`CJffi
`wo
`a_<(
`-:>w
`I
`
`(/)
`UJ
`...J co
`<(
`1-a
`
`~I'
`
`(/)
`UJ
`...J co
`<(
`1-
`I
`
`~I'
`
`0
`UJ
`(/)
`(/)UJ<(
`UJ<!Jt-
`0:<(<(
`a_~o
`~-
`0
`()
`
`~I'
`
`0
`
`(\') ;\
`
`~\
`
`1\
`
`(/)
`w
`...J co
`<(
`1-
`I
`
`(/)
`w
`....J co
`~
`a
`
`"" ,.
`
`.... ,.
`
`(/)
`
`z
`0 -(/)
`ww O:z
`a_-
`~<!J oz
`uw
`<!J
`UJ a_
`....,
`
`HUAWEI EX. 1007 - 4/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 3 of 7
`
`5,850,484
`
`43
`
`)
`
`,,
`
`H TABLES
`
`46
`/
`
`~
`
`QTABLES v
`
`48
`
`FIG. 3
`(PRIOR ART)
`
`COMPRESSED
`IMAGE DATA
`
`44
`/
`
`H'
`
`HEADER
`EXTRACTION
`
`50
`~
`
`L.t
`I""'
`
`54
`/
`~ ...
`
`57 v
`
`58 v
`
`/'52
`,,.
`
`ENTROPY
`DECODING
`
`,,
`
`v-56
`
`INVERSE
`QUANTIZATION
`
`,,
`
`v-59
`
`I OCT
`
`/'60
`"
`BLOCK TO
`RASTER
`TRANSLATION
`
`,,
`
`SOURCE IMAGE
`DATA
`
`HUAWEI EX. 1007 - 5/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 4 of 7
`
`5,850,484
`
`64 v
`
`66 v
`
`GENERATE
`REFERENCE
`IMAGE
`
`DETERMINE
`AVERAGE
`ENERGY
`
`68
`/
`
`70
`/
`
`SELECT
`SCANNED
`IMAGE
`
`,,
`DETERMINE
`AVERAGE
`ENERGY
`
`V*
`
`Vy
`
`(
`
`62
`
`72 v
`
`74
`/
`
`COMPUTE
`SCALING
`MATRIX
`
`s
`
`SCALE
`QTABLE
`
`FIG. 4
`
`HUAWEI EX. 1007 - 6/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 5 of 7
`
`5,850,484
`
`CJ)
`
`CJffi wo
`
`C...<(
`""'1UJ
`I
`
`OCJ)
`w~
`...JCO
`<C<(
`01-
`CJ)o
`
`CJ) w
`_J co
`<(
`I-
`I
`
`0
`w
`CJ)
`CJ)W<(
`W<!JI(cid:173)
`O:<l::<{
`a...~o
`~-
`0
`0
`
`~\
`
`g\
`
`~\
`
`OCJ)
`w~
`...JCO
`<l::<(
`01-
`CJ)o
`
`CJ) w
`...J co
`<(
`1-
`I
`
`H
`
`~\
`
`~\
`
`0: w
`...J
`<{
`()
`(/)
`
`~
`
`....
`
`CJ) w
`...J co
`~
`0
`
`,. ..
`
`,. ..
`
`z
`0 -CJ)
`(/) ww
`O:z
`a...-
`~(!J
`o z ow
`(!J w a...
`
`J
`
`HUAWEI EX. 1007 - 7/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 6 of 7
`
`5,850,484
`
`94
`
`)
`
`,
`
`,;?
`
`H TABLE
`
`~
`SCALER
`
`+
`
`SCALED
`QTABLE
`
`100
`/
`
`102
`/
`
`FIG. 6
`
`COMPRESSED
`IMAGE DATA
`
`~
`
`HEADER
`EXTRACTION
`
`96
`~
`
`~ir
`
`ENTROPY
`DECODING
`
`98
`~
`
`I..
`I""
`
`104
`/
`"' ""
`
`105
`/
`
`106
`/
`
`,,.
`INVERSE
`QUANTIZATION
`
`,,.
`
`IDCT
`
`,,.
`BLOCK TO
`RASTER
`TRANSLATION
`,,.
`SOURCE IMAGE
`DATA
`
`HUAWEI EX. 1007 - 8/18
`
`
`
`U.S. Patent
`
`Dec. 15, 1998
`
`Sheet 7 of 7
`
`5,850,484
`
`IMAGE DOCUMENT
`,,.
`
`1 36
`vJ
`
`REPRODUCED IMAGE
`'"'
`
`SCANNER
`
`PRINTER
`
`1it
`CORRECTIONS &
`TRANS-
`FORMATIONS
`ENGINE
`
`38
`
`)
`
`CORRECTIONS &
`TRANS-
`FORMATIONS
`ENGINE
`
`~
`
`JPEG
`COMPRESSION
`ENGINE
`
`40
`
`)
`
`,,
`G3/G4
`ENCAPSULATION
`ENGINE
`
`42
`1
`l/
`
`JPEG
`DECOMPRESSION
`ENGINE
`
`~"'
`
`G3/G4
`DECODING
`ENGINE
`
`~"'
`
`54 )
`
`52
`
`)
`
`50
`
`)
`
`48
`
`)
`
`'If
`
`TRANSMISSION
`MEANS
`
`44
`
`1 l/
`
`RECEIVING
`MEANS
`
`46
`
`)
`
`,~,
`
`COMPRESSED
`IMAGE DATA
`
`FIG. 7
`
`~"'
`
`COMPRESSED
`IMAGE DATA
`
`)
`
`134
`
`HUAWEI EX. 1007 - 9/18
`
`
`
`5,850,484
`
`1
`TEXT AND IMAGE SHARPENING OF JPEG
`COMPRESSED IMAGES IN THE
`FREQUENCY DOMAIN
`
`CROSS REFERENCE TO RELATED
`APPLICATION
`
`This is a continuation of application Ser. No. 08/411,369
`filed on Mar. 27, 1995, now abandoned.
`
`RELATED APPLICATION DATA
`
`This application incorporates subject matter disclosed in
`commonly-assigned application entitled METHOD FOR
`SELECTING JPEG QUANTIZATION TABLES FOR
`LOW BANDWIDTH APPLICATIONS, Ser. No. 08/935,
`517, filed on even date herewith.
`
`BACKGROUND OF THE INVENTION
`
`2
`basis vectors are unique 2-dimensional (2D) "spatial
`waveforms," which are the fundamental units in the DCT
`space. These basis vectors can be intuitively thought to
`represent unique images, wherein any source image can be
`5 decomposed into a weighted sum of these unique images.
`The discrete cosine transformer uses the forward discrete
`cosine (FDCT) function as shown below, hence the name.
`
`cos
`
`(2x+ 1)k:Jt
`16
`
`(2y + 1)ln
`16
`
`]
`
`cos
`
`15 where:
`C(k), C(l)=1/v'2 for k,l=O; and
`C(k), C(l)=1 otherwise
`The output of the transformer 16 is an 8x8 block of DCT
`elements or coefficients, corresponding to the DCT basis
`20 vectors. This block of transformed image data is then
`forwarded to a quantizer 20 over a bus 18. The quantizer 20
`quantizes the 64 DCT elements using a 64-element quanti(cid:173)
`zation table 24, which must be specified as an input to the
`compression engine 10. Each element of the quantization
`25 table is an integer value from one to 255, which specifies the
`stepsize of the quantizer for the corresponding DCT coef(cid:173)
`ficient. The purpose of quantization is to achieve the maxi(cid:173)
`mum amount of compression by representing DCT coeffi(cid:173)
`cients with no greater precision than is necessary to achieve
`30 the desired image quality. Quantization is a many-to-one
`mapping and, therefore, is fundamentally lossy. As men(cid:173)
`tioned above, quantization tables have been designed which
`limit the lossiness to imperceptible aspects of the image so
`that the reproduced image is not perceptually different from
`35 the source image.
`The quantizer 20 performs a simple division operation
`between each DCT coefficient and the corresponding quan(cid:173)
`tization table element. The lossiness occurs because the
`quantizer 20 disregards any fractional remainder. Thus, the
`40 quantization function can be represented as shown in Equa(cid:173)
`tion 2 below.
`
`This invention relates to data compression using the JPEG
`compression standard for continuous-tone still images, both
`grayscale and color.
`A committee known as "JPEG," which stands for "Joint
`Photographic Experts Group," has established a standard for
`compressing continuous-tone still images, both grayscale
`and color. This standard represents a compromise between
`reproducible image quality and compression rate. To achieve
`acceptable compression rates, which refers to the ratio of the
`uncompressed image to the compressed image, the JPEG
`standard adopted a lossy compression technique. The lossy
`compression technique was required given the inordinate
`amount of data needed to represent a color image, on the
`order of 10 megabytes for a 200 dots per inch (DPI)
`8.5"xll" image. By carefully implementing the JPEG
`standard, however, the loss in the image can be confined to
`imperceptible areas of the image, which produces a percep(cid:173)
`tually loss less uncompressed image. The achievable com(cid:173)
`pression rates using this technique are in the range of 10:1
`to 50:1.
`FIG. 1 shows a block diagram of a typical implementation
`of the JPEG compression standard. The block diagram will
`be referred to as a compression engine. The compression
`engine 10 operates on source image data, which represents
`a source image in a given color space such as CIELAB. The
`source image data has a certain resolution, which is deter- 45
`mined by how the image was captured. Each individual
`datum of the source image data represents an image pixel.
`The pixel further has a depth which is determined by the
`number of bits used to represent the image pixel.
`The source image data is typically formatted as a raster
`stream of data. The compression technique, however,
`requires the data to be represented in blocks. These blocks
`represent a two-dimensional portion of the source image
`data. The JPEG standard uses 8x8 blocks of data. Therefore,
`a raster-to-block translation unit 12 translates the raster 55
`source image data into 8x8 blocks of source image data. The
`source image data is also shifted from unsigned integers to
`signed integers to put them into the proper format for the
`next stage in the compression process. These 8x8 blocks are
`then forwarded to a discrete cosine transformer 16 via bus
`14.
`The discrete cosine transformer 16 converts the source
`image data into transformed image data using the discrete
`cosine transform (DCT). The DCT, as is known in the art of
`image processing, decomposes the 8x8 block of source
`image data into 64 DCT elements or coefficients, each of
`which corresponds to a respective DCT basis vector. These
`
`YQ[k,l] ~Integer Round
`
`(
`
`Y[k,l]
`Q[k,l]
`
`)
`
`where Y(k,l) represents the (k,l)-th DCT element and Q(k,l)
`represents the corresponding quantization table element.
`To reconstruct the source image, this step is reversed, with
`the quantization table element being multiplied by the
`50 corresponding quantized DCT coefficient. The inverse quan(cid:173)
`tization step can be represented by the following expression:
`
`As should be apparent, the fractional part discarded during
`the quantization step is not restored. Thus, this information
`is lost forever. Because of the potential impact on the image
`quality of the quantization step, considerable effort has gone
`into designing the quantization tables. These efforts are
`described further below following a discussion of the final
`60 step in the JPEG compression technique.
`The final step of the JPEG standard is an entropy
`encoding, which is performed by an entropy encoder 28. The
`entropy encoder 28 is coupled to the quantizer 20 via a bus
`22 for receiving the quantized image data therefrom. The
`65 entropy encoder achieves additionallossless compression by
`encoding the quantized DCT coefficients more compactly
`based on their statistical characteristics. The JPEG standard
`
`HUAWEI EX. 1007 - 10/18
`
`
`
`5,850,484
`
`3
`specifies two entropy coding methods: Huffman coding and
`arithmetic coding. The compression engine of FIG. 1
`assumes Huffman coding is used. Huffman encoding, as is
`known in the art, uses one or more sets of Huffman code
`tables 30. These tables may be predefined or computed 5
`specifically for a given image. Huffman encoding is a well
`known encoding technique that produces high levels of
`lossless compression. Accordingly, the operation of the
`entropy encoder 28 is not further described.
`Referring now to FIG. 2, a typical JPEG compressed file 10
`is shown generally at 34. The compressed file includes a
`JPEG header 36, the quantization (Q) tables 38 and the
`Huffman (H) tables 40 used in the compression process, and
`the compressed image data 42 itself. From this compressed
`file 34 a perceptually indistinguishable version of the origi- 15
`nal source image can be extracted when an appropriate
`Q-table is used. This extraction process is described below
`with reference to FIG. 3.
`AJPEG decompression engine 43 is shown in FIG. 3. The
`decompression engine essentially operates in reverse of the
`compression engine 10. The decompression engine receives
`the compressed image data at a header extraction unit 44,
`which extracts the H tables, Q tables, and compressed image
`data according to the information contained in the header.
`The H tables are then stored in H tables 46 while the Q tables
`are stored in Q tables 48. The compressed image data is then
`sent to an entropy decoder 50 over a bus 52. The Entropy
`Decoder decodes the Huffman encoded compressed image
`data using the H tables 46. The output of the entropy decoder
`50 are the quantized DCT elements.
`The quantized DCT elements are then transmitted to an
`inverse quantizer 54 over a bus 56. The inverse quantizer 54
`multiplies the quantized DCT elements by the corresponding
`quantization table elements found in Q tables 48. As
`described above, this inverse quantization step does not 35
`yield the original source image data because the quantization
`step truncated or discarded the fractional remainder before
`transmission of the compressed image data.
`The inverse quantized DCT elements are then passed to an
`inverse discrete cosine transformer (IDCT) 57 via bus 59, 40
`which transforms the data back into the time domain using
`the inverse discrete cosine transform (IDC1). The inverse
`transformed data is then transferred to block-to-raster trans(cid:173)
`lator 58 over a bus 60 where the blocks of DCT elements are
`translated into a raster string of decompressed source image
`data. From the decompressed source image data, a facsimile
`of the original source image can be reconstructed The
`reconstructed source image, however, is not an exact repli(cid:173)
`cation of the original source image. As described above, the
`quantization step produces some lossiness in the process of
`compressing the data. By carefully designing the quantiza(cid:173)
`tion tables, however, the prior art methods have constrained
`the loss to visually imperceptible portions of the image.
`These methods, and their shortcomings, are described
`below.
`The JPEG standard includes two examples of quantization
`tables, one for luminance channels and one for chrominance
`channels. See International Organization for Standardiza(cid:173)
`tion: "Information technology-digital compression encod(cid:173)
`ing of continuous-tones still images-part 1: Requirements
`and Guidelines," ISO!IEC IS10918-l, Oct. 20, 1992. These
`tables are known as the K.l and K.2 tables, respectively.
`These tables have been designed based on the perceptually
`lossless compression of color images represented in the
`YUV color space.
`These tables result in visually pleasing images, but yield
`a rather low compression ratio for certain applications. The
`
`4
`compression ratio can be varied by setting a so-called
`Q-factor or scaling factor, which is essentially a uniform
`multiplicative parameter that is applied to each of the
`elements in the quantization tables. The larger the Q-factor
`the larger the achievable compression rate. Even if the
`original tables are carefully designed to be perceptually
`lossless, however, a large Q-factor will introduce artifacts in
`the reconstructed image, such as blockiness in areas of
`constant color or ringing in text-scale characters. Some of
`these artifacts can be effectively cancelled by post(cid:173)
`processing of the reconstructed image by passing it through
`a tone reproduction curve correction stage, or by segmenting
`the image and processing the text separately. However, such
`methods easily introduce new artifacts. Therefore, these
`methods are not ideal.
`As a result of the inadequacy of the Q-factor approach,
`additional design methods for JPEG discrete quantization
`tables have been proposed. These methods can be catego(cid:173)
`rized as either perceptual, which means based on the human
`visual system (HVS) or based on information theory criteria.
`20 These methods are also designated as being based on the
`removal of subjective or statistical redundancy, respectively.
`These methods are discussed in copending application
`entitled "Method for Selecting JPEG Quantization Tables
`for Low Bandwidth Applications," commonly assigned to
`25 the present assignee, incorporated herein by reference.
`Quantization is not the only cause of image degradation.
`The color source image data itself might be compromised.
`For scanned colored images, the visual quality of the image
`can be degraded because of the inherent limitations of color
`30 scanners. These limitations are mainly of two kinds: limited
`modulation transfer function (MTF) and misregistration.
`The modulation transfer function refers to the mathematical
`representation or transfer function of the scanning process.
`There are inherent limitations in representing the scanning
`process by the MTF and these limitations are the main cause
`of pixel aliasing, which produces fuzzy black text glyphs of
`grayish appearance. Misregistration, on the other hand,
`refers to the relative misalignment of the scanner sensors for
`the various frequency bands. For example, the Hewlett
`Packard Scan Jet lieā¢ has a color misregistration tolerance
`of +/-0.076 mm for red and blue with respect to green. This
`amount of misregistration is significant considering the size
`of an image pixel (e.g., 0.08 mm at 300 dots per inch (dpi)).
`These limitations significantly degrade text in color
`45 images because sharp edges are very important for reading
`efficiency. The visual quality of text can be improved,
`however, using prior art edge enhancement techniques. Edge
`enhancement can be performed in either the spatial or
`frequency domain. In the spatial domain (i.e., RGB), edge
`50 crispening can be performed by discrete convolution of the
`scanned image with an edge enhancement kernel. This
`approach is equivalent to filtering the image with a high-pass
`filter. However, this technique is computationally intensive.
`An MxN convolution kernel, for example, requires MN
`55 multiplications and additions per pixel.
`For edge sharpening in the frequency domain, the full
`image is first transformed into the frequency domain using
`the Fast Fourier Transform (FFT) or the Discrete Fourier
`Transform (DFT), low frequency components are dropped,
`60 and then the image is transformed back into the time
`domain. This frequency domain method, as with the spatial
`domain method, is also computationally intensive.
`Moreover, it uses a transformation different than that
`required by the JPEG standard.
`Accordingly, the need remains for a computationally
`efficient method for improving the visual quality of images,
`and in particular text, in scanned images.
`
`65
`
`HUAWEI EX. 1007 - 11/18
`
`
`
`5,850,484
`
`6
`invention. This capability results in a lower cost color
`facsimile product than is possible using the prior art image
`enhancement techniques.
`The foregoing and other objects, features and advantages
`5 of the invention will become more readily apparent from the
`following detailed description of a preferred embodiment of
`the invention which proceeds with reference to the accom(cid:173)
`panying drawings.
`
`5
`SUMMARY OF THE INVENTION
`
`The invention is a method of compressing and decom(cid:173)
`pressing images which comprises using one quantization
`table (QE) for compressing the image and a second quanti(cid:173)
`zation table (QD) for decompressing the image. In general,
`compression and decompression are performed in conform(cid:173)
`ance with the JPEG standard. The second quantization table
`QD is related to the first quantization table according to the
`following general expression:
`
`where S is a scaling matrix having each element S[k,l]
`formed according to the following expression:
`
`S[k,ZJ"~V*[k,l]!Vy[k,l]
`
`10
`
`15
`
`25
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram of a prior art JPEG compression
`engine.
`FIG. 2 is a drawing of a typical format of a JPEG
`compressed file.
`FIG. 3 is a block diagram of a prior art JPEG decom-
`pression engine.
`FIG. 4 is a flow chart of a method of forming a scaled
`quantization table according to the invention.
`FIG. 5 is a drawing of a JPEG compressed file including
`a quantization table scaled according to the invention.
`FIG. 6 is a block diagram of a JPEG decompression
`engine according to the invention.
`FIG. 7 is a block diagram of a color fax machine including
`JPEG compression and decompression engines according to
`the invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`Overview of the Quantization Process
`
`where V* is a variance matrix of a reference image and Vy
`is a variance matrix of a scanned image; and where B is a
`brightness matrix, which can include zero or non-zero
`elements. By using the scaling matrix S, the high-frequency 20
`components of the DCT elements can be "enhanced" with(cid:173)
`out any additional computational requirements. According
`to the invention, the quantization table QD is transmitted
`with the encoded quantized image data, and is used in
`decompression to recover the image.
`The reference image is a preselected continuous-tone
`image, either grayscale or color depending on the images to
`be processed. The reference image is rendered into a target
`image file. The target image file is not generated by a
`scanner, so the data therein is not compromised by any of the 30
`inherent limitations of a color scanner. Thus, the variance of
`the target image data, which is a statistical representation of
`the energy or frequency content of the image, retains the
`high-frequency components. The reference image can be
`any continuous-tone image, but in the preferred embodiment 35
`the reference image includes text with a serif font because
`the serif font has good visual quality which the method
`preserves.
`The scanned image, although it can be any image, in the
`preferred embodiment is a printed version of the reference
`image. Thus, the variance of the scanned image represents
`the energy or frequency composition of the reference image
`but which is compromised by the inherent limitations of the
`scanner. The scaling matrix, therefore, boosts the frequency
`components that are compromised by the scanning process.
`A preferred embodiment of the invention is described
`herein in the context of a color facsimile (fax) machine. The
`color fax machine includes a scanner for rendering a color
`image into color source image data that represents the color
`image, a compression engine that compresses the color
`source image data to compressed image data, a means for
`encapsulating the compressed image data, and a means for
`transmitting the encapsulated data. The compression engine
`includes means for storing two quantization tables. The first
`quantization table is used to quantize the image data trans- 55
`formed using the discrete cosine transform (DCT). The
`second quantization table is encapsulated with the encoded
`quantized image data for use in decompressing the image.
`The second quantization table is related to the first quanti(cid:173)
`zation table in the manner described above. When used to 60
`transmit and receive color images between two locations,
`the machine transfers the images with higher quality than
`prior systems.
`The second quantization table can be precomputed and
`stored in the compression engine, in which case there are no
`additional computational requirements for the compression
`engine to implement the image enhancing method of the
`
`40
`
`The text and image enhancing technique according to the
`invention is integrated into the decoding or inverse quanti(cid:173)
`zation step that is necessarily required by the JPEG standard.
`The invention integrates the two by using two different
`quantization tables: a first quantization table (QE) for use in
`quantizing the image data during the compression step and
`a second quantization table (QD) for use during the decode
`or inverse quantization during the decompression process.
`The difference between the two tables, in particular the ratio
`of the two tables, determines the amount of image enhancing
`that is done in the two steps. By integrating the image
`enhancing and inverse quantization steps, the method does
`not require any additional computations than already
`required for the compression and decompression processes.
`In order to understand the operation of the invention, the
`following mathematical derivation is necessary. Let QD be
`the second quantization table used during the decoding or
`50 inverse quantization step. Then let QD be related to the first
`quantization table QD used during the quantization step, by
`the following expression:
`
`45
`
`(1)
`
`where S is a scaling matrix, which scales each element of the
`first quantization table QE to a corresponding element in the
`second quantization table QD. The scaling matrix S is not
`used in a true matrix multiplication; rather, the multiplica(cid:173)
`tion is an element-by-element multiplication. Each element
`in the first quantization table QE has a corresponding ele(cid:173)
`ment in the scaling matrix S that when multiplied together
`produce the corresponding element in the second quantiza(cid:173)
`tion table QD.
`The matrix B is a so-called brightness matrix because it
`65 can affect the brightness of the image by changing the DC
`level of the DCT elements. The elements of the B matrix can
`include zero or non-zero values depending on the desired
`
`HUAWEI EX. 1007 - 12/18
`
`
`
`10
`
`7
`brightness. For purposes of the following discussion and
`derivation, however, it will be assumed that the B matrix
`contains zero elements only to simplify the derivation.
`The text and image enhancing technique of the invention
`uses a variance matrix to represent the statistical properties 5
`of an image. The variance matrix is an MxM matrix, where
`each element in the variance matrix is equal to the variance
`of a corresponding DCT coefficient over the entire image.
`The variance is computed in the traditional manner, as is
`known in the art.
`The edge enhancement technique in essence tries