`[19]
`[11] Patent Number:
`5,850,484
`
`Beretta et al.
`[45] Date of Patent:
`Dec. 15, 1998
`
`U8005850484A
`
`[54] TEXT AND IMAGE SHARPENING OF JPEG
`COMPRESSED IMAGES [N THE
`FREQUENCY DOMAIN
`
`0593159A2
`07087491
`07143343
`
`9/1993
`3/1995
`6/1995
`
`European Pat. Off.
`...... .. G06F 15/64
`Japan ................ ..
`.. H04N 7/30
`
`.. H04N 1/41
`Japan
`
`[75]
`
`Inventors: Giordano Beretta, Palo Alto; Vasudev
`BhaSkal‘ana Mountain View;
`KOHStantiHUS KOHStalltillideS, Sail
`Jose, all of Calif.
`
`[73] Assignee: Hewlett-Packard C0., Palo Alto, Calif.
`
`[21] Appl. No.: 940,695
`
`[22]
`
`Filed:
`
`seP- 30’ 1997
`
`OTHER PUBLICATIONS
`G. B. Beretta et al., “Experience with the New Color
`Facsimile Standard”, ISCC Annual Meeting, Apr. 23—25,
`1995, PP- 1—7-
`Albert J. Ahiimada, Jr. et al., “Luminance—Model—Based
`DCT Quantization for Color Image Comression”, Human
`Vision, Visual Processing, and Digital Display III, 1666,
`365—374, SPIE, 1992.
`
`(List continued on next page.)
`
`REIatEd US- Application Data
`
`Primary Examiner—Jose L. Couso
`Assistant Examiner—Matthew C. Bella
`
`[63] Continuation of Ser. No. 411,369, Mar. 27, 1995, aban-
`doned.
`
`[57]
`
`ABSTRACT
`
`The text and image enhancing technique according to the
`invention is integrated into the decoding of inverse quanti—
`zation step that is necessarily required by the JPEG standard.
`The invention integrates the two by using two different
`quantization tables: a first quantization table (QE) for use in
`quantizing the image data during the compression step and
`a second quantization table used during the decode or
`inverse quantization during the decompression process. The
`second quantization table QD is related to the first quanti—
`zation table according to a predetermined function of the
`energy in a reference image and the energy in a scanned
`image. The energy of the reference image lost during the
`scanning process, as represented by the energy in the
`scanned image, is restored during the decompression pro-
`cess by appropriately scaling the second quantization table
`according to the predetermined function. The difference
`between the two tables, in particular the ratio of the two
`tables, determines the amount of image enhancing that is
`done. 1“ the two Steps; BY Integranng the Image enhancmg
`and “were: quannzatlonéteps the methOd does?“ requue
`any additional computations than already required for thc
`compression and decompression processes.
`
`[51]
`
`Int. Cl.6 ....................................................... G06K 9/36
`U.S. Cl.
`........................ ..
`
`358/432; 348/404
`[58] Field of Search ................................... .. 382/298, 233,
`382/251, 244, 232, 253, 250, 274, 252,
`238, 236, 166, 280, 270; 358/427, 426,
`432, 261.3, 448, 261.1, 433, 261.2, 430,
`458; 348/404, 432, 405, 433, 403, 391,
`384, 422, 393, 430, 394, 409, 395, 390
`
`[56]
`
`References Cited
`Us. PATENT DOCUMENTS
`
`4,776,030 10/1988 Tzou ..................................... .. 358/432
`477807761
`10/1988 Daly et a1.
`~~ 358/133
`570637608
`11/1991 Slegel ~~~~~~~~~~~~ ~~
`382/56
`5,073,820 12/1991 Nakagawa et a1.
`.. 358/133
`
`
`
`.
`
`5,465,164
`5,488,570
`
`11/1995 Sugiura ................................. .. 358/432
`1/1996 Agarwal
`............................ .. 364/514 R
`
`FOREIGN PATENT DOCUMENTS
`0444884A2
`2/1991
`European Pat. Off.
`.
`0513520A2
`4/1992 European Pat. Off.
`
`H04N 7/133
`
`35 Claims, 7 Drawing Sheets
`
`68
`
`
`
`64
`SELECT
`GENERATE
`SCANNED
`REFERENCE
`
`IMAGE
`IMAGE
` 70
`DETERMINE
`DETERMINE
`AVERAGE
`AVERAGE
`
`ENERGY
`ENERGY
`
`COMPUTE
`SCALING
`MATRIX
`
`
`
`
`
`
`SCALE
`
`Q TABLE
`
`
`
`
`
`
`
`
`
`
`
`
`
`OLYMPUS EX. 1007 - 1/18
`
`
`
`5,850,484
`Page 2
`
`OTHER PUBLICATIONS
`
`Kenneth R. Alexander et al., “Spatial—Frequency Character-
`istics of Letters Identification”, J. Opt. Soc. Am. A, 11,9,
`2375—2382, 1994.
`Wen—Hsiung Chen et al., “Adaptive Coding of Monochrome
`and Color Images”, IEEE Transactions on Communications,
`COM—25, 1285—1292, 1977.
`Bowonkoon Chitprasert et al., Human Visual Weighted
`Progressive Image Transmission,
`IEEE Transactions on
`Communications, COM—38, 7, 1040—1044, 1990.
`R. J. Clarke, Spectral Responses of the Discrete Cosine and
`Walsh—Hadamard Transforms,
`IEE Proc., 130, Part F,
`309—313, 1983.
`K.K. De Valois et al., Color—Luminance Masking Interac-
`tions, Seeing Contour and Colour, J.J. Kulikowski, C.M.
`Dickinson and I.J. Murray Editors, Pergamon Press, Oxford,
`1989.
`
`J. Raymond Edinger, Jr., “A Measure for Stairstepping in
`Digitized Text that Correlates with the Subjective Impres-
`sion of Quality”, IS&T’s Tenth International Congress on
`Advances in Non—Impact Printing Technologies, 552—558,
`1994.
`
`Yasushi Hoshino et al., Applicability of a Standardized
`Discrete Cosine Transform Coding Method to Character
`Images, J. Electronic Imaging, 1, 3, 322—327, 1992.
`Chansik Hwang et al., Human Visual System Weighted
`Progresseive Image Transmission Using Lapped Orthogonal
`Transform/Classified Vector Quantization, Optical Engi—
`neering, 32, 7, 1524—1530, 1993.
`International Organization for Standardization: Information
`Technology—Digital Compression and Coding of Continu-
`ous—Tone Still Images—Part 1: Requirements and Guide-
`lines, ISO/IEC IS 10918—1, Oct. 20, 1992.
`International Telecommunication Union: Amendments to
`
`ITU—T Rec. T.30 for Enabling Continuous—Tone Colour and
`Gray—Scale Modes for Group 3, COM 8—43—E, Question
`5/8, Mar. 1994.
`International Telecommunication Union: Amendments to
`
`ITU—T Rec. T—4 for Enabling Continuous—Time Colour and
`Gray—Scale Modes for Group 3, COM 8—44—E, Question
`5/8, Mar. 1994.
`Gordon E. Legge, “Reading: Effects of Contrast and Spatial
`Frequency”, Applied Vision, OSA Technical Digest Series,
`16, 90—93, 1989.
`
`Gordon E. Legge et al., Contrast Masking in Human Vision,
`J. Opt. Soc. Am., 70,12,1458—1471, 1980.
`David L. McLaren et al., “Removal of Subjective Redun-
`dancy from DCT—Coded Images”, IEE Proceedings—I, 138,
`5, 345—350, 1991.
`“Color—Facsimile System for
`al.,
`I. Miyagawa
`et
`Mixed—Color Documents”, SID 94 Digest, 887—890, 1994.
`Kathy T. Mullen, “The Contrast Sensitivity of Human
`Colour Vision to Red—Green and Blue—Yellow Chromatic
`Gratings”, J. Physiol., 359, 381—400, 1985.
`Daivd H. Parish et a1., “Object Spatial Frequencies, Retinal
`Spatial Frequencies, Noise, and the Efficiency of Letter
`Discrimination”, Vision Res., 31, 7/8, 1399—1415, 1991.
`Denis G. Pelli et a1., “Visual Factors in Letter Identifica—
`tion”, IS&T’s 47th Annual Conference/ICPS, p. 411, 1994.
`Heidi A. Peterson et al., An Improved Detection Model for
`DCT Coefficient Quantization, Human Vision, Visual Pro-
`cessing, and Digital Display IV, 1913, 191—201, SPIE, 1993.
`Ricardo
`L.
`de Queiroz
`et
`al.,
`“Human Visual
`Sensitivity—Weighted Progressive
`Image Transmission
`Using the Lapped Orthogonal Transform”, J. Electronic
`Imaging, 1, 3, 328—338, 1992.
`Ricardo L. de Queiroz et al., Modulated Lapped Orthogonal
`Transforms in Image Coding, Digital Video Compression on
`Personal Computers: Algorithms and Technologies, 2187,
`80—91, SPIE, 1993.
`
`Robert J. Safranek et al., “A Perceptually Tuned Sub—Band
`Image Coder with Image Dependent Quantization and
`Post—Quantization Data Compression”, Proc. ICASSP 89, 3,
`1945—1948, 1989.
`Robert J. Safranek, JPEG Compliant Encoder Utilizing
`Perceptually Based Quantization, Human Vision, Visual
`Processing, and Digital Display V, 1913, 117—126, SPIE,
`1993.
`
`Andrew B. Watson, DCT Quantization Matrices Visually
`Optimized for Individual Images, Human Vision, Visual
`Processing, and Digital Display IV, 1913, 202—216, SPIE,
`1993.
`
`Andrew B. Watson et a1., Discrete Cosine Transform (DCT)
`Basis Function Visibility: Effects of Viewing Distance and
`Contrast Masking, Human Vision, Visual Processing, and
`Digital Display
`v,
`2179,
`99—108,
`SPIE,
`1994.
`
`
`
`OLYMPUS EX. 1007 - 2/18
`
`
`
`
`
`US. Patent
`
`Dec. 15,1998
`
`Sheet 1 0f 7
`
`5,850,484
`
`SOURCEIMAGE
`
`DATA
`
`12
`
`2
`
` RASTER TO
`
`TRANSLATION
`
`BLOCK
`
`14
`
`16
`
`FIG. 1
`
`(PRIOR ART)
`
`FDCT
`
`18
`
`20
`
`QUANTIZATION
`
`O TABLES
`
`TITABLES
`
`ENTROPY
`
`CODWG
`
`24
`
`3O
`
`COMPRESSED
`
`IMAGEDATA
`
`
`
`
`
`OLYMPUS EX. 1007 - 3/18
`
`
`
`tnetaPS”U
`
`Dec. 15, 1998
`
`Sheet 2 of 7
`
`5,850,484
`
`mm
`
`Ov
`
`N
`
`vmmm.omvm
`
`
`
`mum—n;
`
`wmmo<mr
`
` wmjmxt.I mm_I_m<._.0
`
`mmjm/E.Immjmdc.O
`
`
`6mm...
`
`DmmmMEQEOO
`
`NOS):
`
`<._.<n_
`
`ZO_mmmmn=>_OO
`
`5%MOHMBN.GHh
`
`mzazm
`
`
`
`
`
`OLYMPUS EX. 1007 - 4/18
`
`
`
`
`
`
`US. Patent
`
`Dec. 15,1998
`
`Sheet 3 0f 7
`
`5,850,484
`
`COMPRESSED
`
`IMAGE DATA
`
`/
`
`HEADER
`
`EXTRACTION
`
`
`
`H TABLES
`
`
`
`ENTROPY
`DECODING
`
`INVERSE
`QUANTIZATION
`
`Q TABLES
`
`
`FIG. 3
`
`(PRIOR ART)
`
`
`
`
`BLOCK TO
`
`RASTER
`
`
`
`TRANSLATION
`
`
`
`SOURCE IMAGE
`
`DATA
`
`
`
`OLYMPUS EX. 1007 - 5/18
`
`
`
`
`
`US. Patent
`
`Dec. 15,1998
`
`Sheet 4 of 7
`
`5,850,484
`
`ENERGY
`
`SELECT
`
`SCANNED
`
`IMAGE
`
`DETERMINE
`
`AVERAGE
`
`GENERATE
`
`REFERENCE
`
`IMAGE
`
`DETERMINE
`
`AVERAGE
`
`ENERGY
`
`v*
`
`Vy
`
`COMPUTE
`
`MATRIX
`
`SCALING
`
`74
`
`SCALE
`
`Q TABLE
`
`FIG. 4
`
`
`
`
`
`OLYMPUS EX. 1007 - 6/18
`
`
`
`tnetaPS”U
`
`Dec. 15, 1998
`
`Sheet 5 0f 7
`
`5,850,484
`
`mm
`
`mm
`
`om
`
`No
`
`Dmmmmmnzzoo
`
`MOS):
`
`<.r<n_
`
`m_Z_OZm_
`
` mm4m<._.GOwl—<Om mum—E, mmjm/E.I
`
`mm...m<._.Imm_|_m<._.0
`
`Own—w
`
`mmmD<MI
`
`
`
`m.UHm
`
`mmmm
`
`mm.._m_<._.ODm._<0m
`
`mmj<0m
`
`
`
`
`
`ZO_mwmmn=>_OO
`
`
`
`
`
`OLYMPUS EX. 1007 - 7/18
`
`
`
`
`
`
`
`
`US. Patent
`
`Dec. 15,1998
`
`Sheet 6 0f 7
`
`5,850,484
`
`COMPRESSED
`
`IMAGE DATA
`
`96
`
`HEADER
`
`EXTRACTION
`
`
`
`ENTROPY
`DECODING
`
`
`
`104
`
`
`
`H TABLE
`
`SCALER
`
`SCALED
`
`Q TABLE
`
`100
`
`102
`
`
`
`INVERSE
`
`QUANTIZATION
`
`
`
`
`105
`
`BLOCK TO
`RASTER
`
`106
`
`FIG. 6
`
`TRANSLATION
`
`
`
`SOURCE IMAGE
`
`DATA
`
`
`
`OLYMPUS EX. 1007 - 8/18
`
`
`
`
`
`US. Patent
`
`Dec. 15, 1998
`
`Sheet 7 0f 7
`
`5,850,484
`
`IMAGE DOCUMENT
`
`REPRODUCED IMAGE
`
`CANNER
`
`MEANS
`
`CORRECTIONS &
`
`TRANS-
`
`FORMATIONS
`
`ENGINE
`
`JPEG
`
`COMPRESSION
`
`ENGINE
`
`G3/G4
`
`ENCAPSULATION
`
`ENGINE
`
`TRANSMISSION
`
`PRINTER
`
`CORRECTIONS &
`
`TRANS-
`
`FORMATIONS
`
`ENGINE
`
`JPEG
`
`DECOMPRESSION
`
`ENGINE
`
`MEANS
`
`G3/G4
`
`DECODING
`
`ENGINE
`
`RECEIVING
`
`COMPRESSED
`
`IMAGE DATA
`
`COMPRESSED
`
`IMAGE DATA
`
`FIG. 7
`
`5
`
`134
`
`
`
`
`
`OLYMPUS EX. 1007 - 9/18
`
`
`
`
`
`1
`TEXT AND IMAGE SHARPENING OF JPEG
`COMPRESSED IMAGES IN THE
`FREQUENCY DOMAIN
`
`CROSS REFERENCE TO RELATED
`APPLICATION
`
`This is a continuation of application Ser. No. 08/411,369
`filed on Mar. 27, 1995, now abandoned.
`
`5,850,484
`
`2
`
`(2D) “spatial
`basis vectors are unique 2-dimensional
`waveforms,” which are the fundamental units in the DCT
`space. These basis vectors can be intuitively thought to
`represent unique images, wherein any source image can be
`decomposed into a weighted sum of these unique images.
`The discrete cosine transformer uses the forward discrete
`
`cosine (FDCT) function as shown below, hence the name.
`
`RELATED APPLICATION DATA
`
`10
`
`k,l=—1 Ck-C
`YI]
`4
`
`E7ng -
`(l)[x0 0
`(0’)
`
`This application incorporates subject matter disclosed in
`commonly-assigned application entitled METHOD FOR
`SELECTING JPEG QUANTIZATION TABLES FOR
`LOW BANDWIDTH APPLICATIONS, Ser. No. 08/935,
`517, filed on even date herewith.
`
`BACKGROUND OF THE INVENTION
`
`This invention relates to data compression using the JPEG
`compression standard for continuous-tone still images, both
`grayscale and color.
`A committee known as “JPEG,” which stands for “Joint
`Photographic Experts Group,” has established a standard for
`compressing continuous-tone still images, both grayscale
`and color. This standard represents a compromise between
`reproducible image quality and compression rate. To achieve
`acceptable compression rates, which refers to the ratio of the
`uncompressed image to the compressed image, the JPEG
`standard adopted a lossy compression technique. The lossy
`compression technique was required given the inordinate
`amount of data needed to represent a color image, on the
`order of 10 megabytes for a 200 dots per inch (DPI)
`8.5"><11"
`image. By carefully implementing the JPEG
`standard, however, the loss in the image can be confined to
`impcrccptiblc areas of thc imagc, which produccs a pcrccp-
`tually loss less uncompressed image. The achievable com-
`pression rates using this technique are in the range of 10:1
`to 50:1.
`
`FIG. 1 shows a block diagram of a typical implementation
`of the JPEG compression standard. The block diagram will
`be referred to as a compression engine. The compression
`engine 10 operates on source image data, which represents
`a source image in a given color space such as CIELAB. The
`source image data has a certain resolution, which is deter-
`mined by how the image was captured. Each individual
`datum of the source image data represents an image pixel.
`The pixel further has a depth which is determined by the
`number of bits used to represent the image pixel.
`The source image data is typically formatted as a raster
`stream of data. The compression tcchniquc, howcvcr,
`requires the data to be represented in blocks. These blocks
`represent a two-dimensional portion of the source image
`data. The JPEG standard uses 8x8 blocks of data. Therefore,
`a raster-to-block translation unit 12 translates the raster
`
`source image data into 8x8 blocks of source image data. The
`source image data is also shifted from unsigned integers to
`signed integers to put them into the proper format for the
`next stage in the compression process. These 8x8 blocks are
`then forwarded to a discrete cosine transformer 16 via bus
`14.
`The discrete cosine transformer 16 converts the source
`
`image data into transformed image data using the discrete
`cosine transform (DCT). The DCT, as is knoan in the art of
`image processing, decomposes the 8X8 block of source
`image data into 64 DCT elements or coefficients, each of
`which corresponds to a respective DCT basis vector. These
`
`cos
`
`2x+1ém COS 2y+1611n]
`
`15
`
`where:
`
`C(k), C(1)=1/V2 for k,l=0; and
`C(k), C(1)=1 otherwise
`The output of the transformer 16 is an 8x8 block of DCT
`elements or coefficients, corresponding to the DCT basis
`vectors. This block of transformed image data is then
`forwarded to a quantizer 20 over a bus 18. The quantizer 20
`quantizes the 64 DCT elements using a 64-element quanti-
`zation table 24, which must be specified as an input to the
`compression engine 10. Each element of the quantization
`table is an integer value from one to 255, which specifies the
`stepsize of the quantizer for the corresponding DCT coef-
`ficient. The purpose of quantization is to achieve the maxi-
`mum amount of compression by representing DCT coeffi-
`cients with no greater precision than is necessary to achieve
`the desired image quality. Quantization is a many-to-one
`mapping and, therefore, is fundamentally lossy. As men-
`tioned above, quantization tables have been designed which
`limit the lossiness to imperceptible aspects of the image so
`that the reproduced image is not perceptually different from
`the source image.
`The quantizer 20 performs a simple division operation
`between each DCT coefficient and the corresponding quan-
`tization table element. The lossiness occurs because the
`
`30
`
`35
`
`40
`
`quantizer 20 disregards any fractional remainder. Thus, the
`quantization function can be represented as shown in Equa-
`tion 2 below.
`
`45
`
`50
`
`55
`
`60
`
`65
`
`YQ[k,l] = Integer Round (
`
`)
`
`where Y(k,l) represents the (k,l)-th DCT element and Q(k,l)
`represents the corresponding quantization table element.
`To rcconstruct thc sourcc imagc, this step is rcvcrscd, with
`the quantization table element being multiplied by the
`corresponding quantized DCT coefficient. The inverse quan-
`tization step can be represented by the following expression:
`YIkll=YQ[/91] QEUG l]-
`
`As should be apparent, the fractional part discarded during
`the quantization step is not restored. Thus, this information
`is lost forever. Because of the potential impact on the image
`quality of the quantization step, considerable effort has gone
`into designing the quantization tables. These efforts are
`described further below following a discussion of the final
`step in the JPEG compression technique.
`Thc final step of thc JPEG standard is an cntropy
`encoding, which is performed by an entropy encoder 28. The
`entropy encoder 28 is coupled to the quantizer 20 via a bus
`22 for receiving the quantized image data therefrom. The
`entropy encoder achieves additional lossless compression by
`encoding the quantized DCT coefficients more compactly
`based on their statistical characteristics. The JPEG standard
`
`OLYMPUS EX. 1007 - 10/18
`
`
`
`
`
`5,850,484
`
`3
`specifies two entropy coding methods: Huffman coding and
`arithmetic coding. The compression engine of FIG. 1
`assumes Huffman coding is used. Huffman encoding, as is
`known in the art, uses one or more sets of Huffman code
`tables 30. These tables may be predefined or computed
`specifically for a given image. Huffman encoding is a well
`known encoding technique that produces high levels of
`lossless compression. Accordingly,
`the operation of the
`entropy encoder 28 is not further described.
`Referring now to FIG. 2, a typical JPEG compressed file
`is shown generally at 34. The compressed file includes a
`JPEG header 36,
`the quantization (Q) tables 38 and the
`Huffman (H) tables 40 used in the compression process, and
`the compressed image data 42 itself. From this compressed
`file 34 a perceptually indistinguishable version of the origi-
`nal source image can be extracted when an appropriate
`Q-table is used. This extraction process is described below
`with reference to FIG. 3.
`AJPEG decompression engine 43 is shown in FIG. 3. The
`decompression engine essentially operates in reverse of the
`compression engine 10. The decompression engine receives
`the compressed image data at a header extraction unit 44,
`which extracts the H tables, Q tables, and compressed image
`data according to the information contained in the header.
`The H tables are then stored in H tables 46 while the Q tables
`are stored in Q tables 48. The compressed image data is then
`sent to an entropy decoder 50 over a bus 52. The Entropy
`Decoder decodes the Huffman encoded compressed image
`data using the H tables 46. The output of the entropy decoder
`50 are the quantized DCT elements.
`The quantized DCT elements are then transmitted to an
`inverse quantizer 54 over a bus 56. The inverse quantizer 54
`multiplies the quantized DCT elements by the corresponding
`quantization table elements found in Q tables 48. As
`described above,
`this inverse quantization step does not
`yield the original source image data because the quantization
`step truncated or discarded the fractional remainder before
`transmission of the compressed image data.
`The inverse quantized DCT elements are then passed to an
`inverse discrete cosine transformer (IDCT) 57 via bus 59,
`which transforms the data back into the time domain using
`the inverse discrete cosine transform (IDCT). The inverse
`transformed data is then transferred to block-to-raster trans-
`lator 58 over a bus 60 where the blocks of DCT elements are
`
`translated into a raster string of decompressed source image
`data. From the decompressed source image data, a facsimile
`of the original source image can be reconstructed The
`reconstructed source image, however, is not an exact repli-
`cation of the original source image. As described above, the
`quantization step produces some lossiness in the process of
`compressing the data. By carefully designing the quantiza-
`tion tables, however, the prior art methods have constrained
`the loss to visually imperceptible portions of the image.
`These methods, and their shortcomings, are described
`below.
`
`The JPEG standard includes two examples of quantization
`tables, one for luminance channels and one for chrominance
`channels. See International Organization for Standardiza-
`tion: “Information technology—digital compression encod-
`ing of continuous-tones still images—part 1: Requirements
`and Guidelines,” ISO/IEC 1810918—1, Oct. 20, 1992. These
`tables are known as the KI and K.2 tables, respectively.
`These tables have been designed based on the perceptually
`lossless compression of color images represented in the
`YUV color space.
`These tables result in visually pleasing images, but yield
`a rather low compression ratio for certain applications. The
`
`10
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`compression ratio can be varied by setting a so-called
`Q-factor or scaling factor, which is essentially a uniform
`multiplicative parameter that
`is applied to each of the
`elements in the quantization tables. The larger the Q-factor
`the larger the achievable compression rate. Even if the
`original tables are carefully designed to be perceptually
`lossless, however, a large Q-factor will introduce artifacts in
`the reconstructed image, such as blockiness in areas of
`constant color or ringing in text-scale characters. Some of
`these artifacts can be elfectively cancelled by post-
`processing of the reconstructed image by passing it through
`a tone reproduction curve correction stage, or by segmenting
`the image and processing the text separately. However, such
`methods easily introduce new artifacts. Therefore,
`these
`methods are not ideal.
`As a result of the inadequacy of the Q-factor approach,
`additional design methods for JPEG discrete quantization
`tables have been proposed. These methods can be catego-
`rized as either perceptual, which means based on the human
`visual system (HVS) or based on information theory criteria.
`These methods are also designated as being based on the
`removal of subjective or statistical redundancy, respectively.
`These methods are discussed in copending application
`entitled “Method for Selecting JPEG Quantization Tables
`for Low Bandwidth Applications,” commonly assigned to
`the present assignee, incorporated herein by reference.
`Quantization is not the only cause of image degradation.
`The color source image data itself might be compromised.
`For scanned colored images, the visual quality of the image
`can be degraded because of the inherent limitations of color
`scanners. These limitations are mainly of two kinds: limited
`modulation transfer function (MTF) and misregistration.
`The modulation transfer function refers to the mathematical
`
`representation or transfer function of the scanning process.
`There are inherent limitations in representing the scanning
`process by the MTF and these limitations are the main cause
`of pixel aliasing, which produces fuzzy black text glyphs of
`grayish appearance. Misregistration, on the other hand,
`refers to the relative misalignment of the scanner sensors for
`the various frequency bands. For example,
`the Hewlett
`Packard Scan Jet IIcTM has a color misregistration tolerance
`of +/—0.076 mm for red and blue with respect to green. This
`amount of misregistration is significant considering the size
`of an image pixel (e.g., 0.08 mm at 300 dots per inch (dpi)).
`These limitations significantly degrade text
`in color
`images because sharp edges are very important for reading
`efficiency. The visual quality of text can be improved,
`however, using prior art edge enhancement techniques. Edge
`enhancement can be performed in either the spatial or
`frequency domain. In the spatial domain (i.e., RGB), edge
`crispening can be performed by discrete convolution of the
`scanned image with an edge enhancement kernel. This
`approach is equivalent to filtering the image with a high-pass
`filter. However, this technique is computationally intensive.
`An M><N convolution kernel, for example, requires MN
`multiplications and additions per pixel.
`For edge sharpening in the frequency domain, the full
`image is first transformed into the frequency domain using
`the Fast Fourier Transform (FFT) or the Discrete Fourier
`Transform (DFT), low frequency components are dropped,
`and then the image is transformed back into the time
`domain. This frequency domain method, as with the spatial
`domain method,
`is also computationally intensive.
`Moreover,
`it uses a transformation different
`than that
`required by the JPEG standard.
`Accordingly,
`the need remains for a computationally
`efficient method for improving the visual quality of images,
`and in particular text, in scanned images.
`
`OLYMPUS EX. 1007 - 11/18
`
`
`
`
`
`5,850,484
`
`5
`SUMMARY OF THE INVENTION
`
`The invention is a method of compressing and decom-
`pressing images which comprises using one quantization
`table (QE) for compressing the image and a second quanti-
`zation table (QD) for decompressing the image. In general,
`compression and decompression are performed in conform-
`ancc with thc JPEG standard. The sccond quantization tablc
`QD is related to the first quantization table according to the
`following general expression:
`
`QD=SXQE+Br
`
`where S is a scaling matrix having each element S[k,l]
`formed according to the following expression:
`
`Slli2=V*lk1J/Vylli
`
`where V* is a variance matrix of a reference image and VY
`is a variance matrix of a scanned image; and where B is a
`brightness matrix, which can include zero or non-zero
`elements. By using the scaling matrix S, the high-frequency
`components of the DCT elements can be “enhanced” with—
`out any additional computational requirements. According
`to the invention, the quantization table QD is transmitted
`with the encoded quantized image data, and is used in
`decompression to recover the image.
`The reference image is a preselected continuous-tone
`image, either grayscale or color depending on the images to
`be processed. The reference image is rendered into a target
`image file. The target
`image file is not generated by a
`scanner, so the data therein is not compromised by any of the
`inherent limitations of a color scanner. Thus, the variance of
`the target image data, which is a statistical representation of
`thc cncrgy or frcqucncy contcnt of thc imagc, rctains the
`high-frequency components. The reference image can be
`any continuous-tone image, but in the preferred embodiment
`the reference image includes text with a serif font because
`the serif font has good Visual quality which the method
`preserves.
`The scanned image, although it can be any image, in the
`preferred embodiment is a printed version of the reference
`image. Thus, the variance of the scanned image represents
`the energy or frequency composition of the reference image
`but which is compromised by the inherent limitations of the
`scanner. The scaling matrix, therefore, boosts the frequency
`components that are compromised by the scanning process.
`A preferred embodiment of the invention is described
`herein in the context of a color facsimile (fax) machine. The
`color fax machine includes a scanner for rendering a color
`image into color source image data that represents the color
`image, a compression cnginc that comprcsscs thc color
`source image data to compressed image data, a means for
`encapsulating the compressed image data, and a means for
`transmitting the encapsulated data. The compression engine
`includes means for storing two quantization tables. The first
`quantization table is used to quantize the image data trans—
`formed using the discrete cosine transform (DCT). The
`second quantization table is encapsulated with the encoded
`quantized image data for use in decompressing the image.
`The second quantization table is related to the first quanti-
`zation table in the manner described above. When used to
`
`transmit and receive color images between two locations,
`the machine transfers the images with higher quality than
`prior systems.
`The second quantization table can be precomputed and
`stored in the compression engine, in which case there are no
`additional computational requirements for the compression
`engine to implement the image enhancing method of the
`
`10
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`invention. This capability results in a lower cost color
`facsimile product than is possible using the prior art image
`enhancement techniques.
`The foregoing and other objects, features and advantages
`of the invention will become more readily apparent from the
`following detailed description of a preferred embodiment of
`the invention which proceeds with reference to the accom-
`panying drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram of a prior art JPEG compression
`engine.
`FIG. 2 is a drawing of a typical format of a JPEG
`compressed file.
`FIG. 3 is a block diagram of a prior art JPEG decom-
`pression engine.
`FIG. 4 is a flow chart of a method of forming a scaled
`quantization table according to the invention.
`FIG. 5 is a drawing of a JPEG compressed file including
`a quantization table scaled according to the invention.
`FIG. 6 is a block diagram of a JPEG decompression
`engine according to the invention.
`FIG. 7 is a block diagram of a color fax machine including
`JPEG compression and decompression engines according to
`the invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`Overview of the Quantization Process
`
`The text and image enhancing technique according to the
`invention is integrated into the decoding or inverse quanti-
`zation step that is necessarily required by the JPEG standard.
`Thc invcntion intcgratcs thc two by using two diffcrcnt
`quantization tables: a first quantization table (QE) for use in
`quantizing the image data during the compression step and
`a second quantization table (QD) for use during the decode
`or inverse quantization during the decompression process.
`The difference between the two tables, in particular the ratio
`of the two tables, determines the amount of image enhancing
`that is done in the two steps. By integrating the image
`enhancing and inverse quantization steps, the method does
`not require any additional computations than already
`required for the compression and decompression processes.
`In order to understand the operation of the invention, the
`following mathematical derivation is necessary. Let QD be
`the second quantization table used during the decoding or
`inverse quantization step. Then let QD be related to the first
`quantization table QE, used during the quantization step, by
`the following expression:
`
`QD=(SXQE)+B
`
`(1)
`
`where S is a scaling matrix, which scales each element of the
`first quantization table QE to a corresponding element in the
`second quantization table QD. The scaling matrix S is not
`used in a true matrix multiplication; rather, the multiplica-
`tion is an element-by-element multiplication. Each element
`in the first quantization table QE has a corresponding ele-
`ment in the scaling matrix S that when multiplied together
`produce the corresponding element in the second quantiza-
`tion table QD.
`The matrix B is a so-called brightness matrix because it
`can affect the brightness of the image by changing the DC
`level of the DCT elements. The elements of the B matrix can
`
`include zero or non-zero values depending on the desired
`
`OLYMPUS EX. 1007 - 12/18
`
`
`
`
`
`5,850,484
`
`7
`brightness. For purposes of the following discussion and
`derivation, however, it will be assumed that the B matrix
`contains zero elements only to simplify the derivation.
`The text and image enhancing technique of the invention
`uses a variance matrix to represent the statistical properties
`of an image. The variance matrix is an M><M matrix, where
`each element in the variance matrix is equal to the variance
`of a corresponding DCT coefficient over the entire image.
`The variance is computed in the traditional manner, as is
`known in the art.
`
`The edge enhancement technique in essence tries to match
`the variance matrix of a decompressed image (VY[k,l]) with
`a variance matrix of a reference image (V*[k,l]). The
`technique tries to match the two by scaling the quantization
`table in the manner described above. In order to do this, the
`method takes advantage of the relationship between the
`uncompressed image and the compressed image. The fol-
`lowing derivation will make this relationship clear.
`Let V*[k,l] denote the variance of the [k,l] frequency
`component of a reference image. Ideally, this image contains
`those critical attributes that the technique seeks to preserve,
`for example, text. This variance matrix is of an ideal or
`reference image in that it is not rendered into color source
`image data by a scanner but, instead, is rendered into its
`ideal form by software, described further below. Thus, the
`color source image data of the reference image does not
`suffer
`from the image degradation due to the inherent
`limitations of the scanner. Therefore, the variance of the
`reference image retains the high-frequency characteristics of
`the original reference image.
`The method produces a resulting decompressed image
`that has approximately the same variance as the variance of
`the reference by modifying the quantization tablc. Thus, the
`method produces the following relationship:
`
`Vy[/91]=V*[/9/]
`
`(23'
`
`is related to the
`However, the decompressed image (Y‘)
`original quantized image (YQ) by the following expression:
`
`Y1kll=YQFlk ll in19 ll
`
`(3)
`
`10
`
`15
`
`30
`
`35
`
`40
`
`Substituting equation (1) into equation (3) yields the fol-
`lowing equation below:
`
`YIkll=Ygglkll (SUM QAle
`
`(4)
`
`45
`
`The variance of the decompressed image (VY) can then be
`expressed by the following expression:
`
`Vy[ial]=Var (YIle=Var (Slkll YQEUCVZ] QEUQID
`
`