`____________________
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`_____________________
`
`LG Electronics, Inc.
`Petitioner,
`
`v.
`FastVDO LLC
`Patent Owner.
`
`
`
`Patent No. 5,850,482
`________________________
`
`Inter Parte Review No. ____________
`
`___________________________________________________________
`
`Wallace, “The JPEG Still Picture Compression Standard,” IEEE
`Transactions on Consumer Electronics, Vol. 38, No. 1, Feb. 1992
`
`Exhibit 1007
`
`
`
`
`
`xviii
`
`IEEE Transactions on Consumer Electronics, Vol. 38, No. 1, FEBRUARY 1992
`
`THE JPEG STILL PICTURE COMPRESSION STANDARD
`Gregory K. Wallace
`Multimedia Engineering
`Digital Equipment Corporation
`May nard, Massachusetts
`
`This paper is a revised version of an article by the same
`title and author which appeared in the April I991 issue
`of Communications of the ACM.
`
`Abstract
`For the past few years, a joint ISO/CCITT committee
`known as JPEG (Joint Photographic Experts Group)
`has been working to establish the first international
`compression standard for continuous-tone still images,
`both grayscale and color. JPEG’s proposed standard
`aims to be generic, to support a wide variety of
`applications for continuous-tone images. To meet the
`differing needs of many applications, the JPEG
`standard includes two basic compression methods, each
`with various modes of operation. A DCT-based method
`is specified for “lossy” compression, and a predictive
`method for “lossless” compression. JPEG features a
`simple lossy technique known as the Baseline method,
`a subset of the other DCT-based modes of operation.
`The Baseline method has been by far the most widely
`implemented JPEG method to date, and is sufficient in
`its own right for a large number of applications. This
`article provides an overview of the JPEG standard, and
`focuses in detail on the Baseline method.
`
`1 Introduction
`Advances over the past decade in many aspects of
`digital technology - especially devices for image
`acquisition, data storage, and bitmapped ptlnting and
`display - have brought about many applications of
`digital imaging. However, these applications tend to be
`specialized due to their relatively high cost. With the
`possible exception of facsimile, digital images are not
`commonplace in general-purpose computing systems
`the way text and geomemc graphics are. The majority
`of modem business and consumer usage of photographs
`and other types of images takes place through more
`traditional analog means.
`The key obstacle for many applications is the vast
`amount of data required to represent a digital image
`
`directly. A digitized version of a single, color picture
`at TV resolution contains on the order of one million
`bytes; 35”
`resolution requires ten times that amount.
`Use of digital images often is not viable due to high
`storage or transmission costs, even when image capture
`and display devices are quite affordable.
`
`technology offers a
`image compression
`Modem
`possible solution.
`State-of-the-art techniques can
`compress typical images from 1/10 to 1/50 their
`uncompressed size without visibly affecting image
`quality. But compression technology alone is not
`sufficient. For digital image applications involving
`storage or transmission to become widespread in
`today’s marketplace, a standard image compression
`method
`is needed
`to enable
`interoperability of
`equipment from different manufacturers. The CCITT
`recommendation for today’s ubiquitous Group 3 fax
`machines E171 is a dramatic example of how a standard
`compression method can enable an important image
`application. The Group 3 method, however, deals with
`bilevel images only and does not address photographic
`image compression.
`
`For the past few years, a standardization effort known
`by the acronym JPEG, for Joint Photographic Experts
`Group, has been working toward establishing the first
`international digital image compression standard for
`continuous-tone
`(multilevel)
`still
`images, both
`grayscale and color. The “joint” in JPEG refers to a
`collaboration between CCITT and ISO.
`JPEG
`convenes officially as the IS0 committee designated
`JTCl/SC2/WGlO, but operates in close informal
`collaboration with CCIlT SGVIII. JPEG will be both
`an IS0 Standard and a CCITT Recommendation. The
`text of both will be identical.
`
`Photovideotex, desktop publishing, graphic arts, color
`facsimile, newspaper Wirephoto transmission, medical
`imaging, and many other continuous-tone image
`applications require a compression standard in order to
`develop significantly beyond their present state. JPEG
`has undertaken the ambitious task of developing a
`
`Presented at 3rd Annual Digital Video Workshop
`Revised manuscript received December 6, 1991
`
`0098 3063/92 $03.00 CJ 1992 IEEE
`
`-
`
`I~
`
`1
`
`1
`
`
`
`_ _ _ _ _ ~ -
`
`~
`
`~
`
`Apple Inc. Exhibit 1007 Page 1
`
`
`
`Wallace: The JPEG Still Picture Compression Standard
`
`xix
`
`general-purpose compression standard to meet the
`needs of almost all continuous-tone still-image
`applications.
`
`If this goal proves attainable, not only will individual
`applications flourish, but exchange of images across
`application boundaries will be facilitated. This latter
`feature will become increasingly important as more
`image applications are implemented on general-purpose
`computing systems, which are themselves becoming
`increasingly interoperable and intemetworked. For
`applications which require specialized VLSI to meet
`their
`compression
`and
`decompression
`speed
`requirements, a common method will provide
`economies of scale not possible within a single
`application.
`
`This article gives an overview of JPEG’s proposed
`image-compression standard. Readers without prior
`knowledge of JPEG or compression based on the
`Discrete Cosine Transform (DCT) are encouraged to
`study first the detailed description of the Baseline
`sequential codec, which is the basis for all of the
`DCT-based decoders. While this article provides many
`details, many more are necessarily omitted. The reader
`should refer to the IS0 draft standard [2] before
`attempting implementation.
`
`Some of the earliest industry attention to the JPEG
`proposal has been focused on the Baseline sequential
`codec as a motion image compression method - of the
`“intraframe” class, where each frame is encoded as a
`separate image. This class of motion image coding,
`while providing less compression than “interframe”
`methods like MPEG, has greater flexibility for video
`editing. While this paper focuses only on JPEG as a
`still picture standard (as IS0 intended), it is interesting
`to note that JPEG is likely to become a “de facto”
`intraframe motion standard as well.
`
`2 Background: Requirements and Selec-
`tion Process
`JPEG’s goal has been to develop a method for
`continuous-tome image compression which meets the
`following requirements:
`
`1) be at or near the state of the art with regard to
`compression
`rate and accompanying
`image
`fidelity, over a wide range of image quality ratings,
`and especially in the range where visual fidelity to
`the original is characterized as “very good” to
`“excellent”: also,
`the encoder
`should be
`parameterizable, so that the application (or user)
`can set the desired compression/quality tradeoff;
`
`be applicable
`to practically any kind of
`continuous-tone digital source image (i.e. for most
`practical purposes not be restricted to images of
`certain dimensions, color spaces, pixel aspect
`ratios, etc.) and not be limited to classes of imagery
`with restrictions on scene content, such as
`range of colors, or
`statistical
`complexity,
`properties:
`
`3) have tractable computational complexity, to make
`feasible software implementations with viable
`performance on a range of CPU’s, as well as
`hardware implementations with viable cost for
`applications requiring high performance;
`
`4)
`
`have the following modes of operation:
`
`Sequential encoding: each image component is
`encoded in a single left-to-right, top-to-bottom
`scan;
`
`Progressive encoding: the image is encoded in
`multiple scans for applications in which
`transmission time is long, and the viewer
`prefers to watch the image build up in multiple
`coarse-to-clear passes;
`
`Lossless encoding: the image is encoded to
`guarantee exact recovery of every source
`image sample value (even though the result is
`low compression compared to the
`lossy
`modes):
`
`Hierarchical encoding: the image is encoded at
`multiple resolutions so that lower-resolution
`versions may be accessed without first having
`to decompress the image at its full resolution.
`
`In June 1987, JPEG conducted a selection process
`based on a blind assessment of subjective picture
`quality, and narrowed 12 proposed methods to three.
`Three informal working groups formed to refine them,
`and in January 1988, a second, more rigorous selection
`process [19] revealed that the “ADCT” proposal [ll],
`based on the 8x8 DCT, had produced the best picture
`quality.
`
`At the time of its selection, the DCT-based method was
`only partially defined for some of the modes of
`operation. From 1988 through 1990, JPEG undertook
`the sizable task of defining, documenting, simulating,
`testing, validating, and simply agreeing on the plethora
`of details necessary for genuine interoperability and
`universality. Further history of the JPEG effort is
`contained in [6,7,9, 183.
`
`Apple Inc. Exhibit 1007 Page 2
`
`
`
`xx
`
`IEEE Transactions on Consumer Electronics, Vol. 38, No. 1, FEBRUARY 1992
`
`3 Architecture of the Proposed Standard
`The proposed standard contains the four “modes of
`operation” identified previously. For each mode, one
`or more distinct codecs are specified. Codecs within a
`mode differ according to the precision of source image
`samples they can handle or the entropy coding method
`they use. Although the word codec (encoder/decoder)
`is used frequently in this article, there is no requirement
`that implementations must include both an encoder and
`a decoder. Many applications will have systems or
`devices which require only one or the other.
`
`The four modes of operation and their various codecs
`have resulted from JPEG’s goal of being generic and
`from the diversity of image formats across applications.
`The multiple pieces can give the impression of
`undesirable complexity, but they should actually be
`regarded as a comprehensive “toolkit” which can span a
`wide range of continuous-tone image applications. It is
`unlikely that many implementations will utilize every
`tool -- indeed, most of the early implementations now
`on the market (even before final IS0 approval) have
`implemented only the Baseline sequential codec.
`
`The Baseline sequential codec is inherently a rich and
`sophisticated compression method which will be
`sufficient for many applications. Getting this minimum
`JPEG
`capability
`implemented
`properly
`and
`interoperably will provide
`the
`industry with an
`important initial capability for exchange of images
`across vendors and applications.
`
`4 Processing Steps for DCT-Based Coding
`Figures 1 and 2 show the key processing steps which
`are the heart of the DCT-based modes of operation.
`These
`figures
`illustrate
`the
`special case of
`single-component (gray scale) image compression. The
`reader can grasp
`the essentials of DCT-based
`compression by
`thinking of
`it as essentially
`compression of a stream of 8x8 blocks of grayscale
`image samples. Color image compression can then be
`approximately regarded as compression of multiple
`grayscale images, which are either compressed entirely
`one at a time, or are compressed by alternately
`interleaving 8x8 sample blocks from each in turn.
`
`For DCT sequential-mode codecs, which include the
`Baseline sequential codec, the simplified diagrams
`indicate how single-component compression works in a
`fairly complete way. Each 8x8 block is input, makes
`its way through each processing step, and yields output
`in compressed form into the data stream. For DCT
`progressive-mode codecs, an image buffer exists prior
`to the entropy coding step, so that an image can be
`stored and then parceled out in multiple scans with suc-
`cessively improving quality. For the hierarchical mode
`
`of operation, the steps shown are used as building
`blocks within a larger framework.
`
`4.1 8x8 FDCT and IDCT
`At the input to the encoder, source image samples are
`grouped into 8x8 blocks, shifted from unsigned integers
`with ran e [0, 2’
`- 11 to signed integers with range
`[-2‘-’, 2$-l-1], and input to the Forward DCT (FDCT).
`At the output from the decoder, the Inverse DCT
`(IDCT) outputs 8x8 sample blocks to form the
`reconstructed image. The following equations are the
`idealized mathematical definitions of the 8x8 FDCT
`and 8x8 IDCT:
`
`7
`
`7
`
`
`
`F O y=o
`
`C(U), C(v) = 1 otherwise.
`
`The DCT is related to the Discrete Fourier Transform
`(DFT).
`Some simple
`intuition for DCT-based
`compression can be obtained by viewing the FDCT as a
`harmonic analyzer and the IDCT as a harmonic
`synthesizer. Each 8x8 block of source image samples
`is effectively a 64-point discrete signal which is a
`function of the two spatial dimensions x and y. The
`FDCT takes such a signal as its input and decomposes
`it into 64 orthogonal basis signals. Each contains one
`the 64 unique two-dimensional (2D) “spatial
`of
`frequencies” which comprise the
`input signal’s
`“spectrum.” The ouput of the FDCT is the set of 64
`basis-signal amplitudes or “DCT coefficients” whose
`values are uniquely determined by
`the particular
`@-point input signal.
`
`The DCT coefficient values can thus be regarded as the
`relative amount of the 2D spatial frequencies contained
`in the @-point input signal. The coefficient with zero
`frequency in both dimensions is called the “DC
`coefficient” and the remaining 63 coefficients are
`called the “AC coefficients.” Because sample values
`
`Apple Inc. Exhibit 1007 Page 3
`
`
`
`Wallace: The JPEG Still Picture Compression Standard
`
`DCT-Based Encoder
`8x8 blocks
`/
`- I W
`/
`9 + FDCT -jr Quantizer j Entropy
`Encoder
`
`1
`
`Source
`Image Data
`
`Specifications
`
`Specifications
`
`Compressed
`ImageData
`
`Figure 1. DCT-Based Encoder Processing Steps
`
`Compressed
`Image Data
`
`Specifications
`
`Reconstructed
`Image Data
`
`Figure 2. DCT-Based Decoder Processing Steps
`
`typically vary slowly from point to point across an
`image, the FDCT processing step lays the foundation
`for achieving data compression by concentrating most
`of the signal in the lower spatial frequencies. For a
`typical 8x8 sample block from a typical source image,
`most of the spatial frequencies have zero or near-zero
`amplitude and need not be encoded.
`
`At the decoder the IDCT reverses this processing step.
`It takes the 64 DCT coefficients (which at that point
`have been quantized) and reconstructs a @-point ouput
`image signal by
`summing
`the basis
`signals.
`Mathematically, the DCT is one-to-one mapping for
`64-point vectors between the image and the frequency
`domains. If the FDCT and IDCT could be computed
`with perfect accuracy and if the DCT coefficients were
`not quantized as in the following description, the
`original @-point signal could be exactly recovered. In
`principle, the DCT introduces no loss to the source
`image samples; it merely transforms them to a domain
`in which they can be more efficiently encoded.
`
`Some properties of practical FDCT and IDCT
`implementations raise the issue of what precisely
`should be required by
`the JPEG standard. A
`fundamental property is that the FDCT and IDCT
`equations
`contain
`transcendental
`functions.
`Consequently, no physical
`implementation can
`compute them with perfect accuracy. Because of the
`DCT’s application importance and its relationship to
`the D R , many different algorithms by which the
`
`FDCT and IDCT may be approximately computed have
`Indeed, research in fast DCT
`been devised [16].
`algorithms is ongoing and no single algorithm is
`optimal for all implementations. What is optimal in
`software for a general-purpose CPU is unlikely to be
`optimal in firmware for a programmable DSP and is
`certain to be suboptimal for dedicated VLSI.
`
`Even in light of the finite precision of the DCT inputs
`and outputs, independently designed implementations
`of the very same FDCT or IDCT algorithm which differ
`even minutely in the precision by which they represent
`cosine terms or intermediate results, or in the way they
`sum and round fractional values, will eventually
`produce slightly different outputs from identical inputs.
`
`To preserve freedom for innovation and customization
`within implementations, JPEG has chosen to specify
`neither a unique FDCT algorithm or a unique IDCT
`algorithm in its proposed standard. This makes
`compliance somewhat more difficult to confirm,
`because
`two compliant encoders
`(or decoders)
`generally will not produce identical outputs given
`identical inputs. The JPEG standard will address this
`issue by specifying an accuracy test as part of its
`compliance tests for all DCT-based encoders and
`decoders; this is to ensure against crudely inaccurate
`cosine basis functions which would degrade image
`quality.
`
`Apple Inc. Exhibit 1007 Page 4
`
`
`
`-
`
`, .
`
`xxii
`
`IEEE Transactions on Consumer Electronics, Vol. 38, No. 1, FEBRUARY 1992
`
`For each DCT-based mode of operation, the JPEG
`proposal specifies separate codecs for images with 8-bit
`and 12-bit (per component) source image samples. The
`12-bit codecs, needed to accommodate certain types of
`medical
`and
`other
`images,
`require
`greater
`computational resources to achieve the required FDCT
`or IDCT accuracy.
`Images with other sample
`precisions can usually be accommodated by either an
`8-bit or 12-bit codec, but this must be done outside the
`JPEG standard.
`For example, it would be the
`responsibility of an application to decide how to fit or
`pad a 6-bit sample into the 8-bit encoder’s input
`interface, how to unpack it at the decoder’s output, and
`how to encode any necessary related information.
`
`4.2 Quantization
`After output from the FDCT, each of the 64 DCT
`coefficients is uniformly quantized in conjunction with
`a 64-element Quantization Table, which must be
`specified by the application (or user) as an input to the
`encoder. Each element can be any integer value from 1
`to 255, which specifies the step size of the quantizer for
`its corresponding DCT coefficient. The purpose of
`quantization is to achieve further compression by
`representing DCT coefficients with no greater precision
`than is necessary to achieve the desired image quality.
`Stated another way, the goal of this processing step is
`to discard information which is not visually significant.
`Quantization is a many-to-one mapping, and therefore
`is fundamentally lossy. It is the principal source of
`lossiness in DCT-based encoders.
`Quantization is defined as division of each DCT
`coefficient by its corresponding quantizer step size,
`followed by rounding to the nearest integer:
`
`#(U,
`
`F(u v)
`v ) = Integer Round (-)
`Q ( u 4
`
`(3)
`
`This output value is normalized by the quantizer step
`size. Dequantization is the inverse function, which in
`this case means simply that the normalization is
`removed by multiplying by the step size, which returns
`the result to a representation appropriate for input to the
`IDCT:
`
`When the aim is to compress the image as much as
`possible without visible artifacts, each step size. ideally
`should be chosen as the perceptual threshold or ‘‘just
`noticeable difference” for the visual contribution of its
`corresponding cosine basis function. These thresholds
`are also functions of the source image characteristics,
`display characteristics and viewing distance. For
`applications in which these variables can be reasonably
`well defined, psychovisual experiments can be
`performed to determine the best thresholds. The
`experiment described in [12] has led to a set of
`Quantization Tables for CCIR-601 [4] images and
`displays. These have been used experimentally by
`JPEG members and will appear in the IS0 standard as a
`matter of information, but not as a requirement.
`
`4.3 DC Coding and Zig-Zag Sequence
`After quantization, the DC coefficient is treated
`separately from the 63 AC coefficients. The DC
`coefficient is a measure of the average value of the 64
`image samples. Because there is usually strong
`correlation between the DC coefficients of adjacent 8x8
`blocks, the quantized DC coefficient is encoded as the
`difference from the DC term of the previous block in
`the encoding order (defined in the following), as shown
`in Figure 3. This special treatment is worthwhile, as
`DC coefficients frequently contain a significant fraction
`of the total image energy.
`
`Ac07
`
`...
`
`Differential DC encoding
`
`Zig-zag sequence
`
`ACII
`
`Figure 3. Preparation of Quantized Coefficients for Entropy Coding
`
`Apple Inc. Exhibit 1007 Page 5
`
`
`
`Wallace: The JPEG Still Picture Compression Standard
`
`xxiii
`
`Finally, all of the quantized coefficients are ordered
`into the “zig-zag” sequence, also shown in Figure 3.
`This ordering helps to facilitate entropy coding by
`placing low-frequency coefficients (which are more
`likely
`to be nonzero) before high-frequency
`coefficients.
`
`4.4 Entropy Coding
`The final DCT-based encoder processing step is
`entropy coding.
`This step achieves additional
`compression losslessly by encoding the quantized DCT
`coefficients more compactly based on their statistical
`characteristics. The JPEG proposal specifies two
`entropy coding methods - Huffman coding [8] and
`arithmetic coding 1151. The Baseline sequential codec
`uses Huffman coding, but codecs with both methods
`are specified for all modes of operation.
`
`It is useful to consider entropy coding as a 2-step
`process. The first step converts the zig-zag sequence of
`quantized coefficients into an intermediate sequence of
`symbols. The second step converts the symbols to a
`data stream in which the symbols no longer have
`externally identifiable boundaries. The form and
`definition of the intermediate symbols is dependent on
`both the DCT-based mode of operation and the entropy
`coding method.
`
`Huffman coding requires that one or more sets of
`Huffman code tables be specified by the application.
`The same tables used to compress an image are needed
`to decompress it. Huffman tables may be predefined
`and used within an application as defaults, or computed
`specifically for a given
`image
`initial
`in an
`statistics-gathering pass prior to compression. Such
`choices are the business of the applications which use
`JPEG; the JPEG proposal specifies no required
`Huffman tables. Huffman coding for the Baseline
`sequential encoder is described in detail in section 7.
`
`By contrast, the particular arithmetic coding method
`specified in the JPEG proposal [2] requires no tables to
`be externally input, because it is able to adapt to the
`image statistics as it encodes the image. (If desired,
`statistical conditioning tables can be used as inputs for
`slightly better efficiency, but this is not required.)
`Arithmetic coding has produced 5-10% better
`compression than Huffman for many of the images
`which JPEG members have tested. However, some feel
`it is more complex than Huffman coding for certain
`implementations, for example,
`the highest-speed
`hardware
`implementations.
`(Throughout JPEG’s
`history, “complexity” has proved to be most elusive as
`a practical memc for comparing compression methods.)
`
`If the only difference between two JPEG codecs is the
`entropy coding method, transcoding between the two is
`
`possible by simply entropy decoding with one method
`and entropy recoding with the other.
`
`4.5 Compression and Picture Quality
`For color images with moderately complex scenes, all
`DCT-based modes of operation typically produce the
`following levels of picture quality for the indicated
`ranges of compression. These levels are only a
`- quality and compression can vary
`guideline
`significantly according to source image characteristics
`and scene content. (The units “bits/pixel” here mean
`the total number of bits in the compressed image -
`including the chrominance components - divided by the
`number of samples in the luminance component.)
`
`0.25-0.5 bitdpixel: moderate to good quality,
`sufficient for some applications;
`
`0.5-0.75 bitdpixel: good to very good quality,
`sufficient for many applications:
`
`0.75- 1/5 bits/pixel: excellent quality, sufficient for
`most applications;
`
`0
`
`1.5-2.0 bits/pixel: usually indistinguishable from
`the original, sufficient for the most demanding
`applications.
`
`5 Processing Steps for Predictive Lossless
`Coding
`After its selection of a DCT-based method in 1988,
`JPEG discovered that a DCT-based lossless mode was
`difficult to define as a practical standard against which
`encoders and decoders could be
`independently
`implemented, without placing severe constraints on
`both encoder and decoder implementations.
`
`JPEG, to meet its requirement for a lossless mode of
`operation, has chosen a simple predictive method
`which is wholly independent of the DCT processing
`described previously. Selection of this method was not
`the result of rigorous competitive evaluation as was the
`DCT-based method. Nevertheless, the JPEG lossless
`method produces results which,
`in
`light of
`its
`simplicity, are surpisingly close to the state of the art
`for lossless continuous-tone compression, as indicated
`by a recent technical report [5].
`
`Figure 4 shows the main processing steps for a
`single-component image. A predictor combines the
`values of up to three neighboring samples (A, B, and C )
`to form a prediction of the sample indicated by X in
`Figure 5. This prediction is then subtracted from the
`actual value of sample X, and the difference is encoded
`
`Apple Inc. Exhibit 1007 Page 6
`
`
`
`. -
`
`xxiv
`
`IEEE Transactions on Consumer Electronics, Vol. 38, No. 1, FEBRUARY 1992
`
`Lossless Encoder
`
`Source
`Image Data
`
`p & l
`
`Specifications
`
`Compressed
`Image Data
`
`Figure 4. Lossless Mode Encoder Processing Steps
`
`losslessly by either of the entropy c d n g methods -
`Huffman or arithmetic. Any one of the eight predictors
`listed in Table 1 (under “selection-value”) can be used.
`
`Selections 1, 2, and 3 are one-dimensional predictors
`and selections 4, 5, 6 and 7 are two-dimensional
`predictors. Selection-value 0 can only be used for
`differential coding in
`the hierarchical mode of
`operation. The entropy coding is nearly identical to
`that used for the DC coefficient as described in section
`7.1 (for Huffman coding).
`
`t
`
`.
`
`.
`
` m
`
`I
`
`
`B
`C
`A X
`
`Figure 5. 3-Sample Prediction Neighborhood
`
`For the lossless mode of operation, two different codecs
`are specified - one for each entropy coding method.
`The encoders can use any source image precision from
`2 to 16 bits/sample, and can use any of the predictors
`except selection-value 0. The decoders must handle
`any of the sample precisions and any of the predictors.
`Lossless codecs
`typically produce around 2: 1
`compression for color images with moderately complex
`scenes.
`
`selection-
`value
`0
`1
`2
`3
`4
`5
`6
`7
`
`prediction
`no prediction
`A
`B
`C
`A+B-C
`A+(@ -C)/2)
`B+( (A-C)/2)
`(A+B)12
`
`Table 1. Predictors for Lossless Coding
`
`6 Multiple-Component Images
`The previous sections discussed the key processing
`steps of the DCT-based and predictive lossless codecs
`for the case of single-component source images. These
`steps accomplish the image data compression. But a
`good deal of the JPEG proposal is also concerned with
`the handling and control of color (or other) images with
`multiple components.
`JPEG’s aim for a generic
`compression
`standard
`requires
`its proposal
`to
`accommodate a variety of source image formats.
`
`6.1 Source Image Formats
`The source image model used in the JPEG proposal is
`an abstraction from a variety of image types and
`applications and consists of only what is necessary to
`compress and reconstruct digital image data. The
`reader should recognize that the JPEG compressed data
`format does not encode enough information to serve as
`a complete image representation. For example, JPEG
`does not specify or encode any information on pixel
`aspect ratio, color space, or
`image acquisition
`characteristics.
`
`Apple Inc. Exhibit 1007 Page 7
`
`
`
`Wallace: The JPEG Still Picture Compression Standard
`
`XXV
`
`Ci
`
`top
`
`. I
`
`1
`
`‘Nf
`. ‘Nf- 1
`
`(a) Source image with multiple components
`
`bottom
`(b) Characteristics of an image component
`
`Figure 6. JPEG Source Image Model
`
`Figure 6 illustrates the JPEG source image model. A
`to 255
`source
`image
`image contains from 1
`components, sometimes called color or spectral bands
`or channels. Each component consists of a rectangular
`array of samples. A sample is defined to be an
`unsigned integer with precision P bits, with any value
`in the range [0, 2‘-1]. All samples of all components
`within the same source image must have the same
`precision P. P can be 8 or 12 for DCT-based codecs,
`and 2 to 16 for predictive codecs.
`
`The ith component has sample dimensions xi by yi. To
`accommodate
`formats
`in which
`some
`image
`components are sampled at different rates than others,
`components can have different dimensions. The
`dimensions must have a mutual integral relationship
`defined by Hi and Vi, the relative horizontal and
`vertical sampling factors, which must be specified for
`each component. Overall image dimensions X and Y
`are defined as the maximum xi and yi for all
`components in the image, and can be any number up to
`216. H and V are allowed only the integer values 1
`through 4. The encoded parameters are X, Y, and His
`and Vis for each components. The decoder reconstructs
`the cfimensions xi and yi for each component, according
`to the following relationship shown in Equation 5:
`
`where r 1 is the ceiling function.
`
`6.2 Encoding Order and Interleaving
`A practical image compression standard must address
`how systems will need to handle the data during the
`process of decompression. Many applications need to
`the process of displaying or printing
`pipeline
`multiple-component images in parallel with the process
`
`of decompression. For many systems, this is only
`feasible if the components are interleaved together
`within the compressed data stream.
`
`To make the same interleaving machinery applicable to
`both DCT-based and predictive codecs, the JPEG
`proposal has defined the concept of “data unit.” A data
`unit is a sample in precfictive codecs and an 8x8 block
`of samples in DCT-based codecs.
`
`The order in which compressed data units are placed in
`the compressed data stream is a generalization of
`raster-scan order. Generally, data units are ordered
`from left-to-right and top-to-bottom according to the
`orientation shown in Figure 6. (It is the responsibility
`of applications to define which edges of a source image
`are top, bottom, left and right.) If an image component
`is noninterleaved (i.e., compressed without being
`interleaved with other components), compressed data
`units are ordered in a pure raster scan as shown in
`Figure 7.
`
`top
`
`left
`
`right
`
`bottom
`
`Figure 7. Noninterleaved Data Ordering
`
`When two or more components are interleaved, each
`component Ci is partitioned into rectangular regions of
`Hi by Vi data units, as shown in the generalized
`example of Figure 8. Regions are ordered within a
`component from left-to-right and top-to-bottom, and
`within a region, data units are ordered from left-to-right
`and top-to-bottom. The JPEG proposal defines the
`term Minimum Coded Unit (MCU) to be the smallest
`
`Apple Inc. Exhibit 1007 Page 8
`
`
`
`.
`
`, .
`
`xxvi
`
`IEEE Transactions on Consumer Electronics, Vol. 38, No. 1, FEBRUARY 1992
`
`C S ~ : H1=2, Viz2
`
`3 4 - z
`
`5
`
`C S ~ : H2=2, V2=l
`
`Cs3: H3=1, V3=2 CS4: H4=1, V4=1
`
`0 1 2 3 4 5
`
`0 1 2
`
`7 P a
`
`
`
`e
`.
`
`
`'I
`
`1
`
`2
`3
`
`0
`
`3
`
`Figure 8. Generalized Interleaved Data Ordering Example
`
`group of interleaved data units. For the example
`shown, MCU, consists of data units taken first from the
`top-left-most region of C,, followed by data units from
`the same region of C2, and likewise for C3 and C,.
`MCU, continues the pattem as shown.
`Thus, interleaved data is an ordered sequence of MCUs,
`and the number of data units contained in an MCU is
`determined by the number of components interleaved
`and their relative sampling factors. The maximum
`number of components which can be interleaved is 4
`and the maximum number of data units in an MCU is
`10. The latter restriction is expressed as shown in
`Equation 6, where
`the summation
`is