`
`PROCEEDINGS OF THEIEEE, VOL. 68, NO. 3, MARCH 1980
`
`[14]
`
`{15}
`
`[16]
`
`W. K. H. Panofsky and M. Phillips, Classical Electricity and Mag-
`netism, 2nd Ed. Reading, MA: Addison-Wesley, 1962, ch. 18.
`A. S. Eddington, The Mathematical Theory of Relativity, 3rd ed.
`New York: Chelsea, 1975, ch. 6.
`C. Vassallo, “On the expansion of axial field components in terms
`of normal modes in perturbed waveguides,” IEEE Trans. Micro-
`wave Theory Tech., vol. MTT-23, pp. 264-265, Feb., 1975.
`R. E. Collin, “On the incompleteness of E and H modesin wave-
`guides,’ Can. J. Phys., vol. 51, pp. 1135-1140, 1973.
`J,W. Dettman, Mathematical Methods in Physics and Engineering,
`2nd ed. New York: McGraw-Hill, 1962, p. 63.
`[19] P. M. Morse and H. Feshbach, Methods of Theoretical Physics,
`
`[17]
`
`[18]
`
`Part Il. New York: McGraw-Hill, 1953, ch. 13.
`W. Thomson (Lord Kelvin), Mathematical and Physical Papers.
`Cambridge, England: Cambridge Univ. Press, 1884, pp. 61-91.
`Actually Kelvin’s idea treats the cable as a distributed resistance-
`capacitanceline.
`Reference Data for Radio Engineers, 4th Ed,, H. P, Westman, Ed.
`New York: International Telephone and Telegraph Corp., 1956,
`ch. 6.
`[22] G.J. Gabriel, Part Il, to be published.
`See [14, p. 174] for inductance.
`H.
`J, Carlin, “Distributed circuit design with transmission line
`elements,” Proc, JEEE, vol. 59, pp. 1059-1081, July 1971.
`
`Picture Coding: A Review
`
`ARUN N. NETRAVALI, senioR MEMBER, IEEE, AND JOHN O. LIMB,FELLow,IEEE
`
`Invited Paper
`
`Abstract—This paper presents a review of techniques used fordigital
`* encoding of picture material. Statistical models of picture signals and
`elements of psychophysics relevant to picture coding are coveredfirst,
`followed by a description of the coding techniques. Detailed examples
`of three typical systems, which combine some of the coding principles,
`are given. A bright future for new systems is forecasted based on
`emerging new concepts, technology of integrated circuits and the need
`to digitize in a variety of contexts.
`
`INTRODUCTION
`I.
`ROADCASTtelevision has assumed a dominantrole in
`
`Be everyday life to such an extent that today in the
`
`U.S. there are more homes that contain a television set
`than have telephone service. So it is natural that in thinking of
`television transmission we immediately think of the signal that
`is broadcast into the home. More efficient encoding of this
`signal would free valuable spectrum space. A difficulty in
`modifying the television signal that is broadcasted for local dis-
`tribution is that the television receiver would most likely need
`to be modified or replaced.! The difficulty of achievingthis
`with an invested base of over $10 Billion is staggering.
`There is a large amount of point-to-point transmission ofpic-
`ture material
`taking place today apart from the UHF/VHF
`broadcasting. For example, each of the four U.S. television
`networks has a distribution system spanning the whole of the
`continental United States; international satellite links transmit
`live programs around the world. Video-conferencing services
`
`Manuscript received May 11, 1979; revised October 2, 1979.
`A. N. Netravali is with Bell Laboratories, Holmdel, NJ 07733.
`J. O. Limb is with Bell Laboratories, Murray Hill, NJ 07974.
`‘However,
`there is the possibility of improving picture quality by
`modifying the transmitted signal such that it remains compatible with
`existing television receivers.
`
`are receiving increasing attention, and facsimile transmission of
`newspapers and printed material
`is becoming more wide-
`spread. Satellites are beaming to earth a continuousstream of
`weather photographs and earth-resource pictures, and there are
`a number of important military applications such as the con-
`trol of remotely piloted vehicles. Efficient coding of picture
`material for these applications provides the opportunity for
`significantly decreasing transmission costs. These costs can be
`quite large; in comparison with a digitized speech signal at 64
`kb/s, straightforward digitization of a broadcast television sig-
`nal requires approximately 100 Mb/s. The aim of efficient
`coding is to reduce the required transmission rate for a given
`picture quality so as to yield a reduction in transmission costs.
`A further area of application of efficient coding is where pic-
`ture material needs to be stored, for example, in archiving X-
`ray material and in storing picture databases such as engineer-
`ing drawings and fingerprints. Efficient representation will
`permit the storage requirements to be reduced.
`Some early efforts in picture coding used analog coding
`techniques and attempted to reduce the required analog band-
`width, giving rise to the term ‘‘bandwidth compression”’.”
`Complex manipulations of the signal are today much more
`easily done by first sampling and digitizing the signal and then
`processing the signal in the digital domain rather than using
`analog techniques. The resulting signal may be converted back
`to analog form for transmission over an analog channel or be
`retained in digital form for transmission over a digital channel.
`Almost all coding methods have been oriented toward digital
`
`* Channel capacity is a function of both bandwidth and signal-to-noise
`ratio, thus compressing bandwidth may not reduce channel capacity if a
`lower noise channel is required as a result.
`
`:———]
`—_—ass?
`
`
`
`——__»
`
`0018-9219/80/0300-0366$00.75 © 1980 IEEE
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 1
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 1
`
`
`
`CODED
`INPUT
`SOURCE
`CHANNEL
`OUTPUT
`INPUT
`REVERSIBLE|OUTPUT
`SIGMAL
`CODER
`CODER
`IRREVERSIBLE
`OPERATIONS
`SIGNAL
`ZATION)||ASSIGNMENT)
`OPERATION
`(WORD
`REPRESENTATION
`Fig. 2. Source and channel encoding.
`
`NETRAVALI AND LIMB: PICTURE CODING
`
`367
`
`Fig. 1.
`
`Block diagram of the encoding process.
`
`£ o
`
`O—
`
`eae
`
`=—=
`
`eoneE,-aae—
`
`=.
`
`transmission for a number ofreasons: it offers greater flexibil-
`ity, it may be regenerated, it
`is easily multiplexed and en-
`crypted, and its ubiquity is increasing [1].
`Efficient coding is usually achieved in three stages (Fig. 1).
`1) An initial stage in which an appropriate representation of
`the signal is made, for example, a set of transform coefficients
`for transform encoding. This operation is generally reversible.?
`Statistical redundancy mayalso be reduced.
`2) A stage in which the. accuracy of representation is re-
`duced while still meeting the required picture quality objec-
`tives [2]. For example, dark portions of a picture may be
`coded more accurately than lighter portions to utilize the fact
`that the visual system is more sensitive to small signal changes
`in the darker areas. This operationis irreversible.
`is
`3) A stage in which statistical redundancy in the signal
`eliminated. For example, a Huffman code [3] may be used to
`assign shorter code words to signal values that occur more fre-
`quently and longer code words to values that occur rarely.
`This operation is reversible.
`In practice transmission channels.are frequently prone to er-
`rors and a “catch 22” of coding is that when thesignal is rep-
`resented more efficiently the effect of an error becomes far
`more serious. Consequently, it is frequently necessary to add
`a controlled form of redundancy back into the signal in the
`form of channel encoding in order to reduce the impact of
`transmission errors. The typical configuration then, is shown
`in Fig. 2 with the coding broken down into source encoding,
`in which redundancy is removed from the signal for the pur-
`pose of achieving a more efficient representation, and channel
`coding where redundancyis reinserted into the signal in order
`to obtain better channel-error performance.
`It goes without
`saying that the increase in bit rate resulting from the channel
`coding stage should be significantly less than the decrease in
`bit rate resulting from the source encoding operation in order
`to realize a saving.
`In practice the application of picture cod-
`ing to transmission channels is an economic tradeoff in system
`design, balancing picture quality, circuit complexity, bit rate,
`and error performance.
`Where coding is used to reduce storage requirements the
`tradeoffs are different
`in that the coding operation usually
`need not be performed in real time and buffering may not be
`needed to match the output generation rate of the coder to
`the transmission rate of the channel, Further, the error rate
`encountered in the process of storage and retrieval is usually
`many orders of magnitude lower than the design error rate for
`a digital channel. As a result, for purposes of storage one can
`consider more complicated encoding algorithms without con-
`cern aboutthe effects of a large error rate.
`In this paper, we will be concerned primarily with describing
`efficient picture coding algorithms. The paperis addressed to
`the nonspecialist but does assume some background in digital
`Processing techniques. The literature in this area is extensive
`
`> DPCM encoding (see Section IV-B) combines stages 1 and 2.
`
`[4], [5] and we will describe those aspects of the art which
`we feel are most significant. References [6]-[12] are special
`issues which give more detail about certain aspects of the sub-
`ject. The whole topic of the efficient coding of color signals is
`covered in a recent paper [13] and for this reason color coding
`will be discussed very cursorily. One specific type of signal is
`the two level (black/white) waveform that results from scan-
`ning a facsimile image. This special topic is covered in [14]
`and is not discussed here. A recent book contains reviews of
`many aspects of picture coding [15].
`Westart by providing background on the nature and proper-
`ties of the television signal source in Section II and on the hu-
`man observer (who is in most applications the ultimate re-
`ceiver) in Section III.
`In Section IV, basic waveform coding
`techniques are first classified and then discussed under the
`categories pulse-code modulation (PCM), differential PCM
`(DPCM), transform, hybrid, interpolative, and contour. Sec-
`tion V contains descriptions of state-of-the-art examples of
`transform encoding, frame-to-frame DPCM and frame-to-frame
`interpolative encoding and indicates how the techniquesof the
`previous section have been combined in practical encoders. In
`Section VI issues such as the direction of new developments
`and the effect of new technology are discussed.
`
`IJ. SourRcE CopING AND PICTURE STATISTICS
`
`Ideally, one would like to take advantage of any structure
`(both geometric and statistical) in a picture signal to increase
`the efficiency of the encoding operation. Also the coding pro-
`cess should take into consideration the resolution (amplitude,
`spatial, and temporal) requirements of the receiver, i.e., the
`television display and very often the human viewer.* This
`problem of encoding can be formulated in the general frame-
`work of information theory as a source coding problem.
`In
`this section, we describe briefly the source coding problem and
`point out someof the difficulties in the use of results from in-
`formation theory. We then present some knownstatistics of
`the picture signals and models based on thesestatistics.
`
`A. Source Coding Problem
`The source coding problem can be stated mathematically as
`follows. Given a random source waveform L(x, y, ¢) repre-
`senting, for example, the luminance information in the picture,
`obtain an encoding strategy such that for a given transmission
`bit rate it minimizes the average distortion D defined as
`a)
`D=E[d(L,£))
`where a(L, L) is ameasureofdistortion between two intensity
`fields, L and L; Lc being the coded representation and F de-
`notes the statistical expectation over the ensemble of source
`waveforms. Design of such an encoding strategy depends ob-
`viously on the statistical description of the random source
`waveform, LZ, and on the characteristics of the distortion func-
`tion d, Shannon’s rate distortion theory [16], [17] provides
`
`‘There are many instances where pictures are processed and/or trans-
`
`mitted for interpretation by a machine.
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 2
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 2
`
`
`
`
`
`PROCEEDINGS OF THE IEEE, VOL. 68, NO. 3, MARCH 1980
`
`4/30 SECONDS.
`
`PIGTURE
`SAMPLES
`
`me
`—$—a
`——+
`
`ws
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 3
`
` a -
`
`
`
`
`
`
`FIELD I
`
`“SSGANNING LINE
`
`FIELD (144)
`
`FIELD (142)
`Fig. 3. Scanning process employed ina television signal.
`
`15000
`
`412500
`
`10000
`
`NUMBEROFPELS
`
`7500
`
`5000
`
`2500
`
`Fig. 4. Histogram of intensities of a typical image. The two peaks are
`in the dark and light region of the image.
`
`INTENSITY
`
`the mathematical framework for analysis of this source coding
`problem, Let p,(L) be the probability density function of L,
`and p2(L|L) be a conditional density corresponding perhaps
`to an encoding and decoding operation, then the rate distor-
`tion function R (D) is defined as
`(2)
`R(D*)= min {1(L, £)}
`where /(L, f) is the average mutual information between the
`two random waveforms, source J and its reconstruction L, and
`the minimum is taken over all the encoding strategies which re-
`sult in average distortion D less than or equal to a given num-
`ber D*. Average mutual information /(L, L) is defined by
`rt, B)=~fpy(t)px(£L) toes
`x
`dL «db
`
`p3{L)
`
`(3)
`
`where p3(L)is the probability density of £. Qualitatively, the
`mutual information represents the average uncertainty in the
`source output minus the average uncertainty in the source out-
`put given the coded output L. The above definition of the
`rate distortion function becomessignificant in the light of the
`coding theorem of Shannon, which states that for stationary
`sources an encoding strategy, however complex, cannotbe de-
`signed to give an average distortion less than D for an average
`transmission rate R(D); but it is possible to have an encoding
`strategy to give an average distortion D at a transmission bit
`rate arbitrarily close to R(D). Thus the rate distortion func-
`tion gives the minimum transmission rate to achieve an average
`distortion D and, therefore, provides a bound on the perfor-
`mance of any given encoding strategy, i.e., we can find out
`how far from the optimum any given practical encodingstrat-
`egy is. Also it is possible to construct codes (e.g., block codes,
`tree codes) whose asymptotic performance in terms of rate
`will be close to R(D); however, this information does nottell
`us precisely how to build practical encoders, but it is valuable
`in calibrating them.
`In addition to the problem that rate distortion theory does
`not tell us how to synthesize a practical coder, it has other lim-
`itations. It is difficult to compute rate distortion functions for
`many realistic models of the picture source and distortion cri-
`teria. One of the combinations of source distributions and dis-
`tortion criteria for which the minimization problem of (2) is
`solved is when the waveform L(x, y, ¢) is taken to be a se-
`quence of spatial images L(x, y) representing a Gaussian ran-
`dom field, and distortion between Z and L is measured by a
`weighted square error [18].
`In this case, the optimum en-
`coder first
`filters the luminance field L(x, y) by the error
`weighting function and expands the filtered image into its
`Karhunen-Loeve components. (See Section [V-C.) Karhunen-
`Loeve components are then represented(in binary bits, for ex-
`ample) with equal mean-square error and transmitted. At the
`receiver, an estimate of the filtered luminance field is recon-
`structed, and it is inverse filtered to obtain an approximation
`of the original image. Although the optimal encoder is known
`explicitly in this case, the assumptions under which it is de-
`rived are not entirely appropriate for the problem of picture
`communication. The luminance of most picture signals does
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 3
`
`
`
`NETRAVALI AND LIMB: PICTURE CODING
`
`369
`
`not approximate a Gaussian process, and the weighted square
`error criterion (see Section III) is not appropriate if the pic-
`tures are viewed by human observers. Summarizing, there are
`four problems in the use of the rate distortion theory : 1) lack
`of good statistical models for picture signals; 2) a distortion
`criterion consistent with the visual processing of the human ob-
`servers; 3) calculation of rate distortion functions; and 4) syn-
`thesis of an encoder to perform close to R(D).
`
`B. Picture Signal Statistics and Models
`Perhaps because rate distortion theory presents many prob-
`lems in its use for picture coding, many ad-hoc encoding
`schemes have been proposed to exploit different types of ob-
`served redundancies in the picture signal. We give a brief sum-
`mary of picture signal statistics that is useful in the discussion
`of encoding schemes described in Section IV.
`We start with the first-order statistics. We employ the con-
`ventional scanning and sampling process shown in Fig. 3 to
`convert the television signal from a scene into a sequence of
`samples. This is done by first sampling in time to get fields
`and then a periodic sampling of a matrix of picture elements
`(pels) of chosen resolution in the field. We note that the two
`consecutive fields are interlaced vertically in space, i.e., spatial
`position of a scanningline in a field is in the middle of the spa-
`tial position of scanning lines in either of its two adjacent
`fields. Also note that due to this interlace, distance between
`two horizontally adjacent pels is smaller than the distance be-
`tween two vertically adjacent pels. The probability density of
`luminance samples thus generated is highly nonuniform, de-
`pends upon the camera settings and scene illumination, and
`varies widely from picture to picture. A histogram of pelin-
`tensities from a typical picture shown in Fig. 4 demonstrates
`that, even based on the first-order statistics, the luminance
`does not approximate a Gaussian process [19].
`Measurements of some second-order statistics [20]-[22]
`show that the autocorrelation function depends upon the de-
`tail in the picture.
`In general, the shape of the autocorrelation
`function can be qualitatively related to the structure of the
`picture. Fig. 5 shows two pictures: a head and shoulders view
`of a person, and a picture containing white letters on black
`background.
`It
`is easy to see the relationship betwecnthese
`pictures and their autocorrelations shown in Fig. 5(e) and (f).
`Figs. 5(c) and (d) show that the autocorrelation functions de-
`crease with increasing shift in the pels. The rate of decrease is
`large -for shifts close to zero, but becomes smaller for large
`shifts. The envelope of the power spectrum shown in Fig.6 is
`relatively flat to about twice the line rate (30 kHz for broad-
`cast television), where it begins to drop at about 6 dB/octave,
`implying that most of the video energy is contained in the low
`frequencies [23], or equivalently that the neighboring pels are
`highly correlated. Based on these measurements, autocorrela-
`tion functions in two dimensions have been approximated
`[24], [25] by the functions of the form
`exp (-k,|Ax| - kz|Ay|) and exp [-(k, Ax? + ky Ay? ya/2)]
`where Ax and Ay are spatial displacements and k, and k2 are
`Positive constants. Each one of these appears to be a better
`approximation than the other depending on the type of pic-
`ture.
`In general, however, the second expression appears to be
`closer to the measured data. Using these expressions, different
`models have been made and used to synthesize optimal en-
`coders [25], [26]. One of the consequences of such a high
`
`
`
`degree of correlation is that the histogram of the adjacent ele-
`mentdifference signal, {L(x;»,) - L(x;-,, ¥p} is highly peaked
`at zero [27], [28]. Also, as measurements of Schreiber [27]
`and others [20], [28] indicate, most of the second-order re-
`dundancy (i.e., redundancy contained in blocks of two adja-
`cent samples) is removed by coding adjacent element differ-
`ences. Therefore, use of three previous samples for prediction
`does not result in significantly lower sample entropies of the
`prediction error histograms than the use of two previous sam-
`ples. Due to the highly peaked nature of the histograms of the
`prediction errors, they have been modeled by the Laplacian
`density [29], [30]. Very few measurements [31] have been
`made of statistics of order higher than the second, primarily
`due to its variability from picture to picture, and due to the
`fact that a good methodof utilizing such statistics for the pur-
`pose of coding does not exist.
`Just as the statistical measurements and modelsfor still pic-
`tures are lacking, there are even less measurements on the lu-
`minance signal taken as a function both of space and time. In-
`terframe statistics depend very heavily on the type of scene
`and,
`therefore, show a wide variation from scene to scene.
`Some early measurements [32]
`indicate that since television
`frames are taken at 30 times a second,there is a high degree of
`correlation from frame to frame. Thus the histogram of the
`frame-difference signal is highly peaked at zero. For video-
`telephone-type scenes, where the camera is stationary and the
`movement of subjects is rather limited, on the average only
`about 9 percent of the samples change by a significant amount
`(i.e., more than about 1.5 percent of the peak intensity) from
`frame to frame [33].
`In broadcast television, where the cam-
`eras are not always stationary and there is frequently very
`large movement in scenes, there would be less frame-to-frame
`correlation than in videotelephone or videoconference scenes.
`More recent measurements [34] on thestatistics of frame-
`difference signals indicate that, for scenes containing objects
`more or less in rectilinear motion, the power spectrum of the
`frame-difference signal
`is essentially flat at low speeds, and
`that the power of the frame difference signal in low frequen-
`cies increases by about 7 dB for every doubling of the speed.
`This is seen for a typical scene in Fig. 7. As would be ex-
`pected, the spectra of frame difference signals measured in the
`direction of motion, show nulls at appropriate speeds, whereas
`spectra measured in the direction orthogonal to the direction
`of motion show no such nulls. Another interesting observa-
`tion is that as the amount of motion increases, due to integra-
`tion of the signal in the camera, the spatial correlation of pic-
`ture elements increases and the temporal correlation decreases
`(see Fig. 7). Also there is more correlation spatially orthogo-
`nal to the direction of motion than spatially parallel or in the
`temporal direction.
`It is obvious from these measurements
`that there are still quite a lot of interframe statistics that are
`unknown.
`Weclose this section by pointing out some recent models of
`picture signals which appear to be morerealistic and promising.
`As mentioned before, the picture signal, in general, is highly
`nonstationary, and the local statistics vary considerably from
`region to region. Some ofthis difficulty can be overcome by
`considering the picture signal as the output of many sources
`each tuned to a certain type ofstatistics [35], [36]. Yan and
`Sakrison [35], for example, consider a two-component model
`in which the vertical edges (or the high-frequency components)
`are treated as one component andtherest (texture details) are
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 4
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 4
`
`
`
`
`
`PROCEEDINGS OF THE JEEE, VOL. 68, NO. 3, MARCH 1980
`
`370
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`HORIZONTALSAMPLE
`
`10
`
`t os
`z
`3g
`5
`iu
`
`&s
`2
`°
`Zz 00
`
`08
`
`
`
`
`
`
`
`
`
`
`
`
`
`FOR IL-3(a)
`1
`0.5
`z
`FOR ID-3ta)
`6
`<
`FOR T-3(b)
`a
`
`=oO
`oO
`O°
`
`5 00
`10
`20
`30
`«
`FOR I-3(b)
`20
`HORIZONTAL SAMPLE
`DISTANCE
`
`1.0
`
`
`
`
`-05
`
`()
`
`oadQa —
`
`270°
`
`a=-aea
`
`(b) White text on black background. (c) and (d) The autocorrelation function in horizontal and
`(a) Head and shoulders view of a person.
`Fig. 5.
`vertical direction for both scenes (a) and (b). These are for a typical videotelephone display, with 208 samples/line and 250 lines/frame and with
`a picture size of 5.5 in by Sin. Horizontal sample spacing is then 0.02644 in and vertical line spacing is 0.02000 in (without regard to interlace).
`(e) and (f) The contours of equal autocorrelations for scenes (a) and (b). HU denotes the horizontal sample distance unit.
`
`(fy
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 5
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 5
`
`
`
`NETRAVALI AND LIMB: PICTURE CODING
`
`371
`
`
`
`POWERINdB 005
`
`0.02
`
`005
`
`O2
`O04
`FREQUENCY IN MHz
`
`o5
`
`410
`
`°
`
`a7
`ewi
`
`zo
`
`=&
`
`a <0
`
`a z& =
`
`&
`
`-40
`
`-60
`
`ao2
`
`O2
`04
`005
`FREQUENCY IN MHz
`(b)
`
`O5
`
`4.0
`
`05
`
`410
`
`a2
`O4
`FREQUENCY IN MHz
`(c)
`
`Fig. 7. (a) Power density spectra of the video signal at speeds of 0.5,
`2.0 and 4.0 pels per frame (pef). This is for a video telephone type
`of signal containing a head and shoulders view of a person, The
`attenuation at high frequencies is due to the pre- and postfiltering.
`The effect of camera integration on the video signal at higher speeds
`is seen in the reduced power at high frequencies.
`(b) Power density
`spectra of the frame-difference signal at speeds of 0.5, 2.0, and
`4.0 pef. Note the increase in power density at
`low frequencies as
`the speed increases and the small dip at approximately 0.45 MHz in
`the curve for a speed of 4 pef.
`(c) Comparison of power density
`spectra of the element-difference signal and the frame-differencesignal,
`both recorded at a speed of 1 pef. The dashed curveis for the frame-
`difference signal (from Connor and Limb [34]).
`
`Instead of seeking to make the reproduced pictureas similar
`to the original as possible, consistent with the shortcomings of
`the system, one can purposely distort the picture to obtain a
`more pleasing effect. Examples would be filtering the signal
`(linear or nonlinear) in order to make it appear crisper [40] ;
`altering hue so as to give the appearance ofa healthy tan.
`The task to be performed will largely determine the criteria
`that are used to determine picture quality. Thus a photoin-
`terpreter would attach great
`importance to sharpness and
`probably less
`to accurate tonal reproduction. We will be
`mainly concerned with an average television viewer who is per-
`forming no specific task related to the image structure in con-
`tradistinction to, say, imaging for medical diagnostics.
`It is
`convenient to start with the existing analog signal as a refer-
`
`1
`
`i
`
`o
`
`nooO
`
`
`
`RELATIVEPOWERINdB ‘Il>owooO
`
`1
`
`on°o
`
`-60
`01
`
`A
`
`1
`
`FREQUENCY IN MHz
`
`.
`10
`
`Fig. 6. Envelope of the power spectrum of a typical video signal. Note
`that the envelope is relatively flat, up to about twice the line rate,
`where it begins to drop at about 6 dB/octave (from Connoretal.
`[23]).
`
`treated as the other component. They argue that if the edge
`information is subtracted from the picture signal, the rest of
`the signals appear to be close to a Gaussian process and, there-
`fore, an optimal encoder, mentioned earlier, can be applied.
`Rate distortion theory of such two-component models may
`find greater use and a beginning has already been made [37].
`In a different context, Lebedev and Mirkin [38] , [39] develop
`a composite source model and describe experiments in which a
`picture signal is broken down into many components byusing
`correlations at 0°, 45°, 90°, and 135° to the horizontal. They
`look at the picture signal as the weighted sum of these five
`components, weights being given by a random variable. Thus
`the model can be considered to be locally anisotropic, but
`on the average isotropic.
`Impressive results are claimed by
`Lebedev and Mirkin for image restoration using such a model.
`Such models have a large potential, if appropriate components
`could be determined and a suitable method of combining these
`components to form the composite picture signal could be
`found. A similar idea has been explored by Maxemchuk and
`Stuller [36], who model the image as a random field that is
`partitioned into independent quasi-stationary subfields. Each
`subfield is the output of one of six possible autoregressive
`sources, whose selection is governed by a space-varying proba-
`bility distribution that is unknown a priori to the observer,
`The mode! also includes a white subsource that initiates the
`autoregressive sources at certain boundaries within the picture.
`Maxemchuk and Stuller apply this model to adaptive DPCM
`using a mean-square error criterion for each point and claim
`good results.
`
`III. PROPERTIES OF THE RECEIVER
`
` ot
`
` eeee
`
`A. Picture Quality
`Systematic distortions occur in representing a live scene by a
`television picture. For example, the contrast ratio in a scene
`(the ratio of the luminance of the lightest to the darkest parts)
`can frequently be 200:1 or greater whereasit is difficult to
`obtain a contrast ratio much greater than 50:1 under normal
`television viewing conditions;
`the color television tube, by
`mixing three primary colors
`reproduces
`the approximate
`chromaticities of the original scene, not a scene having the
`same spectral distribution. The fact that the vieweris usually
`happy to accept these approximations implies that he is not
`particularly sensitive to them, even when he can makea direct
`comparison between the original and the reproduction.
`
`
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 6
`
`PMC Exhibit 2025
`Apple v. PMC
`IPR2016-01520
`Page 6
`
`
`
`
`
`372
`
`PROCEEDINGS OF THE IEEE, VOL. 68, NO. 3, MARCH 1980
`
`
`
`TABLE I
`
`(b)
`
`5 Imperceptible
`4 Perceptible but not annoying
`3 Slightly annoying
`2 Annoying
`I Very annoying
`
`(a)
`
`S Excellent
`4 Good
`3 Fair
`2 Poor
`1 Bad
`
`(c)
`
`3 Much better
`2 Bewer
`1 Slightly beter
`0 Same
`-1 Slightly worse
`-2 Worse
`+3 Much worse
`
`4.0
`
`
`1/4 (149K) BS
`
`
`
`*TK
`
`420
`
`+4
`
`+8
`
`+16
`
`UNIT INTERVAL=1/8 pS
`(a)
`
`ence and measure distortion by the extent to which the dis-
`torted picture differs in appearance from the analogsignal.
`
`B. Measurement of Picture Quality
`Measurements of picture quality must depend upon subjec-
`tive evaluations either directly or indirectly [41]. Subjective
`testing is very time consuming and consequently is avoided
`where possible.
`In primary or explicit measurement of picture
`quality a group of subjects make subjective decisions while in
`secondary or implicit measurement, objective characteristics of
`standardized waveforms are measured andtheresults are then
`converted to quality measures through previously established
`relations.
`In the digital processing of pictures, distortions are
`frequently introduced that are complex in nature (e.g., they
`can be a complex function of the signal) such as edge noise,
`slope overload, and movement related distortion [42].
`In
`such instances existing indirect methodsare of little use.
`Subjective evaluations are of two broad types, rating-scale
`methods and comparison methods.
`In the rating-scale method,
`the subject views a sequence of pictures under comfortable,
`natural conditions and assigns each picture to one of the sev-
`eral given categories. The subject may be assigning an overall
`quality rating to the picture using categories such as those
`listed in Table I(a) or he may use an impairment scale as
`shown in Table I(b). The results of a rating will depend upon
`many factors:
`the experience and motivation of the subjects,
`the range of the picture material used and the conditions un-
`der which the picture is viewed (e.g., ambient illumination,
`contrast ratio and viewing distance), These variables have been
`explored in depth and standardization is taking place at the in-
`ternational level [43]. This enhances the utility of the proce-
`dure making it more feasible to compare results obtained at
`different times and in different laboratories.
`In the comparison method, the subject adds impairmentof a
`standard type (e.g., white noise) to a reference picture until he
`judges the impaired and reference pictures to be of equal qual-
`ity. This can be done very accurately where the two types of
`distortion are similar in appearance, for example, equating ad-
`ditive noise of differing spectral distribution. ~The distortion
`can then be assigned a quality by referring to rating scale tests
`on the standard impairment. One should not expect that the
`resulting ratings will necessarily be transitive.
`In a variation of
`this method the subject uses a comparison rating scale (Table
`I(c)) to compare pictures having various levels of a distortion
`with a reference picture. The resulting data is then processed
`to obtain the level which produces the “point of subjective
`equality” between the distorted picture and the reference.
`Secondary measures of quality are more useful in the field
`and are usually developed after primary measurements have
`
`A
`
`B
`
`(b)
`Fig. 8. (a) Test signals used for K-rating measurement method. Signal
`A is a sine-squared pulse of half-amplitude duration 27, where T
`equals the sampling period. Pulse B is a bar signal of width approxi-
`mately half the duration of