`Tn: Em Sm manual!“
`Vol, 56, No. 3, March 1979
`Printed in 0.8.4.
`
`Motion-Commnsated Television Coding:'Psrt I
`
`By A. N. NETHAVALI and J. D. ROBBINS
`
`(Manuscript received August 29. 1918)
`
`We present methods ofestimating displacemenfis ofmoving ohfeets
`from oneframe tothenextinatelevision scemondusmgsuch
`displacements for fiame-to-fmme prediction. Displacement is esti-
`mated by a recursive algorithm which seeks to
`a functional
`of the prediction error. Semi simpiificatimzs of the algoritkm are
`presented which make it alto-active for hardware implementation.
`Performance of the algorithm is evaluated by computer simulations
`on two sequences of moving Wages containing various amounts and
`types of motion. In both cases, the use of displacement-based (or
`motion-compensated) prediction results in bit rates that are 22 to 50
`percent lower than those obtained by simple "fiame-difierence”prev
`diction, which is used commonly in the intevfi'mne coders.
`
`I. INTRODUCTION
`
`Television signals are generated by scanning a scene several times
`a second even though there may not be any change in the scene from
`one frame to the next. This results in a considerabie frame-to-frame
`
`redundancy in the signal. Existence of this redundancy has long been
`recogm'zed. and several measurements have been made to quantify it.
`However, the first real demonstration of a fi-ameato-frame code: which
`used redundancy between successive frames was made in 1969 by
`Mounts.I Since then, several improvements have been made to the
`basic fi‘ame-to-frome encoder resulting in prototypes or real imple-
`mentations of coder-decoder pairs.“ These are the subjects of two
`excellent surveys,” the first one covering material up to 1972 and the
`second one up to 1973. As is evident fi'om these works, most frame-to-
`frame coders are based on the following:
`(i) Seyncnting each television frame into two parts, one port which
`isthe ssmoasthaprcviousflamcandtheotherpm (calledthe moving
`area) which has changed from the previous frame.
`{ii} Transmitting two types of moving area information: (a) ad-
`dresses specifying the location of the picture elements in the moving
`
`631
`
`PMCAPL02444649
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 1
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 1
`
`
`
`area, and (1)) information by which the inteneitiee- of the
`area
`elements on be updated.
`'
`-- -
`(an
`the coder bit rate to the cherinel rate. Since the.
`motionin a real televiaion scene occurs
`and in-bmttfithe
`amount of information about the movirq area will
`aa‘a finiction
`of time. To transmit it over a constant bit rate channel; (a) smooth
`out the transmitted huformetion rate
`'m a
`to
`transmission,- and {bl-use the bullet fullness to regulate
`hit rate by varying amplitude, spatial, and temporal resolution of the
`television signal.
`'
`Intensifies of the moving area
`BlmntB-am
`predictive
`which sends frame difference, element dilferenoe, or
`difference {or their combination) : as: the differential:
`' 311t-
`tempte have been made to optimize the-unafity'withintfie
`conehainteoftheebufl'er eize'andthe-channel'rate.
`.
`3
`Simdtaneouabr with theae implementations, computer simulations
`have been used to explore other finprovemente ofthe
`Ithaslmg'beenreoognized” than'il'aneefimateoftho
`mutilation of-en object is obtained, mowefficient predictive
`can be-perfomei-by-taking difierenoee of elements in- the
`mile with'reepect "to elements inithe
`aware-appro-
`printely
`tome]: schemes "quomoammtea
`Coding Schemes." Their succeeeitlepende-obvioiialy on mellow
`(i) the amount of purely translational motion." of objecfi in a real
`television scene,‘ ( ii) the ability of an algorithm to estimate the
`translation with an amacy that is desirable for good prediction of
`intensity, and (iii) robustness of the displacement estimationalgo-
`rithm when amplitude, spatial, and temporal resolution. of the trans-
`
`
`the
`
`a
`of
`:
`"Methods" of pemobyfpeint tier;
`"
`te ' "
`need inn-scene analyeie‘u'“_appear-to be
`rel'atidn9-or hatter-a
`too complex for present-day implementation, especially if-diepleoement
`needs to be defined with resolution finer thanflie sampledng eta
`television frame. Simpler displacement estimation techniqueam“ uti»
`line the relation between the spatial differential'end
`differ-
`
`
`
`
`ential signala'' ' ' ' Thea:I would be easier.
`implement. Another approach
`
`. itifidaptwem"
`prediction using'"
`the previone"_ " _
`field) Which:
`left;
`the-'coefli‘" dents
`up, down) by a certain“ maxim"
`
`to mom' "
`
`anihteneity error fimctionf‘fill'these techniq"' aaeum'' e
`
`"
`
`_
`
`"Sothatthetelevieitincameraaeeatheobiectain
`thatflieflghfingbemfihmmfliemfieidnfww
`
`In
`
`es: THE BELL $YSTEM-"TEm-INIGAL aouamnmaoe 119215
`
`
`
`4
`PMC3685157
`
`
`
`PMCAPL02444650
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 2
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 2
`
`
`
`that the displacement is constant within a block of picture elements.
`This assumption presents difficulties in scenes with multiple moving
`objects. occluding objects, as Well as different parts of the same object
`moving with different displacements. Of course, decreasing the size of
`the block makes this assumption more realistic, but then the quality
`of displacement estimate suffers.
`The techniques proposed by HaskellI5 do not require an explicit
`estimate of translational displacement and are applicable even if the
`mason is" not
`However, they'aiso work on
`blocks of picture elements and require- a matrix inversion in addition
`to other complicated operations and, therefore, do not appear- to be
`easily implementable without a significant simplification.
`We present several new techniques for estimating motion. They
`attempt to minimize recursively a measure of the motion-compensated
`prediction error. Thus, given an ith estimate of displacement, we
`obtain the (i + ljth estimate such that, in general, the motion-
`compensated prediction error resulting from (i-I- llth estimate is lower
`than that using the ith estimate. The recursive minimization is per-
`formed by a gradient or stoopest descent algorithm. Considerable
`freedom exists in the specific choice of this algorithm Our choices
`have been guided primarily by a desire to implement this algorithm in
`real time.
`
`In Section II, we present a derivation of several motion-estimation
`algorithms and describe some simulations to evaluate their perform-
`ance. Section III contains modifications of one of the algorithm of
`Section II for use in a fi-ame-to-fi'ame coder. Here we evaluate the
`
`performance of the algorithm in the context of a fi‘sme-to-fiame coder.
`Section IV contains a discusflon and the conclusions of our study.
`Many enhancements of the algorithm are possible, and some of these --
`will be presented in a companion paper."5
`
`II. MOTION ESTINATiON
`
`In this section, we derive some simple algorithms for estimating
`motion. They attempt to
`recursiver a certain quantity
`(function of the motion estimation error). If the changes in successive
`television frames are due to translation of an object, then the algorithm
`iterates in a gradient or steepest descent direction such that the
`consecutive estimates converge to an estimate of translation. A proof
`of convergence of such a scheme under certain assumptions is given in
`the appendix and is supported by a large number of computer simu-
`lations presented at the end of this section.
`
`2.! Motion eaflmafloninahiock ofpals
`
`We mentioned in the introduction that most algorithms in the
`literature for estinmting tmmiation of an object from a television scene
`
`MOTION-COMPENSATED TV CODING—l
`
`633
`
`PMCAPL02444651
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 3
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 3
`
`
`
`assume that the translation is constant a block of picture
`elements (pals). We start with one such aigorithmdeveioPed by Limb
`and l‘wiurgzihy13 and Csfforio and Rocco" and show how-it can be
`modifiedtoobtsinabetterestimateoftmmletion.Th’nisdooe
`primarily to define a quantity which is fimdsmental to our recursive
`algofithm intmducedinthenexteecfien
`a 2:1 interlaced raster format, let {(x, t — 1') and I(x; t} be
`the intensity-values of the two successive frames as efini'ction of
`spatial location a: (a tWo-dhnensionai vector) and: time:
`between the two-frame is r. If an object moves in translation. then-in.
`the moving area (disregarding the uncovered background):
`-
`
`Itmti-Ik-DJ-‘ri,
`
`(1}
`
`the time-interval
`where D is the txmslation vector of the object
`[In 1-, t]. The frame difference atspatisl positionx'mgiven by Z __
`
`FDIF(x} - fix, t) -' rec. t - 1r)
`
`- fix, an - fix + I). t).
`
`(2)
`
`which can be written. for small I), by Taylor’s expansion about: as
`EWHfl=-D“Hmn+HmWOMaflm&mD;.m
`
`where V is the gradient with respect to x and superscript T on a vector
`or matrix denotes its transpose. If the translation of the object is
`constant over the entire moving area (except for the uncovered hitch:£
`ground] and if'the higher order terms in D can be neglected, then
`sides of the above'eq'unt'ion can'be' suimnied over the 'entireltiofin’g
`area to obtain a-g'ood estimate for translation. We recognize that: V!
`can be Mean to be a vector of element and'line differences ("EDIE
`LDIF) if the'inteneity is available on a discrete grid as its themes in
`most coding situations. Using linear regression, we get 13, an estimate
`of D as:
`
`o-—[z
`
`VI“: “'VIU, 1),]
`
`-1
`
`[ E
`
`I
`
`I
`
`'
`
`which can be written as
`
`D
`
`[2M%Fu>
`
`E EDlei-Lfllmx) 2 Lame“,
`
`gammedamu14
`[ifimmeemmkj
`
`gammnimwm
`
`534 THE'BELL SYSTEM TECHNICAL JUURNAL. MARCH 19T9
`
`-
`
`.mas.EEEEE'EESEE?!orifice;EL:2;:5!?2':2!?!L?LLL‘::.E::::"L£YJ:: -.-_6.:
`
`'2:'.:--::‘:‘ :°':‘."-- <‘"<."W' 6 ‘6 6“ 6 6 6 6
`
`6
`
`6 6"6
`
`u v «6
`
`PMC368515
`
`PMCAPL02444652
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 4
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 4
`
`
`
`where all the summations are over the entire moving area. This can be
`approximated“ by assuming that
`
`2 EDIF (x)-LDIF (x) - 0
`moving
`
`and then
`
`2 FDIF {xi-EDIF (ill/Z EDIF'aix)
`
`1‘) as —
`
`.
`
`(4)
`
`E FDIF (x) -LDIF (21/E LDIle‘K)
`
`This can be further approximated‘ by avoiding the multiplications in
`the sums as:
`
`E FDIF (xlsignlEDIF (xi)
`
`2; |EDIF(x)|
`
`D - _
`
`,
`
`(5)
`
`2 FDIF (x)sign(LDIF (xi)
`
`2 |LDIF (x) |
`
`where
`
`sigma) =
`
`0.
`
`if
`
`z = 0
`
`.
`z
`—: otherwise.
`lzl
`
`(6)
`
`This algorithm is identical to one of the algorithms given by Limb and
`Murphy.” Its accuracy may be improved? by several modifications
`suggested by Limb and llrliurphy‘3 and Caffofio and Rocco.“
`We suggest another modification which improves the above motion
`estimator finisher. First, we note that” the above estimates are good as
`long as D is small. As D increases, the quality of the approximation
`becomes poor. This can be overcome to some extent by linearizing the
`intensity function around an
`estimate of D. This is possible in
`a television situation, where there is an estimate of D for every field.
`Thus, for the with field, displacement estimate 0‘ can be obtained by
`linearizing the intensity function around the displacement estimate for
`the (i -- 11th field. This process results in the following recursion:
`
`D‘ = DH + U‘,
`
`(7)
`
`' We make no attempts to justify these two approximations. They are made merely
`to derive the algorithm given in Ref. 13. Perhaps, iimimld of the least squares, some
`other criterion may lead to the sumo result (i.a., eq. (5}) with fewer assumptions
`1' The improvement mooted by Limb (Ref. 13) was simulated for comparison
`puzp. However. for the type of pictures used in Section 3.3. the performance did not
`change noticeably by incorporating the improvement.
`
`MOTION-COMPENSATED TV CODINGu-l 635
`
`" “PMC3635160
`
`PMCAPL02444653
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 5
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 5
`
`
`
`estimate of D‘ and U‘ is the update of D‘" to
`where D“‘ is an
`make it more accurate, i.e., an estimate of D — D‘“.
`We now define the quantity DFD (1:, DH ), called the displaced frame
`difference. which is analogous to FDIF [1) used in (4} and (5).
`
`nmcx, 15H) = m, a) — n: -D*'-', t — 1')
`
`(s)-
`
`nan is defined in terms of two quantities: (i) the spatial location I at
`which it is evaluated and (it) the displacement D“ with which it is
`evaluated. Obviously, in the case of a two-dimensional grid of discrete
`samples, an interpolation process would be used to evaluate 1(1 --
`D"‘, t — 1') for nonintegral values of D”. As defined, on) has the
`property of converging to zero as D‘ converges to the actual displace-
`ment, D, of the image. Following the same steps as were used inthe
`derivation of eq. (5), we get:
`'
`
`nrmx, DH) - I-(x, c) — Ia: +1) —o-‘-1, a)
`=' '-(D - BMW-V161. t) + Higher Order Terms.
`
`(9)
`
`approximations similar to
`Neglecting higher order terms and
`the above results in an estimate of D — D"1 which, when combined
`with eq. (7). yields:
`
`D‘s-DH --
`
`2 DFD(I, Di'lisigIflEDIF (x))
`
`E IEDIF (x)!
`-
`_
`E Dmtx. D"‘)sisn"(LDIF (xi)
`
`2 ILDIthH
`
`.
`
`(10)
`
`where the summations are again carried over the entire moving. area.
`This may be simplified slightly by noting that if- the
`estimate
`of displacement DH has only integral components; than DFD( - , -) can
`be computed without interpolation. Let [D] denote an integerapprox-
`imation- to D. This can he obtained either by truncating or rounding
`both the components of the 1vector I). -
`
`D‘ -' [DH] -
`
`)3 me. [D‘"']_>-'sign(EDIF (x);
`E I'EDIF=(x) |
`_
`2 mm. [15"11}vsisn(LDIF (x)) -
`
`-
`
`.
`
`(11)
`
`E ILDIF (xil
`We describe the performance of both the above algorithms later in
`this section.
`
`2.2 Warsaw esmnab‘on of matron
`We mentioned in the introduction that there is an advantage in
`recursive algorithms which iterate on a pel—by—pel (or on a small block
`
`836 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1979
`
`PMCAPL02444654
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 6
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 6
`
`
`
`of pole} basis, i.e.. they revise their displacement estimate at every
`moving area pol. Such recursive algorithms overcome. to a large extent,
`the problems of multiple moving objects, as well as different parts of
`an object undergoing different displacements, provided the recursion
`has sufficiently rapid convergence. Since we intend to use the displace-
`ment estimator for predictive coding, our algorithm should in same
`manner attempt to minimize the resulting prediction error. Also, since
`the prediction error is calculated for transmission anyway, its use in
`the recursive estimation of displacement does not result in extra
`computations and is therefore advantageous Thus, if a pet at location
`3.. is predicted with displacement Di" and intensity Ifx, - D“, t -
`o, resulting in prediction error spam, I)“ }, the estimator should try
`to produce a new estimate. 9", such that i oro(x., D‘ 1| as I emu“.
`DH”. To this end, we attempt to recursively minimize [limbo 13)]2 at
`each moving area element using a gradient type of approach. For
`example,
`
`9‘ -= 0‘" - (E/ZW’nEDFD‘xmni‘ll]!
`
`- I)“ - comm... D‘”')Vnorn (xi. D“I ).
`
`where V0 is the gradient with respect to displacement D and e is a
`positive sealer constant. The gradient VD may be evaluated using the
`definition of BF!) and noting that
`
`Voivwixa. DH” =- + Vlixa - D“. t - r}.
`
`(12}
`
`where V is the gradient with respect to x. This gives us
`
`13" a I)“ — comm, D“‘)VI{:.. — D H, r — r),
`
`(13)
`
`where-nrnande'areevaluated by interpolationfor'eonintegral DH;
`A significant reduction in computation of VI is achieved by quantizing
`DH to an integral value. Thus, if [DH] :19th a rounded or
`truncated value of each of the components of D‘- '. then the estimator
`of eq. (13) can be simplified to
`
`' " "
`
`0" =- DH — corms... D‘-')VI(x, - [13”]. t - r).
`
`(14)
`
`It should be pointed out timt VD could have been evaluated using (9}.
`resulting in an estimator in which VI is evaluated at (xa, t) instead of
`(so - DH, 1! — 1-) as above. This second method implies an assumption
`regardingthe expansionofI(-.-), which maynotbevalid ifD -D""
`is large. Also, there is no difference in the computational complexity if
`it is assumed that a linear interpolation of Ex. t — r) is used to
`compute DB1), and the resulting displaced line and element. differences
`are used to define VI of eq, (14).
`It is interesting to observe that at every iteration we add to our old
`estimate a vector quantity parallel to the direction of the spatial
`gradient of image intensity and whose magnitude is proportional to
`
`MOTlON-COMPENSATED TV CODING—l 637
`
`“'p‘mcs'é‘gasz
`
`PMCAPL02444655
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 7
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 7
`
`
`
`It may be seen flaming-{9):
`th's motiomtdfllpensatéd predictihn'
`the
`(D —' DH) is orthogonalto
`gradient VA the. displaced frame difi‘srence hm (-- ,- -} is--zero.:-giVing-a
`zero update for recursion of eq. (14). This may-happen even thaugh'
`the object may have actually moved. However, this is not a failure of
`the algorithm, but rather is identical to the simstion in which an
`intensity ramp is txanslatsd and only motion parallel to the ramp
`(VI) ispm'ceived.Motionperpendiculsrtothe-ramp
`is'unobservsble, and as such is arbitrary. It is only -tlmuglii=”the
`occurreiics' ot-‘fiedgm'with
`"Orientationsgini real
`thsthonwrgencsoffi‘itosctualB-is-poseihle.- -
`-
`2.
`The-quantities involved in the alumsE
`An
`estimate of the displacement at pal 1:“, D‘“, is to be
`(14), yielding 9". Using the
`estimate DH “and 3., the
`samples in the previous frame in the neighborhood of spatial
`Ia — D“ are located (for example: samples b, c, d. e, and f). The
`
`.
`.2 :. -
`1:
`
`-.
`
`9 1 lair-3'” Lt—fi
`
`
`
`FlELDi—a
`
`1”
`
`x
`A
`film—13'” .t—rl
`
`11
`
`FIELD}? IL» — —
`f
`I
`I
`I
`
`/
`if.
`
`x
`
`-
`PRESENT /
`
`
`
`I
`
`Hm}
`
`Fig. IMRecursivemotionestimation. Displacement estimate IT” is updated at hsl s.
`Gradient «mm. 971:. - m4}. c— 1"}. isohteinedby st'pels h. c.
`d,e.fmthefieldl1u2}.
`--
`=-
`
`ms- "BELL SYSTEM TEWMGAL-UOURNAL. MARCH 1979
`
`"
`
`-1--'- L-AkliieSEHbL-Iflixwuwuyu-umu-u\Hv-l “u...-
`
`.-
`
`PMC Exhibit 2037
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 5 6 'PR2016-00755
`Page 8
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 8
`
`
`
`samples of this neighborhood are then used to compute the update
`term in conjunction with line, t). Thus, in Fig. 1, the components of
`intensity gradient may be approximated by
`
`EDIF - (I, — IJ/2
`
`LDIF - (Is - Irlf2.
`
`(15)
`
`(16)
`
`Similarly, Hm. -- D“, t — 1-) maltr be approximated by a two»
`dimensional linear interpolation using the intensities of the neighbor-
`hood. In the next section, we present several simplifications of this
`basic algorithm and adapt it for frame-to-frame coding. Recursive
`algorithms, using a model of the video process for motion estimation
`in a different context, are dmcribed in Ref. 17.
`
`2.3 notion «reactor perfomam
`
`estimatols, algorithm I (sq. (5)) and algorithm II (eq. (11)). Obviously,
`the performance depends upon the type of scene and, even then, it is
`not clear how to measure the performance. We have need two types of
`scenes. The first is produced synthetically using the following formula:
`
`Ion. t) = [
`
`12?
`
`if “an > 100
`
`127 (1 + e'-°""cos(0.2.w. “RI”, otherwise.
`
`(17)
`
`where R = x - (1., + Dr). 127 is the background intensity (on a scale
`of 0 to 255). x0 is the location of the center of the moving pattern at
`t = 0, D is the displacement of the pattern per frame (i.e., velocity).
`and I- | denotes the Euclidean norm. This formula produces a series of
`alternating light and dark concentric rings with exponentially decreas-
`ing radial intensity variation. 'I‘his pattern wu chosen because it
`contains a distribution of edges with different direction and height.
`le, t) wassampled bothin time tend spacexto produce asetoffom'
`frame sequences with strictly horizontal translation room from 9.5
`to 6.0 pols per frame.
`The second type of scene is also a collection of four frame sequences
`containing an object approxhnataly in horizontal translation. These
`are the same as shown in Fig. 1b of Ref. 15. They contain a mannequin’s
`head which was moved at various nominal Speeds fi-om 0.4 to 4.7 pols
`per frame.
`The first type of scene was chosen so that an exact velocity was
`known and, mcrcfore. the perfume of motion estimators could be
`evaluated rather easily by measuring the deviation from this known
`velocity. A second measure of perfonnence was obtained by computing
`the “match entropy” of the elements in the moving area of each field.
`
`MOTION-COMPENSATED TV CODING-ml 339
`
`"""" “PMC3685164
`
`PMCAPL02444657
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 9
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 9
`
`
`
`That E, using the displacement estimate 13‘, obtained fi-om- two con-
`secutive fields, the entropy of 13mm, 0‘) was computed-using a linear
`interpolation over the moving area elements of frame
`Itx, t). This quantity is similar to the entropy ofthe prediction error;
`however, some future information is used in its calculation. Due to the
`presence of shadows and nonuniform illumination, the second type of
`scene does not possess s precisely defined velocity, and therefore
`metalluer was
`to. mute the motion.
`type of seem.
`'
`'
`_
`'
`i
`. In "all the simulations of the results of algorithms
`II of
`section and the next section, the moving ares elements were deter-
`by an algorithm
`to that descn‘hed'in Ref. 13.1:ikewise.
`the definition Of ED135111 and LDIF(1)' used in the simulatiore cf
`algorithm I may be found in Ref. 13. For EDIF (x), this involved
`averagingofthe element diflerences at I(x, t) and HI, t - r). LDIF{X)
`was computed in a. similar manner. To "extend this definition for
`in
`II.'the
`difierence‘s st fix, .1) and I(x .—4[D"‘],-
`t -— 1'] were averaged.
`:
`The performance of algorim l and II is given in Fige 23 and at:
`for the synthetic Scenes. As seen in Fig. 25. the ester in the estimated
`velocity‘ using algorithm II is considerably smaller than that of
`algorithm 1. especially at higher velocitiee The peak ohseaved in the
`estimates of nlgoritlnn I is perhaps due to the insensitivity of the
`algorithm to a per-frame shift equal to. the period {10.0 units) of the
`synthetic moving image. The initial estimate of displacement for
`algorithm [1 was assumed” to be 0.- It is" interesting to note that" the
`curve of algorithm I is approximatelythe some as that of
`II
`after the first iteration; but after only three iterations, "algorithm '11
`converges to its curve in Fig. 23. In
`2b, the match eotropies
`resulting from the estimates of Fig. 2e are shown. Again, the superiority
`of algorithm 11 over algorithm I is clearly seen. Also for comparison,
`we'have included the entropies of FHIF (x) of picture elements” in the
`moving me. As expected, both algorithms 1 and 11 show
`improvementscompared'to frame differences. These conclusions: for
`synthetic pictures remained generally unchanged when the direction
`of motion was changed and when random frame-to-fienie noise of- —40
`dB (sin ratio) was added to the scenes.
`- Performance of algorithms I and I! for the second type of sceneis
`given in Fig. 3. Here, since we do not have aimith an exact velocity;
`we only give the match entropy at the various nominal
`velocities used to create the scenes. While both algorithms result in
`match .ent-mpies considerably smaller than the flame
`as:
`
`‘Aversgeoflsstfmuoutofslxestimstes.
`
`640 THE BELL SYSTEM TECHHCAL JOURNAL, MARCH 1979
`
`PMCAPL02444658
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 10
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 10
`
`
`
`ALGOHITHM [I
`~._WI1’H AND
`
`mim___L____1
`a
`5
`4
`1
`2
`3
`o
`ACTUAL HORIZONTAL VELICFCITY 1N PEtS PER FRAME
`
`ALGORITHM I
`wrrnomr NGFSE—— f
`wan NOISE—~-
`
` WIFE-IOU? NOISE ESTIMATEDHORIZONTALVELOCITYCOMPONENTINPELSPERFRAME
`
`
`
`AREAF'EL
`
`
`MATCHENTHOFHYINBITSPEI!MOVING
`
`ACTUAI. HORIZUNTRI. U? LUCI‘I V IN PELS FER FRAME
`
`locement esti-
`Fig. BuPerformam of two motion estimators which obtain one
`improvement
`mate per field on e
`thetic pictures. Algorithm 11 results in outsider
`over algorithm 1.
`entropy indicates that, in addition to a perfect estimate. there
`was no interpolation error in evaluating the prediction error. Estimated vertical velocity
`components (in lines/frame] were i 0.3 for algorithm 1 and a: 0.02 for algorithm II.
`
`'
`
`tropy, we see that the performance of the two appears about the same.
`This result is perhaps due to the averaging of the displacement
`components over a large block 6.9.. the moving area of an entire field).
`It should be mentioned that the perfonnence of both the algorithms
`
`MOTION-COMPENSATED TV CODlNG—wl
`
`641
`
`PMC3685166
`
`PMC Exhibit 2037
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 5 9 'PR2016-00755
`Page 11
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 11
`
`
`
`FRAME DIFFERENCE
`
`ALGORITHH II x ‘
`
`
`
` MATCHEMTflDPYINBITSPERMOVINGAREAPEL
`
`
`
`1
`
`a
`
`U
`
`‘-ALGOHETHMI
`
`WWWL____1___
`d
`5
`3
`3
`l
`NOMINAL HORiZONTAL VELDCSTY IN FELS PER FRAME
`
`6
`
`Fig. Swim-format- of two motion estimators which obtain one displacement estiv
`mate per field on moving mannequin.
`
`can be improved by separating the uncovered background, as suggested
`by Cafforio and Rocco."
`
`Ill. CODER PERFORMANCE USING RECURSIVE DISPLAGEEHT
`ESTIHATION
`
`In the previous section, we developed motion estimators and dis-
`cussed the performance of estimators which obtain one velocity esti-
`mate per field. We used scenes which had only one object1 moving with
`a nearly uniform translational displacement. However, such scenesare
`unrealistic and, filerefore, estimators which can dynamically adjust to
`motion of objects are more desirable. In this section. we first descfibe
`our recursive estimator in more detail and then evaluate its perform-
`once on scenes that are much more realistic. We then modify our basic
`estimator so that it can be incorporated in a fiameto-frame encoder
`and evaluate the coder performance, We note that we have paid little
`attention to a very important facet offrmne-to~frame coders: resolution
`controi using the contents of the bufi'er. It is important that motion
`estimation does not suffer immensely in lower resolution modes of the
`coder. Although some of these issues will be considered in part 11,.“
`more realistic performance evaluation can only be done using a
`-
`ware coder wcrkmg on real scenes.
`The first scene, called Judy. consists of 64 frames (2:1 interlaced
`fields) of 256 X .256 samples each, obtained at 30
`a second and
`sampled at Nyquist rate from a video signal-of 1 MHz bandwidth, and
`contains Ironwood—shoulders view of a person ' engaged in a rather
`activo'oonversation. The portion of a frame classified as moving area
`
`642 THE BELL SYSTEM TECHNO”. JOUHNAL. MARCH 1979
`
`PMCAPL02444660
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 12
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 12
`
`
`
`varies from 15 to 51 percent. Also, the motion is not translational, and
`different parts of the scene move differently (such as lips. eyes. and
`head). Four frames of this scene are shown in Fig. 4.
`The second scene, called Mike and Nadine, consists of 64 frames
`with the some resolution as the scene Judy and contains a panned full
`body view of two people briskly Walking around each other on a set
`with severe nonuniform illumjrmtion. The percentage of a frame clas-
`sified asmoving areavariedi‘romBZtoBGpercent. Fmfi‘amesfrom
`this sequence are shown in Fig. 5.
`
`3. 1 Basic estimator perfonnance
`
`The recursive estimator consists of the following: the.
`merit estimate is updated at each moving area picture element using
`eq. (14), i.e.,
`
`D‘ - 15“" - inner-om. D‘“‘)-VI[x,= — (D“'], t — 1'),
`1024
`
`‘
`
`{18)
`
`where 1')" is the estimated displacement of moving area pa] in, 13'“ is
`the last estimate formed prior to pol 1:; during the peleby~pel, line-hy—
`liue (interlaced raster scan order) iteration “trough the moving area.
`VI is the spatial gradient of the intensity, approximated by EDIF and
`LDIFdefined in (15) and (16), and [DH] denotes rounded D“. A two-
`dimensional linear interpolation is used to evaluate mm (interpolation
`is discussed in detail in Section 3.3.3). Instead of using the previous
`home for the intensity {(x, t -— 1}. we have used the previous field.
`Relative advantages of this choice are discussed later. Also, both the
`horizontal and vertical components in the displacement error estimate
`tie. the second term of the rightwth side of eq. (18)) are clipped at
`a magnitude of (9323), so that the displacement from pol to pal is not
`allowed to change by more than his pol/field and We lineffield.
`avoids the possibility of rapid oscillations in displacement due to noise.
`The accuracy used in computation of displacement and interpolation
`is Me: pel (or line) per field.
`In all the simulations of this section, with the exception of the results
`of algorithm I appearing in Fig. 7, the moving area picture elements
`were determined by the following rule: pa] 2 (Fig. 6) is classified as a
`moving area pol if either
`
`(i}|FDIF(z)|> T2
`or
`
`(iillFDIFIzll :- T:
`and
`
`|FDIF|ata, b, c, ord > T;
`and
`
`IFDIFIat A, B, 0,01- D > T1,
`
`MOTION-COMPENSATED TV CODING—l B43
`
`PMC3685168
`
`PMCAPL02444661
`
`PMC Exhibit 2037
`
`Apple v. PMC
`|PR2016-00755
`
`Page 13
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 13
`
`
`
`Judy.
`
`
`
`
`
`
`
`Fng-b—PourEmmaofthescone
`
`at” THE HELL SYSTEM TECFfNiCAL JOURNAL. MARCH 1979_:_
`
`PMC Exhibit 2037
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 62 'PR2016-00755
`Page 14
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 14
`
`
`
`framesofthesceneMikeandNadine.
`
`
`
`Fig.fi—Faur
`
`MOTiofl-COMPENSATED TV CODING"! 645
`
`PMC3655170
`
`PMC Exhibit 2037
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 63 'PR2016-00755
`Page 15
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 15
`
`
`
`I
`
`.
`
`
`
`_..__._..__...___...._~__...___
`B
`,r
`I _____.__x______
`
`f
`
`Piliewous
`LNES
`
`f
`f {—7—“—""—'———'-——-——-—-~
`PRESENT
`renounced,”
`Lmesn —*—X——++"K“’—Lme
`\x
`a
`h
`z
`I:
`i
`\ m_._._......___.~__..___......
`\
`
`...
`
`C
`
`_.
`
`
`
`Fig. Wefiynflion of pole used in the moving area. segmentor.
`
`siegh'ienter is quite similar 'to-the'ona in Refs. 4 and 15. It overcomes.
`molar“ extent,
`mine-Whine noise which
`produce a large number of isolated moving area elements. "Pots-the -
`basic estimator, the moving area was segmented with
`Tl:
`I"
`=4andT2a=255(onaecaleofflt0255).
`The perfonnance of this basic estimator is given in Fig. 7 for the
`sequence Judy. Also for comparison, we have included the entropiee of
`the moving area frame differences and the match entropy of algorithm
`I 3301' Section II. On the average, match. entropy
`:Ithe
`e'etimator. was lower by about 1.4hits!
`
`- --difi‘erencea. while algorithm 1 only'reeiflted ' -
`half of this decrease"
`'
`
`tely’
`
`'
`
`7.
`
`'
`
`_
`-
`-
`3.2 8ammm _
`__
`_ In order to incorporate the estimator as apart of a coder", several
`.-
`other. choices have to be made. In this section. we describe a basic .
`coder and its performance. The next soothe-contains modifieetiofih;
`' end- simplifications of the basic coder and then" '- efi'ects on the
`The'baéic: coder consists of the following:
`
`_
`
`__
`3:2:1- Diapleee’meot- e‘eumator-.
`in foot
`In the-predictive coder, since the dieplacement
`transmitted to the receiver, it. has to be derived from the 5
`
`encoded and transmitted data. We derive a 'dxsplac‘' ement-
`e previOuely transmitted pa] and use it for computation of the predic-
`
`.
`
`:
`
`" Results in Fig. 7 for algorithm I were obtained by using a slightly diflerent moving
`area memento: given in Ref. 13.
`
`THE BELL sve'rEM ' TECHNICAL JOURNAL. MARCH 1979
`
`u u 1-» L-umuxuwuu‘ANWmnm \.
`
`..e
`
`re.
`
`W “' fiMEEEééi i1"
`
`PMC Exhibit 2037
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 64 'PR2016-00755
`Page 16
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 16
`
`
`
`gA g
`
`g
`
`m.
`
`3"
`
`.-: E
`
`E?
`'3 3%
`
`F
`
`:
`i
`
`..
`
`3
`
`2%
`_§§
`5%
`. E:
`-
`
`4
`
`E
`22
`
`E E
`
` _
`
`T
`35
`5*?
`
`g
`E
`§
`g
`S
`é
`
`:
`‘
`
`§
`E
`E
`E
`
`m
`
`1:!VSHVQJADWHad2113NI ANZlNQ HJLJVI
`
`a
`
`MOTION-GOMPENSATED TV CODING—l
`
`64?
`
`§
`
`PMC Exhibit 2037
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 65 'PR2016-00755
`Page 17
`
`PMC Exhibit 2037
`Apple v. PMC
`IPR2016-00755
`Page 17
`
`
`
`tion of the present pol. The two natural choices for the relative location
`of the displacement estimate are the pol above and the pol to the left
`of the predicted pal. We have chosen the previous line (is, the pol
`aboveinthesamefield),ainceitrelieveshardwaretimingeonatraints
`that would have otherwise resulted. If the previous line pol is not a
`moving maps]. weusethehfitialdisplacememestimate-flzatwwld
`have been" used by the above pel if it were roofing. such use of
`displacement estimates for prediction is shown in Pig. 8. We see that
`several elements (e.g., elements j + 1. j + 2, -- - , j + 4 of the present
`line} may he predicted with the same displacement estimate if the
`corresponding elements of the previous line are not moving area
`elements. Other details of the estimator remain the same as those
`given in'Section 3.1 for the basic estimator.
`:
`'
`
`3.2