`THE HELL Sea-rm Momma. JOURNAL
`Vol. 68, No. 7, September 1978
`Paula! in USA.
`
`Motion-Compensated Tranform Coding
`
`By A. N. NETRAVALI and J. A. STULLER
`
`{Manuscript received April 10. 1979)
`
`Interfrume hybrid bansfom/DPCM coders encode teteutsion signals
`by taking a spatial transfom of a black of picture elements in a
`frame and predictivety matter; the reauttmg-coeflicieflts using the
`corresponding coefifcients of the spatiat Mock at the same location in
`the previous frame. These coders can be made more efficient for
`scenes containing objects in translational motion by first estimuttng
`the handsome: displacement of objects and then using
`of a spatiotty disptoced block in the previous home for prediction.
`This paper presents simulation results for such motion-Wed
`#wtsform coders using two algorithm for Waxing disptacements.
`The first afigon'thm, which is devetoped in a companionpaper, recur-
`sively estimtes the displacements fiom the previously transmitted
`Musform coefi'ictente, thereby eliminating the need to transmit the
`displacement estbuates. The second algorithm due to Limb and
`Murphy, “mates displacements by taking ratios of accummuloted
`frame difi'erence and spatial difference signals in. a block. In this
`scheme, the displacement estimates are transmitted to the receiver.
`Computer simulations on two mica! real-life sequences of frmnes
`show that motion-compensated coeflicientprediction results in coder
`bit rates that are 20 to 40 percent tower than conventional interfrwne
`trwtsfonn coders usmg “from difference of coefficients.” Compari~
`sons of bit rates for approxunatety the same picture quality show that
`the two methods of displacement estimation are quite similar in
`perfonnance with a slight preference for the scheme with mwsiw
`disptacenwm‘.‘ estimation.
`
`I.
`
`INTROIJUCTION
`
`Television signals. which are generated by scanning a scene 30 times
`a second, contain a significant amount. of frame—to-fi-ame redundancy.
`A large part of this redundancy can he removed by the technique of
`conditional replenishment."‘” In conditional replenishment, each frame
`
`1703
`
`".'!ié¥v_—‘fl."."..'
`
`MN'I5M63'68'5140
`
`PMC Exhibit 2038
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 33 'PR2016-00755
`Page 1
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 1
`
`
`
`gas?
`i3"
`
`
`
`
`
`
`
`..,_.F_._._____,_.H__.._._...,._.'..rflfianfl,7fi_w.....1...,1.5...:,.:.-:,:.-..-.--‘:.-‘.v...v_,a_,‘:-..-‘:..5.“5.3.:55?:.:-
`
`
`
`
`
`
`
`
`
`-'
`
`_
`
`or"
`
`is segmented into two parts: background-which of '
`elements (pols) having intensifies similar to the previous flame-pols.
`'
`and moving area, which consists of pels that differ significantly from
`the previous frame pols. Information is transmitted only about the
`moving area in the form of prediction errors and addresses of the
`moving area pols. Conditional replenishment schemes can be
`by estimating the displacement of objects in the scene
`
`displaoéin'iént
`" ' we
`by
`elements in the moving" area with respect to apmpfiately
`elexnents in the previous frame Such
`have been referred to
`as motion—compensated coding schemes“!
`Transform domain methods have been widely discussed for band-
`width compression of still images or single frames. '2 They can he
`used for coding of sequent: of television frames by
`astwow
`. dimensional spatial transform followed; by predictive
`corresponding coefficients from the spatial transform of the
`horse's?“
`type of hybrid coding“ relieves the: Wessex-omens:
`associated with the use of flareo-dhnonsionsl-trsnsfm blockmsoch-efa
`scheme can he made more efficient for scenes containing objects in
`motion by using, for prediction, coefficients of blocks from the previous
`frame that are spatially displaced from the present frame blocking-an
`amount equal to the displacement of objects-As in the {the
`_ success of motion compensation in transform coders depends
`{i}
`the amount of purely translational motion of objects in the-soafle;..-(ii)
`the .shilityof the displacement estimation .olgorithm to
`translation with on accuracy necessary:- for good prediction hf. the
`coefficients, and (iii) the robustness of the displacement estimation
`algorithm when the resolution of the transmitted picture is changed to
`match the coder bit rate to the channel rate.
`_ _'
`In this paper, we use two previously published displacement esti-
`mation algorithms for motion-compensated transform coding.’l‘h&first
`algorithm is an extension of a corresponding method in the pol do"
`maimm“ It works recursively on the previously transmitted transform
`coefficients of the present as well as the previous-frame. It the-romeo
`-
`requires no separate transmission of- the displacement
`algorithm ' is discussed in detail in a companion paper,“ where ' its '
`pmfififimmdwamwbmhmkawymdumMmemfimwmm
`of certain simple synthetically generated scenes. The other method of
`displacement estimation that we use is due to Limb and-Murphy.” It
`estimates displacements in a block of pols using a ratio:
`later! frame difference-and spatial difference signals from futm‘e asimfl
`as past data. These displacement estimates are nonreeursive and-must
`be separately to the receiver. -The--presentrpsperi-iovesti-
`gates the performance of the two displacement estimation
`
`1704
`
`THE BELL SYSTEM TECHNICAL JOURNAL.- SEPTEMBER 1979
`
`A
`
`"-l- g. 4- v, we.:-M-.«:-mnc13nuweflsusuwaux-mam
`'PMC3685141W
`PMC Exhibit 2038
`
`Apple v. PMC
`PMCAPLOZ 4 4 4 6 34 'PR2016-00755
`Page 2
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 2
`
`
`
`in the context of inter-flame codem operating on real-life scenes that
`human} fairly complex (nontranalatioual) motion. Results are given
`here on the effects of various coder parameters such as block size,
`particular transform (Hadamard, cosine, etc), and other parameters
`of the displacement estimators. The primary result of this paper is that
`the application of either recursive or nonrecoroive motion estimation
`provides a 20 to 40 percent decrease in hit rate, compared to conven-
`tional, uncompensated hybrid transfonn/opcm coding. We have found
`that the use of large block sizes in motion estimation degrades the
`='¢oder performance. This may be a result of spatially nonuniform
`displacements being averaged over the transform block by the disglace-
`ment estimator. Also, since the motion in real scenes is generally not
`uniform in rectangular blocks, as the block size is increased, only a
`fiaction of elements in a block are compenaable with a given displace-
`ment, and therefore transmitting coefficients of a larger block contain-
`ing some compenaable and some uncompensable pols becomes imam-
`cient.
`
`.il. HYBRID TRANSFORH CODING Wfl'HOUT MOTION COHPENSATION
`
`In an interframe hybrid transfonn-DPCM coder, :1 field of video is
`partitioned into blocks having dimensions Nr rows by N; columns, and
`la two-dimensional transform is perfon on each block to obtain a
`set of coefficients. Transform coefficients of the qth block of the
`present frame are predicted by the corresponding coefficients of the
`:qth block of the previously encoded frame, and, if the prediction error
`is above a specified threshold, the quantized prediction errors are
`transmitted to the receiver. These quantized more are added to the
`coefficients predicted by the receiver. which inverser tremfome the
`result to obtain an image for display at the receiver. A block diagram
`of an interframe hybrid transform-om transmitter is shown in
`Fig. 1. Data compression is achieved both by the redundancy removal
`implicit in the prediction process and because some coefficients can be
`reproduced with low precision (or totally omitted from transmission)
`without visible degradation in the reconstructed picture.
`-
`The performance of the interfiame hybrid transform-an coder
`and the other coders described in later sections of this paper is
`evaluated in terms of bit rate for an acceptable subjective picture
`quality using two scenes, one called Judy and one Mike and Nadine.
`The coding degradation was judged in informal tests by the authors to
`be just perceptible from a viewer; distance of six times the picture
`height. These scenes consist of 64 frames (2:1 interlaced fields) of 256
`x 256 samples each, obtained at 30 times a second and sampled at.
`'Nyquist rate from a video signal of l-MI-Iz bandwidth. The scene Judy
`"Contains head-and~ahoulders views of a person engaged in a rather
`
`MOTION-COMPENSATED TRANSFORM CGDING
`
`1705
`
`" ' PMC3655142
`
`PMCAPL02444635
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 3
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 3
`
`
`
`$224515::
`.
`
`..cuxmd
`
`m....5zE,
`
`1.52.3.
`
`o»
`
`adnzmumI...
`
`20m_fl(tgu
`
` 530
`
`Euazmwop
`
`mix“...
`
`macs;
`
`1708
`
`THE BELL SYSTEM TECHNICAL JOURNAL. SEPTEMBER 1979
`
`' 315$11'5'v'..w w. “v
`
`PMC3685143
`
`PMCAPL02444636
`
`PMC Exhibit 2038
`
`Apple v. PMC
`IPR2016-00755
`
`Page 4
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 4
`
`
`
`
`
`
`
`active conversation. The portion of a frame classified as moving area
`varies from 15 to 51 percent. The motion is not strictly transiatioml,
`and there are different parts of the scene moving differenthr (such as
`lips, eyes, and head}. Four frames of this scene are shown in Fig. 4 of
`Ref. 10. The scene Mike and Nadine contains a panned full-body View
`of two people briskly walking around each other on a set with severe
`nonuniform and time-varying illumination. The percentage of a frame
`classified as moving area varied from 92 to 96 percent. Four frames
`from this sequence are shown in Fig. 5 of Ref. 10.
`In our simulations of the interfi‘ame hybrid transformmrou (called
`conditional replenishment in the transform domain), the coefficients
`of two corresponding spatial blocks of the same field from two succes-
`sive frames are compared, and if the difference is more than a thresh—
`old,
`the coefficient
`is
`transmitted. Thus,
`if {€2}M,....M-I,
`and
`{Eflwmunun are M selected coefficients (out of N coefficients in a
`block) of the present and coded previous frame blocks, respectively,
`then the quantized error, Qdci -— Er], is transmitted only if la. — an
`2 Ti, Where Q,g[-] is the quantizer for the kth coefl'icient, and T). is the
`threshold. If as is not transmitted, then its value at the receiver is
`
`assumed to be 5,. Thus the transmission consists of the quantized
`prediction error of the coefficients that were selected for transmission
`and the addresses of the coefficients that were dropped from the
`transmission. The information necessary to convey addresses of the
`coefficients selected for transmission was computed based on the run—
`length coding of runs of coefficients within a block and then from block
`to block. Parameters of the coder such as the number of coefi‘icients
`
`that were entirer dropped from the transmission, the thresholds {T5,}
`for selecting the transmitted coefficients, and the quantizer scales were
`adjusted“ to produce pictures in which coding degradations were just
`perceptible. The entropies of the prediction errors and the run lengths
`specifying addrewes of the transnfitted coefficients are added to com-
`pute the total bit rate.
`The results are shown in Fig. 2, in which the hit rate is plotted as a
`function of the frame number for 60 frames. In these simulations and
`
`those of the next section, the coder was initialized so that it used the
`unquantized original first frame for prediction of the ascend ironic. For
`comparison, the results from Ref. 10 are reproduced for conditional
`replenishment in pa] domain. The wmparison shows that, in the
`transform domain. using a cosine transform on a 2 x 4 block, there is
`a reduction of about 10 percent in hit rate over that obtained in pel
`
`“ We do not claim that these adjustments resulted in an optimum set of parameters.
`However. a sufficiently large set of parameters was tried, giving us confidence that our
`realms are not far from the optimum.
`
`MOTION-COMPENSATED TRANSFORM 000mg
`
`1 7'07
`
`PMC3685144
`
`PMCAPLOZ 4 4 4 6 3 '7
`
`PMC Exhibit 2038
`
`Apple v. PMC
`IPR2016-00755
`Page 5
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 5
`
`
`
`
`
`
`
`
`
`“F.1-1::-.-n7-.--.-7.-'-.-—.-‘11-):-“"ff‘t'“'"“"‘"PM"
`
`'
`
`{MNDWIONAL
`HELEN ISMENT
`
`an
`
`an n
`
`70
`
`an. outlaw
`
`.
`
`TRANSFBflfl
`- -.
`a... DOM!
`.-
`o x {HOWINE '
`
`--
`.- -..uonoe:._ .
`-cumremsmom
`-
`:A-lN-PEL;.;
`-
`DONAIN
`owm"*4h
`oh
`..
`.
`'IN'TMNSFO'RH
`NMAIN
`
`' MOTION _'
`BOWEHSATIDN
`E IN PEI.
`DOMAIN ..
`
`n
`
`TRANSEOBH}
`
`
`KI[OBITSIPEB.EFHME 8
`
`3838'
`
`
`a
`_
`m
`.m
`an
`40..
`so
`. m”
`.I '
`FRAME NUMBER
`-
`
`.
`
`-
`
`iii-aha-
`Fig. 2~Performance of conditional replenishment and
`form coders. Kilobita/frame are plotted as a function of frame number fora typical
`sequence containing active motion of a head-and—ahouldele View.
`..
`.
`.
`
`domain.‘ For this particular caae. we dropped'the eighth coefifioient
`entirely, and'the prediction errors for the other sieve!)
`fiesta
`doantized with uniform quantization scales with step sizes of 3, 5, 5,7,
`7,9, 9; respectively. The'thteshcilds [Tag] for predictability were
`who 1, 2, '2, 3, 3, 4, 4 (out-of 255) for the seven coefficients, who.
`We varied some parameters of the transform to evaluate.
`tivity of these results to the block size and the type of transform data.
`Some of these are shown in Fig. 3. It is seen from this figure that a
`one—dimensional cosine transfom with four elements did worse than
`the conditional replenishment in the pol domain (between
`to 18
`percent). As the transform size was increased. the bitrate dropped: for
`2 x 2 block and cosmonautsme the results Were
`the
`Conditional replenishment in pei domain; the .2 x S'bloek the
`transform;.on the other hand,
`about 15 percent'hett‘et
`the conditional replenishment in'pel domain. We" also
`transforms and found that" for small block sizes they wereeqfiit‘hlefit
`to the cosine transform but, as the block size was him-eased, the amine
`
`{Of course. aeveralorher modifications can he made to improve the pel'd'omin
`conditional replenishment. Our enema-inch is not meant to be a. comparison between
`pol domain and transform domam coding in general.
`_
`
`nos Teeieeu SYSTEM TECHNIcAL JOURNAL." SEPTEMBER-9mm”
`
`' " "Mathews
`
`PMCAPL02444638
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 6
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 6
`
`
`
`transform behaved better than the other tremfonne. The results for
`
`the 2 x 8 block using the Hadamard transfom basis were very eimiier
`to those of 3 x 4 block and cosine transform but were inferior to those
`
`of the 2 x 8 block and cosine transform.
`
`Figure 4 shows the distribution of the bite required for addressing
`and for the transmission of the first coefficient As is seen, the address-
`ing bits are about 50 percent of the total bite. This in a significant
`increase in addressing requirement compared to the conditional re-
`plenishment in the pe} domain, where the addressing accounts for only
`about 20 to 30 percent of the total hits. This may be a result of using
`only the prediction error corresponding to the coefficient being coded
`for deciding whether that coefficient should be tl'anmfitted. This may
`have made the decision to transmit a coefiicient unnecessarily noisy.
`We did, however. try several methods of reducing the addressing bits,
`but none of these resulted in an overall bit rate reduction. In the
`
`fraction of the bits that are required to send the prediction errors,
`those for the first coefficient account for more than half, as shown in
`Fig. 4. Thus the addressing and the first coefficient take up around 30
`percent. of the total bits generated by the conditional replenishment
`coder.
`
`11D
`
`oouomomL
`fiEPtENIGHMENT
`
`mo
`
`“0
`en
`
`u.
`I;
`.g m
`a:
`"5'
`c: to
`E
`to
`r; sou
`
`40
`
`3g
`
`B I
`
`1 x a meme
`'TRANSFOFIM
`
`._
`
`a
`Bea. tame;
`fifie
`TRANSFORI‘}
`
`..
`
`\ 2 a 4 some
`TRAIEFORM
`\IALso 2 x a
`HADAHARO
`TRANSFORM!
`
`‘2 1| 3 come
`tame"
`
`O
`
`0
`
`II]
`
`2‘)
`
`M
`4-D
`Fame mean
`
`so
`
`on
`
`Io
`
`Fig. amPerformance of conditional replenishment in the transform domain with
`various transforms.
`
`MOTION-COMPENSATED TRANSFORM CODING 1709
`
`PMCAPL02444639
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 7
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 7
`
`
`
`
`
`8 8
`
`3
`
`
`
`mum'sranman:838
`
`DONDITIDNAL REPLEMSHMENT
`m TBAHSFOFIIR DOMAIN
`{2 x 4 OOSINEI
`
`TOTAL ITS
`
`N13 FOR FIRST COEFFICIENT
`
`38
`
`up{3
`
`ADD flmflfi BIIS _
`
`
` o . . ad ‘0 m._
`
`
`'
`'
`'
`mmfiuumaen
`'
`
`
`
`
`
`bite/tram into addremiog 1mm and
`Fig. 4»Diatribut.ion of total oo
`that reunited for the transmiteion o the first coeflicient.
`
`Ill. MOTION COMPENSATION WITH RECURSWE DISPLACEMENT
`ESTIMATION
`
`In the motion-compensated hybrid transfomt-npcM-coder shown:
`Fig.5, the nth eoeflici'eht'of the qth present field blockis
`by
`month coefficient of either the displaced or the nondisplaced block of
`the previous frame, depending on which was better for the {n — nth
`coefi'icient, where the displacement is eta-estimate of the frame-to-
`franm translation of a moving'ohject. The displacement
`technique used in this section is identical to the one given in on:
`companion pinionls We describe it as follows: Let x}, - (x..,, 5291’
`denote the coordinate of the upper left-hand pel- of the qth block,
`where the blocks in each row of blocks are numbered from: left to"
`with q - 0, 1, 2,
`« - e, and superscript T denotes the trumpoee'ofja
`vector or matrix. The pel intensifies of block q in a
`fashion are denoted by a column vector ltxq, t). Let the
`vector of the transform be denoted‘ by on, and, therefore, nth
`eoefl‘icient of the qth block of the transform of the present frame can ..
`be written as
`
`cum) = Ware. rm.
`
`The displaced previous frame value of this coefficient is
`'
`'
`ow. D) “To; - 13.: —7 out,"
`
`.
`
`.
`
`-
`
`'
`
`(1)
`
`_
`' in
`
`1?10
`
`THE BELL SYSTEM TECHNICAL JOURNAL. $EPTEMBEQ 193’9
`
`'Q&'&'&"21'5'1w'
`
`N- «. xu..-L-Iui«wtkawuk w
`C3586???
`
`PMCAPL02444640
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 8
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 8
`
`
`
`$2215£535.
`$me.
`
`wad—Eds
`
`Iva...sz
`
`mun—DD
`
`_336i=3:0
`
`BOImuzzk
`
`ZOMEdmiou
`
`hzmfimufiou“.0
`
`tam—Dixw
`
`Jilin—0
`
`FZmGEumDU
`
`FZmafiufi
`
`Flu—Diuuou
`
`magi“—
`
`map—h
`
`zoFddonmMHZ.
`
`H2w2wu¢31§9
`
`46E35.2.
`
`Emimufimma
`
`4429§¢m
`
`
`
`£330HomflxEoufl—uhgangs—sung“noE955.{calmin
`
`
`
`
`
`MOTION-COMPENSATED TRANSFORM CODING
`
`1711
`
`" "
`
`48
`
`PMCAPL02444641
`
`PMC Exhibit 2038
`
`Apple v. PMC
`IPR2016-00755
`
`Page 9
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 9
`
`
`
`
`
`
`
`
`
`
`
`where [(329 — 1-), t — r) is the column vector of intensities- of the
`displaced qth block of the previous frame and D is the estimated
`displacement of the moving object. Computation of the elements in I
`I(x., —— D, t — 1‘) generally requires an interpolation amongijithe given-
`previous-frame pel intensities. The displacement estimation'§al'gorithm
`attempts to
`the prediction error in predicting cuill'l- by c249,
`D} by the steepest descent iteration of the form
`
`one) = Duel — g meets, mg»
`= Dntq) — cede. 11.th I’lxq - Date). 3 - 'r-len
`
`(3)
`
`fern =0. 1, u-,M-2andq= 0, 1,2, --«,with
`
`Duin a Dar-liq _ 1)
`
`.
`- EEM—liq " 1. DM—iiq—lll
`JV If(xq._1 -— DM_1(q —- 1}, t — This,
`
`.
`
`(4}.
`
`where e..(q, 11.83)} is the error in the prediction of c..{q) (i.e., c..(q) -
`é..(q, 134(3)) and M is the number of displacement iterations performed
`per block. Thus the iteration proceeds by first assuming the initial
`displacement estimate of the qth block, Dam), seen-update from the
`final displacement estimate of the q -» 1 block Dar-iii] — 1). The next
`displacement estimate of the qth block, D1(q), is formed from eq. (3)
`with n = 0. Iteration progresses in the qth block from coefficient to
`coefficient, resulting finally in" displacement estimate Dun-((3).-
`iteration procedure continues along all horizontal blocks of the raster.
`The initial displacement of the leftmost block is assumed arbitrarily to
`be zero.
`'
`
`Such a motion-compensated transform encoder transmits a quan-
`fixed version of the coefficient prediction error en(q, fuel} to ' the
`receiver whenever the magnitude | efl (q, D£ql | exceeds a given threshe
`old T,“ thereby enabling the decoder to update its displacement
`estimate 15.. (q) as ineqs. (3) and (4), as well as correcting its prediction
`of coefficient c149). Both the encoder and decoder use the updated
`displacement estimate in predicting the next coefi'iéient, and the proc-
`ess continues. We note that, since only previously transmitted infor-
`mation is used for displacement updating, no separate transmission of
`displacement is necessary. A simplified block diagram of the hybrid
`motion-compensated coder is shown in Fig. 5.
`-
`The results of motion-compensated coding in the transformdomain
`for the scene Judy are given in Fig. 2. In this figure, total bits per frame
`are plotted against the titanic number. For purposes "of comparison,"
`
`“' It should be noted that motion compensation in pal domain used intensities of the
`previous field rather than trams, whereas motion
`enmtion in transform domain
`used intensifies of the previous frame. It was found that, for the pol domain case (Ref.
`10), previous field intensities give better results.
`
`1712 THE BELL SYSTEM 'TECHNlCAL set-JF'INAL.’ SEPTEMBER 1979
`
`.,._,,_,,_._,...,,..
`
`
`
`
`
`,_.._...a?______a-..-o£.._.,....-.,..-a"fireman...swan-m,“-nwu..-m...a...._,.,w:.fi.l
`
`_
`
`._._.__._._._._...__._..__—LIL.,.....,u...”
`
`PMCAPL02444642
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 10
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 10
`
`
`
`this figure also shows conditional replenishment in the peel domain as
`well as in the transform domain and the two motion compensation
`techniques of Refs. 1.0 and 11 in the pol domain. Motion compensation
`in the transform domain is about 20 to 40 percent better than condi-
`tional replenishment in the transform domain. Also, motion compem
`_ nation in the truisform domain is better than one of the pol domain
`_ motion compensation techIfiQues by about 5 to 10 percent. This pol
`- domain technique is described in Ref. 10. It segments a frame into
`three types of areas: background, compeneahle moving area, and
`_ uncompensshie moving area, and. therefore. requirw a significant
`' mount of address transmission. Motion compensation in the trans-
`. form dormin results in about 20vpercent higher bit rates compared to
`-
`: the second motion compensation technique in pel domain. In this
`. second technique, which is described in Ref. 11, each frame is divided
`. only in two parts, predictable and unpredictabie. and thus transmission
`'- of moving area address information is eliminated.
`Results of motion compensation in the transior'm domain which uses
`different types of transfonns and block sizes are given in
`6. This
`; figure shows that the cosine transform with 2 x 4 block does the best.
`j
`; Increasing the block size increases the bit rate, perhaps as a. result of
`'
`the uncompensebie ares (i.e., the pain for which the prediction error is
`larger than threshold Tn) beingin small isolted fragments. This result
`is in contrast with the remit obtained with larger block sizes in
`I
`' conditional replenishment, where a larger block size, such as 2 x 8.
`gave better results than a smaller block size, such as 2 x 4. A one
`dimensional transform. for example, the 1 x 4 block cosine transform,
`- does worse than motion compensation A in the pol domain, whereas a
`2 x 2 block using the cosine transform, on the other hand, is equivalent
`to motion compensation A in the pol domain.
`
`80
`
`8401303
`ODS-IF ENSATIDN
`
`uD
`
`TR MSFDH’M
`,ooumn
`,I t n 4 cosms:
`2 x a team:
`TRANS“) Ru
`oomm
`2 x. 2 COS"le
`mono
`' comes smote
`
`-
`\ remorse
`scams
`
`s m PEL
`DOMAIN
`
`m 5°
`3
`if 58
`
`K d
`
`m
`
`r!
`..
`8 aset
`as
`
`
`
`0
`
`10
`
`20
`
`so
`30
`FRAHE NUMBER
`
`m
`
`m
`
`TD
`
`Fig. B-uPerforrnance of motion-compensated coder with difi'erent tramfonne.
`
`MOTION-COMPENSATED TRANSFORM COGING
`
`1713
`
`“madame
`
`PMCAPL02444643
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 11
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 11
`
`
`
`-:-v->—>bMunro-v.1'I—Ivrtvw1-un7vwwwfev-vn-¢r-....~...--.1.1».1!5.5:.r.-..'".I.J-.:.-.:.:1.1:;
`
`
`
`
`
`.-w______nw7nsfit-s_..n;wmWW.
`..‘........M;.Lt..................-.:m-.-
`
`Figure 7 'is a plot of the portion of the hits required for" addmasm'" ' g"
`
`
`
`and the error transmission for the first coefficient. It is seen that,-
`11"
`motion compensation in the transform domain, as in the pol-domain;
`the addressing takes up a significant portion of the total hire. Thin
`portion varies from 40 to 60 percent of the total bits. The first
`coefficient momiasion requires more than 50 percent of the bits
`required for transmission of the ooeficients. Although the
`the results for 2 x 4- biock and cosine transform, similar results Were
`obtained for other transfomaat
`-
`-
`
`from
`- We found th‘at'more coefficre"nte could dropped
`transmission in the motion-compensated transform coder than in the
`conditional replenishment transform coder. For example, with a 2 x '4
`block and the cookie transform, only five coefficients were needed in u
`motion compensation, compared to the seven coefficients that were
`necessary for conditional replenishment. Unfortunately. however, the
`effect of dropping a larger number of coefficients did not result in a
`large hit-ratereduction, sincethe number of-‘bits required
`coefficients was very small.
`The results of Fig. 6 were obtained by adiusfing'the quantization.
`scales and the predictability thresholds {TL} in such away that coding
`degradations in pictures were just perceptible in informal viewing by
`the authors. The quantization scales that we need were from uniform
`quantizers with step sizes of 3, 5. 5, 7, 7 (for the first five coefficients-of
`the 2 x 4 cosine transform), and the predictability thresholds Were 2,
`3, 3. 4. 4 {out of 255) for the first; five coefi‘ieients. Coarser quantization
`ofthe higher order coefficients was possible in motion compensation,
`
`as
`-- WHEN WFENSflTIDH IN
`TRANSFORM DOMAIN
`{2 9| 4 GDSINE TRIHEFDRUI
`
`3.
`
`Ki“KITSPERMAME3333
`
`
`
`TOTAL BITS
`
`BiTS FOR FiflS‘l DOEFFICIENT
`
`ADD HEEING BITS
`
`I}
`
`B
`
`IO
`
`20
`
`IICI
`3|]
`FRAME HUHBEB
`
`H)
`
`E)
`
`_Nl
`
`bits/frame into addreau'ng
`Fig. 7—Diau-ihutkm of total
`that required for the omniscient-of the first ooeflideut.
`-
`
`-
`
`1714
`
`THE BELL SYSTEM TECHNICAL- JOURNAL SEPTEMBER:- “lei-379
`
`PMC5385151
`
`PMCAPL02444644
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 12
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 12
`
`
`
`compared to conditional replenishment, without significantly degrad-
`ing picture quality. An increase of the predictability thresholds of the
`first coefficient resulted in rapid degradation of the picture quality. As
`the predictability threshold was increased, the block structure of the
`transform became clearly evident. and the frame-to-fi-ame effects of
`the visibility of the block structure were found rather annoying. For
`higher order coefficients, however,
`the picture quality was not a
`sensitive function of the predictability thresholds. It appears that due
`to better prediction in motion-compensated transform coding, efi‘ects
`. of coarser quantization and higher predictability timesholds are seen
`-
`in smaller disjoint areas of the picture and, therefore, their visibility is
`lower.
`
`The recursive displacement estimation algorithm of eq. (2) was also
`-
`Hearied by changing e and by changing the order of iteration of the
`i
`_ coefficients
`a block. We found that e variation did not change
`' the bit rates significantly as long as e was within a decade of 0.0001. As
`expected”13 much larger e resulted in noisy estimates of displacement,
`' whereas smaller values of 6 took longer to converge. The order of
`_
`. iteration was varied by estimating displacement starting from the first
`3 coefficient to the fifth (for a 2 x 4 transform block and cosine transfonn
`with transmission of only five coeficients) or starting from the
`: coefficient to the first. This corresponded to iterating from low fre-
`' queue}; to high frequency or vice versa. We found that going from high
`.
`frequency to low frequency resulted in a smaller number of bits for
`transmission of the prediction error by about 5 to 10 percent. However,
`_ since amplitude bits account for only about 50 percent of the total hits,
`the overall reduction was only about 2 to 5 percent. Another variation
`that was tried consisted of iterating only the first (dc) coefficient for
`the entire block; that is. iterating the first coefficient five times instead
`' of iterating all the five coefl‘icients once. This variation resulted in
`- performance which was very' similar to the ease in which all the
`coefficients were iterated. Iterating some other higher order coeffi-
`cients five times (with no iteration of the first coefficient), however.
`was found to be quite inferior. Although all the above conclusions are
`based on the scene Judy, similar conclusions are true for the scene
`Mike and Nadine. in general, as in pa] domain,‘°'“ the hit rate for
`Mike and Nadine was much higher than that for Judy. It varied
`between 170 and 200 kilohits per frame for conditional replenishment
`in the transform domain, compared to 150 to 175 kilobits per frame for
`motion-compensated transform coding.
`
`.
`
`W. HOTION COMPENSATION WITH TRANSMITTER DlWCEIlEflT
`
`In this section, we give realms of estimating displacement by a
`technique proposed by Limb and Murphy” and then use it. for motion-
`
`MOTION—COMPENSATED TRANSFORM CODING
`
`1715
`
`” " "ismcsosa 52
`
`PMCAPL02444645
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 13
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016-00755
`Page 13
`
`
`
`-
`
`,...,,,.,,.,...,.,E..,.,,.¢..,.....,.....,._..._-
`
`
`
`
`.,,......._...._...._,.,7,.,___,_._____._..___nanmnama...»,.-H-e—rfit:1—¢-¢I-:-I‘-Ia->M.....,.-..
`
`
`_.-..m-_..m..;1.‘l.............“Maw....;:::-.n
`
`compensated prediction. The displacement computation Was deficit:
`“displacement blocks” with varying sizes, such that the transform
`block was an exact submultiple of the displacement-block: in both
`dimensions. Also, coded values of intemifies of the previous frame
`were used to obviate the need of an additional flame
`computed the displacement for the displacement block by the
`Murphy
`each'transfom coefficient within the'u'anfiorm
`block is predicted by using the displaced coefficient frOm the
`frame or the nondisplaced coefficient from the previous-flame de
`pending on which was better for the previous coefficient of the same
`block. In this scheme. there is a tradeofi between the displacement
`block size and the total number of bits required for a given picture
`quality. A large displacement block size tends to average all the local
`variations of the displacement and, consequently, may not result in a
`good prediction; however, it requires less overhead forof
`the displacement estimate. On the other hand,” a small displacement
`block requires larger overhead but is" potentially superior for
`meat estimation in noiseless data. For real scenes. however, theqnality
`of displacement estimation using small biocks might also sufl'er;
`Our simulations used three sizes for the displacement blocks: '16 x
`6.9., 16 lines 3-: 32 elements in the same field), 8 x 16, and 4 x 3.
`These blocks were approximately square, considering the interlace.
`Only a 2 x 4 transform block with the cosine transform was used. All
`the rest of the coder parameters were adjusted to generate picmres-of
`approximately the same quality as before. For both the scenes, without
`accountmg for bits required for transmission of displacement inform
`motion, a displacement block size of 8 X 16 did the best in
`of
`hits per frame. For the scene Judy, displacement blocks-of 16 x 32 and
`d x 8 resulted in bit rates that were higher by approximately 1000 hits
`per frame and 2000 hits per fi'ame, respectively. For the
`Mike
`and Nadine, similar comparisons resulted in about 3000 hits poi-frame
`and 5000 hits per frame. Also without accounting for those hits nec-
`essary for transmission of displacement information, the 8 x 116' "dis;
`placement block resulted in bit rates comparable to those of- previous
`sections with recursive displacement estimation for Judy, but about
`5000 to 1000 hits per frame higher for the scene Mike and Nadine.
`This, however, is a small percentage of the total bits tranmntted per
`frame. As mention earlier,
`the schemes of this
`transrnimion of displacement information. We did not study ' any
`schemes to optimize transmission of this infoth that
`each 1),, and D, can be specified by 8 bits, we would need 2324, 8096,
`and 32,384hita per frame for 16 x 32, 8 x 16,'and4'>'< 8
`blocks, respectively. Clearly, considering the overall bit rate,- displace-
`ment blocks of 8 x 16 and 16 x 32 are similar in performance, with-a
`
`1716
`
`THE BELL SYSTEM TECHNICAL JOURNAL. SEPTEMBER 1QTQ
`
`: .. : 'x-i'élvtIE-o .s-'-ifl:~3.:d!EEHHE-H a: H -.:
`mcssssise
`
`PMCAPL02444646
`
`PMC Exhibit 2038
`
`Apple v. PMC
`|PR2016-00755
`
`Page 14
`
`PMC Exhibit 2038
`Apple v. PMC
`IPR2016