throbber

`
`’|
`
`Block Matching
`
`As mentioned in the previous chapter. displacement vector measurement and its usage in motion
`compensation in intei li'ai‘i‘ic. coding ['or it TV signal can be traced back to the 19703. Netravali and
`Robbins ([9791 developed a pet-recursive technique. which estimates the displacement vector for
`each pixel recursively Ironi its neighboring pixels using an optimization method. Limb and Murphy
`(1975). Rocca and 7.;tnoletti Il'El‘i'E). Cal‘l‘orio and Rocea (1W6). and Broilerio and Rocco (1977)
`developed techniques for the estimation of diSplaccntent vectors of a block of pixels.
`[11 the latter
`approach. an image Is
`lirst segmented into areas with each having an approximately uniform
`translation. Then the motion vector is estimated for each area. The segmentation and motion
`estimation asst-mated with these arbitrarily shaped blocks are very dilliculi. When there are multiple
`moving areas in images. the situation becomes more challenging. In addition to motion vecmt‘s.
`the shape iiili‘ii'iiiatioii ot‘ these areas needs to be coded. Hence. when moving areas have various
`complicated shapes. hotli computational complexity and coding load will increase remarkably.
`In contrast,
`the block matching technique. Which is
`the locus of this chapter.
`is simple,
`straightforward. and yet very cllictent.
`It has been by far the most popularly utilized motion
`estimation technique in video coding In fact. it has been adopted by all the international video
`coding standards: lSO. MPEG—1 and MPEG-2. and [TU H.261. and H.263. These standards will
`be introduced in detail in Chapters lo.
`l'i'. and IQ. respectively.
`it is interesting to note that even nowadays, with the tremendous advancements in multimedia
`engineering. object-based andlor content-based manipulation ofiiudiovisual information is still very
`demanding. particularly in audiovisual data storage. retrieval. and distribution. The applications
`include digital library. video on demand. audiovisual databases. and so on. Therefore. the coding
`of arbitrarily shaped objects has attracted great research attention these days. It has been included
`in the MPEG—4 activities (Brailean. 199?}. and will he discussed in Chapter 18.
`In this chapter various aspects of block matching are addressed. They include the concept and
`algorithm. matching criteria. searching strategies. limitations. and new improvements.
`
`11.1 NONOVERLAPPED. EQUALLY SPACED. FIXED SIZE,
`SMALL RECTANGULAR BLOCK MATCHING
`
`To avoid the kind of difficulties encountered ll'l motion estimatEOn and motion compensation with
`arbitrarily shaped blocks. the block matching technique was prop05cd by Jain and Jain (198 I) based
`on the following simple motion model.
`_
`’
`An image is partitioned into a set of nonoverlapped. equally spaced. llxed 812C. Emil” rectangular
`blocks: and the translation motion within each block is assumed to be uniform. Although this Simple
`model considers translation motion only. other types of motions, such as rotation and zooming of
`large Obit-1018, may be closely approximated by the piecewise translation of these small blocks
`provided that these blocks are small enough. This observation. originally made by Jain and Jain.
`has been confirmed again and again since then.
`.
`‘
`Di5placement vectors for these blocks are estimated by finding their best matched counterparts
`in the previous frame.
`In this manner. motion estimation is significantly easier than that for
`arbitrarily shaped blocks. Since the motion of each block is described by only one displacement
`“301m.
`the side information on motion vectors decreases. Furthermore.
`the rectangular shape
`
`221
`
`|PR2018—01413
`
`Sony EX1008 Page 247
`
`IPR2018-01413
`Sony EX1008 Page 247
`
`

`

`222
`
`image and Video Compression for Multimedia Engineering
`
`
`the best block matching
`
`correlation window
`
`_|III displacement vector
`
`[Illa-III.-
`IIIIIIIIIII
`
`
`
`An original block
`
`(a) t. frame
`
`[13) [M frame
`
`FIGURE 11.1 Block matching.
`
`information is known to both the encoder and the decoder. and hence does not need to be chOdEd.
`
`which saves both computation load and side information.
`the smaller the block size. the more
`The block size needs to be chosen properly. In general.
`accurate is the approximatiOn. It is apparent. however. that the smaller block size leads to more
`motion vectors being estimated and encoded. which means an increase in both computation and
`side information. As a compromise. a size of 16 x 16 is considered to he a good choicc. (This “35
`been specified in international video coding standards such as H.26I, H.2o3. and MPEG-1 and
`MPEG-2.) Note that for finer estimation :1 block size of 8 X S is sometimes used.
`Figure 11.] is utilized to illustrate the block stretching technique.
`in Figure ll.l(a) an image
`Frame at moment in is segmented into nonoverlapped p x q rectangular blocks. As mentioned above.
`in common practice. square blocks ofp = q = 16 are used most ol'ten. Consider one of the blocks
`centered at (x. 3;). It is assumed that the block is translated as a whole. Consequently. only 0113
`displacement vector needs to be estimated for this block. Figure l l.l(b) shows the previous frame:
`the frame at moment t“. In order to estimate the displacement vector. a rectangular search window
`is opened in the frame rm. and centered at the pixel (it. y). Consider a pixel in the search window.
`a rectangular correlation window of the same size p x r; is opened with the pixel
`located in its
`center. A certain type of similarity measure (correlation) is calculated. After this matching process
`has been completed for all candidate pixels in the search window, the correlation window corre—
`spending 10 the 131363! Similarity becomes the best match of the block under consideration in frame
`r... The relative position between these two blocks (the biock and its best match) gives the displace—
`ment vector. This is shown in Figure 1 1.1(b).
`The size of the search window is determined by the size of the correlation window and the
`maximum POSSibIB diSplacement along four directions: upward, downward. rightward. and leftward.
`1“ Figure l 1-2 these four quantities are assumed to be the same and are denoted by d. Note that r!
`is estimated from apr‘r’orr' knowledge about the translation motion. which includes the largest
`PDSSible motion Speed and the temporal interval between two consecutive trainer». i.e.. 1.. ‘ r"...
`
`1 1 .2 MATCHING CRITERIA
`
`Block matching belongs to image matching and can be viewed from a wider perspective. In mil"),
`image processing tasks. we need to examine two images or two portions of images on a PiXEI‘bY'P'xel
`
`|PR2018—01413
`
`Sony EX1008 Page 248
`
`IPR2018-01413
`Sony EX1008 Page 248
`
`

`

`Block Matching
`
`223
`
`q+2d
`
`l
`
`FIGURE 11.2
`
`Search window and correlation window.
`
`basis. These two images or two image regions can be selected From a spatial image sequence. i.e.,
`From two ['rames taken at the same time with two different sensors aiming at the same object. 01'
`from a temporal image sequence. i.e.. from two frames taken at two different moments by the same
`sensor. The purpose at the examination is to determine the similarity between the two images or
`two portions of images. Examples of this type of application include image registration (Pratt,
`1974) and template matching (Jain,
`l989). The former deals with spatial registration of images.
`while the latter extracts andi’or i'cCOgnixes an object in an image by matching the object template
`and a certain area of the image.
`The similarity measure. or correlation measure. is 21 key element in ”“3 matching process. The
`basic correlation measure between two images I" and r,,_,. C (s. t). is defined as follows (Anuta. 1969).
`
`I):
`
`2.” z:tf“(U‘k)ffl_l[j+s,k+t)
`:41“..kaMix: Hff—I(j+5.k+!)2
`
`.
`
`J2;
`
`(11.1)
`
`This is also referred to as a normalized two-dimensional cross-correlation function (Musmann et at.
`1985)
`Instead ot finding the maximum similarity or correlation an equivalent but yet mote compu-
`tationally efficient way of block matching is to find the minimum dissimilarity. or matching error.
`The dissimilarity (sometimes relerred to as the cum distortion. or distance) between two images
`I" and t“. D (s, t) is defined as tollows.
`
`D('§‘(!)=—EZMUnflj’klfJt-f_|(j+5,.k+f))
`
`j==lkl
`
`(11.2)
`
`where M(u,v) is a metric that measures the dissimilarity between the two arguments u and v. The
`D (S. t) is also referred to as the matching criterion or the D values.
`In the literature there are several types of matching criteria. among which the mean square
`error (MSE) (Jain and Jain. 1981) and mean absolute dill'erence (MAD) (Koga et al., I981) are
`used most often. It is noted that the sum of the squared difference (SSD) (Anandan, 1987) or the
`Sum of the squared error (SSE) (Chan et al.. 1990) IS essentially the same as MSE. The mean
`
`|PR2018—01413
`
`Sony EX1008 Page 249
`
`IPR2018-01413
`Sony EX1008 Page 249
`
`

`

`224
`
`Image and Video Compression for Multimedia Engineering
`
`absolute difference is sometimes referred to as the mean absolute error (MAE) in the literature
`
`(Nogaki and Ohta. 1972).
`In the MSE matching criterion. the dissimilarity metric M (L1. v) is defined as
`
`In the MAD.
`
`’1
`
`M(tt.v)=(rr—v)‘.
`
`M(n.v):|tr—vl.
`
`[l l.3)
`
`{11.4}
`
`Obviously, both criteria are simpler than the normalized two-dimensional cross-correlation measure
`defined in Equation 11.].
`Before proceeding to the next section. a comment on the selection ot' the dissimilarity measure
`is due. A studyr based on experimental works reported that the matching criterion does not signif-
`icantly affect the search {Srinivasam 1984). Hence. the MAD is preferred due to its simplicity in
`implementation (Musmann ct al.. 1985).
`
`11.3
`
`SEARCHING PROCEDURES
`
`'I‘he searching strategy is another important issue to deal with in block matching. Several searching
`strategies are discuused below.
`
`11.3.1
`
`FULL SEARCH
`
`Figure 1 1.2 shows a search window, a correlation window, and their sizes. In searching for the best
`match. the correlation window is moved to each candidate position within the search window. That
`is. there are a total (2 d+l) x (2 d+l) positions that need to be examined. The minimum dissimilaril)‘
`gives the best match. Apparently, this full search procedure is brute force in nature. While the full
`search delivers good accuracy in searching for the best match (thus, good accuracy in motion
`estimation). a large amount of computation is involved.
`In order to lower computational complexity. several
`developed. They are introduced below.
`
`fast searching procedures have been
`
`11.3.2
`
`2-D [OCARITHMIC SEARCH
`
`Jain and Jain (1931) developed a 2-D logarithmic searching procedure. Based on a 1—D logarithmic
`search procedure (Kmflh. 1973). the 2-D procedure successively reduces the search area.
`thus
`reducing the computational burden. The first steps computes the matching criteria for five 1301““;
`i“ the search window. Thesc five Points are as follows: the central point of the search window and
`the four points surrounding it. with each being a midpoint between the central point and one of
`the four boundaries of the window. Among these five points. the one corresponding to the minimum
`dissimilarity is picked as the winner. In the next step. surrounding this winner. another set of five
`points are selected in a similar fashion to that in the first step. with the distances between the five
`points remaining unchanged. The exception takes place when either a central point of a set of five
`points 01' 3 boundary POW 0m“? SCEIICh Window gives a minimum D value. In these circumstanceS.
`the distances between the five points need to be reduced. The procedure continues until the final
`step. in which a set of candidate points are located in a 3 x 3 2-D grid. Figure l 1.3 demonstrates
`No cases of the procedure. Figure 11.3(a) shows that the minimum D value takes place on a
`boundary. while Figure 11.3(b) shows the minimum D value in the central position.
`
`|PR2018—01413
`
`Sony EX1008 Page 250
`
`IPR2018-01413
`Sony EX1008 Page 250
`
`

`

`Block Matching
`
`225
`
`
`
`01}
`
`(a) A 2-D logarithmic search procedure. Points at (i. k+2L 0+2: k‘l'2)- 0+2: “4}: ancl 0+}:
`FIGURE 11.3
`k+4J are found to give the minimum dissimilarity in steps 1. 3. 3. and 4. IE§P€CIIV€ly:
`(‘b} A 2? loganl-hmic
`. search procedure. Points at U. k,2)‘ 0+2, 192). and 0+2. k.]) are found to give the minimum diSSlmiIarity In
`sleps ]. 2. 3, and 4, respectively.
`
`A convergence proofofthe procedure is presented by Jain and Jain (1931). under the assumption
`that the dissimilarity monotonically increases as the search pom: moves away from the point
`correSponding to the minimum dissimilarity.
`
`|PR2018—01413
`
`Sony EX1008 Page 251
`
`IPR2018-01413
`Sony EX1008 Page 251
`
`

`

`226
`
`if on
`
`W
`
`Image and Video Compre55ion for Mu|timedia Engineering
`
`
`3% ‘b”#1gush-l
`
`'0'ta0
`
`I:0Go,a
`
`Three—step search procedure. Points (j+~'l. k—4). U+4. k—o), and tj+5. k-7) give the minimum
`FIGURE 11.4
`dissimilarity in steps 1, 2. and 3. respectively.
`
`11.3.3
`
`COARSE—FINE THREE-STEP SEARCH
`
`Another important work on the block matching technique was completed at almost the same time
`by Koga et a1. (1981). A coarse-fine three—step procedure was developed for fast searching.
`The three-step search is very similar to the 2—D logarithm search. There are, however, three
`main differences between the two procedures. First, each step in the three-step search COITIpfll’BS‘fl
`set of nine points that form a 3 x 3 2—D grid structure. Second, the distances between the points tn
`the 3 X 3 2—D grid structure in the three-step search decrease monotonically in steps 2 and 3. Third,
`a total of Only three steps are carried out, Obviously. these three items are different from the 2—D
`logarithmic search described in Section 11.3.2. An illustrative example of the three-step search is
`shown in Figure 11.4.
`
`11.3.4 CONJUGATE DIRECTION SEARCH
`
`The conjugate direction search is another fast search algorithm that was developed by Srinivasan
`and R30 (1934)- In PfinCiPle. the procedure consists of two parts. In the first part,
`it findsthe
`minimum dissimilarity along the horizontal direction with the vertical coordinate fixed at an’initial
`position. In the second part, it finds the minimum D value along the vertical direction With the
`horizontal coordinate fixed in the position determined in the first part. Starting with the vertical
`direction followed by the horizontal direction is, of course, functionally equivalent. It was reported
`that this search procedure works quite efficiently (Srinivasan and Rao, 1984).
`Figure 11.5 illustrates the principle of the conjugate direction search. In this example, each
`step involves a comparison between three testing points. If a point assumes ihE-mlmm-um D V3126
`compared with both of its two immediate neighbors (in one direction), then it ts consrdereld to“ 6
`the best match along this direction, and the search along another direction is started. Spwffi‘: I);
`the procedure starts to compare the D values tor three points (i, k—-l), (i, k), and (i, k+l)- 1_ :1)
`value of point (j, k—l) appears l0 be the minimum among the three, then points (1. k-Z). (l.
`.
`
`|PR2018—01413
`
`Sony EX1008 Page 252
`
`IPR2018-01413
`Sony EX1008 Page 252
`
`

`

`Block Matching
`
`227
`
`
`
`FIGURE 11.5 Conjugate direction search.
`
`and (j. k) are examined. The procedure continues. finding point (j, k—3) as the best match along
`the horizontal direction since its D value is smaller than that of points (j, k—4} and (j, k—2). The
`procedure is then conducted along the vertical direction. In this example the best matching is finally
`found at point lj+2t k—3).
`
`11.3.5
`
`SUBSAMPLING IN THE CORRELATION WINDOW
`
`In the evaluation of the matching criterion. either MAD or MSE, all pixels within a correlation
`window at the {M frame and an original block at the I” frame are involved in the computation. Note
`that the correlation window and the original block are the same size (refer to Figure l 1.1). In order
`to further reduce the computational effort. a subsantpling inside the window and the block is
`performed (Bierling, 1988). Aliasing effects can be avoided by using low-pass filtering. For instance.
`GHIY every second pixel, both horizontally and vertically inside the window and the block, is taken
`into account for the evaluation of the matching criterion. Obviously, by using this subsampling
`technique, the computational burden is reduced by a factor 0M. Since 3.34 of the pixels within the
`window and the block are not involved in the matching computation. however, the use of such a
`suhsampling procedure may affect the accuracy of the estimated motion vectors. especially in the
`case of small-size blocks. Therefore.
`the subsampting technique is recommended only for those
`cases with a large enough block size so that the matching accuracy will not be seriously affected.
`Figure l 1.6 shows an example of 2 x 2 subsampling applied to both an original block of 16 x 16
`at the I" frame and a correlation window of the same size at the t,” frame.
`
`11.3.6 MULTIRESOLUTION BLOCK MATCHING
`
`is a very
`is well known that a muttiresolution structure. also known as a pyramid structure.
`It
`Powerful computational configuration for various image processing tasks. To save computation in
`block matching, it is natural to resort to the pyramid structure. In fact, the multiresolution technique
`has been regarded as one of the most efficient methods in block matching (Tzovaras et al., 1994).
`In a named top-down multiresolution technique, a typical Gaussian pyramid is formed first.
`
`|PR2018—01413
`
`Sony EX1008 Page 253
`
`IPR2018-01413
`Sony EX1008 Page 253
`
`

`

`223
`
`Image and Video Compression for Multimedia Engineering
`
`an original block
`
`a correlation windmi-
`
`(a) An original block of l6>¢15 in fi'ame at I .
`
`(b) A correlation mndnw of 15): I 5 Ln Emma atl
`
`FIGURE 11.6 An example of 2 x 2 subsarnpling in the original block and correlation window for a faSI
`search.
`
`Before diving into further description. let us pause here to give those readers who have not been
`exposed to the Gaussian pyramid a short introduction to the concept. For those who know the
`coacept. this paragraph can be skipped. Briclly speaking, a Gaussian pyramid can he understood
`as a set of images with different resolutions related to an original image in a certain way. The
`original image has the highest resolution and is considered as the lowest level. sometimes called
`the bottom level.
`in the set. From the bonorn level
`to the top level.
`the resolution decreases
`monotonically. Specifically, between two consecutive levels, the upper level is half as large as the
`lower level in both horizontal and vertical directions. The upper level is generated by applying a
`low-pass filter {which has a group of weights) to the lower level. followed by a 2 >< 2 subsampllng‘
`That is. each pixel in the upper level is a weighted average of some pixels tn the lower level
`In
`general, this iterative procedure of generating a level in the set is equivalent to convolving a specific
`weight function With the original image at the bottom level followed by an appropriate subsantpling'
`Under certain conditions. these weight functions can closely approximate the Gaussian probabilit)’
`density fUflCtion. which is why the pyramid is named after Gauss. (For a detailed discussion. readers
`are referred to Burt and Adelson [1933. 1984].)l A Gaussian pyramid structure is depicted tn
`Figure It]. Note that the Gaussian pyramid depicted in Figure l 1.? resembles a so-called quad—
`trce structure in which each node has four children nodes. In the simplest quad-tree pyramid, eaCh
`pixel in an upper level is assigned an average value of its correSponding four pixels in the DES“
`lower level.
`
`Now let‘s return to our discussion on the top~down multiresolution technique After a Gaussran
`Piramid has been constructed. motion search ranges are allocated among the different Pyram'd
`levels. Block matching is initiated at the lowest resolution level to obtain an initial estimation of
`motion vectors. These computed motion vectors are then propagated to the next higher resolution
`level. where they are corrected and then propagated to the next level. This procedure continues
`until the highest resolution level is reached. As a result. a large amount of computation can be
`saved. Tzovaras et a1. (1994) showed that a two~level Gaussian pyramid outperforms a three-level
`pyramid. Compared with full search block. matching. the top-down multiresolution block searClt
`saves up to 67% ofcomputations withoutseriously affecting the quality of the reconstructed images-
`In conclusion. it has been demonstrated that multiresolution is indeed an efficient compuml‘onal
`structure in block matching. This once again confines the high computational efficiency of ”“3
`multiresolution structure.
`
`|PR2018—01413
`
`Sony EX1008 Page 254
`
`IPR2018-01413
`Sony EX1008 Page 254
`
`

`

`Block Matching
`
`229
`
`level 4
`
`level
`increasing
`resolution
`
`de
`
`tag
`
`level I
`
`
`level 3
`
`level 2
`
`FIGURE 11.7 Gaussian pyramid structure.
`
`11.3.7 THRESHOLDING MULTIRESOLUTIDN Btocx MATCHING
`
`With the multiresolution technique disenssed above. the computed motion vectors at any interrac—
`diate pyramid level are projected to the next higher resolution level. In reality. some computed
`motion vectors at
`the lower resolution levels may be inaccurate and have to be further refined.
`while others may be relatively accurate and able to provide satisfactory motion compensation for
`the corresponding block. From a computation-saving point of view. for the latter class it may not
`be worth propagating the motion vectors to the next higher resolution level For further processing.
`Motivated by the above observation,
`at new inultircsolution block matching method with a
`thresholding technique was developed by Shi and Xia (1997). The thresholding technique prevents
`those blocks, whose estimated motion vectors provide satisfactory motion compensation. from
`further processing. thus saving a lot of computation. In what follows. this technique is presented
`in detail so as to proVide readers with an insight to both multiresolution block matching and
`thresholding multiresolution block matching techniques.
`
`Algerithm —— Let fact, y) be the frame of an image sequence at current moment it. First. two
`Gaussian pyramids are formed, pyramids u and n — I. from image frames fits, y) and fi,_,(x.y).
`respectively. Let the levels of the pyramids be denoted by l. l: 0.
`l.
`L, where 0 is the lowest
`resolution level (rep level). L is the full resolution level (bortom level), and L+l is the total number
`0f layers in the pyramids. If (Lj) are the coordinates of the upper-left corner of a block at level l
`01" pyramid n. the block is referred to as block (i.j),§. The horizontal and vertical dimensions of a
`block at level I are denoted by b; and b'y. respectively. Like the variable block size method (refer
`to Method 1
`in Tzovaras et al. {1994]}, the size of the block in this work varies with the pyramid
`levels. That is. ifthe size ofa block at level i is bi, then the size of the block at level H» 1 becomes
`2b; 3': 2b;. The variable block size method is used because it gives more efficient metion estimation
`than the fixed block size method. Here. the matching criterion used for motion estimation is the
`MAD because it does not require ntultiplication and performs similar to the MSE. The MAD
`between block (i._i)'b,'1 of the current frame and block (i + v‘, j + v,)'b:._l of the previous frame at
`level l can be calculated as
`
`|PR2018—01413
`
`Sony EX1008 Page 255
`
`IPR2018-01413
`Sony EX1008 Page 255
`
`

`

`230
`
`Image and Video Compression for Multimedia Engineering
`
`MADW],[V_:.1’:_)=fiE Elfif(t+k,j+ m)—fif_,(r+ k + vfi.j+ m + vi)!
`
`3' k1“ mzfl
`
`{115)
`
`bl-t bi-l
`
`where V1 = (vi. vy'.) is one of the candidates of the motion vector of block (id-)1.- u". it: are the two
`components of the motion vector along the it and y directions, respectively.
`A block diagram of the algorithm is shown in Figure 11.8. The threshold in terms of MAD
`needs to be determined in advance according to the accuracy requirement ol‘ the motion estimation
`Determining the threshold is discussed below in Part B of this subsection. Gaussian pyramids are
`formed for two consecutive frames of an image sequence from which motion estimation is desired
`Block matching is then performed at
`the top level with the full-search scheme. The estimated
`motion vectors are checked to see if they provide satisfactory motion compensation. it the accuracy
`requirement is met. then the motion vectors will be directly transformed to the. bottom level of the
`pyramid. Otherwise, the motion vectors will be propagated to the next higher resolution level for
`further refinement. This thresholding process is discussed below in Part C of this Subsection. The
`algorithm continues in this fashion until either the threshold has been satisfied or the bottom level
`has been reached. The skipping of some intermediate~level calculations provides for computational
`saving. Experimental work with quite different motion complexities demonstrates that the proposed
`algorithm reduces the processing time from 14 to 20%. while maintaining almost the same quality
`in the reconstructed image compared with the fastest existing mnltiresolution block matching
`algorithm (Tzovaras et al.. t994).
`
`Block matching
`
`Low pass filtering
`and subsampling
`
`Block matching
`
`Satisfying
`thrflhold
`
`Low pass filtering
`
`
`
`and substimpling
`
`
`
`and suhsarnpling
`
`
`Low pass filtering
`
` Satisfying
`
`
`Low pass filtering
`and subsampling
`
`threshold
`
`Motion field
`
`
`FIGURE 11.8 Block diagram for a three-level threshold multiresolution block matching-
`
`|PR2018—01413
`
`Sony EX1008 Page 256
`
`IPR2018-01413
`Sony EX1008 Page 256
`
`

`

`Block Matching
`
`231
`
`——_._._—_____—_____—
`
`TABLE 11.1
`
`Parameters Used in the Esperiments
`
`Parameters at Level
`
`Low Resolution Level
`
`Full Resolution level
`
`Search range
`Block size
`
`Thresholding value
`
`Search range
`Block size
`
`Tltrcsholding value
`
`MiSS America
`3 x 3
`4 X 4
`
`| x l
`3 x 8
`
`2
`
`Train
`
`4 a 4
`4x4
`
`3
`
`Football
`
`None {not applicable}
`
`Ix I
`8x8
`
`None (not applicable)
`
`Search range
`4 x 4
`l x |
`Block size
`4 a: 4
`Bx 8
`4Thresholding value None (not applicable)
`
`
`
`
`
`Threshold Determination — The MAD accuracy criterion is used in this work for the sake of
`saving computations. The threshold value has a direct impact on the performance of the proposed
`algorithm. A small threshold value can improve the reconstructed image quality at the expense of
`increased computational effort. On the other hand. a large threshold value can reduce the compu-
`tational complexity. but the quality of the reconstructed image may be degraded. One possible way
`to determine a threshold value. which was used in many experiments by Shi and Xia U997), is as
`follows.
`
`The peak signal-to-noise ratio (PSNR) is commonly used as a measure of the quality of the
`reconstructed image. As introduced in Chapter 1. it is defined as
`
`PSNR = 10 logm
`
`
`2552
`MSE
`
`(t 1.6)
`
`From the given required PSNR, one can find the necessary MSE value. A square root of this
`MSE value can be chosen as a threshold value, which is applied to the first two images from the
`sequence. If the resulting PSNR and required processing time are satisfactory. it is then used for
`the rest of the sequence. Otherwise. the threshold can be slightly adjusted accordingly and applied
`to the second and third images to check the PSNR and processing time. it was reported in numerous
`experiments that this adjusted threshold value was accurate enough. and that there was no need for
`further adjustment. As shown in Table NJ, the threshold values used for the “Miss America.“
`“Train." and “Football" sequences (three sequences having quite different motion complexities) are
`2. 3, and 4, respectively. They are all determined in this fashion and give satisfactory performance.
`as shown in the three rows marked “New Method (TH=2}," ”New Method (TI—1:3)" and “New
`Method (TH=4),“ respectively, in Table l 1.2. That is, the PSNR experiences only about 0-1 dB 1055
`and the processing time decreases drastically. In the experiments. the threshold value of 3. Le, the
`average value of2, 3. and 4. was also tried. Refer to the three rows marked "New Method {TH=3}
`in Table 11.2. It is noted that this average. threshold value 3 has already given satisfactory perfor-
`mance for all three sequences. Specifically, for the "Miss America“ sequence, since the criterion
`increases from 2 to 3, the PSNR loss increases from 0.12 to 0.48 dB. and the reduction in processing
`time increases from 20 to 38%. For the ”Football“ sequence, since the criterion decreases from
`4 10 3. the PSNR loss decreases from 0.08 lo 0.05 dB, and the reduction in PFOCBSSIUS time decreases
`
`|PR2018—01413
`
`Sony EX1008 Page 257
`
`IPR2018-01413
`Sony EX1008 Page 257
`
`

`

`232
`
`image and Video Compression for Multimedia Engineering
`
`from 14 to 9%. Obviously. for the "Train“ sequence. the criterion, as well as the performance,
`remains the same. One can therefore conclude that the threshold determination may not require
`much computation at all.
`
`Thresholding — Motion vectors estimated at each pyramid level will be checked to see it they
`provide satisfactory motion compensation. Assume V’ (Lj) = (vi. vi) is the estimated motion vector
`for block (i.j)in at level i of pyramid H. For thresliolding. V’ {Lj} should be directly projected to
`the bottom level L. The corresponding motion vector for the same block at the bottom level of
`pyramid it will be VL (2“‘43 r'.2“-'“j), and is given as
`
`VL(2“-'”t. THU): Z‘L'i'V’lLJ)
`
`(11.7}
`
`The MAD between the block at the bottom pyramid level of the current frame and its counterpart
`in the previous frame can be determined according to Equation 1 IS. where the motion vector is
`Vb: VL (Eur-ll LEW-”J0. This computed MAD value can be compared with the predefined threshold.
`If this MAD value is less than the threshold. the computed motion vector l/L (Z‘H’ i.2""“j) will
`be assigned to block (2"~"" i.2"-‘”j){; at level L in the current frame and motion estimation for this
`block will be stopped. If not. the estimated motion vector V“ (i._l) at
`level
`1’ will be propagated to
`level l+ l for further refinement. Figure 11.9 gives an illustration of the above thresholding process.
`
`Experiments — To verify the effeCIiveness of the proposed algorithm. extensive experiments have.
`been conducted. The performance of the new algorithm is evaluated and compared with that of
`Method 1. one of the most efficient multiresolution block matching methods (Trovaras et at. 1994}
`in terms of PSNR, error image entropy, motion vector entropy, the number of blocks stopped at
`the top level vs. the total number of blocks, and processing time. The number of blocks stopped
`at the top level is the number of blocks withheld from further processing. whilc the total number
`of blocks is the number of blocks existing at
`the top level. It
`is noted that the total number of
`blocks is the same for each level in the pyramid. The processing time is the sum ot’ the total number
`of additions involved in the evaluation of the MAD and the thresholding operation.
`In the experiments. two-level pyramids are used since they give better performance for motion
`estimation purposes [Tzovaras etal.. 1994). The algorithms are tested on three video sequences
`with different motion complexities. i.e.. the “Miss America." “Train." and “Football." The “Miss
`America" sequence has a Speaker imposed on a Static background and contains less motion. The
`“Train" sequence has more detail and contains a fast-moving object (train). The 20th frame of the
`sequence is shown in Figure l 1.10. The “Football“ sequence contains the most complicated motion
`
`Prrmfid
`n-l
`
`fl
`
`Estimation of motion vector
`ofa block at level I
`
`
`Pyramid
`:1
`
`f
`
`Pyramid
`level
`
`I
`
`Projection of
`the block and
`its esnma'ted
`motion vector
`at level I.
`
`Calculation of the MAD of
`the block at level I.
`
`
`
`FIGURE 11.9 The thresholding proceSs.
`
`|PR2018—01413
`
`Sony EX1008 Page 258
`
`IPR2018-01413
`Sony EX1008 Page 258
`
`

`

`Block Matching
`
`233
`
`
`
`FIGURE 11.111 The ltttlt frame of the “Train" sequence.
`
`‘ F
`
`IGURE [1.1] The 20th frame in the “Football" sequence.
`
`.‘h >"I—""":"'""_-_a'9' _...
`
`mmPared with the other two sequences. The 20th frame is shown in Figure 11.11. Table 11.1 is
`the list of implementing parameters used in the experiments. Tables 1 1.2 and l [.3 give the perfor~
`mance ofthe proposed algorithm compared with Method 1. In all three cases, the motion estimation
`has a half-pixel accuracy, the meaning of which will be explained in the next section. All perfor—
`mance measures listed there are averaged for the first 25 frames of the testing sequences.
`Each frame of the ”Miss America" sequence is of 360 x 288 pixels. For convenience, only the
`central

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket