throbber
-~----'---------------------------~·--··-·----·------- ·· -·
`
`1 Iref• Icum aBM• M
`
`Quasi-Newton minimization with
`translational motion model and
`4-smoothed images
`
`1 amin• MSEm;n
`
`Quasi-Newtein minimi:z:ation with
`affine motion model and
`2-smoothed images
`
`1 3min• MSEmin
`
`Quasi-Newton minimization with
`quadratic motion model and
`2-smoothed images
`
`1 Rmin• MSEmin
`
`Quasi-Newton minimization with
`aquadratic motion model and
`non-smoothed images
`
`lllQN· MSEQN
`
`Figure 19: Block diagram of Quasi-Newton minimization of the error function
`
`After each minimization, error function value (evaluated in non-smoothed image domain)
`is compared to the lowest value obtained so far. If the value has decreased the motion
`parameters are stored and the next minimization begins at this point in the parameter
`space. Otherwise the motion parameters of the last minimization are discarded and the
`next minimization is initialized by the parameters yielding the smallest value of the error
`function. In the first three minimizations bilinear interpolation of the subpixel values is
`applied to speed up the minimization. Cubic interpolation is then used only in the last
`minimization to guarantee maximum performance.
`
`After the termination of the Quasi-Newton minimizations error function value MSEQN is
`compared to the one after the minimizations of the Taylor expanded error functions
`MSETaylor and the motion parameters yielding smaller value are chosen for the motion
`parameters for this block.
`
`4.3 Merging
`
`Splitting and motion estimation results in a large number of blocks (usually) with good
`predictions. However many of these blocks could have been predicted with the same set
`of motion parameters without affecting the quality of the prediction too much_ To remove
`this redundancy, blocks with similar motion characteristics are merged together to form
`
`Motion Estimation and Representation for Video Coding Applications
`
`34
`
`Vedanti Systems Limited - Ex. 2007
`Page 41
`
`

`
`larger segments. Following subsections describe how to decide which blocks are to be
`merged and how to obtain motion parameters for the merged segment knowing the
`parameters for the original ones.
`
`4.3.1 Finding the final segmentation
`
`To find the optimal segmentation, Lagrangian cost
`
`L=D+AA
`p
`
`(50)
`
`of the image is to be minimized [Kar96]. In (50), the square error between the original and
`the decoded frame is denoted by D, the number of bits needed to code the frame is
`denoted by R and the number of pixels in the frame is denoted by P. A. is the Lagrangian
`multiplier specifying the cost of one bit with respect to the distortion measure. In order to
`find the minimum in L, all the possible combinations of segments should be checked. To
`avoid the computational complexity an approximative method is presented in this section.
`
`Let M...kl be the reduction in Lagrangian cost when two segments sk and sl are merged to
`form Sid. M...ld can be calculated by
`
`(51)
`
`The segment Sk can be coded in three different modes (i.e. m INTER, INTRA or
`UNCHANGED mode). The cost of the segment is assumed to be the minimum of the
`costs of these three modes, since it minimizes (50). Thus the Lagrangian costs for all the
`modes have to be evaluated
`
`In calculation of the INTER mode cost the number of bits for the motion information is
`needed. However this information will be available only after the merging and motion
`model adaptation. To overcome the problem a constant number of bits per motion
`parameter is assumed. By simulations a value of three bits per parameter is found to be a
`good approximation. Prediction error of the segment is coded with 8x8 DCTs similarly as
`in ITU's H.263 recommendation [ITU95]. Bits for the prediction error coding are added
`to the bits for motion information to yield R(Sk). D(Sk) is the squared error between the
`coded segment and the corresponding segment in the current image. These values are then
`placed to the equation (46) and LINlER(Sk) is evaluated.
`
`Calculations of the Lagrangian costs for INTRA and UNCHANGED mode are simpler.
`INTRA mode cost LINTRA(Sk) is obtained by coding the segment with the codec's intra
`coding method
`(in
`this
`implementation
`that
`is
`the one specified
`in H.263
`recommendation) and the cost for the UNCHANGED mode LuNCHANGEo(SJ is equal to
`
`Motion Estimation and Representation for Video Coding Applications
`
`35
`
`Vedanti Systems Limited - Ex. 2007
`Page 42
`
`

`
`I
`
`!
`
`i·
`
`I!I!IC="---------------~~~~------------~-~-~
`
`the· mean squared difference between the reference and the current image in the segment
`area. Lagrangian cost for the segment Sk is then
`
`(52)
`
`INTRA and UNCHANGED mode costs for combined segment Skl are calculated simply
`by adding costs for the same mode of the segments Sk and S1• To obtain Lagrangian cost
`for the INTER mode, joint motion parameters must be found, motion estimated prediction
`must be build and the prediction error must be coded.
`
`Merging procedure is initialized by building an adjacency graph of the segments. Each
`node in the graph is associated with the Lagrangian cost of the segment and each arc with
`the reduction in cost when adjacent segments are merged. Figure 20 shows an example of
`such a graph before and after segments Sk and S1 have been merged together.
`
`Adjacency graphs
`
`Segmentations
`
`S;
`
`I
`
`Figure 20: Adjacency graphs and segmentations before (up) and after (down) merging Sk and S1
`
`After building the adjacency graph and calculation of the initial L(Sk)s and M.kls, arc with
`the largest reduction in Lagrangian cost is chosen and the segments associated with it are
`merged. Adjacency graph is updated and the algorithm proceeds by choosing again the arc
`which yields the largest reduction in cost. This is iterated until there exists no longer a
`pair of segments which yields reduction in cost when merged. Algorithm is summarized
`in figure 21 and in the figure 22 there is shown resulting segmentation for the first frames
`of both Akiyo and earphone test sequences.
`
`Motion Estimation and Representation for Video Coding Applications
`
`36
`
`Vedanti Systems Limited - Ex. 2007
`Page 43
`
`

`
`r--------·---------.. -------···--·-
`
`--·' ~·
`
`f
`
`I,.r, Icum ak for each segment
`
`Building of the adjacency
`graph and calculation of the
`L(SJs and 8Lkls
`
`,.......-~ Find the arc with largest reduction
`in cost and merge segments
`
`Update the adjacency graph and
`find motion parameters and costs
`for the new segment pairs
`
`'-Y-e-s--1
`
`Check if there is segments which
`yield reduction in cost
`
`No
`
`Finished merging
`
`Final segmentation and ak for each segment
`
`Figure 21: Block diagram ofthe merging
`
`Aklyo
`
`Oarph<=~ne
`
`Figure 22: Resulting segmentations after merging for the first frames of Akiyo and Carp hone
`sequences
`
`The main complexity of the algorithm presented here comes from finding the motion
`parameters for segment pairs which are candidates for merging and from the calculation
`of the lNTER mode Lagrangian costs for these segment pairs. In the next section a low
`complexity method for finding the joint motion parameters for two segments is shown. To
`decrease the burden caused by the calculation of INTER mode costs, following
`assumptions were made. First it is assumed that if adjacent segments Sk and S1 are both in
`INTRA or in UNCHANCED mode (i.e. min{LINTER(S), LINTRA(S), LuNCHANGEn(S)} is
`either L!NTRA(S) or LuNCHANGEo(S)) the resulting segment Ski will be in the same mode and
`
`Motion Estimation and Representation for Video Coding Applications
`
`37
`
`Vedanti Systems Limited - Ex. 2007
`Page 44
`
`

`
`--------------------------·-e--·-•·•·--···-•·•--
`
`!·
`
`thus there is no need to calculate LINrnR(S~ci). It is also assumed that the lower the increase
`in prediction error due to merging the higher the reduction in Lagrangian cost will be. An
`estimate for the increase in prediction error is obtained from the calculation of the joint
`motion parameters as will be seen in the following section. This information is used to
`choose whether or not the Lagrangian cost for a segment pair Ski is to be calculated.
`
`Assume that S~ci is a segment which was formed by merging two segments S~c and S1. To
`update the adjacency graph, joint motion parameters for Ski and all its neighbours have to
`be calculated. Also the approximated increases in prediction error have to be obtained for
`all the segment pairs. Letting .6.Gmin be the smallest increase in prediction error, the
`INTER costs are calculated only for those segment pairs for which .ilG < T-AGmin·
`Otherwise the INTER mode Lagrangian cost is set to infinity. In the implementation a
`value of 3.0 is used for the multiplier T. With this value, on average less than a half of the
`segment pairs need to be examined and the best candidate for merging is found in over
`95% of the cases [Kar96].
`
`4.3.2 Finding the joint set of motion parameters for two segments
`
`Given two segments Sk and Sh the problem is to find a set of motion parameters
`minimizing the error function for the merged segment Ski = Sk u St. The approach is very
`similar to the one in Taylor expanded minimization of the error function. Again Taylor
`expansion of the reference image Iref is employed. The expansion is done around the
`points
`
`x' = x+dx(ak,x,y), y' = y+d/ak,x,y), (x,y)ESk
`x'=x+dx(a1,x,y), y'=y+dy(a,,x,y), (x,y)ES1
`
`{
`
`in image coordinate space. Using (35) and (36), the problem can be written as
`
`(53)
`
`(54)
`
`where ak1 is a set of joint motion parameters and the matrices Ek and E1 and vectors Yk and
`y1 are given by (38) and (39), respectively. Substituting both Ek and Et by their QR
`decompositions (54) becomes
`
`(55)
`
`Multiplying both sides of the equations above by the transposes (inverses) of the
`orthogonal Q matrices yields
`
`Motion Estimation and Representation for Video Coding Applications
`
`38
`
`.
`
`.
`
`'
`
`:1
`'
`
`,..,--.. -__ ·
`'ft
`·~ -~
`
`i I
`
`~
`
`Vedanti Systems Limited - Ex. 2007
`Page 45
`
`

`
`(56)
`
`Rk and R1 are both upper triangular matrices of size Pi*l2 (Pi being the number of pixels
`in segment Si) and thus having non-zero elements only in the first twelve rows. Letting
`Rkl, Rn, Zkt and Z11 be the first twelve rows of~' R1. Qk TYk and QtTYI respectively, (56)
`can be rewritten as a set of 24 equations and 12 unknowns (ald)
`
`or combining Rkl with R11 and ZkJ with z11
`
`(57)
`
`(58)
`
`Givens rotations can be applied to both [Rkt Rn]T and [Zkt zn]T to triangularize the first
`one [Pre92]. Solution of (58) becomes then a solution of a triangular system:
`
`(59)
`
`from where the joint motion parameters akl are obtained by simple backsubstitution. It
`should be noted that the only information needed from the original segments Sk and St is
`the matrices Rkl and Rn of size 12x12 and the vectors zkl and zn of length 12. For the
`initial segments of quadtree segmentation these matrices and vectors have to be build by
`the QR decomposition of the corresponding E matrices. For a merged segment these are
`obtained as a 'side product' (Rkn and zkll in equation 59) when finding the joint set of
`motion parameters. Conveniently Zk12 in (59) offers an estimate of the increase in
`prediction error when segments Sk and S1 are merged. Increase in squared error can be
`calculated as follows
`
`11G kl o= liz mll2
`
`4.3.3 Implementation issues
`
`ll
`
`o= L ( z kl2 ):
`
`i=O
`
`(60)
`
`Finding the joint motion parameters and Rktl and Zkll for each segment pair is an
`approximative operation. Every time segments are merged certain amount of interference
`is introduced. Thus it is important not to allow this intetference accumulate. To avoid the
`accumulation, after merging of two segments the prediction errors of the original
`
`Motion Estimation and Representation for Video Coding Applications
`
`39
`
`.
`
`. ~.'
`
`.,
`
`u
`
`Vedanti Systems Limited - Ex. 2007
`Page 46
`
`

`
`I I
`
`i j
`l ·
`
`!
`
`,;
`
`segments are compared to the one of the combined segment. If the increase in MSE is
`larger than a threshold (experimentally set to 7.0) addition motion estimation takes place.
`
`Additional motion estimation is similar to the one presented in section 4.2.2 with some
`simplifications. Initial motion parameters are set to the motion parameters obtained by
`solving (59). In Minimizations of Taylor expanded error functions -stage only the steps
`with non-smoothed images are performed. Quasi-Newton minimization of the error
`function takes place if the MSE of the segment is larger than 100.0 and only the last
`minimization described in section 4.2.2.4 is carried out. These simplifications are due to
`the fact that ' a fairly good estimate of motion parameters is already available. After
`additional motion estimation Rkn and ZJdt are no longer valid for the segment and they are
`recalculated by QR decomposition.
`
`Like the validity of the motion parameters also that of the QR decomposition is checked
`after merging two segments. If the actual increase in prediction MSE differs too much
`from the one estimated by the QR decomposition as given in (60), the QR decomposition
`is rebuilt. As a threshold a MSE difference of 5.0 is used in the implementation.
`
`4.4 Motion model adaptation and mode selection
`
`In the last stage of motion vector field coding the motion model is adapted to match the
`motion in each segment separately. Motion parameters are removed from the model one
`by one and the parameters yielding the minimum in the INTER mode Lagrangian cost are
`set
`to
`the motion parameters of
`the
`segment. Finally
`the coding mode
`(INTER/INTRA/UNCHANGED) of the segment is chosen according to the Lagrangian
`costs of the modes.
`
`4.4.1 Orthonormalization
`
`Before the actual motion model adaptation, motion parameters ak of each segment are
`transformed from quadratic motion model to the orthonormalized ones iik. Not only the
`motion parameters but also the Rk1 and zkl matrices obtained from the merging procedure
`can be transformed as follows
`
`Ru =RkiT
`zkl = zkl
`
`(61)
`
`where - is used to denote matrices and vectors in the orthonormal basis. Transform matrix
`T is calculated by
`
`Motion Estimation and Representation for Video Coding Applications
`
`40
`
`.
`
`~
`
`Vedanti Systems Limited - Ex. 2007
`Page 47
`
`

`
`T=[~ ~J
`
`(62)
`
`and
`
`Tl == •
`
`(63)
`
`aJ,ol3o.o
`ao.oP1.o
`CXo,oPo,o
`ao.ofiz.o
`CXl,Ofii.O
`az.of3o.o
`ai.JJo,o
`iiufi1.o az.J3o.o
`0
`0
`0
`ao,oPJ.l
`0
`0
`iio,o/32,1
`0
`al.ofil.l
`0
`0
`0
`0
`0
`al.lfil.l
`0
`0
`0
`0
`0
`0:2,2/30,0
`0
`0
`0
`0
`ao.oPz.z
`0
`where a .. 's and jJ .. 's are given by ( 17). The characteristic equation of segment Sk with
`p=I2 basis functions becomes
`
`l,)
`
`1,)
`
`(64)
`
`4.4.2 Removal of a basis function from the motion model
`
`Removing a basis function from the model is equivalent to removing a column from Rk1
`and an element from at. Let there be initially p basis functions, out of which the i'th one
`is to be removed. ( 64) can be rewritten for p~ 1 basis functions as follows
`
`where Rf;1 is equal to Rf1 with i'th column removed and at1 is the new set of p-1
`motion parameters (unknown at this point). System in (65) can be triangularized by e.g.
`Givens rotations to yield
`
`(65)
`
`(66)
`
`which can be solved by backsubstitution (Rr1 being a triangular matrix). The increase in
`square prediction error when i'th basis function is removed is available without even
`solving (66) by just calculating i.
`
`4.4.3 Finding the best motion model and selecting the coding mode
`
`It is assumed that the basis function which yields the least increase in prediction error
`when removed, is the least significant one and thus the first one to be removed from the
`model. This increase can be easily calculated for all the basis functions as seen in the
`
`Motion Estimation and Representation for Video Coding Applications
`
`41
`
`Vedanti Systems Limited - Ex. 2007
`Page 48
`
`

`
`r
`
`I ,,
`
`previous section. Starting from the least important one, basis functions are removed one
`by one from the model. After each removal Lagrangian cost for the INTER mode is
`evaluated similarly as in the case of merging (section 4.3.1). Only difference being that
`now the number of bits for motion parameters does not need to be approximated, but they
`can be calculated by coding the parameters. After every basis function has been removed
`the set of basis functions yielding the minimum in Lagrangian cost forms the motion
`model of the s~gment. Finally Lagrangian costs for INTER, INTRA and UNCHANGED
`modes are compared and the coding mode is chosen to be the one of smaJlest cost. The
`algorithm is shown in figure 23.
`
`1 Iref> lcom Rkt, Zkl
`Transformation to orthogonalized I
`motion model 1 jill -12
`
`.tl, zkt
`
`Calculating the INTER Lagrangian
`cost for the full motion model
`
`1 jil2
`
`-12
`J:l' Z.c;J
`
`Removing the least significant
`remaining basis function
`
`1 jip-1
`
`kl
`
`-p-1
`., Z.u
`
`Iterate until p = 0
`
`Evaluating the INTER mode cost
`and saving the result if it is the
`smallest so far
`_ _L minimum LINTER· afm;D
`
`Selection of the coding mode
`
`1 selection information, ak
`
`Finished
`
`Ftgure 23: Block dtagram of fmdmg the mot10n model for a segment
`
`4.4.4 Coding of the motion information
`
`In the last stage of coding the motion parameters of the segments in INTER mode are
`quantized. A uniform scalar quantization is performed for all the remaining motion
`parameters. Before the quantization motion parameters are normalized with respect to the
`ratio of pixels in the segment and in the bounding box of the segment. This is done to
`prevent a small segment with a large bounding box to take as many bits as a large segment
`with similar motion and the same size of the bounding box. The quantized parameters are
`then obtained from the non-quantized ones by
`
`Motion Estimation and Representation for Video Coding Applications
`
`42
`
`1
`~
`
`• •
`•
`
`-.-
`•
`
`, •• 4 .
`'
`
`I
`
`Vedanti Systems Limited - Ex. 2007
`Page 49
`
`

`
`(67)
`
`where the number of pixels in the segment Sk is denoted by Pk. number of pixels in the
`bounding box, by pkbox and the uniform quantizer step by Qstep· In the implementation a
`step size of 3.0 is used.
`
`To inform the decoder which basis functions have been removed from the motion model,
`two six bit headers are constructed. Each bit in the first header indicates whether or not a
`basis function j(x,y), i = 0, ... , 5 is removed and the bits in the second header is used to
`indicate whether or not ~(x,y), j = 6, ... , 11 is removed. These headers are separately
`VLC coded with a Huffman code. Finally also the quantized motion parameters. having
`nonzero values are Huffman coded and put to the bitstream.
`
`4.5 Coding results
`
`The motion estimation scheme is combined with a Multi-Transform Prediction Error
`Coding technique as described in [Kar96J. and the coding results are compared against
`those of the state-of-the-art low bitrate coding standard H.263 with 'Advanced Prediction
`Mode' and 'Unrestricted Motion Vectors'. The results for both schemes are shown in the
`table 1 for a number of test sequences. Both QCIF (176x144 pixels) and CIF (352x288
`pixels) sequences are included in the test material.
`
`Table 1; Comparison of Nokia codec and H.263 codec with equal bitrates
`
`Picture
`size
`
`Nokia codec
`PSNR(Y/C)
`Bitrate
`[dB]
`[kbps]
`
`H.263
`Bitrate
`PSNR(Y/C)
`[dB]
`[kbps]
`
`Improvement in
`PSNR(Y/C)
`[dB]
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`CIF
`
`QCIF
`
`QCIF
`
`CIF
`
`CIF
`
`10.17
`9.98
`
`9.77
`
`9.93
`
`21.36
`
`23.97
`
`23.39
`
`24.39
`
`47.91
`46.49
`
`49.76
`
`111.96
`
`107.10
`
`33.9/40.3
`31.7 I 38.1
`
`32.0/37.6
`
`37.4/40.8
`
`36.5/42.2
`
`32.7/36.9
`
`35.3/40.5
`
`32.6/37.1
`
`33.0/37.4
`
`33.9/39.0
`
`30.2/41.3
`
`35.9/39.2
`
`31.5/37.5
`
`10.31
`10.27
`
`10.12
`
`10.31
`
`21.28
`
`24.32
`
`23.66
`
`24.84
`
`48.08
`
`47.16
`
`49.70
`
`108.35
`
`110.18
`
`33.3/39.2
`31.2/38.0
`
`30.3/36.8
`
`35.6/39.8
`
`35.4140.9
`
`32.1136.8
`
`33.6/39.2
`
`31.2136.1
`
`31.9/36.9
`
`32.3/38.0
`
`29.4/40.5
`
`34.1/38A
`
`30.5/36.8
`
`0.6/1.1
`0.5 I 0.1
`
`1.7/0.9
`
`1.8/1.0
`
`1.1/1.3
`
`0.6/0.1
`
`1.7/1.3
`
`1.4/1.0
`
`1.1/0.5
`
`1.6/1.0
`
`0.8/0.8
`
`1.8/0.8
`
`1.0/0.7
`
`43
`
`Sequence
`
`Mother&Dtr
`
`Hall Objects
`
`Container
`
`Akiyo
`
`Mother&Dtr
`
`Silent Voice
`
`Container
`
`News
`
`News
`
`Foreman
`
`Coastguard
`
`News
`
`Foreman
`
`Motion Estimation and Representation for Video Coding Applications
`
`Vedanti Systems Limited - Ex. 2007
`Page 50
`
`

`
`----~--------------~---------~--···--
`
`To compare the results, bitrates are made equal by assigning the quantizer parameter for
`the error coding schemes. The quality of the coded video signal is measured with the
`average peak signal to noise ratio (PSNR) calculated as
`
`2552
`PSNR=lOlog-(cid:173)
`MSE
`
`(68)
`
`where MSE is the mean square difference between the original and the coded frame. The
`chrominance PSNR values in the table 1 are averages of the PSNRs . of both the
`chrominance components. The frame rates in simulations with bitrates from 10 kbps to 48
`kpbs vary from 7.5 fps to 10 fps and the simulations with the bitrate of 112 kbps were run
`on 15 fps.
`
`As seen from the table 1, the Nokia codec reaches PSNRs from 0.5 to 1.8 dB higher than
`the H.263 codec. Most of the improvements are due to the motion modelling and
`estimation scheme presented in this thesis, since with the same DCT based prediction
`error coding scheme used in the H.263 the Nokia codec achieves improvements between
`0.3 and 1.5 dB over the H.263. To obtain the corresponding reduction in bitrate another
`set of results are shown in the table 2 below.
`Table 2: Comparison between Nokia codec and H.263 with nearly equal PSNRs
`
`Sequ~nce Picture
`size
`
`Noki~ codec
`Bitrate
`PSNR(Y/C)
`[dB]
`[kbps]
`
`H.263
`PSNR(Y/C)
`[dB]
`
`Bitrate
`[kbps]
`
`Bitrate
`reduction
`[%]
`
`PSNR (Y/C)
`difference
`[dB]
`
`Akiyo
`
`Container
`
`Mthr&Dtr
`
`Mthr&Dtr
`
`Foreman
`
`News
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`QCIF
`
`CIF
`
`9.93
`9.77
`
`10.17
`
`21.36
`
`46.49
`
`37.4/40.8
`32.0/37.6
`
`33.9/40.3
`
`36.5/42.2
`
`33.9/39.0
`
`18.06
`17.46
`
`11.96
`
`26.96
`
`61.76
`
`111.96
`
`35.9/39.2
`
`156.66
`
`37.4 I 40.4
`32.0/37.6
`
`33.8/39.5
`
`36.1/41.3
`
`33.5/38.8
`
`35.5/39.1
`
`45
`44
`IS
`21
`
`25
`
`29
`
`0.0/0.4
`0.0/0.0
`
`0.1/0.8
`
`0.4/0.9
`
`0.4/0.2
`
`0.410.1
`
`Now the quantizer parameters are chosen in a way that Nokia codec meets certain bitrate
`requirements (first three results are with a bitrate of approximately 10 kbps, then 24 kbps,
`48 kbps and finally 112 kbps). H.263 codec is set to yield PSNRs which does not exceed
`the PSNRs of the Nokia codec. Comparing the bitrates it can be noticed that with the
`motion estimation algorithm presented in this chapter and a novel prediction error coding
`scheme reductions of 15-45% in bitrate can be obtained while keeping the objective
`quality the same or better. In appendix A there can be found examples of images coded
`with both schemes together with prediction images resulted from motion estimations with
`both translational and quadratic polynomial motion models.
`
`Motion Estimation and Representation for Video Coding Applications
`
`44
`
`Vedanti Systems Limited - Ex. 2007
`Page 51
`
`

`
`---- - - - - - - - -
`
`..
`(;
`
`5. Conclusions
`
`In this thesis a very sophisticated algorithm for motion estimation was introduced. The
`full quadratic motion model was chosen as the basis of the motion model. This allows the
`system to compensate a number of different types of movements in the image plane. To
`make the motion parameters more robust to the quantization the basis functions of the
`motion model are orthonormalized with respect to the bounding boxes of the segments.
`The problem with the larger number of motion parameters compared to the state-of-the-art
`solutions is handled by keeping only the parameters yielding the optimal performance in
`rate-distortion sense.
`
`Not only the motion model but also the segmentation of the images differs from the ones
`used in the current video coding standards. The image to be coded is not segmented in
`square blocks of constant size, but the segments consists of smaller blocks which are
`merged together to form larger segments with different motion characteristics. Also the
`segmentation is done in the rate-distortion sense to achieve the maximum in performance.
`
`The motion estimation algorithm developed for this scheme is very robust to local minima
`in the error function . It can also adapt its behaviour to the complexity of the function to be
`minimized. It observes the convergence rate and according to this makes decisions how to
`proceed with the minimization.
`
`Motion Estimation and Representation for Video Coding Applications
`
`45
`
`Vedanti Systems Limited - Ex. 2007
`Page 52
`
`

`
`The experiments confirm that the motion modelling and estimation scheme described in
`this thesis yields significant improvements in the image quality or corresponding
`reductions in the bandwidth compared to the current standards. However the complexity
`of the encoding process limits the possible applications of this scheme with today' s
`technology. This is evident especially in high bitrate applications where the images are
`large and the frame rate is high. As such the algorithm is most. suitable for low bitrate
`applications. They are characterized by smaller picture size and lower frame rate. Smaller
`picture size makes the algorithm faster and in the lower frame rate full potential of the
`more complex motion model can be utilized (since movements need to be modelled over
`longer periods of time).
`
`Unlike the encoder, the decoder can be implemented with low cost making the scheme
`very attractive when it comes to asymmetric applications with not so tight real-time
`requirements. Such applications
`include electronic mail (video as attachments),
`broadcasting, different retrieval applications (e.g. world wide web) and video archives.
`'
`.
`
`Motion Estimation and Representation for Video Coding Applications
`
`46
`
`--
`
`~-
`
`. .,..._
`
`-
`
`Vedanti Systems Limited - Ex. 2007
`Page 53
`
`

`
`r---------
`
`,
`t _ Bibliography
`
`[IS093] ISOIIEC 11172-2, "Coding of Moving Pictures and Associated Audio for Digital
`Storage Media up to about 1.5 Mbits/s - Part 2: Video", International Organization for
`Standardization (ISO), 1993.
`
`[Sun96] World Wide Web site of the Sun Microsystems Computer Corporation, URL:
`http://www.sun.com/sparc/whitepapers/UltraSPARCtechnology/, October 1996
`
`[ITU95] ITU-T Draft Recommendation H.263, "Video Coding for Low Bitrate
`Communication", December 1995.
`
`[IS096] ISOIIEC 13818-2, "Generic Coding of Moving Pictures Associated Audio
`Information:'Video", International Organization for Standardization (ISO), 1996.
`
`[Dug95] Jean-Luc Dugelay, Henri Sanson, "Differential methods for the identification of
`2D and 3D motion models
`in
`image sequences", Signal Processing: hnage
`Communication, Vol.7, No. 1, March 1995, pp 105-127.
`
`[Wol90] G. Wolberg, "Digital Image Warping", New York, IEEE Computer Society,
`1990, pp. 67-70.
`
`[Ger95] A. Gersho, R. Gray, "Vector Quantization and Signal Compression", Kluwer
`Academic Publishers, Norwell (MA), 1995, pp 235-240.
`
`[Abr72] M. Abramowitz, I. A. Stegun, "Handbood of Mathematical Functions with
`Formulas, Graphs and Mathematical Tables", New York, NY, Dover 1972
`
`[Cic94] P. Cicconi, H. Nicolas, "Efficient Region-based Motion Estimation and
`Symmetry Oriented Segmentation for Image Sequence Coding", IEEE Transactions on
`Circuits and Systems for Video Technology, Vol. 4, No.3, June 1994, pp. 357-364.
`
`[Kar95] M. Karczewicz, J. Nieweglowski, P. Haavisto, "Motion Estimation and
`Representation of Arbitrarily Shaped Image Regions", The Proceedings of IEEE
`International Conference ofimage Processing, Vol. 2, October 1995, pp. 197-201.
`
`Motion Estimation and R~presentation for Video Coding Applications
`
`47
`
`·..
`
`.
`
`J
`
`Vedanti Systems Limited - Ex. 2007
`Page 54
`
`

`
`[Die91] N. Diehl, "Object-oriented Motion Estimation and Segmentation of Image
`Sequences", Signal Processing: Image Comminication, Vol. 3, No 1, February 1991, pp.
`23-56.
`
`[Pre92] W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, "Numerical
`Recipes in C", Second Edition, Cambridge University Press, Cambridge UK, 1992
`
`[Fie95] R. Fletcher, "Practical Methods of Optimization", Second Edition, John Wiley &
`Sons, Chichester UK, 1995
`
`[Wu90] S. F. Wu, J. Kittler, "A Differential Method for Simultaneous Estimation of
`Rotation, Change of Scale and Translation", Signal Processing: Image Communication,
`Vol. 2, No. 1, May 1990, pp. 69-80.
`
`[Kar96] M. Karczewicz, J. Nieweglowski, J. Lainema, 0. Kaleva, "Video Coding Using
`Motion Compensation with Polynomial Motion Vector Fields", The Proceedings of First
`International Workshop on Wireless ImageNideo Communications, Loughborough
`University, Loughborough, UK, September 1996, pp. 26-31.
`
`Motion Estimation and Representation for Video Coding Applications
`
`48
`
`Vedanti Systems Limited - Ex. 2007
`Page 55
`
`

`
`--------------------------------~_.... -··
`
`l I,
`
`<
`
`(
`I
`
`'
`
`Appendix A
`
`Examples of coded images
`
`Motion Estimation and Representation for Video Coding Applications
`
`I
`
`I.
`i
`
`. . tl
`
`Vedanti Systems Limited - Ex. 2007
`Page 56
`
`

`
`p---------------------------~...._:am=F""""'"'"·------- -~...___..,"""'-'~
`
`1 :·
`
`I I I I
`
`a)
`
`c)
`
`d)
`
`e)
`
`f)
`
`FigureA1:
`a) Reference image (coded with H.263 intra coder)
`b) Current image (not coded)
`c) Prediction image with translational motion model
`d) Coded image with H.263 (2752 bits Ri 22.93 kbps)
`c) Prediction image with quadratic polynomial motion model
`d) Coded image with Nokia codec (2214 bits R~ 18.44 kbps)
`
`Motion Estimation and Representation for Video Coding Applications
`
`II
`
`Vedanti Systems Limited - Ex. 2007
`Page 57
`
`

`
`~------------------... -..... -Iii ........ "".,..,, •. "".-.... ·'""---···-----=.--· -~- --·
`
`a)
`
`c)
`
`e)
`
`l
`I
`
`b)
`
`d)
`
`f)
`
`Figure A2:
`a) Reference image (coded with H.263 intra coder)
`b) Current image (not coded)
`c) Prediction image with translational motion model
`d) Coded image with H.263 (2384 bits jO:i 23.84 kbps)
`c) Prediction image with quadratic polynomial motion model
`d) Coded image with Nokia codec (1764 bits jO:i 17.64 kbps)
`
`Motion Estimation and Representation for Video Coding Applications
`
`ill
`
`-
`
`-
`
`_;j~_. '
`
`.: ..... . .
`
`. ... •
`
`...
`
`• -
`
`.
`
`Vedanti Systems Limited - Ex. 2007
`Page 58
`
`

`
`.~
`
`a)
`
`c)
`
`e)
`
`b)
`
`d)
`
`t)
`
`Figure A3 :
`a) Reference image (coded with H.263 intra coder)
`b) Current image (not coded)
`c) Prediction image with translational motion model
`d) Coded image with H.263 (3032 bits~ 25.27 kbps)
`c) Prediction image with quadratic polynomial motion model
`d) Coded image with Nokia codec (25J9 bits~ 20.98 kbps)
`
`Motion Estimation and Representation for Video Coding Applications
`
`IV
`

`
`-
`

`
`-~
`

`
`It
`
`Vedanti Systems Limited - Ex. 2007
`Page 59

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket