throbber
US 7,532,808 B2
`(10) Patent No.:
`a2) United States Patent
`Lainema
`(45) Date of Patent:
`May12, 2009
`
`
`US007532808B2
`
`(54) METHOD FOR CODING MOTIONIN A
`VIDEO SEQUENCE
`
`ea
`;
`.
`Jani Lainema, Irving, TX (US)
`Inventor:
`(75)
`:
`.
`:
`:
`(73) Assignee: Nokia Corporation, Espoo (FI)
`;
`.
`;
`;
`a
`Notice:
`Subject to any disclaimer.> the term ofthis
`patent is extended or adjusted under 35
`USS.C. 154(b) by 1003 days.
`
`(
`
`OTHER PUBLICATIONS
`
`.
`;
`“Global Motion Vector Coding (GMVC)”; Shijun Sun et al.;
`ITU—Telecommunications Standardization Sector, Video Coding
`Experts Group (VCEG); Meeting: Pattaya, Thailand, Dec. 4-7, 2001;
`pp. 1-6.
`“Joint Model Number | (JM-1)”; Doc. JVT-A003; Joint VideoTeam
`of ISO/IEC and ITU-T VCEG;Jan. 2002;pp. 1-79.
`Acta ofZhongshan University, vol. 40, No. 2; L. Hongmeiet al.; “An
`Improved Multiresolution Motion Estimation Algorithm”; pp. 34-37;
`Mar.2001.
`ITU Telecommunications Standardization Sector, Doc.VWCEG-N77;
`S. Sun et al; “Motion Vector Coding with Global Motion Param-
`eters”;
`pp. 1-11; Fourteenth Meeting: Santa Barbara, CA, USA,Sep.
`24-28, 2001.
`
`(Conhinues))
`Jyiiary Hxaminer—Huy"l Nguyen.
`(74) Attorney, Agent, or Firm—Ware, Fressola, Van Der
`Sluys & Adolphson, LLP
`(57)
`ABSTRACT
`
`A method of motion-compensated video encoding that
`enables a video sequence with a global motion componentto
`q
`g
`Pp
`be encoded in an efficient manner. A video encoder is
`arranged to assign macroblocks to be coded to specific coding
`modes including a skip mode, whichis usedto indicate one of
`twopossible types of macroblock motion: a) zero motion, or
`b) global or regional motion. As each macroblock is encoded,
`a previously encoded region surrounding the macroblock is
`examined and the characteristics of motion in that region
`determined. With the skip mode, the macroblock to be coded
`and a motion vector describing the global motion or regional
`motion is associated with the macroblock if the motion in the
`regionis characteristic of global motionor regional motion.If
`the region exhibits an insignificant level of motion, a zero
`valued motion vector is associated with the macroblock.
`
`65 Claims, 10 Drawing Sheets
`
`
`
`21)
`
`(
`
`(22)
`(65
`
`Appl. No.: 10/390,549
`pp
`,
`
`Filed:
`
`Mar. 14, 2003
`Prior Publication Data
`US 2003/0202594.A1
`Oct. 30, 2003

`Related U.S. Application Data
`(60) Provisional application No. 60/365,072,filed on Mar.
`15, 2002.
`
`51)
`
`Int. Cl.
`
`(56)
`
`(2006.01)
`HOAN 5/91
`UWS. Gh: csscssessmnceeemncs 386/111; 386/112
`62)
`(58) Field of Classification Search ..........0...00.... 386/68,
`386/111, 112, 95; 348/466, 699; 375/240.15
`See application file for complete search history.
`.
`References Cited
`U.S. PATENT DOCUMENTS
`
`—
`~~
`5,148,272 A
`9/1992 Acamporaetal.
`.......... 358/133
`
`eee 358/335
`5,191,436 A
`3/1993 Yonemitsu oo...
`
`8/1995 Sunetal. cee 348/402
`5,442,400 A
`12/1997 Kato wees 348/699
`5,701,164 A
`6,683,987 BL*
`1/2004 Sugahara 0.0... 382/235
`7,200,275 B2*
`4/2007 Srinivasan et al.
`.......... 382/239
`
`¢ 630
`PoErr
`I
`|
`|
`'
`|
`|
`|
`[
`|
`'
`
`
`
`;
`
`|
`|
`\
`|
`I
`|
`|
`|
`|
`|
`
`Generate Active
`Anal
`640, 650
`
`
`nalyze
`Motion Parameters
`
`
`
`!
`Surrounding
`128
`Motion
`
`
`Motion ||Compensation
`
`
`
`
`Generate Zero-Motion
`
`
`Parameters
` eee we eee ee ee ee eea ee a ae a a a er eee eee
`
`AMAZON-1001
`7,932,808
`
`||\|||\
`
`AMAZON-1001
`7,532,808
`
`

`

`US 7,532,808 B2
`
`Page 2
`
`OTHER PUBLICATIONS
`
`ITU Telecommunications Standardization Sector, Doc. VCEG-N16;
`S. Sun et al; “Core Experiment description: Motion Vector Coding
`with Global Motion Parameters”; pp. 1-6; Fourteenth Meeting: Santa
`Barbara, CA, USA,Sep. 24-28, 2001.
`
`Joint Photography Expert Group Conference, Crowborough JPEG
`Forum Ltd, GB, Specialists Group on Coding for Visual Telephony
`Joint Photographic Expert Group; “Description of Ref. Model 8
`(RM8)”; pp. 1-72; Jun. 9, 1989.
`
`* cited by examiner
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 1 of 10
`
`US 7,532,808 B2
`
`SEL
`
`
`XNWaa1XAW
`
`
`
`
`
`
`
`OZt3SUSANI
`
`
`
`(OJIBAUOHOW)A
`
`anus
`
`NOILLOW
`
`BHOLS-OBLVSN3dWOo
`
`
`
`OVAldNOILOW
`
`ONIGOD
`
`(LUVWOIYd)|“Big
`
`oe
`
`NOLLOW
`
`NOILVWILSS
`
`c#HOLIMS
`
`
`
`
`
`ecb“496619YSLNI/VYLNI
`
`
`
`(MALI/VELL!40}Bey)dCole‘\a°°||.IOYMLNOSONIGODcomeesees-00L
`GeS}UB19}JedWojsueyPezyUeNbb#HOLIMSC)YAZILNYND
`
`Fares______.Pounulsie.10)Bel)
`veMN(uopeoipuyuonezyuenb)zbeth\wnnnnnnnnnneenenennnnnee901a'Shh
`
`
`Lz7O9L
`ge}NOILOIUsad
`
`
`vo}sioepMSLNI/WHLNI|Ebb
`
`WYUSSNVaL
`
`Oblal
`
`g80lOo
`
`aSY3AN!
`
`YAZILNVINIO
`
`WYOsSNVEL01
`
`'
`
`+
`
`NIO3GIA
`
`LOL
`
`
`
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 2 of 10
`
`US 7,532,808 B2
`
`LNOOACIA
`
`082
`
`Oe
`
`t
`
`ASYZANIL
`
`WHOASNVaL
`
`!11
`
`SSY3ANI0
`
`SHSZILNVIIO
`
`Seb
`
`OaLVSNAdWOO
`
`NOLLUIGSad
`
`NOILOW
`
`
`
`(0}9AuOnOW)A
`
`(LYYWOrud)z‘Big
`
`022
`
`
`
`(uOnesIpuluoWeznuenb)zb|8b¢
`
`he,
`
`092
`
`
`
`JOULNODONIGOD
`
`11
`
`(HELNI/WYLNI10}Bey)csLe
`
`(paymsuen10;Bey)49L2
`
`
`
`(uoneoipuyuoyezyuenb)zbL)2
`
`NN002
`
`
`
`
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 3 of 10
`
`US 7,532,808 B2
`
`(Luv
`
`wOldd):C.
`
`rE|A
`
`8
`
`8
`
`8
`
`zAqojdures-qns
`
`88
`
`oT
`
`8
`
`8
`
`
`

`

`US 7,532,808 B2
`
`FIG.4
`
`U.S. Patent
`
`May12, 2009
`
`Sheet 4 of 10
`
`4x8meaé
`
`4x4
`
`16x88x8
`
`16x16
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 5 of 10
`
`US 7,532,808 B2
`
`FIG.95
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 6 of 10
`
`US 7,532,808 B2
`
`seg
`
`SSZILNVIIO
`
`‘
`sad
`
`SSYaANI
`
`AdUsSNVSL
`
`
`
`(10]98AUONOW)A
`
`9°OId
`
`GALYSNAdWOS
`
`NOLLIIUA8e
`
`NOILOW
`
`lalaNOLLOW
`
`ONIOOD
`
`NOILVWILSA
`
`NOLLOW
`
`ZHHOLIMS
`
`ie
`
`SSYaAANI0
`
`¥01
`
`80}cot
`
`
`
`
`
`PZl(uonesipuluonezquenb}zb
`
`90L
`
`
`
`
`
`PeZznUENe;YBZILNYNOSwEolyeoduWO\sue)
`
`NYOISNVELLL
`
`b#HOLIMS
`
`elt
`
`NIOZdIA
`
`
`
`
`
`
`
`Shhuo|sfoePYALNI/VELNI
`
`ecb(pepiusue10)Bey)3
`
`(SSLNI/VHLNI40)Bey)
`
`O9t
`
`
`
`JOBLNODONIGOS
`
`meeneeeeed
`
`496619MSN!/VELNI
`
`009
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 7 of 10
`
`US 7,532,808 B2
`
`LNOOAGIApl
`
`WaUdSNYaaSYaANI
`
`
`
`TaOLSANVUS
`
`CaLYSNIdWOD
`
`NULLad
`
`NOLLOW
`
`oieLOHOW)A
`
`264
`
`OLL
`
`09¢j
`
`00L
`
`
`
`JOULNODONIGOSD
`
` CSZLLNI/VULLNI40)Bey)dco[Z
`
`
`
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 8 of 10
`
`US 7,532,808 B2
`
`059‘079
`
`udT}OF
`
`uoresusdtuos
`
`
`
`SAIOYoyerouaH
`
`
`
`SIoJoWeIegUOTOW
`
`uolOW
`
`azA]euy
`
`Ssulpunoling
`
`uoTIOY
`
`8c}
`
`UOTOJ-0197d}eIOUIH)
`
`DATIOW-UON
`
`sIg}ouleIeg
`
`uonOW
`
`meeeewneeeeeeeeeeeeeeeeaeaeeweeaeasceaeee
`
`
`
`
`
`AIOUISJAUOTYLULIOJUTUOTOYY
`
`
`
`
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 9 of 10
`
`US 7,532,808 B2
`
`FIG.9
`
`

`

`U.S. Patent
`
`May12, 2009
`
`Sheet 10 of 10
`
`US 7,532,808 B2
`
`NOILOANNOO
`
`NLSdOL
`
`28
`
`b€e2'9
`
`
`yiaa93009
`
`NAWdINOAjaolanvolanv
`
`
`
`ANSAWdINOSSILVWATSL
`
`02h1
`
`€g8
`
`JOYLNOD
`
`WALSAS
`
`
`
`

`

`US 7,532,808 B2
`
`1
`METHOD FOR CODING MOTION IN A
`VIDEO SEQUENCE
`
`This application claims the benefit of U.S. Provision!
`Application No. 60/365,072 filed Mar. 15, 2002.
`
`FIELD OF THE INVENTION
`
`The invention relates generally to communication systems
`and more particularly to motion compensation in video cod-
`ing.
`
`BACKGROUNDOF THE INVENTION
`
`A digital video sequence,like an ordinary motion picture
`recorded on film, comprises a sequenceofstill images, the
`illusion of motion being created by displaying consecutive
`imagesof the sequence oneafterthe otherat a relatively fast
`rate, typically 15 to 30 frames per second. Because of the
`relatively fast frame display rate,
`images in consecutive
`frames tendto be quite similar and thus contain a considerable
`amount of redundant information. For example, a typical
`scene may comprise somestationary elements, such as back-
`ground scenery, and some moving areas, which may take
`many different forms, for example the face of a newsreader,
`movingtraffic and so on. Alternatively, or additionally, so-
`called “global motion” may be presentin the video sequence,
`for example due to translation, panning or zooming of the
`camera recording the scene. However, in many cases, the
`overall change between one video frameandthe nextis rather
`small.
`
`Each frame of an uncompressed digital video sequence
`comprises an array of image pixels. For example, in a com-
`monly used digital video format, knownas the Quarter Com-
`mon Interchange Format (QCIF), a frame comprises an array
`of 176x144 pixels, in which case each frame has 25,344
`pixels. In turn, each pixel is represented by a certain number
`of bits, which carry information about the luminance and/or
`color contentof the region of the image correspondingto the
`pixel. Commonly, a so-called YUV color model is used to
`represent the luminance and chrominance content of the
`image. The luminance, or Y, componentrepresents the inten-
`sity (brightness) of the image, while the color content of the
`imageis represented by two chrominanceorcolor difference
`components, labelled U and V.
`Color models based on a luminance/chrominance repre-
`sentation of image content provide certain advantages com-
`pared with color models that are based on a representation
`involving primary colors (that is Red, Green and Blue, RGB).
`The humanvisual system is moresensitive to intensity varia-
`tions than it is to color variations and YUV color models
`
`exploit this property by using a lowerspatial resolutionfor the
`chrominance components (U, V) thanfor the luminance com-
`ponent (Y). In this way, the amount of information needed to
`code the color information in an image can be reduced with an
`acceptable reduction in image quality.
`The lower spatial resolution of the chrominance compo-
`nents is usually attained by spatial sub-sampling. Typically,
`each frame of a video sequence is divided into so-called
`“macroblocks”, which comprise luminance (Y) information
`and associated (spatially sub-sampled) chrominance (U, V)
`information. FIG.3 illustrates one way in which macroblocks
`can be formed. FIG. 3a showsa frame of a video sequence
`represented using a YUV color model, each componenthav-
`ing the samespatial resolution. Macroblocks are formed by
`representing a region of 16x16 image pixels in the original
`image (FIG. 35) as four blocks of luminance information,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`each luminance block comprising an 8x8 array of luminance
`(Y) values and two spatially corresponding chrominance
`components (U and V) which are sub-sampled by a factor of
`two in the horizontal and vertical directions to yield corre-
`sponding arrays of 8x8 chrominance (U, V) values (see FIG.
`3c).
`A QCIF image comprises 11x9 macroblocks. If the lumi-
`nance blocks and chrominanceblocksare represented with 8
`bit resolution (that is by numbersin the range 0 to 255), the
`total numberofbits required per macroblock is (16x16x8)+
`2x(8x8x8)=3072 bits. The numberofbits needed to represent
`a video frame in QCIF formatis thus 99x3072=304,128 bits.
`This means that the amount of data required to transmit/
`record/display an uncompressed video sequence in QCIF
`format, represented using a YUV color model,at a rate of 30
`frames per second, is more than 9 Mbps (million bits per
`second). This is an extremely high data rate and is impractical
`for use in video recording, transmission anddisplay applica-
`tions because of the very large storage capacity, transmission
`channel capacity and hardware performance required.
`If video data is to be transmitted in real-time over a fixed
`
`line network such as an ISDN (Integrated Services Digital
`Network) or a conventional PSTN (Public Switched Tele-
`phone Network), the available data transmission bandwidth is
`typically ofthe order of 64 kbits/s. In mobile videotelephony,
`where transmission takes place at least in part over a radio
`communicationslink, the available bandwidth can be as low
`as 20 kbits/s. This meansthat a significant reduction in the
`amountof information used to represent video data must be
`achieved in order to enable transmission of digital video
`sequences over low bandwidth communication networks. For
`this reason, video compression techniques have been devel-
`oped which reduce the amount of information transmitted
`while retaining an acceptable image quality.
`Video compression methods are based on reducing the
`redundant and perceptually irrelevant parts of video
`sequences. The redundancy in video sequences can becat-
`egorised into spatial,
`temporal and spectral redundancy.
`“Spatial redundancy”is the term used to describe the corre-
`lation (similarity) between neighbouring pixels within a
`frame. The term “temporal redundancy” expresses the fact
`that objects appearing in one frameof a sequence arelikely to
`appear in subsequent frames, while “spectral redundancy”
`refers to the correlation between different color components
`of the same image.
`Sufficiently efficient compression cannot usually be
`achieved by simply reducing the various forms ofredundancy
`in a given sequence of images. Thus, most current video
`encoders also reduce the quality of those parts of the video
`sequence which are subjectively the least important. In addi-
`tion, the redundancy ofthe compressedvideobit-stream itself
`is reduced by meansofefficient loss-less encoding. Gener-
`ally, this is achieved using a technique known as entropy
`coding.
`There is often a significant amount of spatial redundancy
`between the pixels that make up each frameof a digital video
`sequence. In other words, the value of any pixel within a
`frame ofthe sequence 1s substantially the sameas the value of
`other pixels in its immediate vicinity. Typically, video coding
`systemsreduce spatial redundancy using a technique known
`as “block-based transform coding”, in which a mathematical
`transformation, such as a two-dimensional Discrete Cosine
`Transform (DCT), is applied to blocks of image pixels. This
`transforms the image data from a representation comprising
`pixel values to a form comprising a set of coefficient values
`representative of spatial frequency components significantly
`
`

`

`US 7,532,808 B2
`
`3
`reducing spatial redundancy and thereby producing a more
`compact representation of the image data.
`Frames of a video sequence which are compressed using
`block-based transform coding, without reference to any other
`frame within the sequence, are referred to as INTRA-coded or
`I-frames. Additionally, and where possible, blocks of
`INTRA-coded frames are predicted from previously coded
`blocks within the same frame. This technique, known as
`INTRA-prediction, has the effect of further reducing the
`amountof data required to represent an INTRA-coded frame.
`Generally, video coding systems not only reducethespatial
`redundancy within individual frames of a video sequence,but
`also make use of a technique knownas “motion-compensated
`prediction”,
`to reduce the temporal redundancy in the
`sequence. Using motion-compensated prediction, the image
`content of some (often many) frames in a digital video
`sequenceis “predicted”from one or moreother frames in the
`sequence, knownas “reference” frames. Prediction of image
`content is achieved by tracking the motion of objects or
`regions of an image between a frame to be coded (com-
`pressed) and the reference frame(s) using “motion vectors”.
`In general, the reference frame(s) may precedethe frameto be
`coded or mayfollow it in the video sequence. As in the case of
`INTRA-coding, motion compensated prediction of a video
`frame is typically performed macroblock-by-macroblock.
`Frames of a video sequence which are compressed using
`motion-compensated prediction are generally referred to as
`INTER-coded or P-frames. Motion-compensated prediction
`alone rarely provides a sufficiently precise representation of
`the image content of a video frame andtherefore it is typically
`necessary to provide a so-called “prediction error” (PE)
`frame with each INTER-coded frame. The prediction error
`frame represents the difference between a decoded version of
`the INTER-codedframe and the image content ofthe frame to
`be coded. Morespecifically, the prediction error frame com-
`prises values that represent the difference between pixel val-
`ues in the frame to be coded and corresponding reconstructed
`pixel values formedonthe basis of a predicted version of the
`frame in question. Consequently, the prediction error frame
`has characteristics similar to a still image and block-based
`transform coding can be applied in orderto reduceits spatial
`redundancy and hence the amount of data (numberofbits)
`required to representit.
`In orderto illustrate the operation ofa generic video coding
`system in greater detail, reference will now be madeto the
`exemplary video encoder and video decoder illustrated in
`FIGS. 1 and 2 of the accompanying drawings. The video
`encoder 100 ofFIG. 1 employs a combination ofINTRA- and
`INTER-coding to produce a compressed (encoded) video
`bit-stream and decoder 200 of FIG. 2 is arranged to receive
`and decodethe video bit-stream produced by encoder 100 in
`order to produce a reconstructed video sequence. Throughout
`the following description it will be assumed that the lumi-
`nance component of a macroblock comprises 16x16 pixels
`arranged as an array of 4, 8x8 blocks, and that the associated
`chrominance componentsare spatially sub-sampledby a fac-
`tor oftwo in the horizontal andverticaldirections to form 8x8
`
`blocks, as depicted in FIG. 3. Extension of the description to
`other block sizes and other sub-sampling schemes will be
`apparentto those of ordinary skill in the art.
`The video encoder 100 comprises an input 101 for receiv-
`ing a digital video signal from a camera or other video source
`(not shown). It also comprises a transformation unit 104
`which is arranged to perform a block-based discrete cosine
`transform (DCT), a quantizer 106, an inverse quantizer 108,
`an inverse transformation unit 110, arranged to perform an
`inverse block-based discrete cosine transform (IDCT), com-
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`biners 112 and 116, and a frame store 120. The encoder
`further comprises a motion estimator 130, a motion field
`coder 140 and a motion compensatedpredictor 150. Switches
`102 and 114 are operated co-operatively by control manager
`160 to switch the encoder between an INTRA-modeof video
`
`encoding and an INTER-mode of video encoding. The
`encoder 100 also comprises a video multiplex coder 170
`which forms a single bit-stream from the various types of
`information produced by the encoder 100 for further trans-
`mission to a remote receiving terminal or, for example, for
`storage on a mass storage medium, such as a computer hard
`drive (not shown).
`Encoder 100 operates as follows. Each frame of uncom-
`pressed video provided from the video source to input 101 is
`received and processed macroblock by macroblock, prefer-
`ably in raster-scan order. When the encoding of a new video
`sequencestarts, the first frame to be encodedis encoded as an
`INTRA-coded frame. Subsequently,
`the encoder is pro-
`grammedto code each frame in INTER-coded format, unless
`one of the following conditions is met: 1) it is judged that the
`current macroblockof the frame being codedis so dissimilar
`from the pixel values in the reference frame usedin its pre-
`diction that excessive prediction error information is pro-
`duced,
`in which case the current macroblock is coded in
`INTRA-coded format; 2) a predefined INTRA framerepeti-
`tion interval has expired; or 3) feedback is received from a
`receiving terminal indicating a request for a frame to be
`provided in INTRA-coded format.
`The occurrence of condition 1) is detected by monitoring
`the output of the combiner 116. The combiner 116 forms a
`difference between the current macroblockofthe frame being
`coded and its prediction, produced in the motion compen-
`sated prediction block 150. Ifa measure ofthis difference (for
`example a sum of absolute differences of pixel values)
`exceeds a predeterminedthreshold, the combiner 116 informs
`the control manager 160 viaa control line 119 and the control
`manager 160 operates the switches 102 and 114 via control
`line 113 so as to switch the encoder 100 into INTRA-coding
`mode. In this way, a frame which is otherwise encoded in
`INTER-coded format may comprise INTRA-coded macrob-
`locks. Occurrence of condition 2) is monitored by means of a
`timer or frame counter implemented in the control manager
`160, in such a waythat if the timer expires, or the frame
`counter reaches a predetermined numberof frames, the con-
`trol manager 160 operates the switches 102 and 114 via
`control line 113 to switch the encoder into INTRA-coding
`mode. Condition 3) is triggered if the control manager 160
`receives a feedback signal from, for example, a receiving
`terminal, via control line 121 indicating that an INTRA frame
`refresh is required by the receiving terminal. Such a condition
`mayarise, for example, if a previously transmitted frameis
`badly corrupted by interference during its transmission, ren-
`dering it impossible to decode atthe receiver. In this situation,
`the receiving decoderissues a request for the next frame to be
`encoded in INTRA-coded format,thus re-initialising the cod-
`ing sequence.
`Operation of the encoder 100 in INTRA-coding mode will
`now be described. In INTRA-coding mode,the control man-
`ager 160 operates the switch 102 to accept video input from
`input line 118. The video signal input is received macroblock
`by macroblock from input 101 via the input line 118. As they
`are received, the blocks ofluminance and chrominance values
`which make up the macroblock are passed to the DCTtrans-
`formation block 104, which performs a 2-dimensionaldis-
`crete cosine transform on each block of values, producing a
`2-dimensionalarray of DCT coefficients for each block. DCT
`transformation block 104 produces an array of coefficient
`
`

`

`US 7,532,808 B2
`
`5
`values for each block, the numberof coefficient values cor-
`respondingto the dimensionsofthe blocks which makeup the
`macroblock(in this case 8x8). The DCT coefficients for each
`block are passed to the quantizer 106, where they are quan-
`tized using a quantization parameter QP. Selection of the
`quantization parameter QPis controlled by the control man-
`ager 160 via controlline 115.
`The array of quantized DCT coefficients for each block is
`then passed from the quantizer 106 to the video multiplex
`coder 170, as indicated by line 125 in FIG. 1. The video
`multiplex coder 170 orders the quantized transform coeffi-
`cients for each block using a zigzag scanning procedure,
`thereby converting the two-dimensional array of quantized
`transform coefficients into a one-dimensional array. Each
`non-zero valued quantized coefficient in the one dimensional
`array is then represented as a pairofvalues, referred to as level
`and run, wherelevel is the value of the quantized coefficient
`and run is the numberof consecutive zero-valued coefficients
`precedingthe coefficient in question. The run andlevel values
`are further compressed in the video multiplex coder 170 using
`entropy coding, for example, variable length coding (VLC),
`or arithmetic coding.
`Once the run and level values have been entropy coded
`using an appropriate method, the video multiplex coder 170
`further combines them with control information, also entropy
`coded using a method appropriate for the kind of information
`in question, to form a single compressed bit-stream of coded
`image information 135. It should be noted that while entropy
`coding has been described in connection with operations
`performed by the video multiplex coder 170, in alternative
`implementations a separate entropy coding unit maybe pro-
`vided.
`A locally decoded version ofthe macroblock is also formed
`in the encoder 100. This is done by passing the quantized
`transform coefficients for each block, output by quantizer
`106, through inverse quantizer 108 and applying an inverse
`DCTtransform in inverse transformation block 110. In this
`way a reconstructed array of pixel values is constructed for
`each block of the macroblock. The resulting decoded image
`data is input to combiner 112. In INTRA-coding mode,
`switch 114 is set so that the input to the combiner 112 via
`switch 114 is zero. In this way, the operation performed by
`combiner 112 is equivalent to passing the decoded image data
`unaltered.
`As subsequent macroblocks of the current frame are
`received and undergo the previously described encoding and
`local decoding steps in blocks 104, 106, 108, 110 and 112, a
`decoded version of the INTRA-coded frameis built up in
`frame store 120. When the last macroblock of the current
`
`frame has been INTRA-coded and subsequently decoded,the
`frame store 120 contains a completely decoded frame,avail-
`able for use as a motion prediction reference frame in coding
`asubsequently received video frame in INTER-coded format.
`Operation of the encoder 100 in INTER-coding mode will
`now be described. In INTER-coding mode, the control man-
`ager 160 operates switch 102 to receive its input from line
`117, which comprises the output of combiner 116. The com-
`biner 116 receives the video input signal macroblock by mac-
`roblock from input 101. As combiner 116 receives the blocks
`of luminance and chrominance values which make up the
`macroblock,
`it forms corresponding blocks of prediction
`error information. The prediction error information repre-
`sents the difference between the block in question and its
`prediction, produced in motion compensated prediction
`block 150. More specifically, the prediction error information
`for each block of the macroblock comprises a two-dimen-
`sionalarray ofvalues, each ofwhichrepresentsthe difference
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`between a pixel value in the block of luminance or chromi-
`nance information being coded and a decoded pixel value
`obtained by forming a motion-compensatedprediction for the
`block, according to the procedure to be described below.
`Thus, in the exemplary video coding system considered here
`where each macroblock comprises, for example, an assembly
`of 8x8 blocks comprising luminance and chrominanceval-
`ues, the prediction error information for each block of the
`macroblock similarly comprises an 8x8 array of prediction
`error values.
`
`The prediction error information for each block of the
`macroblock is passed to DCT transformation block 104,
`which performs a two-dimensionaldiscrete cosine transform
`on each block of prediction error values to produce a two-
`dimensional array of DCT transform coefficients for each
`block. DCT transformation block 104 produces an array of
`coefficient values for each prediction error block, the number
`of coefficient values corresponding to the dimensionsof the
`blocks which make up the macroblock(in this case 8x8). The
`transform coefficients derived from each prediction error
`block are passed to quantizer 106 where they are quantized
`using a quantization parameter QP, in a manner analogousto
`that described above in connection with operation of the
`encoder in INTRA-coding mode.As before, selection of the
`quantization parameter QP is controlled by the control man-
`ager 160 via controlline 115.
`The quantized DCT coefficients representing the predic-
`tion error information for each block of the macroblock are
`
`passed from quantizer 106 to video multiplex coder 170, as
`indicated by line 125 in FIG. 1. As in INTRA-coding mode,
`the video multiplex coder 170 orders the transform coeffi-
`cients for each prediction error block using a certain zigzag
`scanning procedure and then represents each non-zero valued
`quantized coefficient as a run-levelpair. It further compresses
`the run-level pairs using entropy coding, in a manner analo-
`gous to that described above in connection with INTRA-
`coding mode. Video multiplex coder 170 also receives motion
`vector information (described in the following) from motion
`field coding block 140 via line 126 and control information
`from control manager160. It entropy codes the motion vector
`information and control information and formsa single bit-
`stream of coded image information, 135 comprising the
`entropy coded motion vector, prediction error and control
`information.
`The quantized DCT coefficients representing the predic-
`tion error information for each block of the macroblock are
`also passed from quantizer 106 to inverse quantizer 108. Here
`they are inverse quantized and the resulting blocks of inverse
`quantized DCTcoefficients are applied to inverse DCTtrans-
`form block 110, where they undergo inverse DCT transfor-
`mation to producelocally decoded blocks of prediction error
`values. The locally decoded blocksof prediction error values
`are then input to combiner 112. In INTER-coding mode,
`switch 114 is set so that the combiner 112 also receives
`
`predicted pixel values for each block of the macroblock,
`generated by motion-compensatedprediction block 150. The
`combiner 112 combineseach ofthe locally decoded blocks of
`prediction error values with a corresponding block of pre-
`dicted pixel values to produce reconstructed image blocks
`and stores themin frame store 120.
`
`As subsequent macroblocks of the video signal are
`received from the video source and undergo the previously
`described encoding and decoding steps in blocks 104, 106,
`108, 110, 112, a decoded version of the frame is built up in
`frame store 120. When the last macroblock of the frame has
`been processed, the frame store 120 contains a completely
`decoded frame, available for use as a motion prediction ref-
`
`

`

`US 7,532,808 B2
`
`7
`erence frame in encoding a subsequently received video
`frame in INTER-coded format.
`
`The details of the motion-compensated prediction per-
`formed by video encoder 100 will now be considered.
`Any frame encoded in INTER-coded format requires a
`reference frame for motion-compensated prediction. This
`means, necessarily, that when encoding a video sequence, the
`first frame to be encoded, whetherit is the first frame in the
`sequence, or someother frame, must be encoded in INTRA-
`coded format. This, in turn, means that when the video
`encoder 100 is switched into INTER-coding mode by control
`manager 160, a complete reference frame, formed by locally
`decoding a previously encoded frame, is already available in
`the frame store 120 of the encoder. In general, the reference
`frame is formed by locally decoding either an INTRA-coded
`frame or an INTER-coded frame.
`
`In the following description it will be assumed that the
`encoder performs motion compensated prediction on a mac-
`roblock basis, i.e. a macroblock is the smallest element of a
`video frame that can be associated with motion information.
`It will further be assumedthat a prediction for a given mac-
`roblock is formed by identifying a region of 16x16 values in
`the luminance componentof the reference frame that shows
`best correspondence with the 16x16 luminancevaluesof the
`macroblock in question. Motion-compensatedprediction in a
`video coding system where motion information maybe asso-
`ciated with elements smaller than a macroblock will be con-
`sidered laterin the text.
`
`The first step in forming a prediction for a macroblock of
`the current frame is performed by motion estimation block
`130. The motion estimation block 130 receives the blocks of
`
`luminance and chrominance values which make up the cur-
`rent macroblock of the frame to be codedvia line 128. It then
`
`performs a block matching operation in order to identify a
`region in the reference frame that corresponds best with the
`current macroblock.In order to perform the block matching
`operation, motion estimation block 130 accesses reference
`frame data stored in frame store 120 via line 127. More
`
`specifically, motion estimation block 130 performs block-
`matching by calculating difference values (e.g. sums of abso-
`lute differences) representing the difference in pixel values
`between the macroblock under examination and candidate
`best-matching regionsofpixels from a reference framestored
`in the frame store 120. A difference value is produced for
`candidate regionsat all possible offsets within a predefined
`search region of the reference frame and motion estimation
`block 130 determines the smallest calculated difference
`
`value. The candidate regionthat yields the smallest difference
`value is selected as the best-matching region. The offset
`between the current macroblock and the best-matching
`region identified in the reference frame defines a “motion
`vector” for the macroblock in question. The motion vector
`typically comprises a pair of numbers, one describing the
`horizontal (Ax) between the current macroblock andthe best-
`matching region of the reference frame, the other represent-
`ing the vertical displacement (Ay).
`Once the motion estimation block 130 has produced a
`motion vector for the macroblock,it outputs the motion vec-
`tor to the motion field coding block 140. The motion field
`coding block 140 approximates the motion vector received
`from motion estimation block 130 using a motion model
`comprising a set of basis functions and motion coefficients.
`Morespecifically, the motion field coding block 140 repre-
`sents the motion vector as a set of motion coefficient values
`which, when multiplied by the basis functions, form an
`approximation ofthe motion vector. Typically, a translational
`
`10
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`motion model having only two motion coefficients and basis
`functions is used, but motion models of greater complexity
`mayalso

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket