`
`U800i292636B2
`
`(12}
`
`United States Patent
`Haskell et al.
`
`[IO] Patent No.:
`
`(45} Date of Patent:
`
`US 7,292,636 32
`Nov. 6, 2007
`
`(54)
`
`USING ORDER VALUE FOR PROCESSING A
`VIDEO PICTURE
`
`(75)
`
`Inventors:
`
`Ilarin Geofl'ry Haskell, Mountain
`View. (IA {US}: David William Singer.
`San Francisco. CA (US); Adriana
`Dumitras. Sunnyvale. CA (US): Atul
`Puri. Cnpertino. CA (US)
`
`(73)
`
`Assignee: Apple Inc.. Cupcrtitto, (IA (US)
`
`(*1
`
`Notice:
`
`Subject to any disclaimer. the term ol'this
`patent is extended or adjusted under 35
`U.S.C. 154(k)) by 757 days.
`
`JP
`JP
`
`5.117.023 A
`6.022.834 A
`6.083.485 A
`6.108.047 A
`6.295.3211 Bl
`6.297.852 Bl ""
`
`4-1998 timer
`6.-'2000 Kim el al.
`732000 Katlono
`852000 Chen
`9e'2001 Dufaux et al.
`l0-"2001 Laksono et a1.
`
`............ 348-534
`
`(Continued)
`FOREIGN PATENT DOCUMENTS
`
`10174035
`[0124065 A
`
`6-1998
`631998
`
`(21)
`
`Appl. No: 101792,669
`
`OTHER PUBLICATIONS
`
`(22)
`
`Filed:
`
`Mar. 2, 2004
`
`U.S. Appl. No. 103291.320. filed Nov. 8. 2002. Haskell.
`
`(65)
`
`(63)
`
`(60}
`
`(51)
`
`(52)
`
`(53}
`
`(56)
`
`Prior Publication Data
`
`US 2004r’0240557 Al
`
`Dec. 2. 2004
`
`Related U.S. Application Data
`
`Continuation of application No. 104291.320. filed on
`Nov. 8. 2002. now Pat. No. ?.088.776.
`
`Provisional application No. 601396.363. filed on Jul.
`15. 2002.
`
`Int. CI.
`(2006.01)
`11043 1X66
`375124023: 3751'24016:
`U.S. Cl.
`375340.12; 375124026; 375040.01: 382E238;
`382.1246: 3821235: 382t'245
`Field of Classification Search
`375t’24023,
`375240.16. 240.12. 240.26, 240.01: 382t238,
`3821246. 235. 245
`See application file for complete search history.
`References Cited
`
`U.S. PA’l‘lEN’l' l)(')(.‘UMt.iN’l"S
`
`(Continued)
`
`Primarr Examiner—Shawn S. An
`(74) Attornqtt. Agent. or Firm—Adcli Law Group PLC
`
`(57)
`
`ABSTRACT
`
`A method and apparatus for variable accuracy inter-picture
`timing specification for digital video encoding is disclosed.
`Specifically. the present invention discloses a system that
`allows the relative liming ol’ nearby video pictures to be
`encoded in a very clIicient manner. In one embodiment, the
`display time difference between a current video picture and
`a nearby video picture is determined. The display time
`dillerenee is then encoded into a digital representation of the
`video picture. in a preferred embodiment. the nearby video
`picture is the most recently transmitted stored picture. For
`coding efficiency,
`the display time dill‘ercnce may be
`encoded using a variable length coding system or arithmetic
`coding. In an alternate embodiment. the display time diffen
`enee is encoded as a power of two to reduce the number of
`bits transmitted.
`
`5.436.664 A
`
`7-1995 Henry
`
`18 Claims, 4 Drawing Sheets
`
`
`
`UNIFIED 1009
`
`UNIFIED 1009
`
`
`
`US 7,292,636 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`200470184543 Al
`
`952004 Haskell
`
`6.400.768 BI
`6.728.315 B2”
`6,859,609 Bl
`7.088.776 132
`2004.-"0008776 AI
`
`6."2002
`4:"2004
`272005
`872006
`172004
`
`Nagumo ct a].
`Haskell et al.
`Watki us
`I-Iaskcl] or al.
`Haskell
`
`37 5-" 240 . l 6
`
`OTHER PUBLICATIONS
`
`International Search Report. Nov. 14. 2003. Apple Computer. Inc.
`International Search Report. Jan. 27. 2004.
`
`‘* cited by examiner
`
`
`
`US. Patent
`
`Nov. 6, 2007
`
`Sheet 1 of 4
`
`US 7,292,636 B2
`
`338m
`
`......
`
`
`
`mEEooE
`
`m3oEmfi
`
`
`
`.5:ngqu3mm
`
`
`
`
`US. Patent
`
`Nov. 6,2007
`
`Sheet 2 of4
`
`US 7,292,636 B2
`
`
`
`
`
`US. Patent
`
`Nov. 6,2007
`
`Sheet 3 of4
`
`US 7,292,636 B2
`
`Figure3
`
`
`
`US. Patent
`
`Nov. 6,2007
`
`Sheet 4 of4
`
`US 7,292,636 32
`
`
`
`
`
`US 1292,6336 B2
`
`1
`USING ORDER VALUE FOR PROCESSING A
`VIDEO PICTU RE
`
`Tlte present patertt application is a continuation of the
`U.S. patent application Ser. No. 10891320. entitled
`“Method and Apparatus for Variable Accuracy Inter-Picture
`Timing Specification for Digital Video Encoding". filed on
`Nov. 8. 2002 now U.S. Pat. No. 7.088.776, which claims the
`benefit of U.S. Provisional Patent Application No. 60/396.
`363 entitled “Method arid Apparatus for Variable Accuracy
`litter-Picture Timing Specification for Digital Video Encod-
`ing". filed on Jul. IS. 2002.
`
`.3
`
`10
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`
`This Application is related to U.S. patent application Ser.
`No. 10291320. filed Nov. 08. 2002: U.S. patent application
`Ser. No. 111621.969. filed Jan. 10. 2007: U.S. patent appli-
`cation Scr. No. ”$21,971, filed Jan. 10, 2007: U.S. patent
`application Ser. No. 11t62],974. filed Jan. 10, 2007; US.
`patent application 10811773, filed Dec. 06. 2002; US.
`patent application 102’7925 I4, filed Mar. 02. 2004; U.S.
`patent application Ser. No. ill621,977, filed Jan. 10, 2007:
`U.S. patent application lll621.9?9. filed Jan. 10. 2007: and
`U.S. patettt application Ser. No.11t621.980.
`filed .lan.10.
`2007'.
`
`FIELD OF THE INVENTION
`
`The present invention relates to the field of multimedia
`compression systems. In particular the present
`invention
`discloses methods and systems for specifying variable accu-
`racy inter-picture tinting.
`
`BACKGROUND OF THE INVENTION
`
`Digital based electronic media formats are finally on the
`cusp of largely replacing analog electronic ntedia formats.
`Digital compact discs (CD3) replaced attalog vinyl rmords
`lottg ago. Analog magnetic cassette tapes are becoming
`
`increasingly rare. Second and third generation digital audio
`systems such as Mini—discs and MP3 (MPEG Audio
`layer
`3) are now taking market share front the first generation
`digital audio format of compact discs.
`The video media ltas been slower to rttove to digital
`storage and transmission formats than audio. This ltas been
`largely due to the massive amounts of digital information
`required to accurately represent video in digital form. Tlte
`massive amounts of digital information needed to accurately
`represent video require very high-capacity digital storage
`systems and high-bandwidth transmission systems.
`However. video is now rapidly moving to digital storage
`and transmission fonttats. Faster computer processors. high-
`density storage systems, and new efliciettt compression and
`encoding algorithms have finally made digital video practi—
`cal at consumer piece points. The DVD (Digital Versatile
`Disc). a digital video system, has been one of the fastest
`selling consumer electronic products in years. DVDs have
`been rapidly supplanting Video-CaSsette Recorders (V’CRs)
`as the pro-recorded video playback system of cltoice due to
`their high video quality, very high audio quality. conve-
`nience, and extra features. The antiquated analog NTSC
`(National Television Standards Committee) video transmis—
`sion system is currently in the process of beirtg replaced with
`the digital A'I‘SC (Advanced Television Standards Commit-
`tee) video transmiSsion system.
`
`3t]
`
`4t]
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`
`Computer systems have been using various different
`digital video encoding formats for a ntuttber of years.
`Among tlte best digital video compression and encoding
`systems used by computer systems ltave been the digital
`video systems backed by the Motion Pictures Expert Group
`commonly known by the acronym MPEG. The three most
`well known and highly used digital video fomtats from
`MPEG are known simply as MPEG-1. MPEGAZ. and
`MPEG-4. VideoCDs (VCDs) and early consumer-grade
`digital video editing systems use the early MPEG-1 digital
`video encoding format. Digital Versatile Discs (DVDs) and
`the Dish Network brand Direct Broadcast Satellite (DBS)
`television broadcast system use the higher quality MPl'iG-Z
`digital video compression and encoding system. The
`MPEG—4 encoding system is rapidly being adapted by the
`latest computer based digital video encoders and associated
`digital video players.
`The MPEG-2 and MPEG-4 standards compress a series of
`video frames or video fields and then encode the compressed
`frames or fields into a digital bitstream. When encoding a
`video frame or field with the MPEG-2 and MPEG-4 sys-
`tems. the video frame or field is divided into a rectangular
`grid of macroblocks. Each macroblock is independently
`compressed and encoded.
`When compressing a video frame or field, the MPEG-4
`standard may compress the frame or field into one of three
`types of compressed
`frames or
`fields:
`intra-frames
`(I-ft‘ames). Unidirectional Predicted frames (P-frames). or
`Bi—Directional Predicted frames (anrames). Intra-frames
`completely independently encode an independent video
`frame with no reference to other video frames. P-frames
`define a video frame with reference to a single previously
`displayed video frame. B-frames define a video frame with
`reference to both a video frame displayed before the current
`frame and a video frame to be displayed after the con-ent
`frame. Due to their efficient usage of redundant video
`information. P—frames and B-frames generally provide the
`best compression.
`
`SUMMARY OF THE INVENTION
`
`A toothed attd apparatus for variable accuracy inter-
`picture tinting specification for digital video encoding is
`disclosed. Specifically,
`the present
`invention discloses a
`system that allows the relative timing of nearby video
`pictures to be encoded in a very efficient manner. In one
`embodiment. the display time difference between a current
`video picture and a nearby video picture is determined. The
`display time difference is then encoded into a digital repre-
`sentation of the video picture. in a preferred embodiment,
`the nearby video picture is the most recently transmitted
`stored picture.
`For coding efficiency. the display time difference may be
`encoded using a variable length coding system or arithmetic
`coding. In an altemate embodiment, the display time difl'er-
`once is encoded as a power oftwo to reduce the number of
`bits transmitted.
`
`Other objects, features. and advantages of present inven-
`tiott will be apparent from the company drawings and from
`the following detailed description.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`features, and advantages of the present
`The objects,
`invention will be apparent to one skilled in the art. in view
`of the following detailed description in which:
`
`
`
`3
`
`4
`
`US 1292,6336 B2
`
`illustrates a high-level block diagram of one
`1
`FIG.
`possible a digital video encoder system.
`FIG. 2 illustrates a serious of video pictures in the order
`that the pictures should be displayed wherein the arrows
`connecting different pictures indicate inter-Picture depen-
`dency created using motion compensation.
`FIG. 3 illustrates the video pictures from FIG. 2 listed in
`a preferred transmission order of pictures wherein the
`arrows connecting different pictures indicate inter-picture
`dependency created using motion compensation.
`FIG. 4 graphically illustrates a series of video pictures
`wherein the distances between video pictures that reference
`each other are chosen to be powers of two.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED IEMBODIMEN‘I‘
`
`A method and system for specifying Variable Accuracy
`Inter-Picture Timing in a multimedia compression and
`encoding system is disclosed. In the following description.
`for purposes of explanation. specific nomenclature is set
`forth to provide a thorough understanding of the present
`invention. However. it will be apparent to one skilled in the
`art that these specific details are not required in order to
`practice the present
`invention. For example.
`the present
`invention has been described with reference to the MPEG-4
`
`multimedia compression and encoding system. However.
`the same techniques can easin be applied to other types of
`compression and encoding systems.
`
`Multimedia Compression and Encoding Overview
`
`FIG. 1 illustrates a high-level block diagram of a typical
`digital video encoder 100 as is well known in the art. The
`digital video encoder 100 receives an incoming video stream
`of video frames 105 at the left of the block diagram. Each
`video frame is processed by a Discrete Cosine transforma-
`tion (DCT) unit 110. The frame may be processed indepen-
`dently (an intra-frame) or with reference to information from
`other frames received front the motion compensation unit
`(an inter-frame). Next. a Quantizer (Q) unit 120 quantities
`the information from the Discrete Cosine Transformation
`
`unit 110. Finally, the quantized video frame is then encoded
`with an entropy encoder ([1) unit 180 to produce an encoded
`bitstream. The entropy encoder (l-I) unit 180 may use a
`variable length coding (VLC) system.
`Since an inter-frame encoded video frame is defined with
`
`reference to other nearby video frames, the digital video
`encoder 100 needs to create a copy of how decoded each
`frame will appear within a digital video decoder such that
`inter-frames may be encoded. Thus. the lower portion of the
`digital video encoder 100 is actually a digital video decoder
`system. Specifically. an inverse quantizer (Q‘l) unit 130
`reverses the quantimtion of the video frame infonnation and
`an inverse Discrete Cosine Transformation (DCT't) unit
`140 reverses the Discrete Cosine Transfomtation of the
`video frame information. After all the DCT coefficients are
`reconstructed from iDC’I‘. the motion compensation unit will
`use the information. along with the motion vectors.
`to
`reconstntct the encoded frame which is then used as the
`reference frame for the motion estimation of the next frame.
`'Jhe decoded video frame may then be used to encode
`inter—frames (Pvframes or viraines) that are defined relative
`to infonnation in the decoded video frame. Specifically. a
`motion compensation (MC) unit 150 and a motion estima—
`tion (ME) unit 160 are used to determine motion vectors and
`generate differential values used to encode inter— frames.
`
`10
`
`3o
`
`4E]
`
`45
`
`50
`
`55
`
`60
`
`65
`
`A rate controller 190 receives information from many
`difiereiit components in a digital video encoder 100 and uses
`the infomtation to allocate a bit budget for each video frame.
`The rate controller 190 should allocate the bit budget in a
`manner that will generate the highest quality digital video bit
`stream that that complies with a specified set of restrictions.
`Specifically. the rate controller 190 attempts to generate the
`highest quality compressed video stream without overflow-
`ing buffers (exceeding the amount of available memory in a
`decoder by sending more infomtation than can be stored) or
`underflowing buffers (not sending video frames fast enough
`such that a decoder runs out of video frames to display).
`
`Multimedia Compression and Encoding Overview
`
`In some video signals the time between successive video
`pictures (frames or fields) may not be constant. (Note: This
`document will use the term video pictures to generically
`refer to video frames or video fields.) For example. some
`video pictures may be dropped because of transmission
`bandwidth constraints. Furthermore, the video timing may
`also vary due to camera inregularity or special efiects such
`as slow motion or fast motion. In some video streams. the
`original video source may simply have non-uniform inter-
`picture times by design. For example. synthesized video
`such as computer graphic animations may have non-tun form
`timing since no arbitrary video timing is created by a
`uniform video capture system such as a video camera
`system. A flexible digital video encoding system should be
`able to handle non-uniform timing.
`Many digital video encoding systems divide video pic-
`tures into a rectangular grid of macroblocks. Each individual
`macroblock from the video picture is independently com-
`pressed and encoded. In some embodiments, sub-blocks of
`macroblocks known as ‘pixelblocks’ are used. Such pixel
`blocks may have their own motion vectors that may be
`interpolated. This document will
`refer to macroblocks
`although the teachings of the present
`invention may be
`applied equally to both macroblocks and pixelblocks.
`Some video coding standards. e.g.. ISO MPEG standards
`or the ITU {-1264 standard. use different types of predicted
`macroblocks to encode video pictures. In one scenario. a
`
`macroblock may be one of three types:
`1. Ininacroblock An Intra (I) macroblock uses no infonna-
`tion from any other video pictures in its coding (it
`is
`completely self-defined);
`2. P-macroblock—A unidirectional ly predicted (P) macrob-
`lock refers to picture infonnalion than) one preceding
`video picture; or
`. B—inacroblock—A bidirectional predicted (B) macrob—
`lock uses information from one preceding picture and one
`future video picture.
`If all
`the macroblocks in a video picture are intra-
`macroblocks. then the video picture is an Intra-franie. If a
`video picture only includes unidirectional predicted macro
`blocks or intra-macroblocks. then the video picture is known
`as a P—frame. Ifthe video picture contains any bidirectional
`predicted macroblocks, then the video picture is known as a
`B-frame. For the simplicity. this document will consider the
`case where all macroblocks within a given picture are of the
`same type.
`An example sequence of video pictures to be encoded
`might be represented as
`1113213384 P5130 [3713“ 13., PmB“ 13121313114“ .
`
`DJ
`
`where the letter (I. P. or B) represents if the video picture is
`an l-frame. P-frame. or B-framc and the munbcr represents
`
`
`
`US 1292,6336 B2
`
`5
`
`the camera order of the video picture in the sequence of
`video pictures. The camera order is the order in which a
`camera recorded the video pictures and thus is also the order
`in which the video pictures should be displayed (the display
`order}.
`The previous example series of video pictures is graphi-
`cally illustrated in FIG. 2. Referring to FIG. 2. the arrows
`indicate that macroblocks from a stored picture (I-frame or
`P-frame in this case) are used in the motion compensated
`prediction of other pictures.
`In the scenario of FIG. 2. no information from other
`pictures is used in the encoding of the intra-frame video
`picture IE. Video picture P5 is a P-frame that uses video
`information from previous video picture I , in its coding such
`that an arrow is drawn from video picture I 1 to video picture
`P5. Video picture 1332. video picture [33. video picture 84 all
`use information from both video picture II and video picture
`P5 in their coding such that arrows are drawn from video
`picture It and video picture P5 to video picture 82, video
`picture 133, and video picture 84. As stated above the
`inter—picture times are. in general. not the same.
`Since B—pictures use information from future pictures
`(pictures that will be displayed later). the transmission order
`is usually different than the display order. Specifically. video
`pictures that are needed to construct other video pictures
`should be transmitted first. For the above sequence.
`the
`transmission order might be
`11 P5 Ba Ba B4 Pit: Bo B? Ba Be PI: Bil 114 B13 - --
`FIG. 3 graphically illustrates the above transmission order
`of the video pictures from FIG. 2. Again. the arrows in the
`figure indicate that macroblocks from a stored video picture
`(I or P in this case) are used in the motion compensated
`prediction of other video pictures.
`Referring to FIG. 3. the system first transmits I-fraine I.
`which does not depend on any other frame. Next. the system
`transmits P—frame video picture P5 that depends upon video
`picture 11. Next, the system transmits B-frame video picture
`B2 after video picture 1’5 even though video picture B1 will
`be displayed before video picture 9;. The reason for this is
`that when it comes time to decode Bl. the decoder will have
`already received and stored the information in video pictures
`II and P5 necessary to decode video picture Bl. Similarly.
`video pictures I 1 and P5 are ready to be used to decode
`subsequent video picture 83 and video picture B4. The
`receiveri’decoder reorders the video pictttre sequence for
`proper display. In this operation 1 and P pictures are often
`referred to as stored pictures.
`The coding of the P-frame pictures typically utilizes
`Motion Compensation, wherein a Motion Vector is com—
`puted for each macroblock in the picture. Using the com—
`puted motion vector. a prediction macroblock (P-macrob-
`lock) can be ftmned by translation of pixels in the
`aforementioned previous picture. The difference between
`the actual macroblock in the P-frame picture and the pre-
`diction macroblock is then coded for transmission.
`Each motion vector may also be transmitted via predictive
`coding. For example. a motion vector prediction may be
`formed using nearby motion vectors. In such a case. then the
`difference between the actual motion vector and the motion
`
`vector prediction is coded for transmission.
`Each B-macroblock uses two motion vectors: a first
`
`motion vector referencing the aforementioned previous
`video picture and a second motion vector referencing the
`future video picture. From these two motion vectors. two
`prediction macroblocks are computed. The two predicted
`macroblocks are then combined together. using some func-
`tion, to form a final predicted macroblock. As above, the
`
`10
`
`6
`difl'erence between the actual macroblock in the B-frame
`picture and the final predicted macroblock is then encoded
`for transmission.
`
`As with P-macroblocks. each motion vector {MV} of a
`B-macroblock may be transmitted via predictive coding.
`Specifically. a predicted motion vector is formed using
`nearby motion vectors. Then. the difference between the
`actual motion vector and the predicted is coded for traits-
`mission.
`
`However. with B-macroblocks the opportunity exists for
`interpolating motion vectors from motion vectors in the
`nearest stored picture macroblock. Such interpolation is
`carried out both in the digital video encoder and the digital
`video decoder.
`
`This motion vector interpolation works particularly well
`on video pictures from a video sequence where a camera is
`slowly panning across a stationary background. In fact. such
`motion vector interpolation may be good enough to be used
`alone. Specifically. this means that no difl’erential informa-
`tion needs be calculated or transmitted for these B-macrob-
`lock motion vectors encoded using interpolation.
`To illustrate further. in the above scenario let us represent
`the inter-picture display time between pictures i andj as Dar
`i.e._.
`if the display times of the pictures are '1'} and '1}.
`respectively. then
`
`urn-4;,-
`
`from which it follows that
`
`3t]
`
`DM=DU+ng
`
`DUE—DH
`
`Note that Dr; may be negative in some cases.
`Thus. if MVE’l
`is a motion vector for a P5 macrobiock as
`referenced to II, then for the corresponding macroblocks in
`I32. 133 and B4 the motion vectors as referenced to II and P5,
`respectively. would be interpolated by
`M V24 2*”Vat "BANDS. I
`
`M V32=Mllsl ”J's/De. I
`
`MPH WWI/5.: ‘DJJIDSJ
`
`M VLF-M Vs: tbsp/55. I
`
`My“ =MV5J .D-JJ‘DSJ
`
`MVSAT‘MVSJ Wet-105..
`
`Note that since ratios of display times are used for motion
`vector prediction, absolute display times are not needed.
`Thus. relative display times may be used for DU display time
`values.
`
`This scenario may be generalized. as for example in the
`H.264 standard. In the generalization. a P or B picture may
`use any previously transmitted picture for its motion vector
`prediction. Thus.
`in the above case picture B3 may use
`picture 11 and picture B2 in its prediction. Moreover, motion
`vectors may be extrapolated. not just interpolated. Thus. in
`this case we would have:
`
`My“ :Ml'lzn 'DJJ’KDZJ
`
`Such motion vector extrapolation (or interpolation} may also
`be used in the prediction process for predictive coding of
`motion vectors.
`the problem in the case of non-uniform
`In any event.
`inter-picture times is to transmit the relative display time
`
`4t]
`
`45
`
`50
`
`55
`
`60
`
`65
`
`
`
`7
`
`8
`
`US 1292,6336 B2
`
`values of D”. to the receiver. and that is the subject of the
`present invention. In one embodiment of the present inven-
`tion. for each picture after the first picture we transmit the
`display time difference between the current picture and the
`most recently transmitted stored picture. For error resiiience.
`the transmission could be repeated several times within the
`picture. e.g.. in the so—called slice headers of the MPEG or
`H.264 standards. If all slice headers are lost. then presum-
`ably other pictures that rely on the lost picture for decoding
`information cannot be decoded either.
`Thus.
`in the above scenario we would transmit the fol-
`
`10
`
`lowing:
`
`])5,-] 132.5 I)3.5 1-1135 1-)“),5 I'IJGJII 1-37.”) I)$.Ill 139.“! I)IZJI]
`[JILIZ 1)I4.12 [)13,I4 '
`'
`'
`
`For the purpose of motion vector estimation, the accuracy
`requirements for D”. may vary from picture to picture. For
`example, if there is only a single B-frame picture 86 halfway
`between two P-frarne pictures P5 and P7. then it suffices to
`send only:
`
`D75 2 and em --1
`
`Where the DR, display time values are relative time values.
`If. instead. video picture 136 is only one quarter the distance
`between video picture P5 and video picture I’T then the
`appropriate DU display time values to send would be:
`r),‘,=4 and Daf—l
`
`Note that in both of the two preceding examples, the display
`time between the video picture 8,, and video picture video
`picture P7 is being used as the display time “unit" and the
`display ti me ditfercncc between video picture P5 and picture
`video picture PF,r is four display time “units".
`In generai. motion vector estimation is less complex if
`divisors are powers of two. This is easily achieved in our
`embodiment
`if Dr; (the inter—picture time) between two
`stored pictures is chosen to be a power of two as graphically
`illustrated in FIG. 4. Alternatively, the estimation procedure
`could be defined to Intricate or round all divisors to a power
`of two.
`
`In the case where an inter-picture time is to be a power of
`two, the number of data bits can be reduced if only the
`integer power (oftwo) is transmitted instead of the full value
`ofthe inter-picture time. FIG. 4 graphically illustrates a case
`wherein the distances between pictures are chosen to be
`powers of two. In such a case, the D3’ 1 display time value of
`2 between video picture PI and picture video picture P3 is
`transmitted as 1 (since 21:2) and the D75 display time value
`of4 between video picture P7 and picture video picture l-’3
`can be transmitted as 2 (since 22' 4).
`In some cases. motion vector interpolation may not be
`used. However. it is still necessary to transmit the display
`order of the video pictures to the receiven'player system such
`that the receiverl‘player system will display the video pic-
`tures in the proper order. In this case, simple signed integer
`values [or 1),”,- sulfice irrespective of the actual display times.
`In some applications only the sign may be needed.
`
`The inter-picture times Dig. may simply be transmitted as
`simple signed integer values. However. many methods may
`be used for encoding the DU values to achieve additional
`compression. For example, a sign bit followed by a variable
`length coded magnitude is relativeiy easy to implement and
`provides coding ellicicucy.
`One such variable length coding system that may be used
`is known as UVLC (Universal Variable Length Code). The
`
`3o
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`UVLC variable length coding system is given by the code
`words:
`
`1
`014)
`{it}
`notott
`0011231
`Utiitfl
`Ut'illl
`EJOUHJUU...
`
`macaw-humb-IIIIIItIIIt
`
`Another method of encoding the inter—picture times may
`be to use arithmetic coding. 'l‘ypically. arithmetic coding
`utilizes conditional probabilities to eflcet a very high com-
`pression of the data bits.
`Thus. the present invention introduces a simple but pow-
`erful method of encoding and transmitting inter-picture
`display times. The encoding of inter~picture display times
`can be made very eflicient by using variable length coding
`or arithmetic coding. Furthermore, a desired accuracy can be
`chosen to meet the needs of the video decoder, but no more.
`The foregoing has described a system for specifying
`variable accuracy inter-picture timing in a multimedia com-
`pression and encoding system.
`It
`is contemplated that
`changes and modifications may be made by one of ordinary
`skill in the art. to the materials and arrangements of elements
`of the present invention without departing from the scope of
`the invention.
`We claim:
`
`1. For a bitstream comprising a first video picture. a
`second video picture. and a third video picture. a method of
`decoding comprising:
`computing a particular value that is based on (i) a first
`order difl'erence value between an order value for the
`third video picture and an order value for the first video
`picture. and (ii) a
`second order difierence value
`between an order value for the second video picture and
`the order value for the first video picture:
`computing a particular motion vector for the second video
`picture based on the particular value and a motion
`vector for the third video picture; and
`decoding at least one video picture by using the computed
`motion vector.
`2. The method of claim 1. wherein an order value for a
`particular video picture is for specifying a position for the
`particular video picture in a sequence of video pictures.
`3. The method of claim 2. wherein the sequence is a
`sequence for displaying the video pictures.
`4. The method of claim 1, wherein an order value for a
`particular video picture is representative of a positional
`relationship of the particular video picture with respect to
`another video picture.
`5. The method of claim I. wherein the particuiar value is
`proportional to the second order difference value.
`6. The method of claim 1, wherein the particular value is
`inverseiy proportional to the first order difference value.
`7. The method of claim 1. wherein the order value for the
`second video picture is derived from a value stored in a slice
`header that is associated with the second video picture.
`8. The method of claim 1. wherein the second video
`picture is decoded by using the computed motion vector.
`9. The method of claim 8. wherein the first and third video
`pictures are decoded before the second video picture.
`1!}. A computer readable medium storing a computer
`program for decoding at least one video picture from a set
`
`
`
`9
`
`10
`
`US 7,292,636 B2
`
`comprising a first video picture. a second video picture, and
`a third video picture. the computer program executable by at
`least one processor. the computer program comprising sets
`of instructions for:
`
`computing a particular vaiue that is based on (i) a first
`order diflerence value between an order value for the
`third video picture and an order value for the first video
`picture. and (ii) a
`second order diflereiice value
`between an order value for the second video picture and
`the order value for the first video picture:
`computing a particular motion vector for the second video
`picture based on the particular value and a motion
`vector for the third video picture; and
`decoding at least one video picture by using the computed
`motion vector.
`11. The computer readable medium of claim 10. wherein
`an order value for a particular video picture is for specifying
`a position for the particular video picture in a sequence of
`video pictures.
`12. The computer readable medium of claim 11. wherein
`the sequence is a sequence for displaying the video pictures.
`13. The computer readable medium of claim 10. wherein
`an order value for a particular video picture is representative
`
`of a positional relationship of the particular video picture
`with respect to another video picture.
`14. The computer readable medium of claim 10. wherein
`the particular value is proportional
`to the second order
`diflerence value.
`
`If)
`
`15. The computer readable medium of claim 10. wherein
`the particular value is inversely proportional to the first order
`diflerence value.
`
`16. The computer readable medium of claim 10. wherein
`the order value for the second video picture is derived from
`a value stored in a slice header that is associated with the
`
`second video picture.
`17. The computer readable medium of claim 10. wherein
`the second video picture is decoded by using the computed
`motion vector.
`
`18. The computer readable medium of claim 17. wherein
`the first and third video pictures are decoded before the
`second video picture.
`
`