`Lee et al.
`
`[54] INTERFRAME VIDEO ENCODING AND
`DECODING SYSTEM
`
`[75]
`
`Inventors: Chong U. Lee; Donald Pian, both of
`San Diego, Calif.
`
`[73] Assignee: Qualcomm Incorporated, San Diego,
`Calif.
`
`[21] Appl. No.: 532,042
`
`[22] Filed:
`
`Sep.21,1995
`
`[63]
`
`[51]
`[52]
`[58]
`
`[56]
`
`Related U.S. Application Data
`
`Continuation of Ser. No. 407,427, Mar. 17, 1995, aban(cid:173)
`doned, which is a continuation of Ser. No. 12,814, Feb. 3,
`1993, abandoned.
`Int. Cl.6
`......... ... .... ... .......... .. ... .. ..... ....... .... .. . H04N 7/32
`U.S. Cl . ............................................. 348/413; 348/416
`Field of Search ..................................... 348/402, 409,
`348/412, 413, 415, 416, 699, 407
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,179,709
`4,293,920
`4,504,860
`4,774,574
`4,776,030
`4,796,087
`4,807,042
`4,816,906
`4,821,119
`4,922,341
`4,924,309
`4,984,076
`5,021,891
`5,045,938
`5,073,820
`5,107,345
`5,126,857
`5,151,784
`5,235,419
`
`12/1979 Workman ................................ 358/133
`10/1981 Merola .................................... 364n25
`3/1985 Nicol et al .............................. 358/133
`9/1988 Daly et al. .............................. 358/133
`10/1988 Tzou ......................................... 382/56
`1/1989 Guichard et al. ....................... 348/402
`2/1989 Tanaka .................................... 358/260
`3/1989 Kurnrnerfeldt et al. ................ 348/402
`4/1989 Gharavi ................................... 348/402
`5/1990 Strobach ................................. 3581136
`5/1990 Hartnack et al. ....................... 358/133
`111991 Watanabe et al ....................... 358/133
`6/1991 Lee .......................................... 358/432
`911991 Sugiyama ................................ 358/133
`12/1991 Nakagawa et al. ..................... 358/133
`4/1992 Lee .......................................... 358/432
`6/1992 Watanabe et al . ...................... 358/433
`9/1992 LaVagetto et al ...................... 348/416
`8/1993 Krause ................ .................... 348/416
`
`111111111111111111111111111111111111111111111111111111111111111111111111111
`US005576767 A
`[Ill Patent Number:
`[45] Date of Patent:
`
`5,576,767
`Nov. 19, 1996
`
`OTHER PUBLICATIONS
`
`Dinstein et al. "Variable Block-Size Transform Image
`Coder," IEEE Transactions on Comm., pp. 2073-2078, Nov.
`1990.
`Reininger et al. "Distributions of the Two-Dimensional
`DCT Coefficients for Images," IEEE Transactions on
`Comm., pp. 835-839, Jun. 1983.
`Ahmed et al. "Discrete Cosine Transform," IEEE Transac(cid:173)
`tions on Computers, pp. 90-93, Jan. 1973.
`Comstock et al. "Hamming Coding of DCT -Compressed
`Images Over Noisy Channels," IEEE Transactions on
`Comm., pp. 856-861, Jul. 1984.
`
`(List continued on next page.)
`
`Primary Examiner-Tommy P. Chin
`Assistant Examiner-A. Au
`Attorney, Agent, or Finn-Russell B. Miller; Sean English
`
`[57]
`
`ABSTRACT
`
`A video compression system and method for compressing
`video data for transmission or storage by reducing the
`temporal redundancy in the video data is described. A frame
`of video data is divided into a variable number of blocks of
`pixel data of varying size, and each block of data is
`compared to a window of pixel data in a reference frame of
`pixel data, typically the previous frame. A best matched
`block of pixel data is selected from the window of pixel data
`in the reference frame, and a displacement vector is assigned
`to describe the selected block location in the reference frame
`relative to the current block of pixel data. The number and
`size of the blocks of pixel data are permitted to vary, in order
`to adapt to motion discontinuities in the sequential frames of
`pixel data. This is to allow prediction blocks of pixel data in
`the current frame to be smaller in areas of high activity,
`while maintaining high levels of compression, achieved by
`using larger prediction blocks, in areas of the frame with low
`levels of activity. A frame of predicted pixel data is
`assembled from variable size blocks of prediction data and
`subtracted from the current frame of pixel data. Only the
`residual difference, the displacement vectors and an indica(cid:173)
`tion of the block sizes used in the prediction are needed for
`transmission or storage.
`
`24 Claims, 6 Drawing Sheets
`
`Vedanti Systems Limited - Ex. 2005
`Page 1
`
`
`
`5,576,767
`Page 2
`
`OTHER PUBLICATIONS
`
`Roese et al. "Interframe Cosine Transform Image Coding,"
`IEEE Transactions on Comm., pp. 1329-1339, Nov. 1977.
`Chen et al. "Adaptive Coding of Monochrome and Color
`Images," IEEE Transactions on Comm., pp. 1285-1292,
`Nov. 1977.
`
`Chen et al. "Scene Adaptive Coder," IEEE Transactions on
`Comm., pp. 225-232, Mar. 1984.
`
`Strobach, Peter. "Quadtree-Structured Recursive Plane
`Decomposition Coding of Images," IEEE Transactions on
`Comm., pp. 1380-1397, Jun. 1991.
`
`Vedanti Systems Limited - Ex. 2005
`Page 2
`
`
`
`"" =" ~ "" =" ""
`
`11-.
`Ol
`
`00 =-~ -~
`
`0\
`0 ....,
`
`0\
`~
`~
`~
`
`~~
`~
`
`;I!!
`0
`~
`
`~ = ......
`~ ......
`•
`00
`~
`
`FIG. 2
`
`BUFFER
`FRAME
`
`1 1'"38
`
`RECONSTRUCTED
`
`PICTURE
`
`-
`
`-
`
`-COMPENSATED -
`
`PREDICTOR
`
`MOTION-
`
`-..
`
`MOTION VECTORS (MBSA)
`
`~PREDICTION
`
`PICTURE
`
`1'"34
`
`36
`
`.. +
`
`INTRAFRAME RECONSTRUCTED
`
`DFD
`
`DECODER
`
`32
`
`FIG. 1
`
`-
`
`DATA
`
`INTRA FRAME
`
`DECODER
`CHANNEL
`
`30
`
`TRANSMISSION
`
`CHANNEL
`
`FROM
`
`'-16
`
`~ ESTIMATOR ~
`
`MOTION
`
`1'"20
`
`(MBSA)
`VECTORS
`MOTION I
`
`ENCODER
`CHANNEL
`CHANNEL TRANSMISSIO
`N
`
`TO
`
`12
`
`INTRAFRAME DATA
`
`INTRAFRAME
`
`ENCODER
`
`DFD
`
`10
`
`COMPENSATED -
`
`PREDICTOR
`
`MOTION-
`
`·~
`+
`
`22
`
`L18
`
`-
`
`SEQUENCE_
`INPUT PICTURE
`
`DECODER
`
`·~
`
`1'"14
`
`,
`
`FRAME RECONSTRUCTE~ECONSTRUCTED INTRA FRAME
`
`DFD
`
`24
`...
`
`+
`
`PICTURE
`
`BUFFER
`
`PREDICTION PICTURE
`
`Vedanti Systems Limited - Ex. 2005
`Page 3
`
`
`
`U.S. Patent
`
`Nov. 19, 1996
`
`Sheet 2 of 6
`
`5,576,767
`
`40
`
`M
`
`46
`
`42
`FIG. 3a
`
`52
`FIG. 3b
`
`Vedanti Systems Limited - Ex. 2005
`Page 4
`
`
`
`U.S. Patent
`
`Nov. 19, 1996
`
`Sheet 3 of 6
`
`5,576,767
`
`I D32(k,l)
`l 1 ..... •------ 32 -----~ ... -1
`' '
`
`32
`
`FIG. 4a
`Ds(k,l) Ds(k+8,l)
`I
`I
`
`.....
`
`Ds(k,1+8)
`-
`Ds(k+8,1+ 8)
`
`,_
`
`D16(k,l)
`
`D16(k+16,1)
`
`Dl6(k,l+16) Dt6(k+ 16,1+ 16
`
`t
`16 t
`
`1.--16__.1
`
`FIG. 4b
`D 4(k,~ ~4(k+4,l)
`
`' '
`D4(k,l+4)
`-1
`~ ~
`D4(k+4,1+4 )
`
`l
`8
`
`~8~
`
`FIG. 4c
`
`l
`4
`
`FIG. 4d
`
`, ....... ....__ ___ 32 ----~·
`BOUNDRYOF
`MOTION DISCONTINUITY
`
`r 32~~~~~-4--~~
`l~
`
`FIG. 5
`
`Vedanti Systems Limited - Ex. 2005
`Page 5
`
`
`
`......:J
`Q-\
`......:J
`='-
`,..
`......:J
`Ol
`,..
`Ol
`
`Q\
`
`FIG. 9
`
`FROM FRAME BUFFER
`
`+
`
`... TO SUMMER
`
`22
`
`BUFFER
`BLOCK
`PIXEL
`
`-
`
`-..
`
`.. COMBINER
`BLOCK
`
`SELECTOR
`
`BLOCK
`
`124
`
`122
`
`120
`
`__..
`
`__..
`
`SIZE ASSIGNMENT
`
`MOTION BLOCK
`
`MOTION VECTOR
`
`COMPOSITE
`
`ttl -~
`
`r;,'.J =(cid:173) ft)
`
`c ....,
`
`~
`\C
`\C
`~
`
`_.\C
`~
`~
`
`z c
`
`~ = "*
`
`~
`~
`•
`rJl
`d •
`
`FIG. 6
`
`D2N(k,l)=DN(k,I)+DN(k+N,I)+DN(k,l+N)+DN(k+N,l+N)
`
`7
`
`IF D2N(k,l )~C[02N(k,l)] ~ I
`IF D2N(k,l)<C[D2N(k,l)] ~0
`COMPARE
`
`64
`
`+
`
`~D'(k,l)=C[D(k,I)J)
`
`CN(•) I--
`
`L!3_
`
`I
`~
`
`I
`
`I j~
`
`I
`
`2NI
`
`DN(k+N,I)
`
`•I
`
`2N
`
`I..
`
`1
`r
`1 DN(k,l+N) DN(k+N,l+N)
`r DN(k,l)
`
`D2N(k,l)
`
`2N
`
`•I
`
`2N
`
`I~
`
`Vedanti Systems Limited - Ex. 2005
`Page 6
`
`
`
`'-1
`~
`'-" '-1
`~
`'-1
`'-" 01
`01
`
`0\
`
`s,
`Ul
`~
`Cll =(cid:173)~
`
`0\
`I.C
`I.C
`~
`
`~ ::=
`~
`z
`
`~
`
`~ a
`00 .
`d
`
`~...-____ _..R TO RREG
`
`\..j
`8.8cl MULTIPLIER
`WEIGHTING
`
`(WFM16)
`
`FACTOR
`
`D'(k,P:)
`
`D32,32(k,l !1,
`
`Q TO QREG ~..1...-------J
`
`COMPARATOR
`~r----1....--~-
`
`90c
`
`-
`
`(DC32)
`
`90b
`
`DISTORTION F
`
`{84a
`
`t
`CALCULATOR D32(k,l) t
`
`•
`
`+
`
`CALCULATOR~ r
`
`MV32
`
`(MVC32)
`
`VECTOR
`MOTION
`
`•
`
`•
`
`,-s2a
`
`+
`
`(MVC16) MV161----+----I
`
`~
`-~
`
`(WFM8)
`
`(DS16)
`
`(DC16)
`
`CALCULATOR Dt6(k,l)
`DISTORTION 1--..,........~ DISTORTION
`
`SUMMER
`
`•
`
`•
`
`......--..:::._~P__;!T~O~P~REG MULTIPLIER
`
`90a
`
`84b
`
`WEIGHTING
`
`FACTOR
`
`I
`
`I
`
`I
`
`I
`
`MV81
`
`CALCULATOR~ r
`'--' ---r+----''
`
`VECTOR
`MOTION
`
`,-82b
`
`CALCULATOR
`
`(MVC8)
`
`VECTOR
`MOTION
`
`BUFFER
`FRAME
`FROM
`BLOCK
`PIXEL
`32X32
`
`Iii 1i1i ill 1111
`•••••• 1
`•
`•••••••••••••••
`...... :::: ...
`::: 111• •••1 !Ill
`i .ii:i: •••• ~
`•11 ••:• 1i1Uii~
`liUIIUiiUiii
`
`: :
`
`I MV4
`
`(MVC4)
`
`I
`
`,-s2c
`
`+
`
`\.._ CALCULATOR
`
`VECTOR
`82dl MOTION
`
`FIG. 7
`
`88b
`
`DISTORTION
`
`SUMMER
`
`(DS8)
`
`86c
`r
`Ds(k,l)
`
`~
`
`..
`
`MULTIPLIER (WFM4) I i-r'D's(k,l)
`WEIGHTING FACTOR
`
`CALCULATOR
`DISTORTION
`,-s4c
`
`(DC8)
`
`88a
`(
`
`DISTORTION
`
`.._ DISTORTION ~
`
`SUMMER
`
`(DS4)
`
`(DC4)
`
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`~~~~~~~~~~ 84d CALCULATOR I D4(k,l)
`
`•
`• •
`•
`• • •
`•
`• • • •
`n·1·1·1·1
`--
`
`Vedanti Systems Limited - Ex. 2005
`Page 7
`
`
`
`.....:J
`Q'\
`.....:J
`-..
`Q'\
`.....:J
`Ol
`-..
`Ol
`
`~
`
`~
`
`0 ....,
`~ .....
`00 =-~
`
`~
`\CI
`\CI
`
`,....
`~ ,....
`0
`':Z
`
`"\CI
`
`~ a
`
`1-C
`•
`00
`d
`
`FIG. 8
`
`.. SELECTOR TO PREDICTOR 18-
`•
`~
`•
`•
`r.t;-
`•
`•
`~
`•
`•
`~ •
`• •
`• •
`• •
`• •
`
`VECTOR
`MOTION
`
`\
`
`etta
`
`,lr
`
`')
`
`•
`
`•
`
`•
`
`•
`
`R
`
`0
`COMPARATC
`IIJ ~ RFROM
`RREG
`,-104
`
`,-R
`
`0
`lO,l,O,Il f.-Q FROM COMPARATOR
`QREG
`,-102
`
`3
`
`--SELECTOR
`.
`-MOTION
`•
`~ , ,...-108
`• • •
`•
`~
`• •
`•
`• • • •
`•
`~
`• • •
`
`VECTOR
`
`\
`
`,-Q
`
`0
`II,o,o,olo,o,o,olo,o,o,olt ,o,o, tl ~ P FROM COMPARATOR
`
`15
`
`PRFG
`
`100
`
`~· [106
`
`. SELECTOR
`..
`-VECTOR
`MOTION
`
`,-P
`
`•
`MV32 '
`
`•
`
`•
`
`•
`
`•
`
`__)_
`
`MV16
`
`• • • •
`• • • •
`• • • •
`• • • •
`MV8
`MV4
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`• • • • • • • •
`
`\
`
`')
`
`Vedanti Systems Limited - Ex. 2005
`Page 8
`
`
`
`5,576,767
`
`1
`INTERFRAME VIDEO ENCODING AND
`DECODING SYSTEM
`
`This is a continuation of application Ser. No. 08/407,427,
`filed Mar. 17, 1995, abandoned which is a continuation of 5
`application Ser. No. 08/012,814 filed Feb. 3,1993, now
`abandoned.
`
`BACKGROUND OF THE INVENTION
`
`10
`
`15
`
`2
`Therefore in order to realize the benefits of digital transmis(cid:173)
`sion, it is necessary to substantially compress the HDTV
`signal. HDTV signal compression must therefore be
`achieved to a level that enables transmission at bandwidths
`comparable to that required by analog transmission formats.
`Such levels of signal compression coupled with digital
`transmission of the signal will enable a HDTV system to
`operate on less power with greater immunity to channel
`impairments.
`Motion estimation/motion compensation techniques can
`be used to compress the amount of data required in the
`transmission of interframe coded motion video. Motion
`estimation/motion compensation processes exploit the high
`level of temporal redundancy in video data. This high level
`of temporal redundancy is a neccesary attribute of video data
`for the picture sequences to appear continuous. The process
`of estimating the motion of objects within a video sequence
`is known as motion estimation. The processing of images by
`compensating for the presence of motion in a scene or image
`is motion compensation. The combined processes of motion
`estimation and motion compensation produce a prediction of
`the current frame of data. The error in this prediction, known
`as the residual, can be further compressed and transmitted.
`It is therefore an object of the present invention to provide
`a novel and improved method and system for compressing
`HDTV signals that will enable digital transmission at band(cid:173)
`widths comparable to that of analog transmission of con(cid:173)
`ventional TV signals.
`
`SUMMARY OF THE INVENTION
`
`The present invention is a novel and improved system and
`method for compressing image data for transmission and for
`reconstruction of the image data upon reception. The image
`compression system includes a subsystem for generating a
`prediction of a block of input pixel data in the current frame
`of pixel data from a corresponding composite block pixel
`data from a reference frame of data optimized for encoding
`a high precision reproduction of said pixel data at a mini(cid:173)
`mum transmission data rate.
`In the present invention, novel techniques of motion
`estimation and motion compensation are employed. In stan(cid:173)
`dard motion estimation and motion compensation applica(cid:173)
`tions, a frame of data is divided into a fixed number of
`blocks of pixel data, and a prediction for each of the blocks
`of pixel data is calculated. The problem with using a fixed
`number of blocks is that using blocks that are too small
`results in a low level of compression and requires unaccept(cid:173)
`ably high banwidth for transmission, and using blocks that
`are too large results in an unacceptably high level of picture
`degradation. In the present invention, however, the number
`and size of the blocks of pixel data varies in response to the
`amount of information in frame sequences. By employing
`this technique a high level of data compression can be
`achieved without picture degradation.
`The sub-system of the present invention comprises first
`selection means for providing a first data prediction by
`selecting a most similar block of pixel data from a first
`reference block of data wherein the first reference block of
`data is a block of data from a previous frame or a combi(cid:173)
`nation of blocks of data from previous frames of data, and
`at least one additional selection means for providing addi(cid:173)
`tional predictions of the block of pixel data as combinations
`of predictions of smaller sub-blocks of pixel data by select(cid:173)
`ing sets of most similar sub-blocks of pixel data from
`additional reference blocks of data and wherein the addi-
`
`I. Field of the Invention
`The present invention relates to image processing. More
`particularly, the present invention relates to a novel and
`improved system and method for interframe video coding
`based upon motion compensated predictive discrete cosine
`transform coding techniques.
`II. Description of the Related Art
`In the field of transmission and reception of television
`signals, various improvements are being made to the NTSC 20
`(National Television Systems Committee) System. Devel(cid:173)
`opments in the field of television are commonly directed
`towards a high definition television (HDTV) system. In the
`early development of HDTV, system developers have
`merely applied the Nyquist sampling theorem and low-pass 25
`filter design with varying degrees of success. Modulation in
`these systems amounts to nothing more than a simple
`mapping of an analog quantity to a value of signal amplitude
`or frequency.
`It has most recently been recognized that it is possible to 30
`achieve further improvements in HDTV systems by using
`digital techniques. Many of the early HDTV transmission
`proposals share common factors. These systems all involve
`digital processing of the video signal, which necessitates
`analog-to-digital (AID) conversion of the video signal. An 35
`analog transmission format is then used thereby necessitat(cid:173)
`ing conversion of the digitally processed picture back to
`analog form for transmission.
`The receiver/processor must then reverse the process in
`order to provide image display. The received analog signal
`is therefore digitized, stored, processed and reconstructed
`into a signal according to the interface format used between
`the receiver/processor and the HDTV display. Furthermore
`the signal is most likely converted back to analog form once
`more for display.
`Many of the conversion operations mentioned above,
`however, may be avoided using a digital transmission format
`which transmits the processed picture, along with control,
`audio and authorization signals, using digital modulation
`techniques. The receiver may then be configured as a digital
`modem with digital outputs to the video processor function.
`Of course, the modem requires an AID function as part of
`operation, but this implementation may only require a 4-bit
`resolution device rather than the 8-bit resolution device 55
`required by analog format receivers.
`Digital transmission is superior to analog transmission in
`many ways. Digital transmissions provide efficient use of
`power which is particularly important to satellite transmis(cid:173)
`sion and military applications. Digital transmissions also 60
`provides a robustness of the communications link to impair(cid:173)
`ments such as multi path and jamming. Furthermore digital
`transmission facilitates ease in signal encryption, necessary
`for many military and broadcast applications.
`Digital transmission formats have been avoided in previ- 65
`ous HDTV system proposals primarily because of the incor(cid:173)
`rect belief that they inherently require excessive bandwidth.
`
`40
`
`45
`
`50
`
`Vedanti Systems Limited - Ex. 2005
`Page 9
`
`
`
`5,576,767
`
`3
`tional reference blocks of data can be data from a previous
`frame or a combination of blocks of data from previous
`frames of data.
`The decision means is included in the sub-system for
`selecting from the first data prediction and the additional
`data predictions an efficient prediction according to the
`similarity the current block of pixel data and the number of
`bits required to describe the efficient prediction and provid-
`ing the efficient prediction as an output and a selection signal
`indicating the block size selection used in describing the 10
`efficient prediction.
`The present invention also provides for a novel and
`improved method for reconstructing from a received
`encoded motion information value a corresponding block of
`pixel data. The present invention further envisions a novel
`and improved method for compressing an image signal as
`represented by a block of pixel data and for reconstructing
`the image signal from the compressed image signal.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The features, objects, and advantages of the present
`invention will become more apparent from the detailed
`description set forth below when taken in conjunction with
`the drawings in which like reference characters identify
`correspondingly throughout and wherein:
`FIG. 1 is a block diagram of an exemplary interframe
`encoder;
`FIG. 2 is a block diagram of an exemplary interframe
`decoder;
`FIGS. 3a and 3b are illustrations of pixel block space with
`a matching of pixel blocks between frames for the purpose
`of motion prediction;
`FIGS. 4a, 4b, 4c and 4d are exemplary illustrations of
`alternative motion prediction block sizes for a 32x32 block
`of pixel data;
`FIG. 5 is an exemplary illustration of a composite 32x32
`block of predicted pixel data;
`FIG. 6 is a simplified block diagram illustrating the
`processing elements in the block size assignment portion of
`an adaptive block size motion estimation element;
`FIG. 7 and 8 are block diagrams illustrating the process(cid:173)
`ing elements in an exemplary adaptive block size motion
`estimation element; and
`FIG. 9 is a block diagram illustrating the processing
`elements in an exemplary adaptive block size motion com(cid:173)
`pensated prediction element.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`
`Turning now to the drawings, FIG. 1 illustrates in block
`diagram form an exemplary implementation of a general
`motion-compensated predictive coding (MPC) system. The
`coding system of FIG. 1 is comprised of intraframe encoder
`10, channel encoder 12, intraframe decoder 14, frame buffer
`16, motion-compensated prediction processor or predictor
`18, motion estimation processor or estimator 20, and sum(cid:173)
`mers 22 and 24. It should be understood that the diagram of
`FIG. 1 is common to both conventional motion-compen(cid:173)
`sated predictive coding systems and the present invention.
`However, in the present invention novel motion estimation
`and motion compensated prediction
`techniques are
`employed.
`
`4
`A sequence of input pictures comprised of pixel data is
`input to the coding system of FIG. 1. The pixel data may be
`provided as serial pixel data or in blocks, typically a 32x32
`pixel block or smaller such as 16x 16, 8x8, etc., is input to
`summer 22 and to motion estimator 20. Summer 22 subtracts
`from the input pixel data a corresponding prediction picture
`generated by motion-compensated predictor 18. The output
`of summer 22 is a displaced frame difference (DFD) data
`that is provided to intraframe encoder 10.
`Intraframe encoder 10 encodes the DFD data using one of
`many
`possible
`intraframe
`compression
`techniques.
`Intraframe encoder 10 by way of example may be a block
`encoder which codes blocks of the DFD data. Should the
`DFD data be in serial form it is converted to block data for
`15 coding. An example of one type of intraframe decoder is the
`well known fixed block size discrete cosine transform (FBS(cid:173)
`DCT) coder. Another example of an intraframe coder which
`has improved performance over the FBSDCT coder is one
`using the adaptive block size discrete cosine transform
`20 (ABSDCT) techniques disclosed in U.S. Pat. Nos. 5,021,891
`and 5,107,345, and even further using the ABSDCT tech(cid:173)
`nique in combination with a Discrete Quadtree Transform
`technique disclosed in copending U.S. patent application
`Ser. No. 071710,216, filed Jun. 4, 1991 each assigned to the
`25 assignee of the present invention and the disclosure of which
`is incorporated by reference. Although ABSDCT coders are
`preferred, other types of intraframe encoders well known in
`the art may be readily substituted therefore. The intraframe
`coded data generated by intraframe encoder 10 is provided
`30 both to channel encoder 12 and intraframe decoder 14.
`Channel encoder 12 is preferably used for encoding the
`intraframe coded data prior to transmission to protect the
`data from transmission channel induced errors. Channel
`encoder 12 typically uses conventional error correction
`35 encoding techniques. As such, encoder 12 may be a block,
`convolutional or trellis coder as is well known in the art. The
`encoded intraframe data along with motion vector data are
`encoded and provided for transmission by a transmitter (not
`shown).
`The intraframe coded data provided to intraframe decoder
`14 is decoded to produce a reconstructed DFD block.
`lntraframe decoder 14 is by its nature of a type which
`complements intraframe encoder 10. When intraframe
`45 encoder 10 is of the type mentioned in the above patents and
`patent application, the decoder is typically of the corre(cid:173)
`sponding type as also described therein.
`The reconstructed DFD block is provided from intraframe
`decoder 14 to summer 24 where it is added with a prediction
`50 picture generated by motion-compensated predictor 18
`based upon a reference frame, typically the previous frame,
`of the input picture. The output of summer 24, which is a
`reconstructed picture, i.e. pixel block, is provided to frame
`buffer 16. Frame buffer 16 is thus updated by a reconstructed
`frame created by adding the reconstructed DFD that is
`produced by intraframe decoder 14 to the prediction picture.
`The reason for using the reconstructed DFD rather than the
`DFD is to keep frame buffer 16 and the frame buffer in the
`decoding system (FIG. 2) synchronized. In other words by
`60 using the reconstructed DFD the same picture is in both
`frame buffer 16 and the frame buffer in the decoding system
`(FIG. 2), since the DFD is not available to the decoding
`system.
`The reconstructed picture stored in frame buffer 16 is
`65 provided to motion-compensated predictor 18 and motion
`estimator 20. Estimator 20 uses the reconstructed picture
`stored in frame buffer 16 and the input picture of the next
`
`40
`
`55
`
`Vedanti Systems Limited - Ex. 2005
`Page 10
`
`
`
`5,576,767
`
`6
`I(k1, 11, t). The vector associated with the location of pixel
`block 42 within frame 40 is (k1, 11).
`A search for the best matching block in the reference
`frame is conducted over a search window within the refer(cid:173)
`ence frame. The "goodness" of the match is often measured
`by a block matching distortion function F8Mv(x, y) which
`for example may be a magnitude difference or squared error,
`e.g. F8 Mv(x, y)=lx-ylor F8 Mv(x, y)=(x-y)2
`.
`FIG. 3b illustrates the reference pixel frame 48 and a
`10 search window 50 over which the best match for pixel block
`42' is to be searched. In FIG. 3b the position in reference
`frame 48 which the pixel block 42' occupied in the current
`frame is indicated in dotted lines by pixel block 52. The (x,
`y) coordinate of the first pixel in the upper left hand comer
`15 of pixel block 52, pixel 54, is (k" 11) having an intensity
`function of IR(k1, 11, t)
`A full-search block-matching motion estimator computes
`the distortion figure DN.M:
`
`N- IM-1
`:E FsMv(l(k + n,/ + m,t)JR(k + n + i,/ + m + j.l)).
`:E
`n=O m=O
`
`(1)
`
`25 IfF BMD(x, y )=lx-yl, Equation 1 simplifies to the following:
`
`5
`frame to generate a set of motion vectors for the next input
`picture frame. Predictor 18 uses the reconstructed picture
`from the current frame stored in frame buffer 16 and the
`motion vectors generated in estimator 20 to produce a
`prediction for the next frame. The motion vectors from 5
`estimator 20 are also provided to channel encoder 12 for
`encoding and transinission along with the intraframe data for
`the next frame .
`FIG. 2 illustrates in block diagram form an exemplary
`implementation of a motion-compensated predictive decod(cid:173)
`ing system for decoding data encoded according to the
`encoding system of FIG. 1. The decoding system of FIG. 2
`is comprised of channel decoder 30, intraframe decoder 32,
`motion-compensated predictor 34, summer 36 and frame
`buffer 38. A motion estimator is not needed in the decoding
`system since the motion vectors along with the interframe
`data is received from the encoding system. Thus the opera(cid:173)
`tion of the decoding system except for the channel decoder
`is a subset of the operations of the coding system of FIG. 1.
`The signal received from
`the transmission channel
`receiver (not shown) is provided to channel decoder 30.
`Channel decoder 30 is of a type capable of decoding the data
`as encoded by channel encoder 12 of FIG. 1. Channel
`decoder 30 typically uses conventional decoding techniques
`and may be configured as a block, Viterbi or convolutional
`decoder as is well known in the art. The decoded data
`includes both intraframe data and motion vector data. The
`intraframe data is provided to intraframe decoder 32 while
`the motion vector data is provided to motion compensated
`predictor 34.
`Intraframe decoder 32 produces from the intraframe data
`the reconstructed DFD which is provided to summer 36
`along with a prediction picture generated by predictor 34.
`Intraframe decoder is preferably of the same type as
`intraframe decoder 14 of FIG. 1.
`Summer 36 adds the reconstructed DFD to the prediction
`picture to produce a reconstructed picture. The current frame
`of the reconstructed picture is provided to frame buffer 38
`where it is stored. The stored frame is provided to predictor
`34 for use in generating the prediction picture for the next
`frame.
`Returning to FIG. 1, in order to facilitate an understanding
`of the generation of the motion vectors of the present
`invention a description of conventional motion vector gen- 45
`eration techniques is provided. The input picture is an image
`sequence, e.g., motion video, which can be represented by
`an intensity function I(x, y, t) where x and y are the spatial
`coordinates of the pixel and t is the frame number. Using this
`nomenclature the current frame to be encoded may be 50
`identified as I(x, y, t) and a reference frame may be identified
`as IR(x, y, t).
`A conventional motion estimator used in a conventional
`motion-compensated predictive coding system is based
`upon block matching. In the conventional motion estimator 55
`an NxM block of pixels for the current frame is used to
`search for a matching block in the reference frame, typically
`the previous frame (i.e. IR(x, y, t)=I(x, y, t-1)). Nand M may
`be of the same integer value to provide a square pixel block
`or may be of different values to provide a rectangular pixel 60
`block.
`FIG. 3a illustrates for current pixel frame 40 a selected
`NxM pixel block 42. For reference purposes the (x, y)
`coordinates of the first pixel in the upper left hand comer of
`frame 40, pixe144, is (0, 0). The (x, y) coordinate of the first 65
`pixel in the upper left hand comer of pixel block 42, pixel
`46, is (k1, 11) with this pixel having an intensity function of
`
`30
`
`35
`
`40
`
`N-lM-l
`ll(k + n,l + m,t)-lR(k+n + i,/ +m + j,t)l.
`:E
`:E
`n:O m:O
`
`(2)
`
`The motion estimator further searches for the motion
`displacement (i, j) that minimizes DN,M (k, 1, i, j) to produce
`the minimum block distortion where:
`
`DN_..(k, I)=MIN{DN_..(k,l,i,j),ij } for all (i,j)e S,
`
`(3)
`
`where S is the search window.
`From Equation 3 the displacement (i, j) that produces the
`minimum motion block distortion is called the motion vector
`(mx, my). The motion vector search range is often restricted
`to a maximum offset, typically -32;;::;mx, my;;::;32. It should
`be noted that mx and my need not be integer values. These
`values can be fractional numbers if a sub-pixel accuracy is
`used in motion estimation and compensation. It should be
`noted that extra processing, including interpolation, is
`required to produce non-integer pixels.
`Using the above equations for the example illustrated in
`FIG. 3b, a pixel block 42' in the reference frame is identified
`as having the best match. The (x, y) coordinate of the first
`pixel in the upper left hand comer of pixel block 42', pixel
`46', is (k1+i 1, l 1+j 1) having an intensity function ofiR(k 1+i 1,
`11+ 1 , t). The motion vector associated with the location of
`pixel block 42' with respect to pixel block 52 in the reference
`frame 48 is (i 1, j 1) .
`Referring back to FIG. 1, motion estimator 20 differs from
`a conventional motion estimation processor in that more
`than one block size is used in the motion estimation. For
`convenience, let the motion block be an NxN square (ie.
`M=N so that DN,M(k, l)=D~k, 1)). In an exemplary configu(cid:173)
`ration four motion block sizes are used, for example N=4, 8,
`16 and 32. In the exemplary configuration, there will be one
`motion block distortion D3 2(k, I) for the 32x32 base block,
`four motion block distortions for each of the four 16x16
`blocks within that 32x32 block, sixteen distortions for each
`of the sixteen motion block distortions within the 32x32
`block and sixty-four motion block distortions for each of the
`sixty-four 8x8 blocks within the 32x32 block.
`
`Vedanti Systems Limited - Ex. 2005
`Page 11
`
`
`
`5,576,767
`
`15
`
`7
`FIGS. 4a-4d, illustrate in an exemplary configuration,
`alternative block sizes used for motion estimation of the
`32x32 block of pixel data. The block of pixel data in the
`current frame can be predicted by the selection of a single
`best match of a 32x32 block of pixel data from the reference
`frame as shown in FIG. 4a or alternatively can be predicted
`by selection from a reference frame of best match blocks in
`the form of four 16x16 sub-blocks of pixel data in FIG. 4b,
`sixteen sub-blocks of pixel data in FIG. 4c, or sixty-four
`sub-blocks of pixel data in FIG. 4d. Alternatively the block 10
`of pixel data can be predicted by a composite of 16x16, 8x8,
`and 4x4 blocks of pixel data from the reference frame as
`illustrated in FIG. 5.
`In motion compensated prediction, predictions from the
`composite of smaller sub-blocks of pixel data will, in
`general, always result in a lower than or equal to distortion
`level than the distortion level resulting from predictions
`from a larger block. However, motion compensated predic(cid:173)
`tions from smaller sub-blocks of pixel data, requires the
`transmission of more information, i.e. more motion vectors.
`The goal of an adaptive motion estimator 20 of FIG. 1 is to
`select smaller sub-block predictions only when it results in
`a significant improvement in local pixel reproduction qual(cid:173)
`ity. In a pixel area where a great amount of activity is present
`the predictions using the larger block sizes will result in
`unacceptably high levels of distortion. On the other hand, in
`areas where little or no activity is taking place the larger
`block sizes are acceptable and result in greater compression
`at little cost to picture quality.
`The adaptive block size motion estimation processor
`produces the composite motion block by choosing an opti(cid:173)
`mum combination oflarge and small motion blocks that best
`describe the motion within the image block. This is to
`produce an improved motion compensated prediction such
`that the residual D