`
`US006633611B2
`
`(12) United States Patent
`Sekiguchi et al.
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 6,633,611 B2
`*Oct. 14, 2003
`
`METHOD AND APPARATUS FOR
`REGION-BASED MOVING IMAGE
`ENCODING AND DECODING
`
`Inventors: Sliunichi Snkiguchi, Tokyo (JP);
`Ynshtmt Isu, Tokyo (JP); Kulitam
`Asai, Tokyo (JP)
`-
`
`Assignee: Mitsubishi Denki Kabushiki Kaisha,
`'1hkyo (JP)
`
`Notice:
`
`This patent issued on a continued pros-
`ecution application filed under 37 CFR
`l.53(zl), and is subject to the twenty year
`patent
`term provisions of 35 U.S.C'.
`tS4(a)(2).
`
`Subject to any disclaimer. the term of this
`patent is extended or adjusted under 35
`U.S.C. 154{b) hy 146 days.
`
`(21) App}, No.: l}8f956,1t'tfi
`
`(23) Fiied:
`
`Oct. 24, 1997
`
`(65)
`
`Prior Ptibltcntion Data
`US 2t'.l03ftJtB5=t77 A1 Feb. 20, 2003
`
`5,572,253
`5,596,659
`5,751,353
`5,852,469
`5,290,175
`5,917,949
`5,974,137
`5,091,668
`G_.U38,3‘)7
`6,205,260 B1 '—'
`
`bIP>>>>>3-35
`
`1.l)'t996 Yokoytima
`H1997 Nomiile et til.
`Sm.-as
`|‘ttiya.muto
`"“ 131993 Nagai et at.
`"-'
`4.319519 Dag; et at.
`.
`‘“
`6/1999 Chun et al.
`“ 1911999‘
`like ............
`ll/1990 Corset er al.
`?I2(|0t} Jearmin
`3;'2tI[l1
`'
`
`FOREIGN PATENT DOCUMENTS
`
`30630 A1
`061.955 2 A1
`
`3/1989
`10/1994
`
`(List continued on next page.)
`OTHER PUBLICATIONS
`
`IEEE Transactions on Circuits and Systems for Video Tech-
`nology, "The MPEG-4 Video Standard Verification Model”.
`Thomas Siltora, vol. 7. No. 1, Feb. 1997, S. 19 bis 31.
`
`(List continued on next page.)
`
`Prt'marj= Exantinsr—Chris Kelley
`Assistant Exrmtirter—Tung Vb
`
`(57)
`
`ABSTRACT
`
`Foreign Application Priority Data
`(30)
`Apr. 24, 1997
`(JP)
`Sep. 25, 1997
`(JP)
`
`9-1070-:2
`9261420
`
`H0411 IJ66
`Int. Cl.’
`(51)
`3-75,240.16
`(52) U.S. Cl.
`3751240, 249.15,
`(53) Field of Search.
`375240.17, 240.03, 240.05, 240.11-, 348/384.1,
`416.t.466.1; 382/232, 233; H643 use
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`5,021,379 A
`5,453,787 A *
`5,555_.0:t3 A
`
`611991 Vogel
`SU1995 Hancock et al.
`9:'1996 Bazzaz
`
`............ 348339]
`
`In partitioning and encoding an image into multiple regions,
`the degree of freedom of the region shape has generally been
`low and setting regions based on image features was ditfi-
`cult. A moving image encoding apparatus includes a region
`partitioning section, an encoder, and a memory for motion-
`compensated prediction. The region partitioning section
`includes a partitioning processing section and 3 integration
`processing section. The partitioning processing section pat-
`litions the input image based on a criterion relating to the
`state of partition. The integration processing section into-
`grates rntitttally close regions based on it criterion relating to
`the state of integration. Thereafter, each region is encoded.
`A large variety of region shapes can be produced by the
`integmtion processing section.
`'
`
`18 Claims, '21 Drawing Sheets
`
`
`
`US 6,633,611 B2
`Page 2
`
`FOREIGN PATENT DOCUMENTS
`
`OTHER PUBLICATIONS
`
`511995
`“W445 3'9
`H1997
`753970 A1
`2/1997
`757334 A2
`zfiszgfisigfg A2’ 153:;
`6_16y449 A
`fifiggi‘
`3.45953 A
`2,1996
`9-507347 A
`771997
`1CID8621 A
`4.-‘I998
`W0 G5/1-13-19 A
`“#1994
`9717797
`571997
`
`“The Principle and Application of Data Compression”,
`published by Zhu Lin PubJ1'shings,May 1996.
`International Tclccommunjcation Union; "Linc 'I\'ansmi5-
`sion of Nonaclcphone Signals"; DRAI-'-"I'—ITU—-'I' Recom-
`l'I1Cl1(1'dllDl1
`5,
`L.C. Real at al; “A Very Low Bit Rate Video Code: Based
`on Wcwr Quantization"; IEEE Transactions on Image Pm-
`cessing; vol. 5; No. 2; (Feb. 2, 1996); pp. 263 through 273.
`
`* cited by examiner
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`12fl.014l8n...hS
`
`US 6,633,611 B2
`
`Eosms
`
`E<moanT9".
`
`
`
`._0m_._.ZOOezaoozm
`
`zoEE,_§
`
`Em>_.__
`m2._=_§[E.mEa.
`22%,_.o:E,_m
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`1.2f02.la._HS
`
`US 6,633,611 B2
`
`E
`
`E<moan.N.9”.
`
`zo_._.om_m
`
`.EoEs_
`
`.mos:
`
`
`.<z_...EEzo:
`
`E_,_wE"__n_em_m%%M¢mm_ao_om%o_homm
`
`zoamm¢_aE
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 3 of 21
`
`Us 6,633,611 B2
`
`3
`
`PARTL
`
`REGION '-
`TIONING “ ENCODEH
`SECTION -I
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 4 of 21
`
`US 6,633,611 B2
`
`ENCODE REGION
`.Sn
`
`S5
`
`FINAL
`REGION?
`
`
`
`U.S. Patent
`
`Oct. 14,2003
`
`Sheet 5 of 21
`
`US 6,633,611 B2
`
`INTEGRATION
`
`PROCESSING
`SECTION
`
`PARTITIONING
`PROCESSING
`
`SECTION
`
`UNIFORM
`PARTIT|0N-
`
`ING
`
`ACTIVITY
`CALCULAT-
`
`ING
`
`SECTION
`
`SECTION
`
`PARTITION-
`ING
`
`JUDGMENT
`
`SECTION
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 6 of 21
`
`US 6,633,611 B2
`
`ACTIVI
`F BLOCK a°,,>TH
`:2
`
`ACTWIT
`F BLOCK a1,,>TH1
`?
`
`OD
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 7 of 21
`
`Us 6,633,611 B2
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`12«.10on.HehS
`
`US 6,633,611 B2
`
`8:33:amzo_m_..._6
`
`
`
`
`_,_o_§_._§m_3_o_m_§_._;_§3<
`
`zopam2::
`
`.Eoouma
`
`_,_o_E$=__
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet?! of 21
`
`Us 6,633,611 B2
`
`
`
`Oct. 14, 2003
`
`Sheet 10 of 21
`
`US 6,633,611 B2
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 11 of 21
`
`US 6,633,611 B2
`
`INTEGRATION
`PROCESSING
`sEc'n°N
`
`UNIFORM
`PAFITITIONING
`SECTION
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`.12on.021tBE_hS
`
`US 6,633,611 B2
`
`
`
`.zo_._._._....._<n_>._._>_._.o<
`
`
`
`oz..._.<.=._O._<o
`
`=,_m_s_oS_.62.
`
`
`
`zozommzocomm
`
`m.....<._o
`
`uz_tE_me
`
`zofimm
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 13 of 21
`
`US 6,633,611 B2
`
`START
`
`UNIFORM
`PARTITIONING
`
`S26
`
`CLASS
`DETERMINATION 527
`
`28S
`
`cm
`I=aLocIga°,.>TH
`
`N
`
`'0 TO PAHTITIO
`
`PAFITIO
`BLOCK B
`3‘
`
`CTIVI
`F BL0cI§?B',.>TH
`
`938
`IE5]
`
`FEATURES
`MEMORY
`
`nscaee or
`
`FEATURE
`COINCIDENCE
`CALCULATING
`SECTION
`
`CLASS
`DETERMI-
`
`5'E;Tf'%N4
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`12........04ItEB‘HS
`
`US 6,633,611 B2
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`r.0.314|.6BhS
`
`7..
`
`4|.
`
`US 6,633,611 B2
`
`..zm:S=_.
`
`zozammmooomn
`
`_s_§$z=s
`
`
`o_,___,_o=zo_E$z_:._
`
`.E_s_4.33.358%
`zofiasm§o_m_§__
`
`
`
`U.S. Patent
`
`041.14, 2003
`
`II-BehS
`
`1..2f061.
`
`US 6,633,611 B2
`
`new_s=s3._.38
`3:523:33
`___m3.83:238
`
`38zo_=33_._.m_.1..3=52:
`E:_3._3
`
`2:3._8§sm.3...
`
`.383E3.__¢Es33m UE53
`_......_3333”.38
`zoE_2m3
`
`
`
`EEE3.383=__s_..§53=o
`
`
`
`3.._8s.§.3".
`
`
`
`am:v.._.,.__3:w
`
`_m393
`
`8.33.
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`.tEE.n..S
`
`.1
`
`.12cl07.
`
`US 6,633,611 B2
`
`;_m_=8a.
`
`
`
`zo_§_B=.___s_§_m25
`
`zo_._.omm322,____§m
`
`_,_o_Em3:5
`
`
`
`535_,_o__=_2mmaooma
`
`
`
`4__.w.4_%_.w__Qm__,E:_,_o_m_>2._
`
`_s_E_._§_c
`
`55:52..
`
`zozsm25¢
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 13 of 21
`
`Us 6,633,611 B2
`
`UPDATE QUANTIZATION
`PARAMETER
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`Sheet 19 of 21
`
`US 6,633,611 B2
`
`zo_._.<m6m._.z_
`
`_s_:3.E
`
`533ms;
`
`E8zo_s_.E.__
`
`
`
`zofimm¢=_=s§.o
`
`
`
`U.S. Patent
`
`Oct. 14, 2003
`
`I2£1002teehS
`
`US 6,633,611 B2
`
`mgenfi
`
`
`
`34:..:m
`
`
`
`uz_mo5m_m__§_Em
`
`
`
`zozommEu.._s._<
`
`298:
`
`_§§_,E,_8
`
`283“
`
`
`
`U. S. Patent
`
`Oct. 14, 2003
`
`..HS
`
`12I.E...c
`
`..U
`
`1291
`
`US 6,633,611 B2
`
`
`
`Becomemmamzma
`
`m.___.6._§
`
`__m.
`
`
`
`5.5....308..
`
`
`
`=9.zoE:.E9_z_
`
`gmzo_om_m
`
`
`
`mug.maoomo
`
`:8Ea
`
`smzoamm
`
`
`
`Em__EE_.:_mmE.a_.=
`
`«mmm_._§...zo_§_m_%5w_
`
`E:25.5ma8m_.._
`
`
`
`mo".zo=<==9_z_
`
`._m.._o_om_¢
`
`
`
`US 6,633,611 B2
`
`1
`METHOD AND APPARATUS FOR
`REGION-BASED MOVING IMAGE
`ENCODING AND DECODING
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`This invention relates to a method and apparatus for
`inputting and encoding a moving image and to an apparatus
`for decoding the encoded moving image. This invention
`particularly relates to a technique for encoding an image
`frame by first partitioning it into multiple regions and to a
`technique for decoding the encoded image frame.
`2. Description of the Related A11
`FIG. 1 is a block diagram of a first prior art showing the
`configuration of a moving image encoder based on l'l'U-T
`recommendation H.263, wherein numeral 1 indicates an
`input digital image signal (hereinafter referred to simply as
`an input
`image), numeral 101 indicates a diiferentiator,
`numeral 102 indicates a prediction signal, numeral 103
`indicates a prediction error signal, numeral 104 indicates an
`encoder, numeral 105 indicates encoded data, numeral 106
`indicates a decoder, numeral 10’? indicates a decoded pre-
`diction error signal, numeral 108 indicates an adder, numeral
`109 indicates a local decoded image signal, numeral 1.10
`indicates a memory, numeral 111 indicates a prediction
`section, and numeral 112 indicates a motion vector
`The input image 1 to be encoded is lirst input to differ-
`entiator 101. Dillcrentiator 101 takes the diflierence between
`input
`image 1 and prediction signal 102 for output as
`prediction error signal 103. Encoder 104 encodes input
`image 1, which is an original signal, or prediction error
`signal 103. and outputs encoded data 105. The encoding
`method in encoder 104 employs a technique in the above-
`mentioned recommendation where prediction error signal
`183 is transformed from a space region to a frequency region
`using Discrete Cosine Transformation (DCT), a type of
`orthogonal transfortzualion, and the obtained transformation
`coelficient is linearly quantized.
`Encoded data 105 is branched into two directions, where
`one is transmitted to a receiver, or an image decoding
`apparatus (not shown) and the other is input to decoder 106
`within the present apparatus. Decoder 105 performs an
`operation which is the opposite of encoder 104, and gener-
`ates and outputs decoded prediction error signal 107 from
`encoded data 105. Adder 103 adds prediction signal 102
`with decoded prediction error signal 107 and outputs the
`result as decoded image signal 109. Prediction section 11.1.
`performs n:totion—comp-ensated prediction using input image
`1 and decoded image signal 109 of the previous frame stored
`in memory 110, and outputs prediction signal 102 and
`motion vector 112. At this time, motion compensation is
`performed in block units of a fixed size called a macro block
`comprising 16x16 pixels. As an optional function for a block
`within a
`region having large movements, motion-
`comperusated prediction can be performed with the macro
`block partitioned into four sub—block units of 8x3 pixels.
`The obtained motion vector 112. is transmitted toward the
`image decoding apparatus, and prediction signal 102 is sent
`to diflferentiator 102 and adder 108. According to this
`apparatus, the amount of data of the moving image can be
`compressed while maintaining image quality through the use
`of motion-compensated prediction.
`In this prior art, the shape of the encoding unit region is
`limited to two types. Moreover, both shapes are rectangular.
`Therefore, there is naturally a limit in the encoding which
`
`2
`can be adapted to the scene structure or features of an image.
`For example, if it is desired to increase the amount of code
`only for an object having large movements, it is preferable.
`although difiicult in this prior art, to define a region having
`a shape identical to that of the object.
`FIG. 2 is a biock diagram of an image encoding apparatus
`concerning a second prior art. This apparatus is based on an
`encoding method that was proposed in “A Very Low Bit
`Rate Video Coder Based on Vector Quantization" by I.. C.
`Real at al (IEEE Transactions on Image Processing, Vb}. 5,
`No. 2, liebruargr 1996). In the same figure, numeral 113
`indicates a region partitioning section, numeral 114 indicates
`a prediction section, numeral ]15 indicates a region deter-
`mination section, numeral 116 indicates encoding mode
`information including inter-frame encoding and intra-frame
`encoding information. numeral 117 indicates a motion
`vector, numeral 118 indicates an encoder, and numeral 119
`indicates encoded data.
`
`In this apparatus, input image 1 is first partitioned into
`multiple regions by region partitioning section 113. Region
`partitioning section 113 determines the
`of regions in
`accordance with the motion-compensated prediction error.
`Region partitioning section 113 performs judgment using a
`threshold with regard to dispersion of the inter-frame sigttal
`and assigns small blocks to regions having large movement
`and large blocks to regions, such as backgrounds, having
`small movement from among ten types of block sizes of
`4x4, 4x3, 8x4, 8x8, 8x16, 16x8, 16x16, 16x32, 32x16, and
`32x32 prepared in advance. In concrete terms, a dispersion
`value is calculated by region determination section 115 for
`the prediction error signal obtained by prediction wlion
`1.14, and based on it the block size is determined. Attribute
`information 116, such as region shape information and
`encoding mode information, as well as motion vector 11'?
`are determined at this time, and the prediction error signal or
`the original signal is encoded by encoder 115 in accordance
`with the encoding mode information to yield encoded data
`119. Subsequent processes are the same as those of the first
`prior art.
`This prior art is richer in processing flexibility than the
`first prior art from the viewpoint of preparing multiple sized
`blocks. However, this apparatus also limits each region to a
`rectangular shape. Therefore, even with rectangular shapes
`in ten sizes, there is room for improvement in adaptability
`with respect to arbitrarily shaped image regions.
`SUMMARY OF THE INVENTION
`
`,
`
`The present invenlion takes into consideration these prob-
`lems with the object of providing a moving image encoding
`technique for performing more flexible processing accord-
`ing to the conditions of the image to be processed. The
`object of this invention, in more concrete terms, is to provide
`a moving image encoding technique using region partition-
`ing techniques that can accurately handle various image
`structures. Another object of this invention is to provide at
`partitioning criterion based on various points of view when
`partitioning regions for encoding. Still another object of this
`invention is to provide a technique for correctly decoding
`the encoded data of regions that have been partitioned into
`various shapes.
`The moving image encoding method of this invention
`includes two steps. A first step partitions an input image into
`multiple regions based on at predetermined partitioning
`judgrnent criterion. Until this point, the encoding process is
`the same as the general conventional region-based encoding.
`However, in a second step, this invention integrates each of
`
`
`
`US 6,633,611 B2
`
`3
`partitioned multiple regions with adjacent regions based on
`a predetermined integration jt.1dgt'l]et:l1 criterion. Thereafter,
`in it third step, the image signal is encoded for each of the
`regions remaining after integration. According to this
`method, the integration process allows regions to take on
`various shapes. Tlitus, a region having a shape closely
`matching the structure of an image or outline of an object
`can be generated.
`The moving image encoding apparatus of this invention
`includes a region partitioning section and an encoder. The
`region partitioning section includes a partitioning processing
`section for partitioning the input image into multiple regions
`based on a predetermined partitioning judgment criterion,
`and a integration processing section for integrating each of
`multiple regions partitioned by the partitioning processing
`section with adjacent regions based on a predetermined
`integration judgment criterion. The encoder encodes the
`image signal for each of the rt-goals remaining after inte-
`gration by the integration processing section. According to
`this apparatus, a comparatively high image quality can be '
`achieved at comparatively high data compression ratios
`while flexibly supporting the strucntres of images.
`The above-mentioned integration processing section per-
`forms preliminary encoding and decoding of images for
`each region, and may examine the amount of code and the
`encoding distortion. In such a case, the encoding distortion
`can be minimized under the constraint of a predetermined
`amount of code.
`
`'-
`
`The above-mentioned partitioning processing section
`includes a class identifying section for classifying the impor-
`tance of regions into classes, and may judge whether or not
`to partition each region based on an activity to be described
`later and the class. If the class identifying section references
`feature parameters in images,
`the recognition of objects
`becomes possible thus facilitating more accurate region
`partitioning.
`On the other hand, the moving image decoding apparatus
`of this invention inputs and decodes the encoded data of the
`image that was encoded after being partitioned into multiple
`regions. This apparatus includes a region shape restoring
`section and an image data decoder. The region shape restor-
`ing section restores, based on region shape information
`included in the encoded data, the shape of each region that
`was partitioned during encoding. The image data decoder,
`after specifying the sequence in which regions were encoded
`based on the shapes of the restored regions, decodes the
`image for each region from the encoded data. According to
`this apparatus, accurate decoding is achieved even if regions
`having various shapes are generated in the encoding stage.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows a moving image encoding apparatus relating
`to it ftrst prior art.
`FIG. 2 shows a moving image encoding apparatus relating -
`to a second prior art.
`FIG. 3 is a block diagram common to general moving
`image encoding apparatus relating to an embodiment.
`FIG. 4 is a flowchart showing an operation of the encod-
`ing apparatus of FIG. 3.
`FIG. 5 is an internal block diagram of the region parti-
`tioning section of FIG. 3.
`FIG. 6 is an internai block diagram of the partitioning
`processing section of FIG. 5.
`FIG. '7 is a flowchart showing an operation of the parti-
`tioning processing section of FIG. 6.
`
`4
`FIG. 3 shows an example of a uniform partitioning result
`in the partitioning processing section of FIG. 6.
`FIG. 9 shows a result of It ftrst. initial partitioning in the
`partitioning processing section of FIG. 6.
`FIG. 10 shows a final result of initial partitioning in the
`partitioning processing section of FIG. 6.
`FIG. ]_l
`is an internal block diagram of the integration
`processing section of FIG. 5.
`FIG. 12 is a flowchart showing an operation of the
`integration processing section of FIG. 11.
`FIG. 13 shows an example of labeling a region in the
`integration processing section of FIG. 11.
`FIG. 14 shows an example of setting adjacent regions in
`the integration processing section of FIG. 11.
`FIG. 15 is a flowchart showing the procedure of S19 of
`FIG. 12.
`
`FIG. 16 is an internal block diagram of another embodi-
`ment of the partitioning processing section of FIG. 5.
`FIG. 17 shows a final result of initial partitioning in the
`partitioning processing section of FIG. 16.
`FIG. 18 is an internal block diagram of another embodi-
`ment of the partitioning processing section of FIG. 5.
`FIG. 19 is t flowchart showing an operation of the
`partitioning processing section of FIG. 18.
`FIG. 20 shows another embodiment of the class identi-
`fying section of FIG. 18.
`FIG. 21 shows motion-compensated prediction based on
`biock matching.
`FIG. 22 is an internal block diagram of another embodi-
`ment of the partitioning processing section of FIG. 5.
`FIG. 23 is a flowchart showing an operation of the
`partitioning processing section of FIG. 22.
`FIG. 24 is an internal block diagram of another embodi-
`ment of the integration processing section of FIG. 5.
`FIG. 25 is a fiowchan showing an operation of the
`integration processing section of FIG. 24.
`FIG. 26 is an internal block diagram of another embodi-
`ment of the integration processing section of FIG. 5.
`FIG. 27 is an internal block diagram of a moving image
`decoding apparatus relating to the embodiment.
`FIG. 28 is a flowchart showing an operation of the
`decoding apparatus of FIG. 24.
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBOIDIMENTS
`
`First Embodiment
`
`FIG. 3 is a block diagram showing a configuration ofa
`moving image encoding apparatus related to this embodi-
`mettt. This apparatus can be used in portable or stationary
`equipment for image communications, such as TV tele-
`phones and TVt:onferent‘.ing. It can also be used as a moving
`image encoding apparatus in image storage and recording
`apparatus such as digital VCRS and video servers.
`Furthermore, the procemes in this apparatus can also be used
`as a moving image encoding program to be installed in the
`form of software or DSP firmware.
`
`In FIG. 3, numeral 1 indicates the input image, numeral
`2 indicates a region partitioning section, numeral 3 indicates
`region shape information, numeral 4 indicates a region
`image signal, numeral 5 indicates region motion
`information, numeral
`6 indicates region attribute
`information, numeral 7 indicates an encoder, numeral 8
`
`
`
`US 6,633,611 B2
`
`5
`indicates a local decoded image, numeral 9 indicates a
`memory, numeral 1|} indicates a reference image, and
`numeral 11 indicates an encoded bit stream. FIG. 4 is a
`flowchart showing an operation of the apparatus. The overall
`operation of the apparatus is first described with reference to _
`FIGS. 3 and 4.
`Input image 1 is input to region partitioning section 2 (S1)
`where it is partitioned into multiple regions. Region parti-
`tioning section 2 performs initial partitioning (S2) and
`adjacent region integrating (53), as willto be described later.
`Region partitioning section 2 passes shape information 3,
`image signal 4, attribute information 6 such as encoding
`modes of the regions and, motion information 5 for each
`region obtained as a rcsult of partitioning to encoder 7.
`Encoder 7 transforms and multiplexes these information
`items into a hit pattern based on a predetermined encoding
`method for output" as encoded bit stream 11 (S4, S5). In order
`to perform region partitioning and encoding based on
`motion-compensated prediction, encoder 7 generates local
`decoded image 3 for catch region and stores it into memory
`9. Region partitioning section 2 and encoder 7 fetches the
`local decoded image stored in memory 9 as reference image
`10 to perform motion-compensated prediction.
`FIG. 5 is a detailed block diagram of region partitioning
`section 2 wherein numeral 12 indicates a partitioning pro-
`cessing section, numeral 13 indicates initial partition shape
`information, and numeral 14 indicates :1 integration process-
`ing section
`(1) Initial Partitioning
`The initial panitionirtg corresponding to S2 of FIG. 4 is
`performed at partitioning processing section 12. Initial par-
`titioning refers to the partitioning which is performed before
`proceeding to integration, and the total partitioning count is
`dependent on the state of the image, namely, the features or
`characteristics of the image.
`FIG. 6 shows an internal configuration of partitioning |
`processing section 12 wherein numeral 15 indicates a uni-
`form partitioning section, numeral 16 indicates an activity
`calculating section, numeral 17 indicates an activity,
`numeral 18 indicates a partitioning judgment section, and
`numeral 19 indicates ll partition state instruction signal. The
`activity refers to an evaluated value for judging the features
`or characteristics of the image regarding a predetermined
`property. A prediction error power accompanying motion-
`cornpcnsated prediction for a region is employed as the
`activity in this embodiment.
`FIG. 21 shows a method of motion-compensated predic-
`tion based on a block matching method.
`In the block
`matching method, vector V given in the following formula is
`found as the motion vector of the region S to be predicted.
`
`6
`encoding for portions having large movements and rough
`encoding for portions having small movements. Afline
`motion compensation for obtaining alline motion parameters
`and perspective motion compensation for detecting three-
`dirnensional motion may he used.
`FIG. 7 is a flowchart showing an operation ofpartitioning
`processing section 12 wherein unconditional uniform block
`partitioning is first performed (53) by uniform partitioning
`section 15. At
`this time, one frame is partitioned,
`for
`example, into blocks of 32x32 pixels as shown in FIG. 8.
`This partitioning process is called a 0"'ipartitioning stage.
`The number of blocks generated in the t)" partitioning stage
`is denoted by No and each block by B,,° (1§n:—‘ND)'.
`Next, a judgment is made individually as to whether or
`not to perform further block partitioning for each E," (S9).
`For this purpose, activity 17 for each 2B,," is calculated in
`activity calculating section 16. Partitioning judgment section
`18 compares threshold THO that was set in advance with the
`activity of each block, and if activity 17 is larger than THO,
`the corresponding B,_." is further partitioned into four blocks
`(SIB). This is called a 1” partitioning stage.
`FIG. 9 illustrates the partitioned image at the 1*" parti-
`tioning stage. The number of newly generated 1(1)-<16 pixel
`blocks is denoted by NJ and each block by B,,1 (1 ‘—En§N,).
`Hereafter, the activity of each B,’ is calculated and a 2'”
`partitioning stage is performed using threshold Tl-ll.
`Thereafter, threshold Tl-lj is applied to block Bf generated
`in a j"’ partitioning stage and the j+]"' partitioning stage is
`executed (S13 to S16). The initial partitioning is terminated
`when j reaches a predetennined upper limit value. It is
`assumed here for the purpose of description that the process
`is terminated at the end of the 2"“ partitioning stage. In this
`case, blocks as shown in FIG. 10 are generated. Block sizes
`range from 8x8 pixels to 32x32 pixels. The number of
`. blocks at the end of initial partitioning is denoted by Mo and
`the initial region of catch block by S,,°. The shape informa-
`tion for S,,° is passed to integration processing section 14 as
`initial partition shape information 13.
`(2) integrating Adjacent Regions
`Integration processing section 14 performs integration
`with adjacent regions for each S,_". The internal configura-
`tion of integration processing section 14 is shown in FIG. 11
`wherein numeral 20 indicates a labeling section, numeral 21
`indicates an adjacent region setting section, numeral 22
`indicates a provisional encoder, numeral 23 indicates a
`decoder, numeral 24 indicates an encoding distortion calcu-
`lating section, numeral 25 indicates an evaluation value
`calculating section, numeral 26 indicates a constant for
`evaluation value calculation, numeral 27 indicates a inte-
`gration judgment mtion, and numeral 28 indicates a inte-
`gration process iteration instruction signal.
`FIG. 12 is a flowchart showing an operation of integration
`processing section 14. As shown in the flowchart, numbers
`or labels, are first assigned to initial regions 5,," by labeling
`section 20 in accordance to a predetermined rule (517). For
`example, numbers are assigned in sequence to regions while
`the image frame is scanned horizontally in pixel units from
`the top left corner to the bottom right corner. A simple
`example of labeling is shown in FIG. 13 wherein labels “1”,
`"_”, andso forth are assigned to the regions in thcirseqnence
`of appearance on the scanning line. At this time, region size
`is ignored. Hereinafter, the label value of region S: is
`denoted by l(S,f). The 1: here corresponds to a 16* partition-
`ing stage to be described later. where the initial state is it-0.
`Next, the “adjacent regions” of each region are defined by
`5 adjacent region setting section 21 (S18) using labels. FIG.
`14 is an example of adjacent regions wherein the adjacent
`
`55
`
`Dm-n
`
`|'_f.rt.t:+i',, _1-'-i—l",..I—l}-fitx. 31.1)]
`
`The term fs(x, y, t) is the pixel value on (X, y) at time t of
`the predicted region S, fs{x, y, I-1) is the pixel value on (x,
`y) at time t—l, and fs('x+v_‘, y-r-vi, t—1) is the pixel value of‘
`the position that is displaced from position (x, y. t-1) by the
`amount of vector v. R represents the motion vector search
`range.
`the prediction image is
`From the obtained vector v,
`obtained by t"s(x+vI, y+v_,,, 1-1), and the prediction error
`power, or activity, becomes D,,,,,,. Defining the activity with
`this method enables region partitioning to be performed
`according to the complexity of the local motion of the
`image. Control becomes possible, such as for detailed
`
`
`
`US 6,633,611 B2
`
`7
`regions of region SH" are based on the labels of FIG. 13.
`Regions B, C, and D, which are adjacent to the edges of
`region A and have label values larger than that of region A.
`are defined as adjacent regions.
`Next, a judgment is made for each region as to whether or
`not the region can be integrated with its adjacent regions.
`For this reason, an evaluation value for integration is cai-
`culated (S19) by provisional encoder 22, decoder 23, encod-
`ing distortion calculating section 24, and evaluation value
`calculating section 25. The evaluation value is amount of
`eode—distortion cost
`l.(S,,") expressed in the following
`formula.
`
`r_(s,,“t-n(5,*)+:tre(s,,":
`
`Formula 1
`
`Here, o(s,,t) is the encoding distortion of 5,", namely, the
`square error summation, R(S,,‘) is the amount of code of Sn",
`and l. is the constant 26. The integration proceeds in the
`direction of decreasing L(S,,‘). Decreasing rrsf) is equiva-
`lent to decreasing the encoding distortion within the range of
`the predetermined amount of code based on the given
`constant 1.. Decreasing the summation of I_(S,,“) enables the
`encoding distortion to be reduced when the same amount of
`code is used.
`
`.15 is a detailed flowchart of $19. First. S," is
`FIG.
`preliminarily encoded (S22) at provisional encoder 22. The
`purpose of this encoding is to prepare for the calculation of
`the amount of code R(S,,k) and the derivation of encoding
`distortion D(S,,"). In this embodiment, provisional encoder
`22 performs motion compensation using reference image 10.
`The data to be encoded includes image data, namely, the
`prediction error signal or original signal, motion information
`to specify the prediction image, and attribute information
`such as of the encoding mode, where the summation of the
`amounts of these codes is R(S,,"'). The prediction error signal
`is obtained as the diiference of the original signal of SJ‘ and
`the prediction image.
`Decoder 23 generates the local decoded image for 5;“
`(523) using the encoded data obtained by provisional
`encoder 22. Next, distortion D(S,_") of the local decoded
`image and original image is calculated (S24) by encoding
`distortion calculating section 24. Evaluation value calculat-
`ing section 25 calculates S25) amount of code—distortion
`cost L(S,,") from Rt'S,,") and D(S,,").
`Step 19 performs the preceding evaluation vahte calcu-
`lation for all regions for the three types of
`1. Each region Sn" itself: L(S,,“']
`2. Adjacent regions N,{S,,"‘] of Sn’: L(N, S,,"])
`3. Regiolp temporarily integrating SJ‘ and N,[ti,,"]: L{SN*+
`N S
`Here,
`denotes an adjacent region of S,,''', and i is a
`number for distinguishing the multiple adjacent regions.
`Next, in integration judgment section 27, a location within
`the image frame where
`
`‘ ‘
`
`8
`tion judgment section 27 halts the instructions to labeling
`section 20 when there are no further combinations yielding
`posit)ive values of DL and terminates the integration process
`(S21 .
`This terminates the processing for partitioning and
`integrating. and infonnation 3 expressing the region pani-
`tioucd state of input image 1, image data 4 For each region,
`motion information 5, and attribute information 6 is output
`to encoder 7. Hereafter, encoding is performed according to
`a predetermined encoding method.
`In this embodiment, integrating was performed as well as
`partitioning and each region can be expressed as a set of
`rectangular blocks of various sizes. For example, an object
`within an image having large movements can be integrated
`into a single region having a shape similar to the outline of
`the object. As a result, the amount of code is controlled by
`changing the quantization parameter for each object so as to
`enable flexible handling of images based on their actual
`structures. Furthermore, optimum region partitioning which
`minimizes encoding distortion is achieved under a fixed
`amount of code. Thus, compared to the conventional moving
`image encoding apparatus, higher image quality can be
`achieved with a smaller amount of code.
`Although the initial partitioning in this embodiment was
`terminated at the end of the 2”’ partitioning stage, it may of
`course be terminated at another stage. For example, if the.
`overall movement of the image is small, the initial parti-
`tioning may be terminated at the 1*“ stage and, if not, the
`number of stages may be increased. Furthermore, although
`image frames were encoded in this embodiment, it is also
`possible to apply this encoding in a similar manner to a.
`rectangular image area including an object of arbitrary shape
`in the image frame.
`For described encoder 7 and provisional encoder 22, the
`encoding of SN“ was performed ‘through a combination of
`DCT and linear quantization. However, other encoding
`methods, such as vector quantization, sub-band encoding, or
`wavelet encoding, may be used. Multiple encoding methods
`may be prepared and a configuration selectively using the
`method having the best encoding efficiency may be
`employed.
`Although prediction error power was adopted for the
`activity in this embodiment, other examples given below
`may be considered.
`A first