`
`(19) World Intellectual Property Organization
`International Bureau
`
`1111111111111111 IIIIII IIIII IIII I II Ill 111111111111111 lllll 11111111111111111111111
`
`(43) International Publication Date
`26 July 2001 (26.07.2001)
`
`PCT
`
`(10) International Publication Number
`WO 01/54416 Al
`
`(51) International Patent Classification 7: H04N 7/34, 7/50
`
`(21) International Application Number:
`
`PCT/FI0l/00050
`
`(22) International Filing Date: 22 January 2001 (22.01.2001)
`
`Joni [FI/Fl]; Ojavainionkatu 18 B 10, FIN-33710 Tampere
`(FI). DOBRIN, Bogdan-Paul [RO/Fl]; Iso-Roobertinkatu
`8 B 12, FIN-00120 Helsinki (FI). KARCZEWICZ,
`Marta [PL/US]; 1224 Hidden Ridge, #3097, Irving, TX
`75038 (US).
`
`(25) Filing Language:
`
`(26) Publication Language:
`
`English
`
`English
`
`(74) Agent: TAMPEREEN PATENTTITOIMISTO OY;
`Hermiankatu 6, FIN-33720 Tampere (FI).
`
`(30) Priority Data:
`20000131
`
`21 January 2000 (21.01.2000)
`
`FI
`
`(71) Applicant (for all designated States except US): NOKIA
`MOBILE PHONES LTD [FI/Fl]; Keilalahdentie 4, FIN-
`02150 Espoo (FI).
`
`(72) Inventors; and
`(75) Inventors/Applicants (for US only): KALEVO, Ossi
`[FI/Fl]; Ketunhanta 1, FIN-37800 Toijala (FI). VAHTERI,
`
`(81) Designated States (national): AE, AG, AL, AM, AT, AT
`(utility model), AU, AZ, BA, BB, BG, BR, BY, BZ, CA,
`CH, CN, CR, CU, CZ, CZ (utility model), DE, DE (utility
`model), DK, DK (utility model), DM, DZ, EE, EE (utility
`model), ES, FI, FI (utility model), GB, GD, GE, GH, GM,
`HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK,
`LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX,
`MZ, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SJ, SK, SK
`(utility model), SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ,
`VN, YU, ZA, ZW.
`
`-----------------------------------------
`
`(54) Title: A METHOD FOR ENCODING IMAGES, AND AN IMAGE CODER
`
`[Continued on next page}
`
`(57) Abstract: The invention relates to a method for encoding a digital image,
`in which method the digital image is divided into blocks (C, L, U, UL, UR).
`In the method a spatial prediction for a block (C) is performed to reduce the
`amount of information to be transmitted, wherein at least one prediction method
`(Pl-P13) is defined. In the method a classification is determined for at least one
`neighbouring block (L, U) of said block (C) to be predicted according to the
`contents of said neighbouring block (L, U), and a prediction method (Pl-P13) is
`selected for the current block (C) on the basis of at least one said classification.
`
`----,
`I
`I
`I
`I
`I
`120
`I
`
`Directionality
`Classification
`
`Mapping of
`Classes
`
`iiiiiiiiiiii
`iiiiiiiiiiii
`
`---iiiiiiiiiiii
`iiiiiiiiiiii ----
`
`iiiiiiiiiiii
`
`iiiiiiiiiiii
`iiiiiiiiiiii
`
`19
`
`I f:
`
`17 I
`I
`I
`I
`I
`I
`I
`I
`Selection of
`I
`I
`Prediction
`I
`Method Set
`- - - - ____ J
`,---- ----,23
`r:
`I
`
`Spatial
`Prediction
`
`)
`18 I
`I
`I
`I
`Signaling of
`I
`Prediction
`I
`I
`Method
`L __ ___ __ J
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 1
`
`
`
`WO 01/54416 Al
`
`1111111111111111 IIIIII IIIII IIII I II Ill 111111111111111 lllll 11111111111111111111111
`
`(84) Designated States (regional): ARIPO patent (GH, GM,
`KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW), Eurasian
`patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), European
`patent (AT, BE, CH, CY, DE, DK, ES, Fl, FR, GB, GR, IE,
`IT, LU, MC, NL, PT, SE, TR), OAPI patent (BF, BJ, CF,
`CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG).
`
`Published:
`-
`with international search report
`
`For two-letter codes and other abbreviations. refer to the "Guid(cid:173)
`ance Notes on Codes and Abbreviations" appearing at the begin(cid:173)
`ning of each regular issue of the PCT Gazette.
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 2
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`1
`
`A method for encoding images, and an image coder
`
`The present invention relates to a method for encoding images
`according to the preamble of claim 1. The present invention also relates
`to a device for encoding images according to the preamble of claim 12.
`Furthermore, the present invention relates to an encoder according to
`the preamble of claim 23, to a decoder according to the preamble of
`claim 24, to a codec according to the preamble of claim 25, to a mobile
`terminal according to the preamble of claim 26, and a storage medium
`for storing a software program according to the preamble of claim 27.
`
`The image can be any digital image, a video image, a TV image, an
`image generated by a video recorder, a computer animation, a still
`image, etc. In general, a digital image consists of pixels, are arranged
`in horizontal and vertical lines, the number of which in a single image is
`typically tens of thousands. In addition, the information generated for
`each pixel contains, for instance, luminance information relating to the
`pixel, typically with a resolution of eight bits, and in colour applications
`information, e.g. a chrominance signal. This
`also chrominance
`chrominance signal generally consists of two components, Cb and Cr,
`which are both typically transmitted with a resolution of eight bits. On
`the basis of these luminance ~nd chrominance values, it is possible to
`form information corresponding to the original pixel on the display
`device of a receiving video terminal. In this example, the quantity of
`data to be transmitted for each pixel is 24 bits uncompressed. Thus, the
`total amount of information for one image amounts to several megabits.
`In the transmission of a moving image, several images are transmitted
`per second. For instance in a TV image, 25 images are transmitted per
`information to be
`second. Without compression,
`the quantity of
`transmitted would amount to tens of megabits per second. However, for
`example in the Internet data network, the data transmission rate can be
`in the order of 64 kbits per second, which makes uncompressed real
`time image transmission via this network practically impossible.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`To reduce the amount of information to be transmitted, a number of
`different compression methods have been developed, such as the
`
`CONFIRMATION COPY
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 3
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`2
`
`JPEG1 MPEG and H.263 standards. In the transmission of video1 image
`compression can be performed either as inter-frame compression,
`intra-frame compression, or a combination of these. In inter-frame
`is
`to eliminate redundant
`information
`in
`compression,
`the aim
`successive image frames. Typically1 images contain a large amount of
`non-varying information1 for example a motionless background, or
`slowly changing information, for example when the subject moves
`slowly. In inter-frame compression1 it is also possible to utilize motion
`compensated prediction, wherein the aim is to detect elements in the
`image which are moving, wherein motion vector and prediction error
`information are transmitted instead of transmitting the pixel values.
`
`To enable the use of image compression techniques in real time, the
`transmitting and receiving video terminal should have a sufficiently high
`processing speed that it is possible to perform compression and
`decompression in real time.
`
`in digital
`In several image compression techniques, an image signal
`format is subjected to a discrete cosine transform (OCT) before the
`image signal is transmitted to a transmission path or stored in a storage
`means. Using a OCT1 it is possible to calculate the frequency spectrum
`of a periodic signal, i.e. to perform a transformation from the time
`domain to the frequency domain. In this context, the word discrete
`indicates that separate pixels instead of continuous functions are
`processed in the transformation. In a digital image signal, neighbouring
`pixels typically have a substantial spatial correlation. One feature of the
`OCT is that the coefficients established as a result of the OCT are
`practically uncorrelated; hence, the OCT conducts the transformation of
`the image signal from the time domain to the (spatial) frequency
`domain in an efficient manner, reducing the redundancy of the image
`data. As such, use of transform coding is an effective way of reducing
`redundancy in both inter-frame and intra-frame coding.
`
`Current block-based coding methods used in still image coding and
`video coding for independently coded key frames (intra-frames) use a
`block-based approach. In general, an image is divided into NxM blocks
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 4
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`3
`
`that are coded independently using some kind of transform coding.
`Pure block-based coding only reduces the inter-pixel correlation within
`a particular block, without considering the inter-block correlation of
`pixels. Therefore, pure block-based coding produces rather high bit
`rates even when using transform-based coding, such as OCT coding,
`which has very efficient energy packing properties for highly correlated
`data. Therefore, current digital image coding standards exploit certain
`methods that also reduce the correlation of pixel values between
`blocks.
`
`the
`in
`image coding methods perform prediction
`Current digital
`transform domain, i.e. they try to predict the OCT coefficients of a block
`currently being coded using the previous coded blocks and are thus
`coupled with the compression method. Typically a OCT coefficient that
`corresponds to the average pixel value within an image block is
`predicted using the same OCT coefficient from the previous coded
`block. The difference between the actual and predicted coefficient is
`sent to decoder. However, this scheme can predict only the average
`pixel value, and it is not very efficient.
`
`Prediction of OCT coefficients can also be performed using spatially
`neighbouring blocks. For example, a OCT coefficient that corresponds
`to the average pixel value within a block is predicted using the OCT
`coefficient(s) from a block to the left or above the current block being
`coded. OCT coefficients that correspond to horizontal frequencies (i.e.
`vertical edges) can be predicted from the block above the current block
`and coefficients that correspond to vertical frequencies (i.e. horizontal
`edges) can be predicted from the block situated to the left. Similar to
`the previous method, differences between the actual and predicted
`coefficients are coded and sent to the decoder. This approach allows
`prediction of horizontal and vertical edges that run through several
`blocks.
`
`In MPEG-2 compression, the OCT is performed in blocks using a block
`size of 8 x 8 pixels. The luminance level is transformed using full spatial
`resolution, while both chrominance signals are subsampled. For
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 5
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`4
`
`example, a field of 16 x 16 pixels is subsampled into a field of 8 x 8
`pixels. The differences in the block sizes are primarily due to the fact
`that the eye does not discern changes in chrominance equally well as
`changes in luminance, wherein a field of 2 x 2 pixels is encoded with
`the same chrominance value.
`
`The MPEG-2 standard defines three frame types: an I-frame (Intra), a
`P-frame (Predicted), and a B-frame (Bi-directional). An I-frame is
`generated solely on the basis of information contained in the image
`itself, wherein at the receiving end, an I-frame can be used to form the
`entire image. A P-frame is typically formed on the basis of the closest
`preceding I-frame or P-frame, wherein at the receiving stage the
`preceding I-frame or P-frame is correspondingly used together with the
`received P-frame. In the composition of P-frames, for instance motion
`compensation is used to compress the quantity of information. B(cid:173)
`frames are formed on the basis of a preceding I-frame and a following
`P- or I-frame. Correspondingly, at the receiving stage it is not possible
`to compose the 8-frame until the preceding and following frames have
`been received. Furthermore, at the transmission stage the order of the
`P- and B-frames is changed, wherein the P-frame following the B-frame
`is received first. This tends to accelerate reconstruction of the image in
`the receiver.
`
`Intra-frame coding schemes used in prior art solutions are inefficient,
`wherein transmission of intra-coded frames is bandwidth-excessive.
`This limits the usage of independently coded key frames in low bit rate
`digital image coding applications.
`
`The present invention addresses the problem of how to further reduce
`redundant information in image data and to produce more efficient
`coding of image data, by introducing a spatial prediction scheme
`involving the prediction of pixel values, that offers a possibility for
`prediction from several directions. This allows efficient prediction of
`edges with different orientations, resulting in considerable savings in bit
`rate. The method according to the invention also uses context-
`
`5
`
`1 0
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 6
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`5
`
`dependent selection of suitable prediction methods, which provides
`further savings in bit rate.
`
`5
`
`10
`
`15
`
`The invention introduces a method for performing spatial prediction of
`pixel values within an image. The technical description of this document
`introduces a method and system for spatial prediction that can be used
`for block-based still image coding and for intra-frame coding in block(cid:173)
`based video coders. Key· elements of the invention are the use of
`multiple prediction methods and the context-dependent selection and
`signalling of the selected prediction method. The use of multiple
`prediction methods and the context-dependent selection and signalling
`of the prediction methods allow substantial savings in bit rate to be
`achieved compared with prior art solutions.
`
`It is an object of the present invention to improve encoding and
`decoding of digital images such that higher encoding efficiency can be
`achieved and the bit rate of the encoded digital image can be further
`reduced.
`
`20
`
`According to the present invention, this object is achieved by an
`encoder for performing spatially predicted encoding of image data.
`
`According to a first aspect of the invention there is provided a method
`for encoding a digital image, in which method the digital image is
`divided into blocks, characterized in that in the method a spatial
`prediction for a block is performed to reduce the amount of information
`to be transmitted, wherein at least one prediction method is defined, a
`classification is determined for at least one neighbouring block of said
`block to be predicted according to the contents of said neighbouring
`block, and a prediction method is selected for the current block on the
`basis of at least one said classification.
`
`According to a second aspect of the invention there is provided a
`device for encoding a digital image, which is divided into blocks,
`characterized in that the device comprises means for performing spatial
`prediction for a block to reduce the amount of information to be
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 7
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`6
`
`transmitted, wherein at least one prediction method has been defined,
`that the device further comprises means for determining a classification
`for at least one neighbouring block of said block to be predicted
`according to the contents of said neighbouring block, and means for
`selecting a prediction method for the current block on the basis of at
`least one said classification.
`
`According to a third aspect of the invention there is provided an
`encoder comprising means for encoding a digital image, and means for
`dividing the digital image into blocks, characterized in that the encoder
`comprises means for performing spatial prediction for a block to reduce
`the amount of information to be transmitted, wherein at least one
`prediction method has been defined, that the encoder further comprises
`means for determining a classification for at least one neighbouring
`block of said block to be predicted according to the contents of said
`neighbouring block, and means for selecting a prediction method for
`the current block on the basis of at least one said classification.
`
`According to a fourth aspect of the invention there is provided a
`decoder comprising means for decoding a digital image, which is
`divided into blocks, characterized in that the decoder comprises means
`for performing spatial prediction for a block to reduce the amount of
`information to be transmitted, wherein at least one prediction method
`has been defined, that the decoder further comprises means for
`determining a classification for at least one neighbouring block of said
`block to be predicted according to the contents of said neighbouring
`block, and means for selecting a prediction method for the current block
`on the basis of at least one said classification.
`
`According to a fifth aspect of the invention there is provided a codec
`comprising means for encoding a digital image, means for dividing the
`digital image into blocks, and means for decoding a digital image,
`characterized in that the codec comprises means for performing spatial
`prediction for a block to reduce the amount of information to be
`transmitted, wherein at least one prediction method has been defined,
`that the codec further comprises means for determining a classification
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 8
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`7
`
`for at least one neighbouring block of said block to be predicted
`according to the contents of said neighbouring block, and means for
`selecting a prediction method for the current block on the basis of at
`least one said classification.
`
`According to a sixth aspect of the invention there is provided a mobile
`terminal comprising means for encoding a digital image, means for
`dividing the digital image into blocks, and means for decoding a digital
`image, characterized in that the mobile terminal comprises means for
`performing spatial prediction for a block to reduce the amount of
`information to be transmitted, wherein at least one prediction method
`has been defined, that the mobile terminal further comprises means for
`determining a classification for at least one neighbouring block of said
`block to be predicted according to the contents of said neighbouring
`block, and means for selecting a prediction method for the current block
`on the basis of at least one said classification.
`
`According to a seventh aspect of the invention there is provided a
`storage medium for storing a software program comprising machine
`executable steps for encoding a digital image, and for dividing the
`digital image into blocks, characterized in that the software program
`further comprises machine executable steps for performing spatial
`prediction for a block to reduce the amount of information to be
`transmitted, wherein at least one prediction method has been defined,
`steps for determining a classification for at least one neighbouring block
`of said block to be predicted according to the contents of said
`neighbouring block, and steps for selecting a prediction method for the
`current block on the basis of at least one said classification.
`
`The invention is based on the idea that to perform spatial prediction of
`pixel values for a block to be coded, adjacent decoded blocks are
`examined to determine if there exists some directionality in the contents
`of the adjacent blocks. This directionality information is then used to
`classify the blocks. Based on the combination of the classes of the
`adjacent blocks, the contents (pixel values) of the current block are
`then predicted using a suitable prediction method. The prediction
`
`5
`
`1 0
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 9
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`8
`
`method is signalled to the decoder. Prediction error information is also
`sent if it is efficient to do that in a distortion vs. bit-rate sense.
`
`5
`
`Considerable advantages are achieved with the present invention when
`compared with solutions of prior art. Using a method according to the
`invention, it is possible to reduce the amount of information needed
`when transmitting images in digital format.
`
`1 O
`
`In general, the method according to the invention can be applied to
`block-based still image coding as well as to intra-frame coding in a
`block-based digital image coder.
`
`15
`
`20
`
`In the following, the invention will be described in more detail with
`reference to the appended figures, in which
`
`Fig. 1
`
`shows the structure of a digital image transmission system,
`
`Fig. 2
`
`illustrates the spatial prediction method of the present
`invention in the form of a block diagram,
`
`show an illustration of blocks that are used for
`Figs. 3a-3c
`prediction according to an advantageous embodiment of the
`present invention,
`
`25
`
`Fig. 4
`
`shows the mapping of directionality classes to context
`classes according to an advantageous embodiment of the
`present invention,
`
`show an illustration of pixels that are used for
`Figs. 5a-5p
`prediction according to an advantageous embodiment of the
`present invention,
`
`Fig. 6
`
`shows an advantageous bit-stream syntax used in the
`transmission of displacement information, and
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 10
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`9
`
`Fig. 7
`
`is a schematic representation of a portable communications
`device implementing a method according to the invention.
`
`The intra-frame prediction method described in this invention operates
`in a block-based manner and can be applied to image frames that
`comprise NxM blocks scanned e.g. row by row from left to right and
`from top to bottom. It is obvious that other scanning directions can also
`be used in connection with the present invention. Spatial prediction is
`performed for each intra-coded block using previously reconstructed
`blocks in the same frame. The residual error can be compressed using
`any suitable method, e.g. using OCT, as in current standards. It should
`also be appreciated that the method according to the invention may be
`applied equally well to both monochrome and colour images.
`
`The system according to the invention consists of two main parts, as
`illustrated in Figure 2. Firstly, context-dependent selection 17 of a
`suitable subset of prediction methods is performed by classifying
`neighbouring reconstructed blocks. Secondly, a prediction block is
`constructed 18 using one of the prediction methods in the selected
`subset and the prediction method is signalled to decoder.
`
`Context-dependent selection of a prediction method subset comprises
`directionality classification of possible neighbouring blocks, mapping of
`directionality classes
`to context classes and context-dependent
`selection of an appropriate prediction method subset.
`
`In the following, the transmission and reception of digital image frames
`in a transmission system is described with reference to the digital
`image transfer arrangement presented in Figure 1. The current frame
`arrives at the transmission system 1 as input data 2 provided, for
`example, as the output of a digital video camera. The current frame
`may be provided in its entirety (i.e. a complete frame comprising NxM
`image blocks), in which case the frame is stored, or the transmission
`system 1 may receive the input data block by block. The blocks of the
`frame are directed one by one to a summer 4, where prediction error of
`a block is calculated e.g. by subtracting a block of the frame from a
`
`5
`
`1 O
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 11
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`10
`
`predicted block. The prediction error is coded in a coder 5 and decoded
`in a decoder 6. In summer 7 the decoded prediction error is summed
`with predicted blocks and the result is saved in a frame memory 8. The
`prediction estimator 3, where spatial prediction is performed according
`to the method of the invention, receives blocks to be used with
`prediction from the frame memory 8.
`
`In order to form a new prediction block, the prediction estimator 3
`examines, if there exists some directionality in possible neighbouring
`blocks of the current block. This scheme is illustrated in Figure 3a. The
`reference C denotes the current block, the reference L denotes a first
`neighbouring block of the current block and the reference U denotes a
`second neighbouring block of the current block. In this advantageous
`embodiment of the invention, the first neighbouring block is to the left of
`the current block C and the second neighbouring block is above the
`current block C. If the scanning order is different from left to right and
`from top to bottom, the first neighbouring block L and the second
`neighbouring block U are not necessarily to the left of and above the
`current block C, respectively. The neighbouring blocks L, U are blocks
`adjacent to the current block C which have already been reconstructed.
`In some embodiments of the invention more than two blocks can be
`classified and used to select the prediction method for the current block
`C. However, in the following description of a preferred embodiment of
`the invention, a maximum of two neighbouring blocks L, U are classified
`for each block C under examination. Furthermore, the classification is
`performed only if a neighbouring block L or U exists. If a current block
`does not have any neighbouring blocks, it is treated as "Non-Intra"
`during context-dependent selection of prediction methods, as will be
`explained further later in the text.
`
`Prediction can also be implemented in such a way that it is performed
`using only already reconstructed intra-coded blocks. In this case, all
`blocks other than intra-coded blocks are treated as "Non-Intra".
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`The first neighbouring block L and the second neighbouring block U are
`classified according to the directionality of image details inside the
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 12
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`11
`
`block. As illustrated in Figure 2, directionality classifier 19 analyses the
`directionality of the neighbouring blocks using pixel value gradients. As
`a result, each neighbouring block is mapped 20 into an output class. In
`an advantageous embodiment of the invention there are 11 such output
`classes, but it is obvious that the number of output classes may vary.
`Advantageously, the output classes consist of 8 directionality classes
`DO - D7 corresponding to edge orientations k•22.5°, k = 0, 1, ... , 7 and
`3 non-directional classes D8 - D10 corresponding to flat, smooth
`texture and coarse texture blocks. In alternative embodiments of the
`invention, the number of directionality classes and the way in which
`they are defined may vary.
`
`In the system of figure 1, the prediction estimator 3 first examines if the
`first neighbouring block L and/or the second neighbouring block U exist.
`If either one of these blocks does not exist, that neighbouring block is
`defined as a CO block ("Non-Intra"), i.e. the current block C is on the
`edge or in a corner of the frame, or on the edge or in a corner of an
`area consisting of Intra blocks. Then, the prediction estimator 3 selects
`a suitable prediction method for the current block C, as described later
`in this description. Otherwise, the prediction estimator 3 calculates
`gradient information relating to the block or blocks L, U.
`
`the gradient
`for calculating
`There are many suitable methods
`information. In the following, one advantageous method is described.
`First, average absolute directional gradients gk, k = 0, 1, . . . , 7 of a
`block L, U are defined as
`
`5
`
`10
`
`15
`
`20
`
`25
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 13
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`12
`
`J
`1
`( N-1 N-2
`go =--max 1, L r11cx, y)-I(x+l, y)I
`N(N -1)
`y=O x=O
`
`g 1 =
`
`gz =
`
`1
`2 max(l, ~~jl(x, y)-½(/(x-1, y)+I(x-1, y+l))IJ
`(N -1)
`y=O x=l
`J
`l
`( N~N~
`2 max 1, L r11cx, y)-l(x-1, y+ l)j
`(N 1)
`y=O x=l
`
`)
`
`1
`( N-2N-1
`2 max 1, LLII(x,y)-½(I(x-I,y+I)+l(x,y+l))j
`(N -1)
`J
`g 4 =--max 1,LLiI(x,y) I(x,y+l)j
`1
`( N-2N-1
`N(N -1)
`
`g 3 =
`
`y=O x=1
`
`y=O x=O
`
`1
`2 max(l, ~~jl(x, y)
`g 5 =
`(N -1)
`2 max 1, L r11cx, y)
`1
`( N-2N-2
`(N -1)
`1
`( N-2N-2
`2 max 1,rr11cx,y)
`(N -1)
`y=O x=O
`
`y=O x=O
`
`y=O x=O
`
`g6 =
`
`g7 =
`
`(I(x, y+l)+l(x+l, y+l))IJ
`J
`I(x+I, y+l)I
`
`(I(x+l,y)+J(x+l,y+l))I
`
`](1)
`
`where N is the size of the block and l(x,y) represent the pixel intensity
`values. Indices x and y refer to the co"ordinates of a pixel inside the
`block and k represents edge orientations. The prediction estimator 3
`calculates the gradient values 9k according to the formulae above.
`
`5
`
`Using the gradient values 9k, gradient ratios rk, k = 0, 1, . . . , 7 are
`defined as the ratio between the gradient value in a certain direction
`and gradient value in the orthogonal direction:
`
`1 O
`
`go
`ro =-,
`g4
`1
`r4 =-,
`ro
`
`'
`
`1
`
`r =11..
`g5
`1
`r5 =-,
`f1
`
`'
`
`2
`
`r, = .£3..
`g6
`1
`r6 =-,
`rz
`
`3
`
`r, = 2.2.
`g7
`1
`r1 =-
`r3
`
`(2)
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 14
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`13
`
`Based on the absolute gradient values gk and gradient ratios rk defined
`in (1) and (2), classification of the block is performed, advantageously
`according to the following classification steps 1-12 using some
`numerical values as thresholds. This classification process classifies
`each of the neighbouring blocks into one of a first set of block types
`D0-D10. The present invention is not limited to the values used in the
`algorithm, but the values used in the algorithm in the following steps are
`preferred. The method can also be applied to any block size.
`
`In this advantageous embodiment of the invention the classification
`phase comprises 13 steps, but it is obvious that the classification may
`comprise also different number of steps.
`
`Step 1
`In this step the flatness of the block is checked. Prediction estimator 3
`calculates gradient values g0 and g4. These correspond to gradient
`values for horizontal (0°) and vertical (90°) image details. If both
`go :::; 2.0 and g4 :::; 2.0, the block is classified as class D8 and the initial
`classification process terminates. Otherwise, classification step 2 is
`performed.
`
`Step 2
`In this step a further check for flatness of the block is performed. The
`rest of the gradient values gk are calculated, and the maximum gradient
`value Qmax = max{gk} is determined. The maximum gradient value Qmax
`is compared with 2.5. If 9max :::; 2.5 the block is classified as class D8
`and the initial classification process terminates. Otherwise, the method
`continues from step 3.
`
`Step 3
`In step 3 a check for clear directionality is performed. The gradient
`ratios rk are calculated and the minimum gradient ratio r min = min{rk} is
`determined. When
`the minimum gradient
`ratio
`is
`found,
`the
`corresponding index kmin is defined. If r min :::; 0.15 the block is classified
`to corresponding class Dkmin and the method continues from step 12,
`otherwise the method continues from step 4.
`
`5
`
`1 O
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1011, p. 15
`
`
`
`WO 01/54416
`
`PCT/FIOl/00050
`
`14
`
`Step 4
`In step 4 a check for texture is performed. The minimum gradient ratio
`rmin is compared with 0.6. If rmin ~ 0.6 the method continues from step
`13, otherwise the method continues from the next step.
`
`Step 5
`In step 5 the two smallest gradient ratios are checked to determine if
`they are clearly distinct. The gradient ratios rk are sorted in increasing
`r(o) :s; r(1) :5 r(2)
`order
`the gradient ratio
`indices are
`••. :5 r(7), Also
`reordered according to the sorted order k(o), k(1), k(2),
`• • • k(7).
`If
`'co) < ½ (7c 2)
`the sixth classification step is performed next,
`r<1) )
`r<1)
`otherwise the method continues from the 10th classification step.
`
`Step 6
`In step 6 the smallest gradient r