throbber
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`803
`
`Transactions Letters________________________________________________________________
`
`A Unified Approach to Restoration, Deinterlacing and Resolution Enhancement
`in Decoding MPEG-2 Video
`Bo Martins and Søren Forchhammer
`
`Abstract—The quality and spatial resolution of video can be
`improved by combining multiple pictures to form a single super-
`resolution picture. We address the special problems associated
`with pictures of variable but somehow parameterized quality
`such as MPEG-decoded video. Our algorithm provides a unified
`approach to restoration, chrominance upsampling, deinterlacing,
`and resolution enhancement. A decoded MPEG-2 sequence for
`interlaced standard definition television (SDTV) in 4 : 2 : 0 is
`converted to: 1) improved quality interlaced SDTV in 4 : 2 : 0; 2)
`interlaced SDTV in 4 : 4 : 4; 3) progressive SDTV in 4 : 4 : 4; 4)
`interlaced high-definition TV (HDTV) in 4 : 2 : 0; and 5) progres-
`sive HDTV in 4 : 2 : 0. These conversions also provide features as
`freeze frame and zoom. The algorithm is mainly targeted at bit
`rates of 4–8 Mb/s. The algorithm is based on motion-compen-
`sated spatial upsampling from multiple images and decimation
`to the desired format. The processing involves an estimated
`quality of individual pixels based on MPEG image type and local
`quantization value. The mean-squared error (MSE) is reduced,
`compared to the directly decoded sequence, and annoying ringing
`artifacts including mosquito noise are effectively suppressed. The
`superresolution pictures obtained by the algorithm are of much
`higher visual quality and have lower MSE than superresolution
`pictures obtained by simple spatial interpolation.
`
`Index Terms—Deinterlacing, enhanced decoding, motion-com-
`pensated processing, MPEG-2, SDTV to HDTV conversion, video
`decoding.
`
`I. INTRODUCTION
`
`M PEG-2 [1] is currently the most popular method for com-
`
`pressing digital video. It is used for storing video on
`digital versatile disks (DVDs) and it is used in the contribu-
`tion and distribution of video for TV. We base this paper on the
`MPEG reference software encoder [2] for which a bit rate of
`5–7 Mb/s yields a quality which is equivalent to (analog) distri-
`bution phase alternating line (PAL) TV quality. Lower bit rates
`are also used in TV distribution to save bandwidth and because
`professional encoders may provide better quality than the refer-
`ence software.
`
`Manuscript received December 1, 1999; revised May 2, 2002. This work was
`supported in part by The Danish National Centre for IT Research. This paper
`was recommended by Associate Editor A. Tabatabai.
`B. Martins was with the Department of Telecommunication, Technical
`University of Denmark, DK-2800 Lyngby, Denmark. He is now with
`Scientific-Atlanta Denmark A/S, DK-2860 Søborg, Denmark (e-mail: bo.mar-
`tins@sciatl.com).
`S. Forchhammer is with Research Center COM, 371, Technical University of
`Denmark, DK-2800 Lyngby, Denmark (e-mail: sf@com.dtu.dk).
`Publisher Item Identifier 10.1109/TCSVT.2002.803227.
`
`At these bit rates, a sequence decoded from an MPEG-2 bit-
`stream is of lower quality than the original digital sequence in
`terms of sharpness and color resolution but still acceptable (ex-
`cept for very demanding material). This overall reduction of
`quality is less annoying to a human observer than the artifacts
`typically found in compressed video. The most annoying ar-
`tifacts are ringing artifacts1 and in particular mosquito noise,
`which occurs when the appearance of the ringing changes from
`picture to picture.
`The primary goal of this paper is to improve MPEG-2 de-
`coding, or rather to postprocess the decoded sequence re-using
`information in the MPEG-2 bitstream to obtain a sequence of
`higher fidelity, especially with regard to the artifacts. The re-
`sulting output is a sequence in the same format as the directly
`decoded one, which in our case is interlaced standard TV in
`4 : 2 : 0. In addition, we demonstrate how the approach can be
`used to obtain progressive (deinterlaced) or high-definition TV
`(HDTV) from the same bitstream. This also facilitates features
`such as frame freeze and zoom.
`Previous work on postprocessing includes projections onto
`convex sets (POCS) [3] and regularization [4]. For low-bit-rate
`(high compression)
`JPEG-compressed still
`images
`and
`MPEG-1-coded moving pictures, the main artifact is blocking,
`i.e., visible discontinuities at coding block boundaries. This
`artifact can be dealt with efficiently using the POCS frame-
`work [5], as well as by other methods [6]. By regularization,
`POCS constraints can be combined with “soft” assumptions
`about the sequence. Thus, Choi et al. [4] restored very-low
`bit-rate video encoded by H.261 and H.263 according to the
`following desired (soft) properties: 1) smoothness across block
`boundaries; 2) small distance between the directly decoded
`sequence and the reconstructed sequence; and 3) smoothness
`along motion trajectories. Elad and Feuer [7] presented a
`unified methodology for superresolution restoration requiring
`explicit knowledge of parameters as warping and blurring. As
`this knowledge is not available in our case, we do not take the
`risk of processing based on estimating such parameters. Patti et
`al. [8] also addressed the superresolution problem in a general
`setting modeling the system components. They applied POCS
`performing projections for each pixel of each reference image
`in each iteration. Recently [9] this approach was modified to
`obtain superresolution from images of an MPEG-1 sequence
`captured by a specific video camera. Projections were carried
`
`1Ringing artifacts are caused by the quantization error of high-frequency con-
`tent, e.g., at edges. They appear as ringing adjacent to the edge.
`
`Authorized licensed use limited to: Cliff Reader. Downloaded on November 14,2023 at 06:50:12 UTC from IEEE Xplore. Restrictions apply.
`
`1051-8215/02$17.00 © 2002 IEEE
`
`1
`
`SAMSUNG-1007
`
`

`

`804
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`out in the transform domain. Our goal is to develop simpler
`techniques (which could be combined with POCS).
`The starting point of our work is the sequence decoded by
`an ordinary MPEG-2 decoder [2]. The material to be processed
`in this paper is of higher quality than MPEG-1 material or the
`low-bit-rate material of [4]. Consequently, there is a higher risk
`of degrading the material. Enforcing assumptions of smoothness
`of the material will almost surely lead to a decrease of sharpness.
`The basic idea of our restoration scheme is to apply a conser-
`vative form of filtering along motion trajectories utilizing the
`assumed quality of the pixels on each trajectory. The assumed
`quality of each pixel in the decoded sequence is given by the
`MPEG picture structure (i.e., what type of motion compensation
`is applied) and the quantization step size for the corresponding
`macroblock.
`The algorithm has two steps. In the first step, a superreso-
`lution version (default is quadruple resolution) of each directly
`decoded picture2 is constructed. In the second step, the super-
`resolution picture is decimated to the desired format. Depending
`on the degree of decimation of the chrominance and luminance
`in the second step, the problem addressed is one of restoration,
`chrominance upsampling, deinterlacing, or resolution enhance-
`ment, e.g., conversion to HDTV. The aim in restoration is to en-
`hance the decoding quality. For the other applications, the reso-
`lution is also enhanced.
`In the first part of the upsampling, directly decoded pixels are
`placed very accurately in a superresolution picture before fur-
`ther processing. This approach is motivated by the fact that the
`individual pictures of the original sequence are undersampled
`[9], [10]. We do not want to trade resolution for improved peak
`signal-to-noise ratio (PSNR) by spatial filtering at this stage so
`the noise reducing filtering is deferred to the decimation step.
`The paper is organized as follows. In Section II, a quality
`value is assigned to each pixel in the decoded sequence. Part
`one (upsampling) of our enhancement algorithm is described in
`Section III. The second part (decimation) is described in Sec-
`tion IV. Results on a number of test sequences are presented in
`Section V.
`
`II. PROCESSING BASED ON MPEG-QUALITY
`
`MPEG-2 [1] partitions a picture into 16 16 blocks of picture
`material (macroblocks). A macroblock is usually predicted from
`one or more reference pictures. The different types of pictures
`are referred to as I, P, and B pictures. I pictures are intracoded,
`i.e., no temporal prediction. Macroblocks in P pictures may be
`unidirectionally predicted and macroblocks in B pictures may
`be uni- or bidirectionally predicted. (Macroblocks in B and P
`pictures may also be intracoded as macroblocks in I-pictures.)
`The error block, resulting from the prediction, is partitioned
`into four luminance and two, four, or eight chrominance blocks
`of 8
`8 pixels, depending on the format. For the 4 : 2 : 0 format,
`each macroblock has two chrominance blocks. The discrete co-
`sine transform (DCT) is applied to each 8
`8 block. The DCT
`coefficients are subjected to scalar quantization before being
`coded to form the bitstream.
`
`A. Quality Measure for Pixels in an MPEG Sequence
`From the MPEG code stream, the type (I, P, or B) and the
`quantization step size are extracted for each macroblock. Based
`on this information, we shall estimate a quality parameter for
`each pixel which is used in a motion-compensated (MC) fil-
`tering. MPEG specifies the code-stream syntax but not the en-
`coder itself. Our work is based on the reference MPEG-2 soft-
`ware encoder [2], for which the quantizers may be character-
`ized as follows. The nonintra quantizer used for DCT coefficient
`is (very close to) a uniform quantizer with quantization
`step
`and a deadzone of
`around zero. The intra quantizer
`used for DCT coefficient
`has a deadzone of 5/4
`around zero. For larger values, it is a uniform quantizer with
`, and the dequantizer reconstruction
`quantization step
`point has a bias of 1/8
`toward zero. In [2], as is usually
`the case, all DCT coefficients in all blocks are being quantized
`independently as scalars.
`The mean-squared error (MSE) caused by the quantization
`depends on the distribution of
`. This distribution varies
`with the image content and is hard to estimate accurately. We
`may approximate the expected error by the expression for a uni-
`form distribution of errors, within each quantization interval, re-
`sulting from a uniform quantizer with quantization step
`ap-
`plied to
`
`(1)
`
`This expression may underestimate the error as it neglects the
`influence of the dead zone, and it may overestimate the error as
`the distribution of
`is usually quite peaked around zero,
`especially for the high frequencies.
`The DCT transform is unitary (when appropriate scaling is
`applied). Thus, the sum of squares over a block is the same in
`the DCT and spatial domains. Applying this to the quantization
`errors and introducing the expected values gives the following
`relationship for each DCT block:
`
`(2)
`
`where the DCT coefficients
`are scaled as specified in [1]
`(Annex A) and
`denotes the pixel value variables. As
`an approximation, we assume that the expected squared quan-
`tization errors are the same for all the pixel positions
`within the DCT block. Based on this assumption, the expected
`value of the squared error for pixel
`is given by
`
`2In this paper, all pictures are field pictures.
`
`for all
`
`within the DCT block having coefficients
`
`Authorized licensed use limited to: Cliff Reader. Downloaded on November 14,2023 at 06:50:12 UTC from IEEE Xplore. Restrictions apply.
`
`(3)
`
`.
`
`2
`
`

`

`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`805
`
`Fig. 1. MSE measured for sequence table as a function of the quantization step
`size q (depicted using natural logarithms). For intra pictures, q is defined as
`the quantization step size for the DCT coefficient at (1, 1).
`
`Fig. 1 depicts the logarithm of the MSE as a function of
`for the luminance component of I, B, and P pictures.
`The figure reflects the fact that bidirectional prediction is better
`than unidirectional prediction, and that intra pictures and non-
`intra pictures are different. It is noted that we can use the ex-
`pression (1) as a general approximation for the MSE of picture
`type
`as long as we replace with
`, where
`is a constant
`which depends on the picture type, i.e.
`
`(4)
`
`From the data in Fig. 1, we measure
`, and
`,
`. These values are used in all the experiments reported.
`The intra and nonintra quantization matrices used [2] are dif-
`ferent. This is, in part, addressed by the values of
`. [The value
`of was measured with
`defined as the quantization step size
`for
`.] The normalized quantization parameters,
`in (4)
`are used as the quality value we assign to each pixel within the
`block. This measure is only used for relative comparisons and
`not as an absolute measure. It could be improved by taking the
`specific frequency content into account, as well as the precise
`quantization for each coefficient.
`In general, pixels in the interior of an 8
`8 DCT block have
`a smaller MSE than pixels on the border. We could assign a dif-
`ferent value of
`for interior pixels and pixels on the border.
`Experiments lead to our decision of ignoring the small differ-
`ence at our (high) bit rates and as an approximation use the same
`quality value (4) for all pixels in a block.
`
`III. UPSAMPLING TO SUPERRESOLUTION USING
`MOTION COMPENSATION
`
`To process a given (directly decoded) picture we combine the
`information from the current frame and the
`previous frames
`and the
`subsequent frames, where
`is a parameter and
`each frame consists of two field pictures. We first describe how
`to align pixels of the current picture at time with pixels of one
`of the reference pictures using motion estimation. Section III-A
`then describes how to combine the information from all the ref-
`erence pictures to form a single superresolution picture at
`. The
`
`Fig. 2. Overview block diagram. MC upsampling alternates between doubling
`the resolution vertically and horizontally. Then final step is decimation to the
`desired format. Equation numbers are given in (). Dashed line marks control
`flow.
`
`term superresolution picture is used to refer to the initial MC up-
`sampled high-resolution image. An overview of the algorithm is
`given in Fig. 2.
`The motion field, relative to one of the reference pictures, is
`determined on the directly decoded sequence by block-based
`motion estimation using blocks of size 8
`8. This block size
`is our compromise between larger blocks for robustness and
`smaller blocks for accuracy, e.g., at object boundaries. A mo-
`tion vector is calculated at subpixel accuracy for each pixel
`of the current picture relative to the reference field picture con-
`sidered. Based on the position of
`and the associated motion
`vector, one pixel
`shall be chosen in the reference picture.
`The motion vector is found by searching the reference picture
`for the best match of the 8 8 block, which has
`positioned as
`the lower-right of the four center pixels. The displacements are
`denoted by
`, where
`is the integer
`and
`the (positive) fractional part of the displacement
`relative to the position in the current picture.
`is the ver-
`tical displacement. For a given candidate vector
`, each pixel
`of the 8 8 block is matched against an esti-
`mated value which is formed by bilinear interpolation of four
`neighboring pixels
`,
`,
`, and
`in the reference picture
`
`(5)
`
`where
`
`is the pixel in the reference picture displaced
`relative to the pixel
`in the current picture. (The co-
`ordinate systems of the two pictures are aligned such that the
`positions of the pixels coincide with the lattice given by the in-
`teger coordinates.) The subpixel resolution of the motion field,
`specified vertically by
`and horizontally by
`, determines the
`allowed values of
`and
`:
`
`Authorized licensed use limited to: Cliff Reader. Downloaded on November 14,2023 at 06:50:12 UTC from IEEE Xplore. Restrictions apply.
`
`3
`
`

`

`806
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`and
`
`. The best motion vector
`is defined as the candidate vector that
`minimizes the sum of the absolute differences (
`) taken
`over the 64 pixels of the block. (How the set of candidate vec-
`tors is determined is described in Section III-C.) Let
`be the absolute coordinate of pixel
`in the reference picture
`obtained by displacing the position of the current pixel
`by
`the integer part
`of the best motion vector. The pixel
`value of
`is now perceived as a (quantized) sample value of a
`pixel at position
`,
`in a
`superresolution picture at time which has
`times the number
`of pixels vertically and
`times the number horizontally rela-
`tive to the directly decoded picture.
`It is not sufficient, though, to find the best motion vector ac-
`cording to the matching criterion as there is no guarantee this is
`a good match. The following criteria is used to decide for each
`whether it shall actually be placed in the superresolution pic-
`ture. We may look at the problem as a lossless data compression
`problem (inspired by the minimum description length principle
`[11]). Let there be two alternative predictive descriptions of the
`pixels of the current 8 8 block, one utilizing a block of the ref-
`erence picture and one which does not. If the best compression
`method that utilizes the reference block is better than the best
`method which does not, then we rely on the match. In practice,
`we do not know the best data compression scheme, but instead
`some of the best compression schemes in the literature may be
`used. For lossless still-image coding, we use JPEG-LS [12]. For
`lossless compression utilizing motion compensation, we chose
`the technique in [13], which may be characterized as JPEG-LS
`with motion compensation. For simplicity, the comparison is
`based on the sum of absolute differences. The JPEG-LS pre-
`dictor [12] is given by
`
`if
`if
`otherwise
`
`(6)
`
`, denotes the pixel on
`
`denotes the pixel to the left of
`where
`top of
`, and
`the top-left pixel.
`We compare the (intra picture) JPEG-LS predictor and the
`best MC bilinear predictor (5). If the former yields a better pre-
`diction of the pixels of the 8 8 surrounding block, we leave the
`superresolution pixel undefined (or unchanged) by not inserting
`(or modifying) a MC pixel at the position
`,
`.
`Checking the match reduces the risk of errors in the motion
`compensation process, e.g., at occlusions. Occlusions are also
`handled by performing the motion compensation in both di-
`rections time wise, and by performing motion compensation at
`pixel level. This leads to a fairly robust handling of occlusions
`to within 3–4 pixels of the edge.
`
`A. Forming the Superresolution Picture
`The superresolution picture is initially formed by mapping
`pixels from each of the reference pictures as described above.
`The implemented block-based motion-compensation scheme
`is described in Section III-C. If more than one reference pixel
`
`maps to the same superresolution pixel, the superresolution
`pixel is assigned the value of the reference pixel having the
`smallest value of the normalized quantization parameter
`obtained from
`and the picture type (4). If the pixels are
`of equal quality
`, the superresolution pixel is set equal to
`their average value. We do not define a MC superresolution
`pixel if the best (i.e., smallest)
`is significantly larger than the
`normalized quantization value of the current macroblock in the
`directly decoded picture.
`Pixels of the current directly decoded picture a priori have a
`higher validity than the reference pixels because the exact lo-
`cation in the current picture is known. Let
`be a pixel of the
`directly decoded picture at time
`and
`a pixel from a refer-
`ence picture aligned with within the uncertainty of the motion
`compensation. To estimate a new (superresolution) pixel value
`at the original sample position of
`, we calculate a weighted
`value
`of
`and
`by
`
`The filter coefficients in (7) may be estimated in a training
`session using original data. The (MSE) optimal linear filter is
`given by solving the Wiener–Hopf equations
`
`(7)
`
`(8)
`
`where
`are the stochastic variables of the pixels in
`, and
`,
`(7). The variables
`and
`represent quantized pixel values,
`whereas
`represents a (superresolution) pixel at a sample posi-
`tion in the picture with the original resolution. The Wiener filter
`coefficients could, alternatively, be computed under the con-
`straint that
`in order to preserve the mean value. In
`our experiments on actual data applying (8),
`was fairly
`close to 1, so we just proceeded with these estimates. Given
`enough training data, the second-order mean values in (8) could
`be conditioned on the quality of
`, i.e.,
`, and on
`the types of the pictures of
`and
`as well as other MPEG pa-
`rameters. In this paper, the picture type is reflected by (4) and the
`number of free parameters is reduced by fitting a smooth func-
`tion to the samples
`. We choose the function below
`as it is monotonically increasing in
`from 0 to 1 and as its
`behavior can be adjusted by just two parameters as follows:
`
`(9)
`(10)
`
`The parameter
`should
`specifies the a priori weight that
`carry. The parameter
`specifies how much the difference in the
`qualities of
`and
`should influence . The filter (9) has the
`property that for
`,
`and
`, we have
`.
`The MC superresolution pixels, which do not coincide with
`the sample positions in the current image, maintain the quality
`value they were assigned in the reference picture. Pixels in the
`original sample positions
`, determined by (7),
`are assigned the quality value
`
`(11)
`
`Authorized licensed use limited to: Cliff Reader. Downloaded on November 14,2023 at 06:50:12 UTC from IEEE Xplore. Restrictions apply.
`
`4
`
`

`

`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`807
`
`used in the software coder [2] for 4 : 2 : 2–4 : 4 : 4 con-
`
`filter
`version
`
`(12)
`
`Each pixel of the resulting superresolution picture
`is assigned the attribute of whether it was determined by motion
`compensation
`or interpolation
`. The MC
`pixels also maintain their quality value determined by (4) [and
`possibly modified by (11)] as an attribute.
`
`Fig. 3. Block diagram of MC upsampling doubling the vertical or horizontal
`resolution. Equation numbers are given in (). Averages as expressed by
`(8)–(10) may also be used.
`
`C. Speedup of Motion Compensation
`The following scheme is applied to speed up the estimation
`of the
`high-resolution motion fields that are required
`for the
`reference pictures relative to the current pic-
`ture. The very first motion field (estimating the displacement
`of pixels of the other field of the current frame relative to the
`current field picture) is found by an exhaustive search within a
`7 horizontally).
`small rectangular window ( 3 vertically and
`For each of the remaining
`reference pictures, we initially
`predict the motion field before actually estimating the field by
`a search over a reduced set of candidate motion vectors. The
`motion field is initially predicted from the previously estimated
`motion fields using linear prediction, simply extrapolating the
`motion based on two motion vectors taken from two previous
`fields. (The offset in relative pixel positions between fields of
`different parity is taken into account in the extrapolation. After
`this the motion vectors implicitly takes care of the parity issue.)
`Having the predicted motion field (truncated to integer preci-
`sion), we collect a list over the most common motion vectors
`appearing in the predicted motion field. Thereafter, the search is
`restricted to the small set of this list for the integer part (
`)
`of the motion vector in (5). All
`fractional values of a mo-
`tion vector are combined with the
`integer vectors on the list.
`Consequently, the final motion vector search consists of trying
`out
`vectors. This way, we hope to track the motion vec-
`tors at picture level without requiring the tracking locally. Thus,
`even with a small initial search area, between the two fields of
`a frame, the magnitude of the motion vectors on the list may
`increase considerably with no explicit limit to the magnitude.
`Very fast motion, exceeding the initial search area between two
`fields of the same frame, is not captured though. In the exper-
`iments, we use a fixed-size (
`) candidate list. The size
`of the list can be adjusted according to different criterias. As
`an example, including all motion vectors on the list with an oc-
`currence count greater than some threshold
`in the predicted
`motion field reduces the risk of overlooking the motion vector
`of an object composed of more than
`pixels, as a motion vector
`is estimated for each pixel. An additional increase in speed for
`higher-resolution motion fields
`is obtained
`by letting them be simple subpixel refinements of the motion
`field found for
`. The processing time for creating
`the high-resolution motion field is proportional to
`instead of
`, i.e., approximately a reduction by a factor
`of four for the usual resolution
`. As the size of
`the list with the updated vectors is fixed, the complexity is also
`proportional to the number of pictures specified by
`.
`Authorized licensed use limited to: Cliff Reader. Downloaded on November 14,2023 at 06:50:12 UTC from IEEE Xplore. Restrictions apply.
`
`B. Completing the Superresolution Picture by Interpolation
`A block diagram of the MC upsampling is given in Fig. 3.
`Let
`denote a superresolution picture created by MC
`upsampling as described previously.
`and
`specify the res-
`olution of the motion compensation (5). Usually, some of the
`pixels of
`are undefined because there was no ac-
`curate match (of adequate quality) in any of the reference pic-
`tures. These pixels are assigned values from an interpolated
`superresolution picture
`having the same resolution
`as
`. The resulting image is denoted
`. The
`picture
`is created by a 2 : 1 spatial interpolation of
`the high-resolution picture
`(if
`) or the
`high-resolution picture
`(if
`). This upsam-
`pling alternates between horizontal and vertical 2 : 1 upsam-
`pling.
`The upsampling process is first initialized by setting
`equal to the directly decoded picture which has the original res-
`olution. Thereafter, the initialization is completed by defining
`, where
`is created by spatial inter-
`polation of
`. Hereafter,
`,
`, and
`3
`may be created in turn building up the resolution, alternating
`between horizontal and vertical 2 : 1 upsampling.
`The odd samples being interpolated in the upsampled picture
`are obtained with a symmetric finite-impulse response (FIR)
`
`3The block-based motion-estimation method applied does not warrant higher
`precision of the motion field.
`
`5
`
`

`

`808
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`In order to keep the algorithmic complexity down, we base the
`decisions in the enhancement algorithm on analysis of the lumi-
`nance component only, always performing the same operations
`on a chrominance pixel as the corresponding luminance pixel.
`Additionally, no special action is taken at the picture boundaries
`apart from zero padding. The original motion vectors coming
`with the bit stream were disregarded as a higher resolution is
`desired. They could be used though, e.g., by including them on
`the list of predicted motion vectors.
`
`IV. DECIMATION
`
`The upsampling procedure only performed quality-based
`filtering for pixels located on the same motion trajectory
`(within our accuracy). In this section, we state a downsampling
`scheme applying quality based spatial filtering of the super-
`resolution pictures. The filter coefficient for each pixel should
`reflect the quality and the spatial distance of the pixel. The
`quality attributes are dependent on the MPEG quantization (4)
`and whether the pixel is MC or interpolated. For all possible
`combinations of quality attributes within the filter window,
`the optimal filter could be determined given enough training
`data. Instead, we take the simpler approach of first assigning
`individual weights to each pixel depending on its attributes
`relative to the current pixel and then normalizing the filter
`coefficients.
`A two-dimensional linear filter
`is applied to the samples
`of the superresolution picture
`in the vicinity of
`each sample position
`in the resulting output image of
`lower resolution. The filter is a product of a symmetric vertical
`filter, a symmetric horizontal filter and a function reflecting the
`quality. The weight of the pixel
`at
`in
`is
`
`(13)
`
`In this expression, the weight
`quality attributes of the pixel
`factor
`. The 1-D filters
`spatial distance, are defined as follows:
`
`is a function of the
`is a normalizing
`, reflecting the
`
`and
`and
`
`(14)
`(15)
`(16)
`
`It is noticed that the support of the low-pass filter is
`superresolution pixels or approximately the area of one
`low-resolution pixel. This very small region of support is chosen
`to reduce the risk of blurring across edges in the decimation
`process. Furthermore, the value of
`should be quite small be-
`cause very often the individual pictures are undersampled. In
`the experiments, we use the parameter value
`.
`The function
`(13), reflecting the quality, depends on
`whether
`and
`are MC superresolution pixels
`or whether they were found through interpolation. When both
`pixels are MC [i.e., defined by
`], their rel-
`ative quality parameters are used to determine the weight of
`
`. If one of the pixels is obtained by interpolation, a con-
`stant is used for the weight
`
`are parameters.
`
`(17)
`
`where
`, and
`,
`The parameter
`worth of a MC
`
`specifies the a priori
`pixel compared to an interpolated
`pixel. The last case in (17), where there
`is no MC superresolution
`pixel at the output sample
`position
`, may occur in conversion to HDTV and in
`chrominance upsampling. Restoring SDTV there will always
`be the directly decoded pixel at
`ensuring a defined
`pixel in
`at
`.
`The parameter
`is a global parameter (set to 0.5) whereas
`is inversely proportional to a local estimate [within a region
`of size
`] of the variance of the superresolu-
`tion picture at
`.
`is set to 6. The structurally simple
`downsampling filter specified by (13)–(17) only has the four
`parameters
`. The downsampling filter also attenu-
`ates noise, e.g., from (small) inaccuracies in the motion com-
`pensation. (Larger inaccuracies in the motion compensation are
`largely avoided by checking the matches and only operating on
`a reduced list of candidate motion vectors.)
`
`V. RESULTS
`
`Four sequences were encoded: table, mobcal, tambour-sdtv,
`and tambour-hdtv. The extremely complex tambour sequence
`has been used both as interlaced SDTV and in HDTV format.
`For SDTV, the format is 4 : 2 : 0 PAL TV, i.e., the luminance
`frame size is 720
`576 and the frame rate is 25 frames/sec. For
`HDTV, the resolution is doubled horizontally and vertically.
`The parameters
`of the filter expression (9) are esti-
`mated using a small number of frames of the sequence mobcal.
`Calculating the Wiener filter (8), we assume implicitly that
`the “original” pixels of the superresolution picture taken at
`the sample positions are equal to the original (low-resolution)
`pixels of the SDTV test sequence. This yields the curve
`depicted in Fig. 4. Fitting the filter parameters of (9)
`to this curve yields
`and
`. These parameters
`are used in the processing of all the test sequences. Besides the
`curve
`based on average values over all
`, curves of
`were recorded for different fixed values of
`. These
`curves differ from the average in shape, as well as in level, e.g.,
`expressed by the value for
`, i.e.,
`. For most of the
`occurrences,
`was close to 1. The irregular shape of the
`curves for larger values of
`reflects the sparse statistics
`and due to this the dependency on the specific data that was
`used for estimating the Wiener filter. The overall level of
`was observed to increase with increasing
`, reflecting the fact
`that the motion estimation inaccuracy becomes less important
`when the quantization error is large.
`
`Authorized licensed use limited to: Cliff Reader. Downloaded on November 14,2023 at 06:50:12 UTC from IEEE Xplore. Restrictions apply.
`
`6
`
`

`

`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 9, SEPTEMBER 2002
`
`809
`
`Fig. 4. Wiener filter coefficient h as a function of q =q for a piece of mobcal.
`The smooth function is the filter expression obtained by fitting and
. The
`curves are h (q =q ) for all q and for a small fixed value of q (=12).
`
`Fig. 6. Average improvement in PSNR for luminance and chrominance for all
`sequences (res) using parameters H = V = 4 and N = 5. The result of
`increasing the bit rate by 1 Mb/s (+1) is given for comparison.
`
`Fig. 5. PSNR of directly decoded sequences

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket