`
`IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 3, NO. 5, SEPTEMBER 1994
`
`
`
`Spatially Adaptive Subsampling of Image Sequences
`
`
`
`Ricardo A. F. Belfor, Member, IEEE, Marc P. A. Hesp,
`
`Reginald L. Lagendijk, Member, IEEE, and Jan Biemond, Fellow, IEEE
`
`Abstract- In a spatially adaptive subsampling
`scheme, the
`
`
`subsampling lattice is adapted to the local spatial frequency
`In this paper, we use rate-distortion
`content of an image sequence.
`
`theory to show that spatially adaptive subsampling gives a better
`A
`
`
`performance than subsampling with a fixed sampling lattice.
`Original
`
`new algorithm that optimally assigns sampling lattices to dift'erent
`parts of the image is presented. The proposed spatially adaptive
`
`
`subsampling method can be applied within a motion-compensated
`actions in the spectral domain.
`coding scheme as welL Experiments show an increased perfor
`mance over fixed lattice subsampling.
`
`Prefilter
`
`IIl
`LIJ
`
`Interpolation
`
`Fig. I. Basic subsampling scheme. The figures in the boxes illustrate the
`
`I. INTRODUCTION
`
`S ?BSAMP�ING is a
`
`different spatial sampling lattices to the blocks is described
`
`
`based on rate-distortion theory. This theory has been used
`
`successfully in the past for solving the problem of optimally
`basic
`data compression method in
`
`
`dividing bits among different channels [6]. The advantage
`.
`.
`tmage codmg. By dtscardmg a part of the pixels, the
`
`of the algorithm is that the set of possible lattices is not
`
`
`
`image can be transmitted more efficiently. An application of
`
`
`limited and that, under certain conditions, optimality can be
`
`
`
`fixed lattice subsampling is the coding of the color information
`guaranteed.
`
`in image sequences [I]. Because of the oversampling of the
`A further compression can be achieved by exploiting the
`
`
`
`
`
`color information in current video standards, substantial data
`
`temporal correlation in an image sequence. In a subsam
`
`
`
`compression can be obtained. Fixed subsampling, however,
`pling scheme, this is normally done by using sub-Nyquist
`
`discards a part of the spectrum without any consideration as
`
`subsampling techniques [7], [8]. The approach taken in our
`
`
`to the actual content of the image. Such an approach causes
`
`
`scheme is to use motion-compensated prediction as used in
`
`an unacceptable loss of resolution when applied to luminance
`
`
`
`hybrid coders. If the prediction error is small, no additional
`
`information. Therefore, an extension of basic subsampling
`
`
`information has to be transmitted. Thus, the spatially adaptive
`
`should be made in order to account for the nonstationary nature
`
`subsampling analogy is extended by decreasing the temporal
`of image sequences.
`
`sampling rate if the temporal activity is low.
`In a spatially adaptive subsampling scheme, the image is
`
`
`In Sections II and III some theoretical background is pre
`
`subdivided into square blocks, and each block is represented
`
`
`sented about the basics of subsampling and spatially adaptive
`
`by a specific spatial sampling lattice. In detailed regions,
`
`subsampling. The purpose of the theory is to provide a
`a dense sampling lattice is used, and in regions with little
`
`
`mathematical framework for analyzing the performance of
`detail, a sampling lattice with only a few pixels is used.
`
`
`adaptive subsampling system. We use the rate-distortion theory
`
`The choice of which lattice to use is determined by a rate
`
`
`to prove that spatially adaptive subsampling works better than
`
`
`and quality controller. In [2], the time axis transform (TAT)
`
`
`
`fixed lattice subsampling. The practical implementation of a
`
`
`system that optimally assigns a sampling lattice to each block
`
`spatially adaptive coding system is described in Sections III
`
`was described. This algorithm allows only for two different
`
`and IV, and is finally evaluated in Section V.
`
`sampling lattices. A nonoptimal solution for the assignment
`problem in a system with three different sampling lattices
`
`was presented in [3]. This method was not optimal because
`no attempt was made to search for the global minimum
`Subsampling can be defined as representing an image se
`
`
`
`distortion, and the algorithm stops in what may be a local
`quence on a new sampling lattice with a lower sampling
`
`minimum. A variation of this algorithm was presented in [4].
`
`
`density than the original lattice. A simple subsampling scheme
`
`The number of different sampling lattices can be extended by
`
`is shown in Fig. 1. To prevent aliasing caused by subsampling,
`
`using an hierarchical approach [5]. However, this introduces
`
`the spatial frequencies of the image sequence should be
`
`the problem of distributing the bit rate over the different levels
`confined to a unity cell [9]. This is a region in the frequency
`
`
`in the hierarchy. In this paper, a new algorithm that assigns
`
`
`domain associated with a given sampling lattice and defined in
`Manuscript received April 1, 1993; January 24, 1994. This work was sup
`
`such a way that by tiling this region, a complete coverage of
`
`ported by NATO grant 5-2-05/CRG 900834. The associate editor coordinating
`
`the review of this paper and approving it for publication was Prof. Dr.-Ing.
`
`the frequency plane can be obtained without any overlap. This
`Bernd Girod.
`
`can be seen as an extension of the Nyquist sampling theorem
`
`
`
`The authors are with the Department of Electrical Engineering, Information
`
`
`to multiple dimensions on an arbitrary sampling lattice. A
`
`Theory Group, Delft University of Technology, Delft, The Netherlands.
`IEEE Log Number 9402257.
`
`prefilter can be used to confine the original image spectrum to
`
`II. FIXED LATIICE SUBSAMPLING
`
`© 1994 IEEE
`1057-7149/94$04.00
`
`Google Inc.
`GOOG 1007
`IPR of US Pat. No. 7,974,339
`
`1
`
`
`
`BELFOR eta/.: SPATIALLY ADAPTIVE SUBSAMPLING OF IMAGE SEQUENCES
`
`493
`
`S(ro)
`
`Fig. 2. Power density function. The shaded areas are the components that
`contribute to the distortion. The dark shade represents the aliasing component
`and the light shade the loss of resolution.
`
`a prefilter H ( w) is used
`If we assume that prior to subsampling
`with a cut-off frequency at W88, then the corresponding
`mean
`
`square error distortion function is given by
`
`(1- IH(w)I2)S(w)dw +-
`D = -
`1171"
`1171"
`
`7r 0
`
`7r w "
`
`IH(w)I2S(w)dw
`(3)
`where we have taken into account the symmetry of S(w)
`and H ( w) around the origin. The first part of the equation
`
`
`
`
`represents the distortion introduced by the prefilter, and the
`
`second part is the aliasing error caused by an imperfect
`
`
`
`prefilter. The two areas that contribute to the distortion are
`the shaded regions in Fig. 2. If an ideal prefilter is used with
`transfer function
`
`H(w) = g:
`
`lwl < Wss
`Wss < lwl < 7r
`
`(4)
`
`the unity cell. If the original spectrum is already confined to
`
`the unity cell, prefiltering is optional.
`If the new sampling lattice is a subset of the original
`
`
`
`lattice, the actual subsampling can be implemented by simply
`
`
`discarding those pixels not present in the new lattice. If
`
`
`this is not the case, an intermediate sampling structure that
`
`bears a relation to both the original and the new lattice must
`be used [10]. In a subsampling data compression system,
`(2) in (5) and using (1) gives the distortion-rate
`Substituting
`function D (R):
`
`the remaining pixels after subsampling are transmitted or
`
`forwarded to subsequent coding or processing stages.
`
`At the receiver, the image sequence has to be reconstructed
`
`to the original sampling lattice. This is done with an inter
`polation filter. This filter has to be designed in such a way
`
`that the replicas introduced by the subsamp1ing process are
`cancelled out. Further, it should not remove the frequency
`
`Let us consider the properties of this distortion-rate function.
`of D (R) is given by The first derivative
`components within the unity cell because this would cause a
`
`loss of resolution.
`
`We investigate several coding aspects of fixed lattice sub
`
`sampling from a rate-distortion point of view [11]. Properties
`derived here are used in the next section when discussing
`
`
`
`spatially adaptive subsampling. The rate-distortion function
`The first derivative of the rate-distortion function is always
`
`
`R(D) provides a lower bound for the bit rate R necessary to
`because S(w) is always greater or
`
`monotonically decreasing
`
`transmit a source with an average distortion D. We derive the
`
`equal to zero. Thus, (7) is the trivial result wherein for every
`
`
`rate-distortion function for subsampling as a data reduction
`
`
`coding scheme, the distortion increases if the rate decreases.
`
`
`method to show the conditions under which subsampling is
`
`
`Another property of rate-distortion curves is convexity. This
`
`appropriate and to study the source of the different errors.
`is a required property because it implies that the parts of
`
`
`We assume a band-limited spatially discrete source that is
`the spectrum with the least relevance to the entire signal are
`density function S(w).
`PCM encoded and has a power spectral
`discarded first. A function is convex if the second derivative
`An example of such a function is shown in Fig. 2. According
`
`
`is monotonically decreasing. From (7), we obtain
`theorem, the variance a2 of this source is
`to Parseval's
`
`then the aliasing component is zero, and (3) reduces to
`
`1171"
`
`D =-
`7r Wss
`
`S(w)dw.
`
`(5)
`
`D (R) = a2- -
`7r 0
`
`11:,71"
`
`S(w)dw.
`
`aD =- _!__s(.!!__1r) .
`
`Ro Ro
`
`8R
`
`(6)
`
`(7)
`
`a2 = - S(w)dw.
`1 !7r
`
`27r -71"
`In a fixed lattice subsampling scheme, the spectrum of
`We see that if the power spectral density function S(w) is
`the input source is low-pass filtered and then subsampled
`w (i.e., S' (w) < 0),
`
`
`monotonically decreasing with increasing
`according to the Nyquist theorem. If the rate required to
`
`then the rate-distortion function is convex. Thus, in subsam
`
`transmit the original source using PCM coding is R0, then
`
`pling schemes, the convexity of the rate-distortion curve is
`
`
`the new rate after subsampling is
`of S(w). If the
`
`directly coupled to the decreasing character
`
`power spectrum density is nondecreasing, the rate-distortion
`7r
`curve may become nonconvex.
`where Wss is the bandwidth after prefiltering. Hence, the new
`
`If the argument presented in this section is extended to two
`
`
`dimensions, then after pre filtering and subsampling,
`
`bit rate is reduced proportionally with the bandwidth reduction.
`
`(1)
`
`(2)
`
`(8)
`
`S ( Wx, Wy)
`
`R = Wss · Ro(bits/pixel)
`
`T -r -
`
`2
`
`
`
`494
`
`IEEE TR ANSACTIONS ON IMAGE PROCESSING, VOL. 3, NO. 5, SEPTEMBER I994
`
`Subsampling
`Input
`, ... -- -- -- -----'
`'
`'
`' ·�
`S,(ro)'
`� i------7---+ _ill_ i R,
`' I
`_cb_ �
`�
`R2
`'
`'
`
`
`
`the following relation holds for the bit rates:
`
`
`
`R, = Rz = Rt.
`
`D[ ( Rt) can be computed by inserting
`The total distortion
`
`this relation into (10):
`
`(11)
`
`(12)
`
`S,(ro)
`
`If spatially adaptive subsampling is used, the bits can
`
`
`be divided over the two regions in an uneven manner. In
`
`an optimal bit allocation where the mean square error is
`
`
`
`minimized, the resulting bit rate for each region is controlled
`
`by the following two relations:
`
`Fig. 3. Model for spatially adaptive subsampling.
`
`(13)
`
`and
`
`
`
`does not extend over a specific bandwidth but covers a region
`
`U L in the 2-D space. This region is the unity cell prescribed
`_ Area(UL)
`- (27r)2 Ro
`(2�)2 J J S(wx, wy)dwxdwy.
`(wx,wy)E UL
`
`
`
`
`
`
`
`by the subsampling lattice. Equations (2) and (5) now become
`where Df(Rt) is the distortion
`in a spatially adaptive coding
`
`scheme. We now show in the following that if an optimal
`is used, Df is always less than or equal to
`bit allocation
`
`D[. Using the transformation RI/2 - R2/2 = �r, where
`I� r I represents
`
`the difference between the allocated bit rate
`assigned to each region and the desired bit rate, (13) can be
`
`inserted into (14 ):
`
`R
`
`(9)
`
`D = u2-
`
`minDf(Rt) = minD,(Rt + �r) + Dz(Rt- �r).
`
`�r
`
`�r
`
`(15)
`
`The optimal bit allocation can now be solved by setting the
`
`Without any specific knowledge about the 2-D subsampling
`to �r to zero:
`first derivative with respect
`
`
`
`lattice, it is not possible to obtain a close form relation for the
`
`rate-distortion function.
`
`8D1(Rt + �r)
`
`8Dz(Rt- �r) _ 0
`
`.
`
`+
`
`(16)
`
`-
`a�r
`a�r
`yields the following expresInserting (7) into this expression
`
`
`sion for the optimal bit allocation:
`
`III. SPATIALLY ADAPTIVE SUBSAMPLING
`
`A. A Model of a Spatially Adaptive Subsampling Scheme
`
`exist and are monotonically
`
`meaning that
`In order to examine the advantage of spatially adaptive
`
`more bits are assigned to the first region. If the power spectral
`
`
`subsampling over fixed subsampling, a simple model is in
`
`density function of the first region is greater, then (7) implies
`
`troduced. A block diagram of the model is given in Fig. 3.
`
`that the slope of the rate-distortion curve of the first region is
`
`To account for the nonstationary nature of the input signal,
`greater than that of the second region. This leads to
`
`we consider a signal that consists of two distinct regions
`
`(e.g., blocks in an image) with different statistics so that
`S 1 ( w) and 82 ( w) differ
`the power spectral density functions
`where we have used the convexity of the rate-distortion
`
`from each other. We assume that the power spectral density
`
`
`function. This relation indicates that the gain in assigning
`functions
`
`decreasing; therefore,
`more bits to the first region is larger than the loss caused by
`
`
`
`the corresponding rate-distortion functions are convex. Both
`
`
`
`
`assigning fewer bits to the second region. Rewriting (18) gives
`regions have the same number of samples, and all samples are
`bit rate Rt is
`fed into a single coding scheme. The resulting
`given by the average of the two bit rates R1 and R2 used to
`Dt(Rt) after coding
`
`code each region. The resulting distortion
`is given by
`The above argument also holds if the characteristics of the
`
`
`first and second regions are exchanged. If the power densities
`
`differ, then the performance of spatially adaptive subsam
`
`pling is always better than the performance of nonadaptive
`where D, ( R I ) and D2 ( R2) are the rate-distortion
`functions
`subsampling.
`S,(1rRt) is equal to Sz(1rRt). In this case, the performance
`of each region separately.
`
`Let us first consider the case of fixed subsampling. Because
`
`
`
`of fixed subsampling is obviously identical to the performance
`both regions are encoded with the same sampling frequency,
`
`of spatially adaptive subsampling.
`
`If S1(1rRt) > S2(1rRt), then �r is positive,
`
`(17)
`
`D,(Rt) + Dz(Rt) > D,(Rt + �r) + Dz(Rt- �r)
`{::} D[(Rt) > Df(Rt)·
`
`(19)
`
`(10)
`
`If the solution to (17) yields �r = 0, then
`
`3
`
`
`
`BELFOR et al.: SPATIALLY ADAP TIVE SUBSAMPLING OF IMAGE SEQUENCES
`
`495
`
`••••••••
`••••••••
`••••••••
`••••••••
`••••••••
`••••••••
`••••••••
`••••••••
`
`Mode 1
`
`eoeoeoeo
`00000000
`eoeoeoeo
`00000000
`eoeoeoeo
`00000000
`eoeoeoeo
`00000000
`
`Mode 2
`
`eoooeooo
`00000000
`00000000
`00000000
`eoooeooo
`00000000
`00000000
`00000000
`
`Mode 3
`
`Fig. 4. Examples of different modes. The solid dots are the pixels that are
`transmitted.
`
`B. Spatial Adaptive Subsampling in Practice
`
`Now that we have shown that spatially adaptive subsam
`pling works better that fixed lattice subsampling from a
`theoretical point of view, we consider the practical imple
`mentation. The ideal case would be to segment the image
`into regions that require the same spatial sampling frequency
`and sample each region according to this frequency. Such
`a solution would require a detailed analysis of the image,
`and a large amount of side information would be needed to
`transmit the shape of the regions. Therefore, we subdivide the
`image into square blocks, and within each block, one specific
`sampling lattice is used. The size of the blocks is an important
`system parameter. If large blocks are chosen, the amount of
`side information is low, but the ability to adapt to the local
`spatial frequency contents would be lost. Small blocks cause
`a large overhead but warrant a better adaptation.
`Another consideration in a practical system is the sampling
`lattice to be used for each block. Ideally, each block should
`be sampled with a sampling lattice optimally suited for that
`particular block. Again, this implies a large amount of side
`information. Therefore, only a limited set of possible sampling
`lattices is used. This set is designed in such a way that it
`gives a good coverage of the range of all the necessary spatial
`frequencies. In the sequel, each specific sampling lattice is
`called a mode. In Fig. 4, some examples of modes are given
`with different data reduction factors. For instance, in mode
`
`3, only four pixels are kept out of 64 pixels, giving a data
`
`reduction factor of 16. Mode 1 can be used for highly detailed
`regions, whereas mode 3 can be used for areas with a slowly
`varying luminance. The number of possible modes is affected
`by the block size because for decreasing block size, the number
`of possible sampling lattices within the block decreases as
`well.
`In a constant bit rate application, the output of the spatially
`adaptive subsampling scheme should be a fixed number of
`samples. Thus, the available modes should be distributed over
`the different blocks in such a way that the weighted sum
`of all the modes is equal to the desired bit rate. Of course,
`the image quality should be the best possible. Therefore, a
`criterion function that reflects the quality of the block for
`
`a particular mode needs to be used. In the next section,
`
`an allocation algorithm is discussed for this optimal mode
`distribution problem.
`The resulting overall coding scheme is shown in Fig. 5.
`The input image is first prefiltered and subsampled for each
`
`Fig. 5. General spatially adaptive coding scheme.
`
`mode. The subsampled images are fed into an interpolation
`module that also evaluates the quality criterion. The quality
`of each mode is used in the mode allocation that assigns a
`particular mode to each block. Finally, this information is
`transmitted to the receiver together with the pixels remaining
`after subsampling all blocks.
`If, at the receiver, each block is interpolated using a tech
`nique that involves only the pixels within the block, the
`interpolation of each block is straightforward. However, if
`a more sophisticated interpolation technique is used, such as
`a filtering, information from neighboring blocks is required.
`This poses a problem if a neighboring block is sampled
`with a different mode because the pixels necessary for the
`interpolation may not be available. To avoid such problems, a
`hierarchical set of modes should be used. In a hierarchical set
`
`mode, n+ 1 is always a subset of mode n:
`{x I x E moden+d C {x I x E moden } , Vn
`
`(20)
`
`where the vector x represents a pixel location. For instance,
`the modes given in Fig. 4 form an hierarchical set. The mode
`with the smallest sampling density can be now be interpolated
`because the required boundary pixels are always present in
`the neighboring blocks. After interpolating all blocks with
`this mode, the same argument holds for the next mode in
`the hierarchy. It is obvious that this interpolation scheme will
`result in a worse interpolation result as compared with the
`interpolation made at the transmitter where all the required
`pixels are always present in the neighboring blocks. The
`severity of this loss of performance is examined in Section
`V. If a nonhierarchical set is used, a common intermediate
`sampling lattice must be introduced. From this lattice, the
`boundary pixels for the different modes can be deduced, and
`interpolation is still possible.
`
`C. The Mode Allocation Problem
`
`The mode allocation is of great significance as it influences
`the output quality considerably. A brute force search tries all
`the modes on all the blocks and selects the combination that
`gives the smallest distortion. In a practical situation, this is not
`
`feasible because in a system with N different modes and M
`blocks, the number of possible allocations is equal to N M. As
`
`the number of blocks grows linearly, the number of allocations
`increases exponentially.
`In a system where only two modes are used, the mode
`allocation problem can be solved analytically [2]. If the
`fraction of blocks sampled with mode 1 at a rate of R1 is
`equal to o:1 and the fraction of blocks sampled with mode 2 at
`
`4
`
`
`
`4%
`
`IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 3, NO. 5, SEPTEMBER 1994
`
`
`relations hold:
`500
`
`D (MSE)
`
`(21)
`
`400
`
`300
`
`200
`
`100
`
`a rate of Rz is az (R1 > Rz) then the following
`0:1 + 0:2 1
`a1R1 + a2R2 = Rt
`where Rt is the desired total bit rate. From these equations,
`a1 and az can be solved. To minimize
`the distortion, all the
`
`a1 of
`blocks are first assigned to mode 2. Next, a fraction
`
`the blocks with the highest distortion are assigned to mode l.
`
`This will lead to an optimal mode assignment in the sense of
`
`minimizing the overall distortion.
`If the number of modes is greater than two, (21) generalizes
`
`to a pair of equations with more than two unknowns, which
`
`
`cannot be solved uniquely. In [3], a heuristic algorithm is
`
`presented that does not guarantee optimality. Further, this
`
`
`
`algorithm is only suitable for three modes and is based on the
`
`2-D histogram of the error differences between the different
`Fig. 6. Evaluated points in a mode allocation with four blocks and three
`
`modes. A variation of this algorithm was described in [4].
`different modes.
`
`
`
`
`
`Here, we describe a mode allocation algorithm that allows for
`
`
`an arbitrary number of modes and is under some conditions
`optimal.
`
`
`The algorithm is based on the convex hull bit allocation
`
`
`scheme used for assigning quantizers to the subbands in a
`
`subband coding scheme [6]. Two modifications have to be
`made to this algorithm:
`of assigning mode j(j E {1 ... N}) to that source is com
`
`• The sources that have to be coded correspond to the
`
`
`
`puted. The source with the smallest relative distortion gain
`(min;,j 6.D;i/ 6.R;j) is assigned a new mode. In [6], it is
`blocks into which the image is subdivided.
`
`
`• The quantizers correspond to the different mode struc
`proved that due to the convexity of the rate-distortion curve,
`
`tures.
`
`
`this new mode allocation again lies on the rate-distortion curve.
`
`Two assumptions made in the original algorithm require
`
`
`Starting from the new allocation, the described procedure is
`
`
`special attention. First, it is necessary that the rate-distortion
`
`
`
`repeated. The algorithm terminates when the desired bit rate
`curve of each source is convex. In Section II, we saw that this
`
`
`is reached. Hence, instead of evaluating all the possible mode
`
`is not always the case and depends on the shape of the power
`
`
`
`allocations, only those lying on the rate-distortion function are
`
`spectral density function of each block. For a whole image,
`examined. This causes a drastic decrease of the number of
`
`this may be a valid assumption, but for a small block in an
`
`
`
`iterations and guarantees optimality. An example of the points
`
`evaluated is given in Fig. 6 for a system with M 4 different
`
`image, this will no longer hold in general. A solution to this
`problem is to remove the modes that cause the rate-distortion
`blocks and N
`
`3 different modes different modes. In this
`
`
`
`case, a brute force allocation requires 34 = 81 iterations. The
`
`
`curve to be nonconvex from the allocation process. The impact
`
`
`of this is that some allocations are no longer possible. If the
`
`fundamental benefit of the proposed search algorithm is to
`
`
`optimal solution is contained in this set of excluded allocations,
`
`
`use an iterative method, which automatically follows the nine
`
`then the mode allocation in no longer optimal. The second
`
`
`combinations being optimal from the rate-distortion point of
`
`view (solid triangles in Fig. 6).
`
`
`requirement is that the total distortion is equal to the sum of
`
`the distortion for each source:
`M
`
`0.25
`
`0.5
`
`R/Ru
`
`0.75
`
`x Brute Force Allocation ..... Convex Hull Allocation
`
`and the distortion difference
`
`6.D;j
`
`(24)
`
`Dt LD;(Ri)·
`i=l
`
`(22)
`
`IV. MOTION-COMPENSATED
`SPATIALLY ADAPTIVE SUBSAMPLING
`
`As we saw in the previous section, this condition cannot
`
`
`Motion compensation has been used in many coding
`
`
`
`be satisfied because of the nonavailability of boundary pixels
`
`schemes to exploit the temporal correlation in image
`
`
`at the receiver. Hence, the mode allocation is performed on
`
`sequences. If a correct motion estimate is made, then no
`
`an estimate of the distortion. The effect of this problem is
`
`
`
`additional information has to be transmitted except for the
`
`discussed in Section V.
`
`
`
`motion vectors. A spatially adaptive subsampling scheme can
`The mode allocation algorithm starts by assigning each
`
`
`
`benefit from this property if, for a particular region, the spatial
`block i( i E { 1 ... M}) the mode with the lowest possible
`bit rate Ro;. This point gives the highest distortion,
`
`
`correlation is low but the temporal correlation is high. In this
`and it
`
`
`case, spatially adaptive subsampling would require a lot of
`
`is guaranteed that this point lies on the rate-distortion curve.
`
`
`samples, whereas motion compensation requires none. The
`Starting from this point, for each block i, the rate difference
`overall system is shown in Fig. 7. The shaded areas contain
`the components that were also present in the basic spatially
`(23)
`
`adaptive subsampling scheme as shown in Fig. 5.
`
`6./lti equal to
`
`5
`
`
`
`BELFOR et al.: SPATIALLY ADAPTIVE SUBSAMPLING OF IMAGE SEQUENCES
`
`497
`
`An advantage of this scheme is that it implicitly adjusts the
`threshold for the decision between intra and interframe coding.
`If the desired bit rate is high, then the algorithm is biased in
`the direction of intraframe coding, and if the bit rate is low,
`then interframe coding is preferred. Another advantage of this
`scheme is that now, there is a mode that requires no additional
`pixels; therefore, the maximal compression factor is no longer
`bounded by the block size, as is the case when only spatially
`adaptive subsampling is used.
`
`V. EXPERIMENT RESULTS
`
`A. The Effect of the System Parameters
`
`Fig. 7. Motion-compensated spatially adaptive coding scheme: (PF + SS
`
`is the prefilter and subsampling. MS is the mode selection, EC is the error
`computation, MA is the mode allocation, INT is the interpolation, FM is the
`frame memory, and MC is the motion compensation).
`
`A motion-compensated prediction (MC) of the actual image
`is made using the previous image stored in the frame memory
`(FM). The prediction error is determined by subtracting the
`original image from the motion-compensated predicted image.
`In many coding schemes, the prediction error is coded and
`transmitted over the channel. Generally, this is not a good
`solution. The reason for this is that the computation of the
`prediction error acts as a high-pass filter and introduces extra
`high-frequency components. If an incorrect motion estimate
`is made and the spatial correlation is high (e.g., flat areas
`in the image), then the prediction error is still low. The
`area pointed to by the bad motion estimate can still be
`a reasonable prediction of the actual block. However, if a
`bad motion estimate is made and the spatial correlation is
`high (e.g., an area containing edges), extra high frequency
`components are introduced. This is because besides the edges
`in the original image, the edges of the previous image are
`also present in the prediction error. For a coding scheme, this
`property is not advantageous because of the decrease in spatial
`correlation. Thus, motion-compensated prediction can both
`reduce and increase the spatial correlation of the prediction
`error.
`Therefore, a hybrid solution is chosen. Both the predic
`tion error and the original image are used in the spatially
`adaptive coding scheme. The original image is subsampled
`(PF+SS) and the interpolation error is computed (EC) for the
`different modes. The prediction error is fed together with the
`interpolation errors into the mode allocation (MA). The mode
`allocation now starts by assigning to each block a mode with
`zero pixels. The interpolation for this mode is based on the
`predicted image. If the prediction error is high, then the modes
`used for spatially subsampling are assigned to the blocks in a
`similar manner as pure spatially adaptive subsampling. Hence,
`both the spatial and temporal correlation are exploited.
`
`I -� --- -
`
`In this section, the effect of the different system parameters
`is investigated for spatially adaptive subsampling. This is
`illustrated by a practical example using the Lena test image.
`As a reference in all experiments, unless otherwise stated, a
`system with the mode structure as given in Fig. 4, a block
`size of 8 by 8 and a 3-tap prefilter is used. Mode information
`is accounted for in all experiments by reserving log2 ( N) bits
`per block if N different modes are used. The transmission
`of the motion vectors is not accounted for. The algorithm is
`evaluated using the signal-to-noise ratio (SNR) defined by
`
`SNR = 10
`
`10
`
`( u2 )
`log MSE
`
`(25)
`
`where MSE is the mean square error.
`First, spatially adaptive subsampling is compared with fixed
`lattice subsampling in Fig. 8(a). For the fixed lattice subsam
`pling, horizontal, vertical, and quincunx subsampling was used
`and each time the scheme that gave the best result was selected.
`We see that spatially adaptive subsampling gives a significant
`improvement over fixed subsampling. The difference decreases
`with decreasing rate because eventually, for zero rate, the mean
`square error is equal to the image variance in both coding
`schemes.
`In Fig. 8(b ), the effect of different block sizes is shown.
`Small blocks give a better performance than large blocks
`because of a better adaptation to the local spatial frequency
`contents. For low bit rates, the advantage is smaller because
`there is a relative increase in the amount of side information
`compared to large blocks. In Fig. 8(c), the effect of different
`mode structures are shown. The first scheme is the mode
`structure as shown in Fig. 4. The second scheme starts from
`a block containing all samples and each time quincunx sub
`sampling is used on the previous block. The third scheme
`is a modification of Fig. 4. In between the existing modes,
`two extra modes are introduced. One mode is a horizontal
`subsampling of the previous mode, and the second extra mode
`is a vertical subsampling of the previous mode. Note that this
`is not a hierarchical set. If the number of different modes
`increases, the performance increases as well. This is because a
`better adaptation is possible and a closer match can be found
`between the necessary Nyquist frequency and the available
`sampling frequencies.
`Two different mode allocation schemes were compared,
`namely, the histogram method as described in [3] and the pro
`posed algorithm. The performance of the histogram allocation
`
`6
`
`
`
`498
`
`IEEE TRANSACTIONS ON IMAGE PROCESSING. VOL. 3, NO. 5, SEPTEMBER 1994
`
`SNR (dB)
`
`27
`
`25
`
`SNR (dB)
`
`. +
`....
`
`• +"
`.....
`. +.
`
`23
`21
`19
`17
`16.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
`1tu o.ts 0.2 o.25 o.3 o.35 o.4 o.45 o.s
`
`... Fixed l...altice
`+Adaptive
`
`R/R.
`
`(a)
`
`R/R.
`
`(b)
`
`SNR(dB)
`
`27
`
`25
`
`23
`
`21
`
`*:ooc::"""'".
`
`1b.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
`R/R,
`
`(c)
`
`
`
`
`
`
`
`
`Fig. 8. Simulation result of Lena: (a) Comparison with fixed subsampling; (b) different block sizes; (c) different mode schemes.
`
`
`
`did not differ much from the optimal allocation. However,
`
`
`the proposed allocation scheme can benefit from the fact that
`
`
`increasing the number of modes has a positive influence on the
`
`
`
`performance. In addition, the difference between the expected
`
`
`error given by the mode allocation and the actual error after
`
`
`
`interpolation was investigated. This difference was small, and
`
`therefore, it may be concluded that the impact of the errors
`
`
`introduced by different sampling structures at block boundaries
`can be neglected.
`
`B. System Comparison
`
`
`In this section, the following coding schemes are compared
`with each other:
`(SA-SS): This scheme
`
`• Spatially adaptive subsampling
`uses no motion information.
`
`coding 2 (MCINSS): In this scheme,
`• Motion-compensated
`
`there are only two modes: If a good prediction is made,
`
`then noting is transmitted, and the original image is
`The second frame of the sequence is shown in Fig. 9. Th