`
`IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997
`
`II. REVIEW OF LOCAL COSINE BASES
`By local cosine bases, or modulated lapped transforms, we will
`denote a class of critically sampled perfect reconstruction filterbanks
`[9], [18] that uses a single prototype filter—window—w[n] of length
`2N (where N is the number of channels and is even) to construct
`all of the filters h0; ; h N 1 as follows:
`2k + 1
`w[n]
`hk[n] =
`(2n N + 1)
`pN cos
`4N
`with k = 0; ; N 1; n = 0; ; 2N 1; and where the prototype
`lowpass filter w[n] is symmetric, and satisfies the following condition
`[2], which, if we would arrange the first N coefficients of w[n] along
`the diagonal of the matrix WWW ; could be expressed as
`
`(1)
`
`WWW
`
`2 + JJJWWW
`
`2
`
`JJJ = 2III:
`
`(2)
`
`Here, JJJ denotes the antidiagonal matrix. This last condition,
`imposed on the window, ensures that the resulting local cosine basis
`is orthogonal. The two symmetric halves of the window are called
`“tails.”
`A convenient way of analyzing filterbanks in time domain uses
`infinite matrices, which describe the action of the filters on the input
`signal. For local cosine bases, such an infinite matrix TTT is “doubly
`diagonal” with blocks AAA0; AAA1; where blocks AAA0; AAA1 are of sizes
`N N; and contain the impulse responses of the filters. For example,
`the jth row of AAAi is (hj [2N 1 iN ] h j[N iN ]) for i = 0; 1:
`Note that the filter length is twice the number of channels. For an
`orthogonal, perfect reconstruction solution, the matrix TTT has to be
`unitary, which is equivalent to the following [19]:
`
`
`0 AAA1 = 0:1 AAA0 = AAATAAAT0 AAA0 + AAAT1 AAA1 = III; AAAT
`
`
`
`
`The second set of conditions above are called the “orthogonality of
`tails” conditions [19]. One more fact will be of use later. Call BBBi the
`blocks when no windowing is used, or w0[n] = 1; n = 0; ; 2N 1:
`That is, these blocks will just contain the cosines. Then
`
`AAA0 = BBB 0
`
`WWW ; AAA1 = BBB 1
`
`JJJWWW JJJ
`
`(3)
`
`
`
`where WWW is a diagonal matrix with window coefficients on the
`1]: Blocks BBBi satisfy the following:
`diagonal w[0];
`; w [N
`
`BBBT
`
`0 BBB 0 =
`
`(III + JJJ ):
`
`1 2
`
`JJJ ); BBBT1 BBB 1 =
`
`
`
`
`(III
`
`1 2
`
`III. 2-D LOCAL COSINE BASES
`We will now turn our attention to the 2-D case, with both
`rectangular and nonrectangular sampling lattices (see Fig. 1). For the
`current state of applications (such as image compression), rectangular
`sampling is of more interest; thus, we will examine it in detail and
`illustrate it with a design example.
`
`A. Rectangular Sampling
`We assume that we have rectangular sampling, N1 in horizontal
`dimension, and N2 in the vertical one [see Fig. 1(a)]. We will
`construct the filters as follows:
`w[n1; n2]
`pN1N2
`2j + 1
`4N2
`
`hij [n1; n2] =
`
`cos
`
`2i + 1
`4N1
`
`(2n1
`
`
`
`N1 + 1)
`
`cos
`
`
`
`(2n2
`
`
`
`N2 + 1)
`
`(4)
`
`Correspondence
`
`Local Cosine Bases in Two Dimensions
`
`Jelena Kovaˇcevi´c
`
`Abstract—We construct two-dimensional (2-D) local cosine bases in dis-
`crete time. Solutions are offered both for rectangular and nonrectangular
`lattices. In the case of nonrectangular lattices, the problem is solved by
`mapping it into a one-dimensiona (1-D) equivalent problem.
`
`Index Terms—Filterbanks, local bases, wavelets.
`
`I. INTRODUCTION
`Discrete-time cosine modulated filterbanks1 have been in use for
`some time [1]–[14]. Due to a few of their properties, they have
`become quite popular; For example, all filters (basis functions) of
`a filterbank are obtained by appropriate modulation of a single
`prototype filter. Then, fast algorithms exist, making them very
`attractive for implementation. Finally, they have been recently used
`to achieve time-varying tilings of the time-frequency plane [5]. Their
`continuous-time counterpart, termed Malvar’s wavelets, has found use
`in decomposing a signal into a linear combination of time-frequency
`atoms [6].
`Local cosine bases have been used extensively in audio coding
`[7]–[9]. They have also found use in image coding, due to the
`reduction of blocking effects [10] when compared to the discrete
`cosine transform (DCT). Some video works contain local cosine
`bases as well [11]. However, in all image and video applications,
`one-dimensional (1-D) cosine bases are used separately. We develop
`here true two-dimensional (2-D) nonseparable cosine bases, which
`offer more degrees of freedom. We consider both rectangular and
`nonrectangular sampling structures, and offer solutions for both.
`Although in continuous time a similar analysis could be performed,
`here we consider the case of more interest for applications in
`discrete time. Note that the general solution to this problem exists
`(valid for any number of dimensions, arbitrary sampling lattices
`and both for continuous and discrete time) and is given in [12]
`and [13]. However, since this work was performed before [12] and
`[13] and also because it explicitly states the conditions for the 2-D
`case (especially for separable sampling which is of most interest
`in applications), these results should be of use to researchers and
`practitioners.
`While revising the article, the reviewers pointed out some recent
`work on 2-D local cosine bases. For example, Chan in [14] considers
`a similar problem for separable sampling by using separable modula-
`tions and allowing the window to be nonseparable and constraining
`the prototype filter to have various symmetries. Work in [15] offers a
`general solution to the problem. Other works include [16] and [17].
`
`Manuscript received May 27, 1995; revised October 17, 1996. Part of
`this work was presented at ICASSP’95, May 1995. The associate editor
`coordinating the review of this manuscript and approving it for publication
`was Prof. C.-C. Jay Kuo.
`The author is with Bell Laboratories, Innovations for Lucent Technologies,
`Murray Hill, NJ 07974 USA (e-mail: jelena@bell-labs.com).
`Publisher Item Identifier S 1057-7149(97)07122-4.
`1 These are known under various names, such as modulated lapped trans-
`forms, Princen–Bradley filterbanks, time domain aliasing cancelation filter-
`banks (TDAC), modified DCT (MDCT) with 50% overlap.
`
`1057–7149/97$10.00 ª
`
`1997 IEEE
`
`Micro Motion 1048
`
`1
`
`
`
`IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997
`
`1581
`
`where we use the same scanning order as before. Note that since
`we have imposed on the window to be centrally symmetric (persym-
`metric), DDD2 and DDD3 have factors, JJJWWW 0JJJ and JJJWWW 1JJJ, respectively.
`Blocks CCC i are given by
`
`CCC 0 = BBB 0
`
`
`
`
`BBB 0 ; CCC 1 = BBB 0
`
`BBB 1
`
`
`
`
`(a)
`
`(b)
`
`Fig. 1. Sampling lattices. The support of the filters is given by the shaded
`area. (a) Rectangular case. (b) Nonrectangular case.
`
`1; and n1 = 0;
`with i = 0;
`; 2N1
`; N 2
`1; j = 0;
`; N 1
`
`
`
`
`
`
`1: This produces filters of sizes 2N1
`2N2:
`;2 N2
`1; n2 = 0;
`
`
`
`Note that the window will be assumed to be persymmetric, that is
`WWW = JJJWWW JJJ:
`Let us now try to find conditions for the filters to form a basis,
`that is, for them to be perfect reconstruction. The counterpart of one
`block-row of the matrix TTT is
`
`CCC 2 = BBB 1
`
`BBB 0 ; CCC 3 = BBB 1
`
`BBB 1
`
`
`
`
`
`is the ith block in dimension j as in (3). To obtain
`where block BBBi
`the conditions for perfect reconstruction, we have to make sure that
`all overlapping tails give zero, and that TTT 1 is unitary, that is
`
`
`
`
`DDDT0 DDD0 + DDDT1 DDD1 + DDDT2 DDD2 + DDDT3 DDD3 = III;
`
`
`DDDT1 DDD0 + DDDT3 DDD2 = 0;
`
`
`DDDT3 DDD1 + DDDT2 DDD0 = 0;
`
`
`DDDT3 DDD0 = 0;
`
`
`
`DDDT2 DDD1 = 0:
`
`(5)
`(6)
`(7)
`(8)
`(9)
`
`Conditions (6)–(9) can be easily verified while (5) will lead to the
`conditions on the window, the first one being the 2-D counterpart
`of (2)
`
`JJJ = III;
`
`2 1
`
`+ JJJWWW
`
`2 1
`
`JJJ + WWW
`
`2 0
`
`+ JJJWWW
`
`2 0
`
`WWW
`
`TTT 1 = (DDD0 DDD1 DDD2 DDD3)
`
`WWW 0(III
`
`JJJ )WWW 1
`
`
`
`
`
`JJJWWW 1JJJ (III
`
`
`
`
`+WWW 1(III
`
`
`
`
`JJJ )WWW 0 + JJJWWW 0JJJ(III
`
`JJJ )JJJWWW 0JJJ
`
`
`
`JJJ)JJJWWW 1JJJ = 0;
`
`WWW 0JJJWWW 0
`
`WWW 1JJJWWW 1 = 0;
`
`N1N2: Since TTT 1 represents
`where each block DDDi is of size N1N2
`
`convolution, let us look at the shift-reversed version of one filter,
`h00. For example, see the matrix at the bottom of the page, where
`dimension n1 goes from left to right and dimension n2 goes from
`bottom to top. We can arbitrarily choose a scanning order, which we
`define to be from left to right, and bottom to top. This means that the
`action of the first filter will go into TTT 1 as the following row-vector:
`
`h00[2N1
`
`1; 2N2
`
`
`h00[2N1
`
`
`1; 2N2
`
`1]
`
`
`2]
`
`
`
`h00[0; 2N2
`
`
`h00[2N1
`
`
`
`1]
`
`
`
`1; 0]
`
`
`h00[0; 0]:
`
`
`Then the first row of DDD0; for example, will be
`
`h00[2N1
`
`1; 2N2
`
`1]
`
`h00[0; 2N2
`
`1]
`
`
`
`
`
`Expressing hij [n1; n2] as in (4), we get that DDDi are given by
`
`DDD0 = CCC 0
`
`
`
`WWW 0; DDD2 = CCC 2
`
`JJJWWW 0JJJ;
`
`
`
`DDD1 = CCC 1
`
`WWW 1; DDD3 = CCC 3
`
`JJJWWW 1JJJ:
`
`
`III)WWW 0 + JJJWWW 0JJJ(JJJ
`
`WWW 0(JJJ
`
`
`
`III)WWW 1 + JJJWWW 1JJJ (JJJ
`
`
`
`III)JJJWWW 1JJJ = 0:
`
`III)JJJWWW 0JJJ
`
`WWW 1(JJJ
`
`
`
`
`
`To summarize, in the 2-D case with rectangular sampling, we will
`have up to (N1N2)=2 free variables. Compare that to (N1 + N2)=2
`free variables in the 2-D case with separable sampling.
`
`B. Nonrectangular Sampling
`The nonrectangular case is a more difficult one. We will describe a
`solution that will use a particular mapping from one dimension into
`two dimensions. Note that this solution would mean that the filters
`are obtained by shifting the prototype filter along a line, and that it
`is very similar to what was done in [20]. It will hold for an even
`sampling density N: First, we find the upper-triangular form of the
`sampling matrix
`
`DDD =
`
`a b
`0 c
`
`with N = det(DDD) and we assume that b and c do not have common
`factors [see Fig. 1(b)]. For the support of our filters, we will take two
`unit cells, the ones located at points [0, 0] and [a; 0]: Then, define
`the filters as follows:
`w[n1; n2]
`pN
`
`hk[n1; n2] =
`
`cos
`
`
`
`2k + 1
`4N
`
`(2(cn1
`
`bn2)
`
`
`
`
`
`N + 1)
`
`(10)
`
`
`
`Here, CCC i contain the modulating cosines, while diagonal matrices
`WWW i contain appropriately placed coefficients of the 2-D, window
`function w[n1; n2] as
`
`w[0; 0]
`
`w[N1; 0]
`
`.. .
`
`.. .
`
`WWW 0 =
`
`WWW 1 =
`
`;
`
`w[N1
`
`1; N2
`
`1]
`
`
`
`
`
`w[2N1
`
`1; N2
`
`1]
`
`
`
`
`
`h00[2N1
`
`...
`
`h00[2N1
`h00[2N1
`
`...
`
`h00[2N1
`
`
`
`
`
`
`
`
`1; 0]
`
`1; N2
`
`1; N2]
`
`1]
`
`1; 2N2
`
`1]
`
`
`
`
`...
`
`
`...
`
`
`h00[N1; 0]
`
`...
`
`1]
`
`h00[N1; N2
`
`h00[N1; N2]
`
`...
`
`h00[N1
`
`...
`
`h00[N1
`h00[N1
`
`...
`
`h00[N1; 2N2
`
`1] h00[N1
`
`
`
`
`
`
`
`
`
`
`1; 0]
`
`1; N2
`
`1; N2]
`
`1]
`
`1; 2N2
`
`1]
`
`
`
`
`...
`
`
`...
`
`
`h00[0; 0]
`
`...
`
`1]
`
`h00[0; N2
`
`h00[0; N2]
`
`...
`
`h00[0; 2N2
`
`1]
`
`
`
`2
`
`
`
`1582
`
`IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997
`
`Fig. 2. Examples of the separable window (left column) versus the nonseparable window (right column) for the rectangular sampling by eight in both
`horizontal and vertical dimensions. The first row gives contour plots of the two windows, while the second and the third rows give contour and log
`plots of the magnitude of the frequency response.
`
`with k = 0; ; N 1; and n1; n2 belonging to the unit cell as
`explained above. Now define the scanning order to be along the
`second axis first, that is
`
`or equivalently
`
`2
`
`WWW
`
`+ JJJWWW
`
`2
`
`JJJ = 2III
`
`[0; 0]
`
`b + 1
`c
`
`; 1
`
`
`
`c 1
`c
`
`(b + 1); c 1
`
`[1; 0]
`
` :
`
`(11)
`
`By doing this, we have mapped the problem into a 1-D problem,
`that is
`
`hk[0; 0] = hk[0]
`
`hk
`
`b + 1
`c
`
`; 1 = hk[1]
`
`...
`
`hk 2a 1 +
`
`(c 1)(b + 1)
`c
`
`; c 1 = hk[2N 1]:
`
`Thus, if we take blocks AAAi to be the formulation shown at the
`top of the next page, then all the relations without windowing hold
`and all the proofs are equivalent. We only need to take care of the
`window. It has to be persymmetric and
`
`2
`
`w
`
`[n1; n2] + w
`
`2 (c 1)(b + 1)
`c
`
`+ a 1 n; c 1 n2 = 2
`
`since we have used the same scanning order as in (11) to put the win-
`dow coefficients into WWW : Note that this condition is exactly the same
`as in the 1-D case given in (2). In the quincunx case, for example, this
`scheme would lead to 1-D filters. However, if we replace cn1 bn2
`with n1 + n2 where [n1; n2] 2 f[0; 0]; [1; 0]; [1; 1]; [2; 1]g; the whole
`problem is again mapped into a 1-D problem and thus easily solved.
`Note though, that the quincunx case is not of much interest, since it
`has only two channels and filters have only four taps.
`
`C. Design Example
`To illustrate constructions we presented in the previous section,
`we will choose rectangular subsampling and compare the separable
`to the nonseparable window. The sampling lattice is rectangular with
`sampling by eight in each dimension, thus the filters are of size 16
`16 (as well as the window). The design procedure for the window can
`be found in [13]; the window was obtained by minimizing the energy
`in the region outside of [ =2; =2][ =2; =2]: The result of the
`design procedure is given in Fig. 2. The left side shows rectangular
`sampling by eight in both horizontal and vertical directions with a
`separable window, while the right side shows the same sampling with
`
`3
`
`
`
`IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997
`
`1583
`
`h0 2a 1 +
`
`(b + 1); c 1
`
`h0 2a 1 +
`
`c 1
`c
`
`...
`
`c 2
`c
`
`...
`
`(b + 1); c 2
`
` h0[a; 0]
`
`...
`
`...
`
`(b + 1); c 2 hN 1[a; 0]
`
`hN 1 2a 1 +
`
`(b + 1); c 1 hN 1 2a 1 +
`
`h0 a 1 +
`
`(b + 1); c 1
`
`h0 a 1 +
`
`(b + 1); c 2
`
`c 1
`c
`c 1
`c
`
`AAA0 =
`
`AAA1 =
`
`c 2
`c
`c 2
`c
`
`
`
`h0[0; 0]
`
`...
`
`...
`
`...
`
`c 2
`c
`
`(b + 1); c 1
`
`hN 1 a 1 +
`
`(b + 1); c 2
`
` hN 1[0; 0]
`
`hN 1 a 1 +
`
`...
`
`c 1
`c
`
`fast tiling algorithms,” IEEE Trans. Signal Processing, Special Issue on
`Wavelets and Signal Processing, vol. 41, pp. 3341–3359, Dec. 1993.
`[6] Y. Meyer, Wavelets: Algorithms and Applications, trans. R. D. Ryan.
`Philadelphia, PA: SIAM, 1993.
`[7] C. C. Todd et al., “AC-3: Flexible preceptual coding for audio transmis-
`sion and storage,” in Proc. Conv. AES, Amsterdam, The Netherlands,
`Feb. 1994.
`[8] J. D. Johnston and A. J. Ferreira, “Sum-difference stereo transform cod-
`ing,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing,
`San Francisco, CA, Mar. 1992, pp. II:569–572.
`[9] M. Vetterli and J. Kovaˇcevi´c, Wavelets and Subband Coding, Signal
`Processing. Englewood Cliffs, NJ: Prentice-Hall, 1995.
`[10] J. Kovaˇcevi´c, D. J. LeGall, and M. Vetterli, “Image coding with
`windowed modulated filter banks,” in Proc. IEEE Int. Conf. Acous-
`tics, Speech, and Signal Processing, Glasgow, U.K., May 1989, pp.
`1949–1952.
`[11] A. W. Johnson, J. Princen, and M. H. Chan, “Frequency scalable video
`coding using the MDCT,” in Proc. IEEE Int. Conf. Acoustics, Speech,
`and Signal Processing, Adelaide, Australia, 1994.
`[12] R. Bernardini and J. Kovaˇcevi´c, “Local orthogonal bases I: Construc-
`tion,” Multidimen. Syst. Signal Process., Special Issue Wavelets Multires.
`Signal Process., vol. 7, no. 4, July 1996, invited paper.
`[13] R. Bernardini and J. Kovaˇcevi´c, “Local orthogonal bases II: Window
`design,” Multidimen. Syst. Signal Process., Special Issue Wavelets Mul-
`tires. Signal Process., vol. 7, no. 4, July 1996.
`[14] S. C. Chan, “Two dimensional nonseparable modulated filter banks,”
`in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing,
`Adelaide, Australia, 1994.
`[15] Y.-P. Lin and P. P. Vaidyanathan, “Two-dimensional paraunitary cosine
`modulated perfect reconstruction filter banks,” in Proc. IEEE Int. Symp.
`Circuits and Systems, Seattle, WA, 1995.
`[16] M. Ikehara, “Cosine-modulated 2-dimensional filter banks satisfying
`perfect reconstruction,” in Proc. IEEE Int. Conf. Acoustics, Speech, and
`Signal Processing, Adelaide, Australia, 1994.
`[17] H. Yan and M. Ikehara, “Modulated 2-dimensional perfect reconstruc-
`tion FIR filter banks with permissible subbands,” in Proc. IEEE Int.
`Symp. Circuits and Systems, Seattle, WA, 1995.
`[18] P. P. Vaidyanathan, Multirate Systems and Fitler Banks. Englewood
`Cliffs, NJ: Prentice-Hall, 1992.
`[19] M. Vetterli and D. J. LeGall, “Perfect reconstruction FIR filter banks:
`Some properties and factorizations,” IEEE Trans. Acoust., Speech, Signal
`Processing, vol. 37, pp. 1057–1071, July 1989.
`[20] A. Tabatabai, “Some results on two-dimensional pseudoquadrature mir-
`ror filters,” IEEE Trans. Circuits Syst., vol. CAS-34, pp. 988–992, Aug.
`1987.
`
`the nonseparable window. The first row gives contour plots of the
`two windows, while the second and the third rows give contour and
`log plots of the magnitude of the frequency response. The separable
`nature of the window in the left side is clearly seen in contour and
`log plots. Note that the aim in this example was not necessarily to
`show that one or the other filter has a superior frequency isolation.
`Rather, this example demonstrates that, under the same constraints,
`using a nonseparable window gives more freedom in design. It is clear
`that the separable window, being a special case of the nonseparable
`one, can at best equal the performance of the nonseparable window.
`Moreover, note how the nonseparable window demonstrates less
`“preference” for horizontal and vertical directions, that it is more
`“isotropic” as expected from a nonseparable window.
`
`IV. CONCLUSIONS
`Two-dimensional local cosine bases were presented in discrete
`time. We examined both rectangular and nonrectangular lattices.
`Although solutions for the rectangular lattices are more important in
`practice, those for the nonrectangular ones are a difficult challenge. In
`that case, the problem is solved by mapping it into an equivalent 1-D
`problem. As a result, solutions easily follow, however, the resulting
`filters will be obtained by modulation along a single line. More
`general modulation structures and a more general framework for local
`bases can solve this problem. For details, see [12] and [13].
`
`ACKNOWLEDGMENT
`
`The author thanks R. Bernardini of the University of Padua, Italy,
`for providing the design example and collaboration on the topic of
`local orthogonal bases, and T. Chen of AT&T Research for his useful
`comments. The author also acknowledges the anonymous reviewers
`who pointed out several references on 2-D local cosine bases.
`
`REFERENCES
`
`[1] H. S. Malvar, Signal Processing with Lapped Transforms. Norwood,
`MA: Artech House, 1992.
`[2] J. Princen, A. Johnson, and A. Bradley, “Subband transform coding
`using filter bank designs based on time domain aliasing cancellation,” in
`Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Dallas,
`TX, Apr. 1987, pp. 2161–2164.
`[3] T. A. Ramstad and J. P. Tanem, “Cosine-modulated analysis-synthesis
`filterbank with critical sampling and perfect reconstruction,” in Proc.
`IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Toronto, Ont.,
`Canada, May 1991, pp. 1789–1792.
`[4] R. D. Koilpillai and P. P. Vaidyanathan, “New results on cosine-
`modulated FIR filter banks satisfying perfect reconstruction,” in Proc.
`IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Toronto,
`Canada, May 1991, pp. 1793–1796.
`[5] C. Herley, J. Kovaˇcevi´c, K. Ramchandran, and M. Vetterli, “Tilings of
`the time-frequency plane: Construction of arbitrary orthogonal bases and
`
`4
`
`