`Image Reconstruction:
`A Technical Overview
`
`In most electronic imaging applica-
`
`tions, images with high resolution
`(HR) are desired and often re-
`quired. HR means that pixel den-
`sity within an image is high, and
`therefore an HR image can offer more
`details that may be critical in various ap-
`plications. For example, HR medical images are very
`helpful for a doctor to make a correct diagnosis. It may be
`easy to distinguish an object from similar ones using HR
`satellite images, and the performance of pattern recogni-
`tion in computer vision can be improved if an HR image
`is provided. Since the 1970s, charge-coupled device
`(CCD) and CMOS image sensors have been widely used
`to capture digital images. Although these sensors are suit-
`able for most imaging applications, the current resolution
`level and consumer price will not satisfy the future de-
`mand. For example, people want an inexpensive HR digi-
`tal camera/camcorder or see the price gradually reduce,
`and scientists often need a very HR level close to that of
`an analog 35 mm film that has no visible artifacts when an
`image is magnified. Thus, finding a way to increase the
`current resolution level is needed.
`The most direct solution to increase spatial resolution
`is to reduce the pixel size (i.e., increase the number of pix-
`els per unit area) by sensor manufacturing techniques. As
`the pixel size decreases, however, the amount of light
`available also decreases. It generates shot noise that de-
`
`Sung Cheol Park, Min Kyu Park,
`and Moon Gi Kang
`
`©DIGITAL VISION, LTD.
`
`grades the image quality severely. To reduce the pixel size
`without suffering the effects of shot noise, therefore,
`there exists the limitation of the pixel size reduction, and
`the optimally limited pixel size is estimated at about 40
`µm2 for a 0.35 µm CMOS process. The current image
`sensor technology has almost reached this level.
`Another approach for enhancing the spatial resolution
`is to increase the chip size, which leads to an increase in ca-
`pacitance [1]. Since large capacitance makes it difficult to
`speed up a charge transfer rate, this approach is not con-
`sidered effective. The high cost for high precision optics
`and image sensors is also an important concern in many
`commercial applications regarding HR imaging. There-
`fore, a new approach toward increasing spatial resolution
`is required to overcome these limitations of the sensors
`and optics manufacturing technology.
`One promising approach is to use signal processing
`techniques to obtain an HR image (or sequence) from
`observed multiple low-resolution (LR) images. Recently,
`such a resolution enhancement approach has been one of
`the most active research areas, and it is called super resolu-
`tion (SR) (or HR) image reconstruction or simply reso-
`lution enhancement in the literature [1]-[61]. In this
`article, we use the term “SR image reconstruction” to re-
`fer to a signal processing approach toward resolution en-
`hancement because the term “super” in “super
`
`MAY 2003
`
`IEEE SIGNAL PROCESSING MAGAZINE
`1053-5888/03/$17.00©2003IEEE
`
`21
`
`1
`
`SAMSUNG-1012
`
`
`
`resolution” represents very well the characteristics of the
`technique overcoming the inherent resolution limitation
`of LR imaging systems. The major advantage of the sig-
`nal processing approach is that it may cost less and the ex-
`isting LR imaging systems can be still utilized. The SR
`image reconstruction is proved to be useful in many prac-
`tical cases where multiple frames of the same scene can be
`obtained, including medical imaging, satellite imaging,
`and video applications. One application is to reconstruct
`a higher quality digital image from LR images obtained
`with an inexpensive LR camera/camcorder for printing
`or frame freeze purposes. Typically, with a camcorder, it is
`also possible to display enlarged frames successively. Syn-
`thetic zooming of region of interest (ROI) is another im-
`portant application in surveillance, forensic, scientific,
`medical, and satellite imaging. For surveillance or foren-
`sic purposes, a digital video recorder (DVR) is currently
`replacing the CCTV system, and it is often needed to
`magnify objects in the scene such as the face of a criminal
`or the licence plate of a car. The SR technique is also use-
`ful in medical imaging such as computed tomography
`(CT) and magnetic resonance imaging (MRI) since the
`acquisition of multiple images is possible while the reso-
`lution quality is limited. In satellite imaging applications
`such as remote sensing and LANDSAT, several images of
`the same area are usually provided, and the SR technique
`to improve the resolution of target can be considered. An-
`other application is conversion from an NTSC video sig-
`nal to an HDTV signal since there is a clear and present
`need to display a SDTV signal on the HDTV without vi-
`sual artifacts.
`How can we obtain an HR image from multiple LR
`images? The basic premise for increasing the spatial reso-
`lution in SR techniques is the availability of multiple LR
`images captured from the same scene (see [4, chap. 4] for
`details). In SR, typically, the LR images represent differ-
`
`Subpixel
`Shift
`
`Scene
`
`Scene
`
`...
`
`Scene
`
`Camera
`
`Scene
`
`Camera
`
`Camera Camera
`
`Scene
`
`Video Sequence
`
`s 1. Basic premise for super resolution.
`
`ent “looks” at the same scene. That is, LR images are
`subsampled (aliased) as well as shifted with subpixel pre-
`cision. If the LR images are shifted by integer units, then
`each image contains the same information, and thus there
`is no new information that can be used to reconstruct an
`HR image. If the LR images have different subpixel shifts
`from each other and if aliasing is present, however, then
`each image cannot be obtained from the others. In this
`case, the new information contained in each LR image
`can be exploited to obtain an HR image. To obtain differ-
`ent looks at the same scene, some relative scene motions
`must exist from frame to frame via multiple scenes or
`video sequences. Multiple scenes can be obtained from
`one camera with several captures or from multiple cam-
`eras located in different positions. These scene motions
`can occur due to the controlled motions in imaging sys-
`tems, e.g., images acquired from orbiting satellites. The
`same is true of uncontrolled motions, e.g., movement of
`local objects or vibrating imaging systems. If these scene
`motions are known or can be estimated within subpixel
`accuracy and if we combine these LR images, SR image
`reconstuction is possible as illustrated in Figure 1.
`In the process of recording a digital image, there is a
`natural loss of spatial resolution caused by the optical dis-
`tortions (out of focus, diffraction limit, etc.), motion blur
`due to limited shutter speed, noise that occurs within the
`sensor or during transmission, and insufficient sensor
`density as shown in Figure 2. Thus, the recorded image
`usually suffers from blur, noise, and aliasing effects. Al-
`though the main concern of an SR algorithm is to recon-
`struct HR images from undersampled LR images, it
`covers image restoration techniques that produce high
`quality images from noisy, blurred images. Therefore, the
`goal of SR techniques is to restore an HR image from sev-
`eral degraded and aliased LR images.
`A related problem to SR techniques is image restora-
`tion, which is a well-established area
`in image processing applications
`[62]-[63]. The goal of image restora-
`tion is to recover a degraded (e.g.,
`blurred, noisy) image, but it does not
`change the size of image. In fact, res-
`toration and SR reconstruction are
`closely related theoretically, and SR
`reconstruction can be considered as a
`second-generation problem of image
`restoration.
`Another problem related to SR re-
`construction is image interpolation
`that has been used to increase the size
`of a single image. Although this field
`has been extensively studied
`[64]-[66], the quality of an image
`magnified from an aliased LR image
`is inherently limited even though the
`ideal sinc basis function is employed.
`That is, single image interpolation
`
`Integer Pixel
`Shift
`
`: Reference LR Image
`If There Exist Subpixel Shifts
`Between LR Images,
`SR Reconstruction Is Possible
`
`22
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`MAY 2003
`
`2
`
`
`
`Common Imaging System
`
`CCD Sensor Preprocessor
`
`OTA
`
`Environment
`
`Blurred, Noisy,
`Aliased LR Image
`
`Optical
`Distortion
`
`Aliasing Motion Blur
`
`Noise
`
`s 2. Common image acquisition system.
`
`Original Scene
`
`cannot recover the high-frequency
`components lost or degraded during
`the LR sampling process. For this
`reason, image interpolation methods
`are not considered as SR techniques.
`To achieve further improvements in
`this field, the next step requires the
`utilization of multiple data sets in
`which additional data constraints
`from several observations of the same
`scene can be used. The fusion of in-
`formation from various observations
`of the same scene allows us SR recon-
`struction of the scene.
`The goal of this article is to intro-
`duce the concept of SR algorithms
`to readers who are unfamiliar with
`this area and to provide a review for
`experts. To this purpose, we present
`the technical review of various exist-
`ing SR methodologies which are of-
`ten employed. Before presenting the review of existing
`SR algorithms, we first model the LR image acquisi-
`tion process.
`
`Observation Model
`The first step to comprehensively analyze the SR image
`reconstruction problem is to formulate an observation
`model that relates the original HR image to the observed
`LR images. Several observation models have been pro-
`posed in the literature, and they can be broadly divided
`into the models for still images and for video sequence. To
`present a basic concept of SR reconstruction techniques,
`we employ the observation model for still images in this
`article, since it is rather straightforward to extend the still
`image model to the video sequence model.
`×
`Consider the desired HR image of size L N L N
`1
`1
`2
`2
`written in lexicographical notation as the vector
`x = [
`x
`x
`xN
`
`, where N L N L N= ×
`
`2 . Namely,
`,
`,....,
`]
`T
`1
`2
`1
`1
`2
`x is the ideal undegraded image that is sampled at or
`above the Nyquist rate from a continuous scene which is
`assumed to be bandlimited. Now, let the parameters L1
`and L2 represent the down-sampling factors in the obser-
`vation model for the horizontal and vertical directions, re-
`
`2
`
`k
`
`
`
`,1
`
`k
`
`,
`
`2
`
`
`
`k M,
`
`spectively. Thus, each observed LR image is of size
`×
`. Let the kth LR image be denoted in lexico-
`N N1
`= [
`y
`y
`y
`graphic notation as y k
`,
`for
`,
`,....,
`]
`T
`k
`= 1 2,
`
`
`p,..., and M N N= ×1
`
`2 . Now, it is assumed that x
`remains constant during the acquisition of the multiple
`LR images, except for any motion and degradation al-
`lowed by the model. Therefore, the observed LR images
`result from warping, blurring, and subsampling opera-
`tors performed on the HR image x. Assuming that each
`LR image is corrupted by additive noise, we can then rep-
`resent the observation model as [30], [48]
`
`=
`+
`y DB M x n
`k
`k
`k
`
`k
`
`
`
`≤ ≤for 1 k
`
`p
`
`(1)
`
`2
`
`2
`
`2
`
`where M k is a warp matrix of size L1N1L2N2 ⫻ L1N1L2N2,
`×
`L N L N
`B k represents a L N L N
`blur matrix,
`1
`1
`2
`1
`1
`2
`2
`2
`×
`N N
`L N L N
`subsampling matrix, and
`D is a (
`)
`1
`1
`1
`2
`n k represents a lexicographically ordered noise vector. A
`block diagram for the observation model is illustrated in
`Figure 3.
`Let us consider the system matrix involved in (1). The
`motion that occurs during the image acquisition is repre-
`sented by warp matrix M k . It may contain global or local
`translation, rotation, and so on. Since this information is
`
`Desired HR Image x
`
`kth Warped HR Image xk
`
`kth Observed
`LR Image yk
`
`Warping
`
`Blur
`
`Down Sampling
`
`Undersampling
`(
`,
`)
`L L1 2
`
`
`Optical Blur
`Motion Blur
`Sensor PSF, Etc.
`
`−−−
`
`Translation
`Rotation, Etc.
`
`−−
`
`Continuous
`Scene
`
`Sampling
`
`Continuous to
`Discrete Without
`Aliasing
`
`Noise (
`
`)nk
`
`s 3. Observation model relating LR images to HR images.
`
`MAY 2003
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`23
`
`3
`
`
`
`The SR image reconstruction is
`proved to be useful in many
`practical cases where multiple
`frames of the same scene can
`be obtained, including medical
`imaging, satellite imaging, and
`video applications.
`
`generally unknown, we need to estimate the scene mo-
`tion for each frame with reference to one particular frame.
`The warping process performed on HR image x is actu-
`ally defined in terms of LR pixel spacing when we esti-
`mate it. Thus, this step requires interpolation when the
`fractional unit of motion is not equal to the HR sensor
`grid. An example for global translation is shown in Figure
`4. Here, a circle (䊊) represents the original (reference)
`HR image x, and a triangle (䉭) and a diamond (䉫) are
`globally shifted versions of x. If the down-sampling factor
`is two, a diamond (䉫) has (0.5, 0.5) subpixel shift for the
`horizontal and vertical directions and a triangle (䉭) has a
`shift which is less than (0.5,0.5). As shown in Figure 4, a
`diamond (䉫) does not need interpolation, but a triangle
`(䉭) should be interpolated from x since it is not located
`on the HR grid. Although one could use ideal interpola-
`tion theoretically, in practice, simple methods such as
`
`: Original HR Grid
`
`: Original HR Pixels
`
`,
`
`: Shifted HR Pixels
`
`s 4. The necessity of interpolation in HR sensor grid.
`
`HR Grid
`
`LR Grid
`
`HR Pixel
`
`a0
`a1
`a2 a3
`
`HR Image
`
`s 5. Low-resolution sensor PSF.
`
`LR Pixel =(
`
`Σ ai
`4
`
`)
`
`zero-order hold or bilinear interpolation methods have
`been used in many literatures.
`Blurring may be caused by an optical system (e.g., out
`of focus, diffraction limit, aberration, etc.), relative motion
`between the imaging system and the original scene, and
`the point spread function (PSF) of the LR sensor. It can be
`modeled as linear space invariant (LSI) or linear space vari-
`ant (LSV), and its effects on HR images are represented by
`the matrix B k . In single image restoration applications,
`the optical or motion blur is usually considered. In the SR
`image reconstruction, however, the finiteness of a physical
`dimension in LR sensors is an important factor of blur.
`This LR sensor PSF is usually modeled as a spatial averag-
`ing operator as shown in Figure 5. In the use of SR recon-
`struction methods, the characteristics of the blur are
`assumed to be known. However, if it is difficult to obtain
`this information, blur identification should be incorpo-
`rated into the reconstruction procedure.
`The subsampling matrix D generates aliased LR im-
`ages from the warped and blurred HR image. Although
`the size of LR images is the same here, in more general
`cases, we can address the different size of LR images by
`using a different subsampling matrix (e.g., Dk ). Al-
`though the blurring acts more or less as an anti-aliasing
`filter, in SR image reconstruction, it is assumed that
`aliasing is always present in LR images.
`A slightly different LR image acquisition model can be
`derived by discretizing a continuous warped, blurred
`scene [24]-[28]. In this case, the observation model must
`include the fractional pixels at the border of the blur sup-
`port. Although there are some different considerations
`between this model and the one in (1), these models can
`be unified in a simple matirx-vector form since the LR
`pixels are defined as a weighted sum of the related HR
`pixels with additive noise [18]. Therefore, we can express
`these models without loss of generality as follows:
`
`=
`+
`y W x n
`k
`k
`
`k
`
`,
`
`for
`
`k
`
`=
`
`
`
`,...,1
`
`p
`,
`
`(2)
`
`×
`L N L N
`N N
`repre-
`where matrix Wk of size (
`)
`2
`1
`1
`2
`1
`2
`2
`sents, via blurring, motion, and subsampling, the contri-
`bution of HR pixels in x to the LR pixels in y k . Based on
`the observation model in (2), the aim of the SR image re-
`construction is to estimate the HR image x from the LR
`= 1,...,
`p
`images y k for k
`.
`Most of the SR image reconstruc-
`tion methods proposed in the litera-
`ture consist of the three stages
`illustrated in Figure 6: registration,
`interpolation, and restoration (i.e.,
`inverse procedure). These steps can
`be implemented separately or simul-
`taneously according to the recon-
`struction methods adopted. The
`estimation of motion information is
`referred to as registration, and it is ex-
`tensively studied in various fields of
`image processing [67]-[70]. In the
`
`LR Image
`
`24
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`MAY 2003
`
`4
`
`
`
`registration stage, the relative shifts between LR images
`compared to the reference LR image are estimated with
`fractional pixel accuracy. Obviously, accurate subpixel
`motion estimation is a very important factor in the suc-
`cess of the SR image reconstruction algorithm. Since the
`shifts between LR images are arbitrary, the registered HR
`image will not always match up to a uniformly spaced
`HR grid. Thus, nonuniform interpolation is necessary to
`obtain a uniformly spaced HR image from a
`nonuniformly spaced composite of LR images. Finally,
`image restoration is applied to the upsampled image to
`remove blurring and noise.
`The differences among the several proposed works are
`subject to what type of reconstruction method is em-
`ployed, which observation model is assumed, in which
`particular domain (spatial or frequency) the algorithm is
`applied, what kind of methods is used to capture LR im-
`ages, and so on. The technical report by Borman and
`Stevenson [2] provides a comprehensive and complete
`overview on the SR image reconstruction algorithms un-
`til around 1998, and a brief overview of the SR tech-
`niques appears in [3] and [4].
`Based on the observation model in (2), existing SR
`algorithms are reviewed in the following sections. We
`first present a nonuniform interpolation approach that
`conveys an intuitive comprehension of the SR image re-
`construction. Then, we explain a frequency domain ap-
`proach that is helpful to see how to exploit the aliasing
`relationship between LR images. Next, we present de-
`terministic and stochastic regularization approaches, the
`projection onto convex sets (POCS) approach, as well as
`other approaches. Finally, we discuss advanced issues to
`improve the performance of the SR algorithm.
`
`deconvolution method that considers the presence of
`noise.
`The reconstruction results of this approach appear in
`Figure 8. In this simulation, four LR images are gener-
`ated by a decimation factor of two in both the horizontal
`
`and vertical directions from the 256 256×
`HR image.
`Only sensor blur is considered here, and a 20-dB Gaussi-
`an noise is added to these LR images. In Figure 8, part (a)
`shows the image interpolated by the nearest neighbor-
`hood method from one LR observation, and part (b) is
`the image produced by bilinear interpolation; a
`nonuniformly interpolated image from four LR images
`appears in part (c), and a deblurred image using the
`Wiener restoration filter from part (c) is shown in part
`(d). As shown in Figure 8, significant improvement is ob-
`served in parts (c) and (d) when viewed in comparison
`with parts (a) and (b).
`Ur and Gross [5] performed a nonuniform interpola-
`tion of an ensemble of spatially shifted LR images by uti-
`lizing the generalized multichannel sampling theorem of
`Papoulis [73] and Brown [74]. The interpolation is fol-
`lowed by a deblurring process, and the relative shifts are
`assumed to be known precisely here. Komatsu et al. [1]
`presented a scheme to acquire an improved resolution im-
`age by applying the Landweber algorithm [75] from
`multiple images taken simultaneously with multiple cam-
`eras. They employ the block-matching technique to mea-
`sure relative shifts. If the cameras have the same aperture,
`however, it imposes severe limitations both in their ar-
`rangement and in the configuration of the scene. This dif-
`ficulty was overcome by using multiple cameras with
`different apertures [6]. Hardie et al. developed a tech-
`nique for real-time infrared image registration and SR re-
`construction [7]. They utilized a gradient-based
`
`Restoration
`for Blur
`and
`Noise
`Removal
`
`x
`
`Interpolation
`onto an
`HR Grid
`
`•••
`
`Registration
`
`or
`
`Motion
`Estimation
`
`y1
`y2
`
`•••
`
`yp−1
`yp
`
`s 6. Scheme for super resolution.
`
`SR Image Reconstruction Algorithms
`Nonuniform Interpolation Approach
`This approach is the most intuitive method for SR im-
`age reconstruction. The three stages presented in Figure
`6 are performed successively in this approach: i) estima-
`tion of relative motion, i.e., registration (if the motion
`information is not known), ii) nonuniform interpola-
`tion to produce an improved resolution image, and iii)
`deblurring process (depending on the observation
`model). The pictorial example is shown in Figure 7.
`With the relative motion information
`estimated,
`the HR image on
`nonuniformly spaced sampling
`points is obtained. Then, the direct or
`iterative reconstruction procedure is
`followed to produce uniformly
`spaced sampling points [71]-[74].
`Once an HR image is obtained by
`nonuniform interpolation, we ad-
`dress the restoration problem to re-
`move blurring and noise. Restoration
`can be performed by applying any
`
`LR
`Images
`
`Direct
`Reconstruction
`
`Iterative
`Reconstruction
`
`Deblurring
`
`HR
`Images
`
`Mapping to HR Grid
`
`Registration
`
`MAY 2003
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`25
`
`s 7. Registration-interpolation-based reconstruction.
`
`5
`
`
`
`registration algorithm for estimating the shifts between
`the acquired frames and presented a weighted nearest
`neighbor interpolation approach. Finally, Wiener filter-
`ing is applied to reduce effects of blurring and noise
`caused by the system. Shah and Zakhor proposed an SR
`color video enhancement algorithm using the Landweber
`algorithm [8]. They also consider the inaccuracy of the
`registration algorithm by finding a set of candidate mo-
`
`tion estimates instead of a single motion vector for each
`pixel. They use both luminance and chrominance infor-
`mation to estimate the motion field. Nguyen and
`Milanfar [9] proposed an efficient wavelet-based SR re-
`construction algorithm. They exploit the interlacing
`structure of the sampling grid in SR and derive a
`computationally efficient wavelet interpolation for inter-
`laced two-dimensional (2-D) data.
`The advantage of the nonuniform interpolation ap-
`proach is that it takes relatively low computational load
`and makes real-time applications possible. However, in
`this approach, degradation models are limited (they are
`only applicable when the blur and the noise characteris-
`tics are the same for all LR images). Additionally, the
`optimality of the whole reconstruction algorithm is not
`guaranteed, since the restoration step ignores the errors
`that occur in the interpolation stage.
`
`(a)
`
`(b)
`
`(c)
`
`(d)
`
`s 8. Nonuniform interpolation SR reconstruction results by (a)
`nearest neighbor interpolation, (b) bilinear interpolation, (c)
`nonuniform interpolation using four LR images, and (d)
`debluring part (c).
`
`Aliasing
`
`Aliased LR Signal (CFT)
`
`Dealiased HR Signal (DFT)
`
`Decompose Aliased Signal into Dealiased Signal
`
`s 9. Aliasing relationship between LR image and HR image.
`
`
`
`1
`
`2
`
`Frequency Domain Approach
`The frequency domain approach makes explicit use of the
`aliasing that exists in each LR image to reconstruct an
`HR image. Tsai and Huang [10] first derived a system
`equation that describes the relationship between LR im-
`ages and a desired HR image by using the relative motion
`between LR images. The frequency domain approach is
`based on the following three principles: i) the shifting
`property of the Fourier transform, ii) the aliasing rela-
`tionship between the continuous Fourier transform
`(CFT) of an original HR image and the discrete Fourier
`transform (DFT) of observed LR images, iii) and the as-
`sumption that an original HR image is bandlimited.
`These properties make it possible to formulate the system
`equation relating the aliased DFT coefficients of the ob-
`served LR images to a sample of the CFT of an unknown
`image. For example, let us assume that there are two 1-D
`LR signals sampled below the Nyquist sampling rate.
`From the above three principles, the aliased LR signals
`can be decomposed into the unaliased HR signal as
`shown in Figure 9.
`
`Let x t(
`t
`2 denote a continuous HR image and
`,
`)
`1
`
`X w w( ,
`2 be its CFT. The global translations, which
`
`)
`1
`are the only motion considered in the frequency do-
`main approach, yield the kth shifted image of
`=
`+
`δ
`+
`δ
`, where δ k1 andδ k2 are ar-
`x t
`t
`x t
`t
`(
`,
`)
`(
`,
`)
`k
`
`k1
`k
`1
`2
`1
`2
`2
`bitrary but known values, and k
`= 1 2,
`p
`. By the shifting
`,..,
`property of the CFT, the CFT of the shifted image,
`X w w
`2 , can be written as
`,
`)
`k(
`1
`[
`]
`(
`)
`
`X w w( ,
`
`) exp=
`j
`π δ
`w
`+
`
`2
`
`k1
`k
`k
`1
`1
`2
`2
`2
`1
`2
`t
`The shifted image x t
`is sampled with the sampling
`)
`,
`k(
`period T1 and T2 to generate the observed LR image
`y n n
`2 . From the aliasing relationship and the assump-
`,
`]
`k[
`1
`
`tion of bandlimitedness of X w w( ,
`
`)| 0=
`X w w1
`
`)
`(| (
`,
`2
`1
`2
`≥
`π
`w
`
`| (≥
`L
`π
`T
`
`L
`T
`for | | (w
`), the relationship
`, |
`/
`)
`/
`)
`1
`1
`1
`2
`2
`2
`between the CFT of the HR image and the DFT of the kth
`observed LR image can be written as [76]
`
`δ
`
`w
`
`X w w( ,
`
`
`
`. (3)
`)
`
`26
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`MAY 2003
`
`6
`
`
`
`nL
`
`1
`
`1
`
`1 2
`
`T T
`1
`
`
`
`×
`
`Y
`k
`
`
`
`[Ω Ω,
`1
`
`
`
`]
`
`=
`
`2
`
`2
`Ω
`π
`T N
`1
`
`01
`=−
`
`X
`
`nL
`
`2
`
`2
`
`01
`=−
`
`∑ ∑
`
`
`
`+
`
`n
`1
`
`1 1
`
`
`
`The basic premise for increasing
`the spatial resolution in SR
`techniques is the availability
`of multiple LR images captured
`from the same scene.
`
`.
`
`
`
`
`+
`
`n
`
`2
`
`2 2
`
`
`
`Ω
`π
`2
`T N
`2
`
`k
`
`,
`
`(4)
`By using lexicographic ordering for the indices n1 , n 2 on
`the right-hand side and k on the left-hand side, a matrix
`vector form is obtained as:
`
`Y
`
`X= Φ ,
`
`(5)
`
`squares (CLS) and maximum a posteriori (MAP) SR im-
`age reconstruction methods are introduced.
`
`1
`
`Deterministic Approach
`With estimates of the registration parameters, the obser-
`vation model in (2) can be completely specified. The de-
`terministic regularized SR approach solves the inverse
`problem in (2) by using the prior information about the
`solution which can be used to make the problem well
`posed. For example, CLS can be formulated by choosing
`an x to minimize the Lagrangian [63]
`
`y W x−
`
`
`k
`k
`
`2
`
`+
`
`α
`
`Cx
`
`2
`
`,
`
`p
`
`=∑
`
`where Y is a p × 1 column vector with the kth element of
`the DFT coefficients of y n n
`2 , X is a L L1
`1× column
`,
`]
`k[
`1
`2
`
`vector with the samples of the unknown CFT of x t(
`t
`2 ,
`,
`)
`
`andΦis a p L L× 1 2 matrix which relates the DFT of the ob-
`
`served LR images to samples of the continuous HR image.
`Therefore, the reconstruction of a desired HR image re-
`quires us to determine Φ and solve this inverse problem.
`An extension of this approach for a blurred and noisy
`image was provided by Kim et al. [11], resulting in a
`weighted least squares formulation. In their approach, it is
`assumed that all LR images have the same blur and the
`same noise characteristics. This method was further re-
`fined by Kim and Su [12] to consider different blurs for
`each LR image. Here, the Tikhonov regularization
`method is adopted to overcome the ill-posed problem re-
`sulting from blur operator. Bose et al. [13] proposed the
`recursive total least squares method for SR reconstruction
`to reduce effects of registration errors (errors in Φ). A
`discrete cosine transform (DCT)-based method was pro-
`posed by Rhee and Kang [14]. They reduce memory re-
`quirements and computational costs by using DCT
`instead of DFT. They also apply multichannel adaptive
`regularization parameters to overcome ill-posedness such
`as underdetermined cases or insufficient motion informa-
`tion cases.
`Theoretical simplicity is a major advantage of the fre-
`quency domain approach. That is, the relationship be-
`tween LR images and the HR image is clearly
`demonstrated in the frequency domain. The frequency
`method is also convenient for parallel implementation ca-
`pable of reducing hardware complexity. However, the ob-
`servation model is restricted to only global translational
`motion and LSI blur. Due to the lack of data correlation
`in the frequency domain, it is also difficult to apply the
`spatial domain a priori knowledge for regularization.
`
`Regularized SR Reconstruction Approach
`Generally, the SR image reconstruction approach is an
`ill-posed problem because of an insufficient number of
`LR images and ill-conditioned blur operators. Proce-
`dures adopted to stabilize the inversion of ill-posed prob-
`lem are called regularization. In this section, we present
`deterministic and stochastic regularization approaches
`for SR image reconstruction. Typically, constrained least
`
`(7)
`
`(8)
`
`27
`
`and this leads to the following iteration for $x:
`
`
`
`,
`
`−
`
`α
`C Cx
`T
`$
`
`n
`
`
`
`−
`(W y W x$ )
`T
`k
`k
`k
`
`
`n
`
`p
`
`∑
`
`k
`
`=
`1
`
`
`
`n
`
`+
`
`1
`
`x
`$
`
`=
`
`x
`$
`
`n
`
`+
`
`β
`
`MAY 2003
`
`IEEE SIGNAL PROCESSING MAGAZINE
`
`(6)
`
`
`
`k
`
`1
`
`
`
`where the operatorC is generally a high-pass filter, and||||⋅
`represents a l2 -norm. In (6), a priori knowledge concern-
`ing a desirable solution is represented by a smoothness
`constraint, suggesting that most images are naturally
`smooth with limited high-frequency activity, and there-
`fore it is appropriate to minimize the amount of high-pass
`energy in the restored image. In (6), α represents the
`Lagrange multiplier, commonly referred to as the regular-
`ization parameter, that controls the tradeoff between fi-
`2
`delity to the data (as expressed by ∑
`−y W x
`
`) and
`p
`
`=k 1
`k
`k
`smoothness of the solution (as expressed by Cx 2 ). The
`Larger values of α will generally lead to a smoother solu-
`tion. This is useful when only a small number of LR im-
`ages are available (the problem is underdetermined) or
`the fidelity of the observed data is low due to registration
`error and noise. On the other hand, if a large number of
`LR images are available and the amount of noise is small,
`smallα will lead to a good solution. The cost functional in
`(6) is convex and differentiable with the use of a quadratic
`regularization term. Therefore, we can find a unique esti-
`mate image $x which minimizes the cost functional in (6).
`One of the most basic deterministic iterative techniques
`considers solving
`
`=
`
`p
`
`∑
`
`k
`
`
`
`=1
`
`W y
`T
`k
`
`k
`
`,
`
`
`
`+W W C C xα
`
`
`
`
`T
`T
`$
`k
`k
`
`p
`
`∑
`
`k
`
`
`
`=1
`
`
`
`7
`
`
`
`where β represents the convergence parameter and Wk
`T
`contains an upsampling operator and a type of blur and
`warping operator.
`Katsaggelos et al. [15], [16] proposed a multichan-
`nel regularized SR approach in which regularization
`functional is used to calculate the regularization param-
`eter without any prior knowledge at each iteration step.
`Later, Kang formulated the generalized multichannel
`deconvoultion method including the multichannel reg-
`ularized SR approach [17]. The SR reconstruction
`method obtained by minimizing a regularized cost
`functional was proposed by Hardie et al. [18]. They de-
`fine an observation model that incorporates knowledge
`of the optical system and the detector array (sensor
`PSF). They used an iterative gradient-based registra-
`tion algorithm and considered both gradient descent
`and conjugate-gradient optimization procedures to
`minimize the cost functional. Bose et al. [19] pointed
`to the important role of the regularization parameter
`and a proposed CLS SR reconstruction which gener-
`ates the optimum value of the regularization parameter,
`using the L-curve method [77].
`
`Stochastic Approach
`Stochastic SR image reconstruction, typically a Bayesian
`approach, provides a flexible and convenient way to
`model a priori knowledge concerning the solution.
`Bayesian estimation methods are used when the a pos-
`teriori probability density function (PDF) of the original
`
`(a)
`
`(b)
`
`where Z is simply a normalizing constant, U( )x is called an
`energy function, ϕ c (x) is a potential function that depends
`only on the pixel values located within clique c, andSdenotes
`the set of cliques. By defining ϕ c ( )x as a function of the de-
`rivative of the image, U( )x measures the cost caused by the
`irregularities of the solution. Commonly, an image is as-
`sumed to be globally smooth, which is incorporated into the
`estimation problem through a Gaussian prior.
`A major advantage of the Bayesian framework is the
`use of an edge-preserving image prior model. With the
`Gaussian prior, the potential function takes the quadratic
`=
` nD(
`
`2 where D n(
`) is an nth order differ-
`form ϕ c
`
`( )x
`x
`)
`(
`)
`ence. Though the quadratic potential function makes the
`algorithm linear, it penalizes the high-frequency compo-
`nents severely. As a result, the solution becomes
`oversmoothed. However, if we model a potential func-
`tion which less penalizes the large difference in x, we can
`obtain an edge-preserving HR image.
`If the error between frames is assumed to be inde-
`pendent and noise is assumed to be an independent iden-
`tically distributed (i.i.d) zero mean Gaussian distribu-
`tion, the optimization problem can be expressed more
`compactly as
`
`(12)
`
`
`
`
`
`( )x
`
`,
`
`−
`y W x
`$
`k
`
`2
`
`+
`
`∑
`α ϕ
`c S
`∈
`
`c
`
`p
`
`∑
`
`1