`VALEO v. MAGNA
`IPRZO15-01410
`
`P'agé
`
`Page 1
`
`
`
`DIGITAL
`IMAGE
`PROCESSING
`
`Rafael C .. Gonzalez
`University of Ten nessee
`Perceptics Corporation
`
`Richard E. Woods
`
`Perceptics Corporation
`
`A
`Y'Y ADDISON-WESLEY PUBLISHING COMPANY
`Reading, Massachusetts· Menlo Park, Ca lifornia· New York
`Don Mills, O ntario· Wokingham, England ·Amsterdam · Bonn
`Sydney · Singapore · Tokyo · Madrid · San juan · Milan · Paris
`
`Page 2
`
`
`
`Library of Congress Cataloging-in-Publication Data
`Gonzalez, Rafael C.
`Digital image processing I by Rafael C. Gonzalez. Richard C.
`Woods.
`em.
`p.
`Includes bibliographical references and index.
`ISBN 0-201 -50803-6
`I. Image processing- Digital techniques.
`ll. Title.
`TAlo32.G60
`621.36 '7 - clc20
`
`19CJ2
`
`I. Woods. Richard C.
`
`91 -30866
`C IP
`
`Copyright © 1992 by Addison-Wesley Publishing Company, Inc.
`All rights reserved. No part of this publication may be reproduced , stored in a retrieval system. or
`transmitted, in any form or by any means, electronic, mechan ical, photocopying. recording, or other(cid:173)
`wise, without the prior written permis~ion of the publisher. Printed in the United States of A merica.
`
`1 2 3 4 5 6 7 8 9 10-MA-95949392
`
`Page 3
`
`
`
`CONTENTS
`
`INTRODUCTION
`Chapter 1
`1.1 Background
`1.2 Digital Image Representation
`1.3 Fundamental Steps in Image Processing
`1.4 Elements of Digital Image Processing System11
`1.4.1
`Image Acquisition
`1.4.2 Storage
`1.4.3 Processing
`1.4.4 Communication
`1.4.5 D isplay
`1.5 Organization of the Book
`References
`
`Chapter 2 DIGITAL IMAGE FUNDAMENTALS
`2.1 Elements of Visual Perception
`2.1.1 Structure o f the Human Eye
`2.1.2
`Image Formation in the Eye
`2.1.3 Brightness Adaptation and Discriminatio n
`2.2 A Simple Image Model
`2.3 Sampling and Quantization
`2.3.1 U niform Sampling and Quantization
`2.3.2 Nonuniform Sampling and Quantization
`2.4 Some Basic Relationships Between Pixels
`2.4.1 Neighbors of a Pixel
`2.4.2 Connectivity
`2.4.3 Labeling of Connected Components
`2.4.4 Relations, Equivalence, and Transitive C losure
`2.4.5 D istance Measures
`2.4.6 Ari thmetic/Lugil: Opt:rations
`Imaging Geometry
`2.5.1 Some Basic Transformations
`2.5.2 Perspective Transformations
`2.5.3 Camera Model
`2.5.4 Camera Calibration
`2.5.5 Stereo Imaging
`2.6 Photographic Film
`2 .6.1 Film Structure and Exposure
`
`2.5
`
`1
`1
`6
`7
`10
`10
`14
`15
`16
`16
`17
`18
`
`21
`21
`22
`24
`25
`28
`31
`31
`38
`40
`40
`41
`42
`43
`45
`47
`51
`52
`56
`61
`67
`6S
`71
`71
`
`xi
`
`Page 4
`
`
`
`xii
`
`Contents
`
`2.6.2 Film Characteristics
`2.6.3 Diaphragm and Shutte r Settings
`2.7 Concluding Remarks
`References
`Problems
`
`IMAGE TRANSFORMS
`Chapte r 3
`lntroduction to the Fourier Trunsform
`3.1
`3.2 The Discrete Fourier Transform
`3.3 Some Prope rties of the Two-Dimensional Fourier Transform
`3.3. l Sepa rability
`3.3.2 Translation
`3.3.3 Periodicity and Conjugate Symmetry
`3.3.4 Rotation
`3.3.5 Distributivity and Scaling
`3.3.6 Average Value
`3.3.7 Laplacian
`3.3.8 Convolution and Corre latio n
`3.3.9 Sampling
`3.4 The Fast Fourier Transform
`3.4.1 FFT Algorithm
`3.4.2 Number of Operations
`3.4.3 The Inverse FFT
`3.4.4
`Implementation
`3.5 Other Separable Image Transforms
`3.5 . .I Wa lsh Transform
`3.5.2 Hadamard Transform
`3.5.3 Discrete Cosine Transform
`3.5.4 The Haar Transfo rm
`3.5.5 The Slant T ransform
`3.6 The Hotelling Transform
`3.7 Concluding Remarks
`References
`Problems
`
`IMAGE ENHANCEMENT
`Chapter 4
`4.1 Background
`4. l. L Spatial Domain Methods
`4.1.2 Freque ncy Domain Me thods
`
`Page 5
`
`
`
`4.2 Enhancement by Point Processing
`4.2.1 Some Simple Intensity Transformations
`4.2.2 Histogram Processing
`4.2.3
`Image Subtraction
`4.2.4
`Image Averaging
`4.3 Spatial Filtering
`4.3.1 Background
`4.3.2 Smoothing Filters
`4.3.3 Sharpening Filters
`4.4 Enhancement in the Frequency Domain
`4.4.1 Lowpass Filtering
`4.4.2 Highpass Filtering
`4.4.3 Homomorphic Filtering
`4.5 Generation of Spatial Masks from Frequency Domain Specifications
`4.6 Color Image Processing
`4.6.1 Color Fundamentals
`4.6.2 Color Models
`4.6.3 Pseudo-color Image Processing
`4.6.4 Full-color Image Processing
`4.7 Concluding Remarks
`References
`Problems
`
`:._:
`
`:I
`
`IMAGE RESTORATION
`Chapter 5
`5.1 Degradation Model
`5 .1.1 Some Definitions
`5.1.2 Degradation Model for Continuous Functions
`5.1.3 Discrete Formulation
`5.2 Diagonalization of Circulant and Block-Circulant Matrices
`5.2.1 Circulant Matrices
`5.2.2 Block-Circulan t Matrices
`5.2.3 Effects of Diagonalization on the Degradation Model
`5.3 Algebraic Approach to Restoration
`5.3.1 Unconstrained Restoration
`5.3.2 Constrained Restoration
`Inverse Filtering
`5.4.1 Formulation
`5.4.2 Removal of Blur Caused by Uniform Linear Motion
`5.5 Least Mean Square (Wiener) Filter
`5.6 Constrained Least Squares Restoration
`5.7
`Interactive Restoration
`
`5.4
`
`Contents
`
`xiii
`
`166
`166
`171
`185
`187
`189
`189
`191
`195
`201
`202
`209
`213
`218
`221
`222
`225
`237
`245
`248
`248
`249
`
`253
`254
`254
`255
`257
`261
`261
`263
`264
`268
`268
`269
`270
`270
`272
`279
`282
`289
`
`Page 6
`
`
`
`xiv
`
`Contents
`
`5.8 Restoration in the Spatial Domain
`5.9 Geometric Transformations
`5. 9.1 Spatial Transformations
`5.9.2 Gray-Levellntcrpolation
`5.10 Concluding Remarks
`References
`Problems
`
`6.2
`
`IMAGE COMPRESSION
`Chapter 6
`6.1 Fundamentals
`6.1.1 Coding Redundancy
`6.1.2
`Inlerpixel Redundancy
`6.1.3 Psychovisual Redundancy
`6.1.4 Fidelity Criteria
`Image Compression Models
`6.2.1 The Source Encoder and Decoder
`6.2.2 The Channel Encoder a nd Decoder
`6.3 Elements of Information Theory
`6.3.1 Measuring 1 nforrnation
`6.3.2 The Information Chan nel
`6.3.3 Fundamental Codi ng Theorems
`6.3.4 Using Information Theory
`6.4 Error-Free Compression
`6.'1.1 Variable-Le ngth Coding
`6.4.2 Bit-Plane Coding
`6.4.3 Lossless Predictive Coding
`6.5 Lossy Compression
`6.5.1 Lossy Predictive Coding
`6.5.2 Transform Coding
`6.6 Image Compression Standards
`6.6.1 Bilevel (Binary) Image Compression Standards
`6.6.2 Continuous Tone Image Compression Standards
`6.7 Concluding Remarks
`References
`Problems
`
`Chapter 7
`7.1
`
`IMAGE SEGMENT A liON
`I )etection of Discontinuities
`7 .1.1 Point Detection
`
`296
`296
`298
`300
`302
`303
`304
`
`307
`309
`310
`312
`315
`318
`320
`321
`322
`324
`324
`325
`331
`339
`343
`343
`349
`358
`362
`362
`374
`389
`389
`394
`405
`405
`407
`
`413
`414
`415
`
`Page 7
`
`
`
`7.1.2 Line Detection
`7.1.3 Edge Detection
`7.1.4 Combined Detection
`7.2 Edge Linking and Boundary Detection
`7.2.1 Local Processing
`7.2.2 Global Processing via the Hough Transform
`7.2.3 Global Processing via Graph-Theoretic Techniques
`7.3 Thresholding
`7.3.1 Foundation
`7.3.2 The Role of Illumination
`7.3.3 Simple G lobal Thresholding
`7.3.4 Optimal Thresholding
`7.3.5 Threshold Selection Dased on Boundary Characteristics
`7.3.6 Thresholds Dased on Several Variables
`7.4 Region-Oriented Segmentation
`7.4.1 Dasic Formulation
`7.4.2 Region Growing by Pixel Aggregation
`7.4.3 Region Splitting and Merging
`7.5 The Usc of Motion in Segmentation
`7.5.1 Spatial Techniques
`7.5.2 Frequency Domain Techniques
`7.6 Concluding Remarks
`References
`Problems
`
`Chapter 8 REPRESENTATION AND DESCRIPTION
`8.1 Representation Schemes
`8.1. 1 Chain Codes
`8.1.2 Polygonal Approximations
`8.1.3 Signatures
`8.1.4 Boundary Segme nts
`8.1.5 The Skeleton of a Region
`8.2 Boundary Descriptors
`8.2.1 Some Simple Descriptors
`8.2.2 Shape Numbers
`8.2.3 Fourier Descriptors
`8.2.4 Moments
`8.3 Regional Descriptors
`8.3. 1 Some Simple Descriptors
`8.3.2 Topological Descriptors
`
`Contents
`
`xv
`
`415
`416
`423
`429
`429
`432
`438
`443
`443
`444
`447
`447
`452
`455
`458
`458
`458
`461
`465
`465
`470
`476
`477
`478
`
`483
`
`4~3
`484
`486
`. 48~
`490
`491
`494
`494
`496
`497
`502
`503
`503
`504
`
`Page 8
`
`
`
`xvi
`
`Contents
`
`8.3.3 Texture
`8.3.4 Moments
`8.4 Morphology
`8.4. J Dilation and Erosion
`8.4.2 Opening and Closing
`8.4.3 Hit-or-Miss Transform
`8.4.4 Some Basic Morphological Algorithms
`8.4.5 Extensions to Gray-Scale Images
`8.5 Relntionol Descriptors
`8.6 Concluding Remarks
`References
`Problems
`
`Chapter 9 RECOGNITION AND INTERPRETATION
`9.1 Elements of Image Analysis
`9.2 Patterns and Pattern Classes
`9.3 Decision-Theoretic Methods
`9.3.1 Matching
`9.3.2 Optimum Statistical Classifiers
`9.3.3 Neural Networks
`9.4 Structural Methods
`9.4.1 Matching Shape Numbers
`9.4.2 String Match ing
`9.4.3 Syntactic Methods
`Interpretation
`9.5.1 Background
`9.5.2 Types of Knowledge
`9.5.3 Logical Systems (Predicate Calculus)
`9.5.4 Semantic Networks
`9.5.5 Production (Expert) Systems
`9.6 Concluding Remarks
`References
`Problems
`
`9.5
`
`Appendix A GENERATION OF HALFTONE IMAGES
`
`Appendix .B CODED IMAGES
`
`BIBLIOGRAPHY
`
`INDEX
`
`506
`514
`518
`518
`524
`528
`530
`548
`560
`564
`565
`566
`
`571
`572
`574
`579
`580
`586
`595
`619
`619
`621
`623
`639
`639
`640
`641
`650
`652
`657
`657
`658
`
`663
`
`669
`
`683
`
`705
`
`Page 9
`
`
`
`2.5
`
`Imaging Geometry
`
`51
`
`Table 2.3 ALU Operations
`
`Operations on A
`
`Operations on B
`
`Multiply by W 5
`
`Add w4*A
`
`Add w1*A
`
`Add w2*A
`
`Add w/A
`
`Add wfi*A
`
`Add w9*A
`
`Add wN*A
`
`Add w1*A
`
`Shift right
`
`Shift down
`
`Shift left
`
`Shift left
`
`Shift up
`
`Shift up
`
`Shift right
`
`Shift right
`
`Shift left
`Shift down
`
`happens in a single pixel of B by considering how a mask would have to be
`shifted in order to produce the result of Eq. (2.4-5) in that location. The first
`operation on B produces w5 multiplied by the pixel value at that location. Since
`that value is z5 , we have w5 z 5 after this operation. The first shift to the right
`brings the neighbor with value z4 (see Fig. 2.15a) over that location. The next
`operation multiplies Z4 by W4 and adds the result to the location of the first
`step. So at this point the result is w4z4 + w5z5 at the location in question. The
`next shift on A and ALU operation on B produce Wt z 1 + W4 z 4 + w5 z5 at that
`location , and so on. The operations are done in parallel for all locations in B,
`so this procedure takes place simultaneously at the other locations in that frame
`buffer. In most ALUs, the operation of multiplying an image by a constant
`(say, Wi'~A) followed by an ADD is done in one frame time. Thus the ALU
`implementation of Eq. (2.4-5) for an entire image takes on the order of nine
`frame times (9/30 sec). For ann x m mask it would take on the order of nm
`frame times.
`
`IMAGING GEOMETRY - - - - - - - - - - - -- - - -
`
`2.5
`In the following discussion we present several important transformations used
`in imaging, derive a camera model , and treat the stereo imaging problem in
`some detail.
`
`Page 10
`
`
`
`52
`
`Digital Image Fundamentals
`
`2.5.1 Some Basic Transformations
`The material in this section deals with development of a unified representation
`for problems such as image rotation, scaling, and translation. All transfor(cid:173)
`mations are expressed in a three-dimensional (3-D) Cartesian coordinate system -
`in which a point has coordinates denoted (X, Y, Z). In cases involving 2-D
`images, we adhere to our previous convention of lowercase representation (x ,
`y) to denote the coordinates of a pixel. Referring to (X, Y, Z) as the world
`coordinates of a point is common terminology.
`
`Translation
`Suppose that the task is to translate a point with coordinates (X, Y, Z) to a
`new location by using displacements (X0 , Y0 , Zo). The translation is easily
`accomplished by using the equations:
`
`X*= X + X o
`Y* = Y + Yo
`Z* = Z + Zo
`
`(2.5-1)
`
`where (X*, Y*, Z*) are the coordinates of the new point. Equation (2.5-1)
`may be expressed in matrix form by writing
`
`[1 0 0 X0
`] X y
`X*]
`Y* = 0 1 0 Yo
`0 0 1 Z 0 z
`Z*
`1
`
`(2.5-2)
`
`It is often useful to concatenate several transformations to produce a com(cid:173)
`posite result, such as translation, followed by scaling and then rotation. The
`use of square matrices simplifies the notational representation of this process
`considerably. With this in mind, Eq. (2.5-2) can be written as follows:
`
`X*
`
`Y*
`
`Z*
`
`1
`
`=
`
`1 0 0 Xo X
`0 1 0 Yo y
`0 0 1 Zo z
`0 0 0
`1
`]
`
`(2.5-3)
`
`In terms of the values of X*, Y*, and Z*, Eqs. (2.5-2) and (2.5-3) are equivalent.
`Throughout this section, we use the unified matrix representation
`v* = Av
`(2.5-4)
`where A is a 4 x 4 transformation matrix, v is the column vector containing
`
`Page 11
`
`
`
`the original coordinates,
`
`2.5 Imaging Geometry
`
`53
`
`(2.5-5)
`
`v =
`
`X
`
`y
`z
`1
`
`and v* is a column vector whose components are the transformed coordinates
`
`v* =
`
`X*
`
`Y*
`
`Z*
`
`1
`
`With this notation, the matrix used for translation is
`
`T =
`
`1 0 0 X o
`
`0 1 0 Yo
`
`0 0 1 Zo
`
`0 0 0
`
`(2.5-6)
`
`(2.5-7)
`
`and the translation process is accomplished by using Eq. (2.5-4), so that v* -
`Tv.
`
`Scaling
`Scaling by factors S.., Sn and S2 along the X, Y, and Z axes is given by the
`transformation matrix
`
`S =
`
`s.
`0
`
`0
`
`0
`
`0
`
`Sy
`
`0
`
`0
`
`0 0
`
`0 0
`s, 0
`0 1
`
`(2.5-8)
`
`Rotation
`The transformations used for 3-D rotation are inherently more complex thlln
`the transformations discussed thus far. The simplest form of these transfor(cid:173)
`mations is for rotation of a point about the coordinate axes. To rotate a point
`about another arbitrary point in space requires three transformations: the first
`
`Page 12
`
`
`
`54
`
`Digital Image Fundamentals
`
`z
`
`f3
`
`X
`
`Figure 2.16 Rotation of a point about each of the coordinate axes. Angles are measured
`clockwise when looking along the rotation axis toward the origin.
`
`translates the arbitrary point to the origin , the second performs the rotation,
`and the third translates the point back to its original position .
`With reference to Fig. 2. 16, rotation of a point about the i coordinate axis
`by an angle f) is achieved by using the transformation
`
`cos (}
`
`sin 8 0 0
`
`- sin () cos 0 0 0
`
`0
`
`0
`
`0
`
`0
`
`1 0
`0 1
`
`(2.5-9)
`
`The ro tatio n angle ()is measured clockwise when looking at the origin from a
`point on the + Z axis. This transformation affects only the values of X and Y
`coordinates.
`R otation of a point about the X axis by an angle a is performed by using
`the transformation
`
`R =
`
`IX
`
`1
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`cos a
`
`sin a 0
`
`- sin a cos a 0
`
`0
`
`0
`
`1
`
`(2.5-10) .
`
`Finally, rotation of a point about the Y axis by an angle f3 is achieved by
`
`Page 13
`
`
`
`2.5 Imaging Geometry
`
`55
`
`using the transformation
`
`R.o =
`
`cos {3 0
`
`- sin {3 0
`
`0
`
`1
`
`sin {3 0
`
`0
`
`0
`
`0
`cos /3
`0
`
`0
`
`0
`
`1
`
`(2.5-11)
`
`(2.5-12)
`
`Concatenation and inverse transformations
`The application of several transformations can be represented by a single 4 x
`4 transformation matrix. For example, translation , scaling, and rotation about
`the Z axis of a point v is given by
`v"' = R9(S(Tv))
`= Av
`where A is the 4 x 4 matrix A = R 9ST. These matrices generally do not
`commute, so the order of application is important.
`Although the discussion thus far has been limited to transformations of a
`single point, the same ideas extend to transforming a set of m points simul(cid:173)
`taneously by using a single transformatio n. With reference to Eq. (2.5-5), let
`v1, v2 , • • • , v"' represent the coordinates of m points. For a 4 x m matrix V
`whose columns are these column vectors, the simultaneous transformation of
`all these points by a 4 x 4 transformation matrix A is given by
`V * = AV.
`(2.5-13)
`T he resulting matrix V* is 4 x m. Its ith column, v;*, contains the coordinates
`of the transformed point corresponding to v;.
`Many of the transformations discussed above have inverse matrices that
`perform the opposite transformation and can be obtained by inspection. For
`example, the inverse translation matrix is
`
`T - ' =
`
`1 0 0
`0 1 0
`0 0 1
`0 0 0
`
`- X u
`- Yu
`- Zu
`1
`
`Similarly, the inverse rotation matrix R 9- r is
`
`Ru' =
`
`cos (- 8)
`
`sin ( - 8) 0 0
`
`- sin ( - 0) cos ( - e) 0 0
`
`0
`
`0
`
`0
`
`0
`
`1 0
`
`0 1
`
`(2.5-14)
`
`(2.5-15)
`
`Page 14
`
`
`
`56
`
`Digital Image Fundamentals
`
`The inverses of more complex transformation matrices are usually obtained by
`numerical techniques.
`
`2.5.2 Perspective Transformations
`A perspective transformation (also called an imaging transformation) projects
`3-D points onto a plane. Perspective transformations play a central role in
`image processing because they provide an approximation to the manner in
`which an image is formed by viewing a 3-D world. These transformations arc
`fundamentally different from those discussed in Section 2.5.1 because they are
`nonlinear in that they involve division by coordinate values.
`Figure 2.17 shows a model of the image formation process. The camera
`coordinate system (x , y, z) has the image plane coincident with the xy plane
`and the o ptical axis (established by the center of the lens) along the ~ axis.
`Thus the center of the image plane is at the origin, and the center of the lens
`is at coordinates (0, 0, A). lf the camera is in focus for distant objects, A is the
`focal length of the lens. Here the assumption is that the camera coordinate
`system is aligned with the world coordinate system (X, Y, Z). We remove this
`restriction in Section 2.5.3.
`Let (X, Y, Z) be the world coordinates of any point in a 3-D scene, as
`shown in Fig. 2.17. We assume throughout the following discussion that Z >
`A; that is, all points of interest lie in front of the lens. The first step is to obtain
`a relationship that gives the coordinates (x, y) of the projection of the point
`(X, Y, Z) onto the image plane. This is easily accomplished by the use of similar
`triangles. With reference to Fig. 2.17,
`
`y, y
`
`Figure 2.17 Basic model of the imaging process. The camera coordinate system (x, y, z) is
`aligned with the world coordinate system (X, Y, Z).
`
`Page 15
`
`
`
`of similar triangles. With reference to Fig. 2.17,
`X
`- = A
`Z - A
`X
`= A - z
`
`X
`
`2.5 Imaging Geometry
`
`57
`
`(2·.5-16)
`
`and
`
`y
`~
`= A
`Z - A
`y
`= A -z
`where the negative signs in front of X and Y indicate. that image points are
`actually inverted, as the geometry of Fig. 2.17 shows.
`The image-plane coordinates of the projected 3-D point follow directly
`from Eqs. (2.5-16) and (2.5-17):
`
`(2.5-17)
`
`and
`
`(2.5-18)
`
`(2.5-19)
`
`These equations are nonlinear because they involve division by the variable Z.
`Although we could use them directly as shown , it is often convenient to express
`them in linear matrix form, as in Section 2.5.1 for rotation, translation, and
`scaling. This is easily accomplished by using homogeneous coordinates.
`The homogeneous coordinates of a point with Cartesian coordinates (X,
`Y, Z) are defined as (kX, kY, kZ, k), where k is an arbitrary, nonzero constant.
`Clearly, conversion of homogeneous coordinates back to Cartesian coordinates
`is accomplished by dividing the first three homogeneous coordinates by the
`fourth. A point in the Cartesian world coordinate system may be expressed in
`vector form as
`
`and its homogeneous counterpart is given by
`
`w, =
`
`kX
`
`kY
`
`kZ
`
`k
`
`(2.5-20)
`
`(2.5-21)
`
`Page 16
`
`
`
`58
`
`Digital Image Fundamentals
`
`If we define the perspective transformation matrix as
`
`p =
`
`1 0
`
`0 1
`0 0
`
`0 0
`
`0 0
`
`1 0
`
`0 0 -1 1
`A
`
`the product Pw" yields a vector denoted c11:
`
`c11 = Pw,
`1 0
`
`0 1
`
`0 0
`
`0 0
`
`0 0
`
`k
`
`kY
`
`kZ
`
`k
`
`0 0
`
`1 0
`
`-1 1
`A
`
`kX
`
`kY
`kZ
`-kZ --+ k
`A
`
`(2.5-22)
`
`(2.5-23)
`
`The elements of c11 are the camera coordinates in homogeneous form. As in(cid:173)
`dicated, these coordinates can be converted to Cartesian form by dividing each
`of the first three components of c" by the fourth. Thus the Cartesian coordinates
`of any point in the camera coordinate system are given in vector form by
`
`X
`
`c = y =
`
`z
`
`AX
`A - Z
`
`AY
`A - Z
`
`AZ
`A - Z
`
`(2.5~24)
`
`The first two components of care the (x, y) coordinates in the image plane
`of a projected 3-D point (X, Y, Z), as shown earlier in Eqs. (2.5-18) and (2.5-
`19). The third component is of no interest in terms of the model in Fig. 2.17.
`As shown next, this component acts as a free variable in the inverse perspective
`transformation.
`
`Page 17
`
`
`
`2.5
`
`Imaging Geometry
`
`59
`
`The inverse perspective transformation maps an image point back into 3-
`D. Thus from Eq. (2.5-23),
`
`w,, = p - 1c,
`
`(2 .5-2~)
`
`where p - I is
`
`(2.5-26)
`
`1 0 0 0
`
`0 1 0 0
`
`0 0 1 0
`
`l
`
`0 0 1
`
`
`-A
`
`p 1 =
`
`Suppose that an image point has coordinates (x0 , y 0, 0), where the 0 in the
`z location simply indicates that the image plane is located at z = 0. This point
`may be expressed in homogeneous vector form as
`
`,, -
`c -
`
`kx0
`
`kyo
`
`0
`
`k
`
`(2.5-27)
`
`Application of Eq. (2.5-25) then yields the homogeneous world coordinate
`vector
`
`\ VII =
`
`kxo
`
`kyo
`
`0
`
`k
`
`or, in Cartesian coordinates,
`
`(2.5-28)
`
`(2.5-29)
`
`This result obviously is unexpected because it gives Z = 0 for any 3-D point.
`The problem here is caused by mapping a 3-D scene onto the image plane,
`which is a many-to-one transformation. The image point (xo, y0) corresponds
`to the set of collinear 3-D points that lie on the line passing through (x0 , y0 , 0)
`
`Page 18
`
`
`
`60
`
`Digital Image Fundamentals
`
`and (0, 0, A). The equations of this line in the world coordinate system come
`from Eqs. (2.5-18) and (2.5-19); that is,
`
`and
`
`X= Xn (A - Z)
`A
`
`y = Yo (A - Z).
`A
`
`(2.5-30)
`
`(2.5-31)
`
`Equations (2.5-30) and (2.5-31) show that unless something is known about
`the 3-D point that generated an image point (for example, its Z coordinate),
`it is not possible to completely recover the 3-D point from its image. This
`observation, which certainly is not unexpected , can be used to formulate the
`inverse perspective transformation by using the z component of c, as a free
`variable, instead of 0. Thus, by letting
`
`c, =
`
`kxo
`
`kyo
`
`kz
`
`k
`
`(2.5-32)
`
`it follows from Eq. (2.5-25) that
`
`w, =
`
`(2.5-33)
`
`k
`kz
`-+
`A
`
`which, upon conversion to Cartesian coordinates gives
`
`Axo
`A + z
`
`w= y
`
`= ~
`A+ z
`
`z
`
`Az
`- -
`A + z
`
`(2.5-34)
`
`Page 19
`
`
`
`2.5
`
`Imaging Geometry
`
`61
`
`In other words, treating z as a free variable yields the equations
`
`X = ~ ,\ + z
`Y = ~ ,\ + z
`Z=-~ ,\ + z
`Solving for z in terms of Z in the last equation and substituting in the first two
`expressions yields
`
`(2.5-35)
`
`and
`
`Xo
`X = -
`,\
`
`(.A - Z)
`
`y =Yo (A - Z)
`A
`
`(2.5-36)
`
`(2.5-37)
`
`which agrees with the observation that recovering a 3-D point from its image
`by means of the inverse perspective transformation requires knowledge of at
`least one of the world coordinates of the point. We address this problem again
`in Section 2.5.5.
`
`2.5.3 Camera Model
`Equations (2 .5-23) and (2.5-24) characterize the formation of an image by
`projection of 3-D points onto an image plane. These two equations thus con(cid:173)
`stitute a basic mathematical model of an imaging camera. This model is based
`on the assumption that the camera and world coordinate systems are coincident.
`In this section we consider a more general problem in which the two coordinate
`systems are allowed to be separate. However, the basic objective of obtaining
`the image-pla ne coordinates of any particular world po int remains the same.
`Figure 2.18 shows a world coordinate system (X, Y, Z) used to locate both
`the camera and 3-D points (denoted by w). Figure 2.18 also shows the camera
`coordinate system (x, y, z) and image points (denoted by c). The assumption
`is that the camera is mounted on a gimbal, which allows pan through an angle
`()and tilt through an angle a. Here, pan is the angle between the x and X axes,
`and tilt is the angle between the z and Z axes. The offset of the center of the
`gimbal from the origin of the world coordinate system is denoted by Wo, and
`the offset of the center of the imaging plane with respect to the gimbal center
`is denoted by vector r, with components (r1, r2, rJ).
`The concepts developed in Sections 2.5.1 and 2.5.2 provide all the necessary
`tools to derive a camera model based on the geometric arrangement of Fig.
`2. 18. The approach is to bring the camera and world coordinate systems into
`
`Page 20
`
`
`
`62
`
`Digital Image Fundamentals
`
`z
`
`w
`
`z
`
`r--------------r----~r-----------------Y
`
`Xo
`
`Yo
`
`X
`
`Figure 2.18
`[1987].)
`
`Imaging geometry with two coordinate systems. (From Fu, Gonzalez, and Lee
`
`alignment by applying a set of transformations. After doing so, we simply apply
`the perspective transformation of Eq. (2.5-22) to obtain the image-plane co(cid:173)
`ordinates for any world point. In other words, we first reduce the problem to
`the geometric arrangement shown in Fig. 2.17 before applying the perspective
`transformation.
`Suppose that, initially, the camera was in normal position, in the sense that
`the gimbal center and origin of the image plane were at the origin of the world
`coordinate system , and all axes were aligned. The geometric arrangement of
`Fig. 2.18 may then be achieved in several ways . Let us assume the following
`sequence of steps: (1) displacement of the gimbal center from the origin, (2)
`pan of the x axis, (3) tilt of the z axis, and (4) displacement of the image plane
`with respect to the gimbal center.
`
`Page 21
`
`
`
`2.5
`
`Imaging Geometry
`
`63
`
`Obviously, the sequence of these mechanical steps does not affect the world
`points because the set of points seen by the camera after it was moved from
`normal position is quite different. However, applying exactly the same sequence
`of steps to all world points can achieve normal position again. A camera in
`normal position satisfies the arrangement of Fig. 2.17 for application of the
`perspective transformation. Thus the problem is reduced to applying to every
`world point a set of transformations that correspond to the steps listed earlier.
`Translation of the origin of the world coordinate system to the location of
`the gimbal center is accomplished by using the transformation matrix
`
`G =
`
`1 0 0
`
`-Xo
`
`0 1 0
`
`- Yo
`
`0 0 1
`
`- Zo
`
`0 0 0
`
`1
`
`(2.5-38)
`
`In other words, a homogeneous world point w11 that was at coordinates (X0 ,
`Yo, Zo) is at the origin of the new coordinate system after the transformation
`Gw,.
`As indicated earlier, the pan angle is measured between the x and X axes.
`In normal position, these two axes are aligned. In order to pan the x axis
`through the desired angle, we simply rotate it by 8. The rotation is with respect
`to the z axis and is accomplished by using the transformation matrix R 0 of Eq.
`(2.5-9). In other words, application of this matrix to aU points (including the
`point Gw,,) effectively rotates the x axis to the desired location. When using
`Eq. (2.5-9) it is important to keep clearly in mind the convention established
`in Fi.g. 2.16. That is, angles are considered positive when points are rotated
`clockwise, which implies a counterclockwise rotation of the camera about the
`z axis. The unrotated (0°) position corresponds to the case when the x and X
`axes are aligned.
`At this point the z and Z axes are still aligned. Since tilt is the angle between
`these two axes, we tilt the camera an angle a by rotating the z axis by a. The
`rotation is with respect to the x axis and is accomplished by applying the
`transformation matrix Ra of Eq. (2.5-10) to all points (including the point
`RoGw,) . Again, a counterclockwise rotation of the camera implies positive
`angles, and the 0° mark is when the z and Z axes are aligned.t
`According to the discussion in Section 2.5.4, the two rotation matrices can
`be concatenated into a single matrix, R = Ra R 0• Then , from Eqs. (2.5-9) and
`
`t A useful way to visualize these transformations is to construct an axis system (for exampie,
`with pipe cleaners), label the axes x, y, and z, and perform the rotations manually, o ne axis
`at a time.
`
`Page 22
`
`
`
`64
`
`Digital Image Fundamentals
`
`(2.5-10),
`
`cos 0
`
`sin 0
`
`0
`
`0
`
`R =
`
`-sin 0 cos a
`
`cos 0 cos a
`
`sin a 0
`
`sin (}sin a
`
`-cos 0 sin a cos a 0
`
`0
`
`0
`
`0
`
`1
`
`(2.5-39)
`
`Finally, displacement of the origin of the image plane by vector r is achieved
`by the transformation matrix
`
`C=
`
`1 0 0
`
`- r ,
`
`0 1 0
`
`-
`
`r 2
`
`0 0 1
`
`- rJ
`
`0 0 0
`
`1
`
`(2.5-40)
`
`Thus applying to w, the series of transformations CRGw,. brings the world and
`camera coordinate systems into coincidence. T he image-plane coordinates of
`a point w, arc finally obtained by using Eq . (2.5-23). In other words, a ho(cid:173)
`mogeneous world point that is being viewed by a camera satisfying the geometric
`arrangement shown in Fig. 2. 18 has the following homogeneous representation
`in the camera coordinate system:
`
`c11 = PCRGw11
`
`(2.5-41)
`
`Equation (2.5-41) represents a perspective transformation involving two co(cid:173)
`ordinate systems.
`As indicated in Section 2.5 .2, we obtain the Cartesian coordinates (x, y)
`of the imaged point by dividing the first and second components of c11 by the
`fourth. Expanding Eq . (2.5-41) and converting to Cartesian coordinates yields
`(X - X 0)cos e + ( Y - Y0)sin 0 -
`r,
`x = A - (X- X0)sin 0 sin a+ (Y - Y0)cos 0 sin a - (Z- Zo)cos a + rJ +A
`(2.5-42)
`
`and
`
`- (X - X0)sin 0 cos a + (Y - Y0)cos e cos a + (Z - Zo)sin a -
`r2
`(Z - Zo)cos a+ rJ + A
`y = A - (X- X 0)sin e sin a + (Y- Y0)cos 0 sin a -
`(2.5-43)
`
`which are the image coordinates of a point w whose world coordinates are (X ,
`
`Page 23
`
`
`
`2.5 Imaging Geometry
`
`65
`
`Y, Z). These equations reduce to Eqs. (2.5-18) and (2.5-19) when Xo = Yo =
`Zo = 0, r1 = r2 = r3 = 0, and a =
`() = 0°.
`Example: As an illustration of the concepts just discussed, suppose that we
`want to find the image coordinates of the corner of the block shown in Fig-.
`2.19. The camera is offset from the origin and is viewing the scene with a pan
`of 135o and a tilt of 13SO. We will follow the convention that transformation
`angles are positive when the camera rotates counterclockwise, viewing the
`origin along the axis of rotation.
`Let us examine in detail the steps required to move the camera from normal
`position to the geometry shown in Fig. 2.19. The camera is in normal position
`in Fig. 2.20(a) and displaced from the origin in Fig. 2.20(b). Note that, after
`this step, the world coordinate axes are used only to establish angle references.
`That is, after displacement of the world coordinate origin, all rotations take
`place about the new (camera) axes. Figure 2.20(c) shows a view along the z
`axis of the camera to establish pan. In this case the rotation of the camera
`about the z axis is counterclockwise, so world points are rotated about this axis
`in the opposite direction, which makes ()a positive angle. Figure 2.20( d) shows
`a view, after pan, along the x axis of the camera to establish tilt. The rotation
`about this axis is counterclockwise, which makes a a positive angle. The world
`coordinate axes are shown as dashed lines in the latter two figures to emphasize
`that their only use is to establish the zero reference for the pan and tilt angles.
`We do not show the final step of displacing the image plane from the center
`of the gimbal.
`
`X
`
`Figure 2.19 Camera viewing a 3-D scene. (From Fu, Gonzalez, and Lee [1987]. )
`
`Page 24
`
`
`
`66
`
`Digital Image Fundamentals
`
`z
`
`z
`
`X
`
`y
`
`.)----------¥
`
`X
`
`(a)
`
`X
`
`(b)
`
`y
`
`x axis is -
`1 to paper
`
`+--..c::
`
`I
`I
`I
`I
`r
`I
`XY plane
`I
`_ ________ ____ _j
`
`z
`
`(c)
`
`(d)
`
`(a) Camera in normal position; (b) gimbal center displaced from origin; (c)
`Figure 2.20
`observer view of rotation about z axis to determine pan angle; (d) observer view of rotation
`about x axis for tilt. (From Fu, Gonzalez, and Lee {1987].)
`
`The followjng parameter values apply lo Lhis problem:
`
`X0 = 0 m
`a = 135°
`r, = 0.03 m
`
`Yo= 0 m
`e = 135°;
`r2 = r3 = 0.02 m
`
`Zo = 1m;
`
`A = 35 mm = 0.035 m
`
`The corner in question is at coordinates (X, Y, Z) = (1, 1, 0.2).
`To compute the image coordinates of the block corner, we simply substitute
`
`Page 25
`
`
`
`2.5
`
`Imaging Geometry
`
`67
`
`the parameter values into Eqs. (2.5-42) and (2.5-43); that is,
`
`Similarly,
`
`-0.03
`x=A - - - -
`-1 .53 + A
`
`-0.42
`y - A - - - (cid:173)
`-1.53 + A
`
`Substituting A
`
`0.035 yields the image coordinates
`
`x = 0.0007 m and y = 0.009 m.
`
`Note that these coordinates are well within a 1 x 1 in. (0.025 x 0.025 m)
`imaging plane. It is easily verified that use of a lens with a 200-mm focal length ,
`for example , would have imaged the corner of the block outside the boundary
`of a plane with these dimensions (that is, outside the effective field of view of
`the camera) .
`Finally, note that all coordinates obtained with Eqs. (2.5-42) and (2.5-43)
`are with respect to the center of the image plane . A change of coordinates is
`required to use the convention established earlier that the origin of an image
`0
`is at its top left corner.
`
`2.5.4 Camera Calibration
`In Section 2.5.3 we developed explicit equations for the image coordinates (x,
`y), of a world point w. As shown