throbber
APPL-1012 / Page 1 of 211
`Apple Inc. v. Corephotonics
`
`

`

`Texts in Computer Science
`
`Editors
`David Gries
`Fred B. Schneider
`
`For further volumes:
`www.springer.com/series/3 191]
`
`APPL-1012 / Page 2 of 211
`APPL-1012 / Page 2 of 211
`
`

`

`
`
`Richard Szeliski
`
`Computer Vision
`
`Algorithms and Applications
`
`G) Springer
`
`APPL-1012 / Page 3 of 211
`APPL-1012 / Page 3 of 211
`
`

`

`
`
`Dr. Richard Szeliski
`Microsoft Research
`One Microsoft Way
`98052-6399 Redmond
`Washington
`USA
`szeliski @microsoft.com
`
`Series Editors
`David Gres
`Department of Computer Science
`Upson Hall
`Comell University
`Ithaca, NY 14853-7501, USA
`
`Fred B. Schneider
`Department of Computer Science
`Upson Hall
`Comell University
`lihaca, NY 14853-7501, USA
`
`ISSN 1868-0941
`ISBN 978-1-84882-934-3
`DOT 10, 1007/978-1-84882-935-0
`Springer London Dordrecht Heidelberg New York
`
`e-ISSN 1868-095X
`e-ISBN 978-1-84882-935-0
`
`British Library Cataloguing in Publication Data
`A catalogue record for this book is available fromthe British Library
`
`Library of Congress Control Number: 20108368 17
`
`© Springer-Verlag London Limited 2011
`Apart from any fair dealing for the purposes of research or private study, of criticism of review, as
`permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced,
`stored or
`transmitted,
`in any form or by any means, with the prior permission in writing of
`the
`publishers, or in the case of reprographic reproduction in accordance withthe terms of licenses issued by
`the Copynght Licensing Agency. Enquiries conceming reproduction outside those terms should be sent
`to the publishers.
`The use of registered names, trademarks, etc., in this publication does not imply, even in the absence of a
`specific statement, that such names are exempt fromthe relevant laws and regulations and therefore free
`for general use.
`The publisher makes no representation, express or implied, with regard to the accuracy ofthe information
`contained in this book and cannot accept any legal responsibility or liability for any errors or omissions
`that may be made.
`
`Printed on acid-free paper
`
`Springer is part of Springer Science+Business Media (www.springer.com)
`
`APPL-1012 / Page 4 of 211
`APPL-1012 / Page 4 of 211
`
`

`

`Chapter 2
`
`Image formation
`
`2.1 Geometric primitives and transformations .. 2.1.6. ee a as cigs ganna
`2.1.1 Geometric primitives... 2. ee es Manone RN
`21.2
`2D transformations... 2. ae ee
`21S SD tranaiirmatigng; 32 eee eee aaa ee a
`21.4
`SD rotations. eee ce ee ee TELE
`2.15.
`3Dto2D projections . 2... ee ee ees ease ees ayaa
`2G:
`Tans distortions aceasta ta eee ele ee eas A aaa
`22 Photormetricimage formation.
`. 6... ee
`22.1 Lighting... aces eee ee ee Fee wee
`2.2.2 Reflectance and shading ... 2... 66. ees te eee
`Dei:
`ROHER cay sag git ela ergata etieee £2 SS Ree eral ee
`23 The digital camera... 0.004. 2g Re ae ea ta
`2.3.1
`Sampling and aliasing. 2.0.6 ee ee ke es
`Se EUG cnt
`na:
`cersses ese caccy!
`tes
`sisecien
`erpernecee erage nes
`ice See ere
`23.3. Compression ... 2.6 ce ees CREST TRL RAR ee
`24 Additional reading ........ Sgt erty SEER og BERT ECE eae
`25:
`tbnsinee: 5 re Sse pa add a Se ata es at a WEA
`
`23
`29
`33
`36
`
`aT
`42
`52
`54
`a4
`
`61
`
`KR. Szeliski, Computer Vision: Algorithms andApplications, Texts in Computer Science,
`DOT 10,1007/978-1-84882-945-0_2, © Springer-Verlag London Limited 2011
`
`iT
`
`APPL-1012 / Page 5 of 211
`APPL-1012 / Page 5 of 211
`
`

`

`28
`
`2 Image formation
`
`
`
`Figure 2.1 A few components of the image formation process: (a) perspective projection; (b)light scattering
`when hitting a surface; (c) lens optics; (d) Bayer color filter array.
`
`APPL-1012 / Page 6 of 211
`APPL-1012 / Page 6 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`29
`
`Before we can intelligently analyze and manipulate images, we need to establish a vocabulary
`for describing the geometry of a scene. We also need to understand the image formation
`process that produced a particular image given a set of lighting conditions, scene geometry,
`surface properties, and camera optics, In this chapter, we present a simplified model of such
`an image formation process.
`.
`Section 2.1 introduces the basic geometric primitives used throughout the book (points,
`lines, and planes) and the geometric transformations that project these 3D quantities into 2D
`image features (Figure 2.1a). Section 2.2 describes how lighting, surface properties (Fig-
`ure 2.1b), and camera optics (Figure 2.1c) interact in order to produce the color values that
`fall onto the image sensor, Section 2.3 describes how continuous color images are turned into
`discrete digital samples inside the image sensor (Figure 2.1d) and how to avoid (or at least
`characterize) sampling deficiencies, such as aliasing.
`The material covered in this chapter is but a brief summary of a very rich and deepset of
`topics, traditionally covered in a number of separate fields. A more thorough introduction to
`the geometry of points, lines, planes, and projections can be found in textbooks on multi-view
`geometry (Hartley and Zisserman 2004; Faugeras and Luong 2001) and computer graphics
`(Foley, van Dam, Feiner et al. 1995). The image formation (synthesis) process is traditionally
`taught as part of a computer graphics curriculum (Foley, van Dam, Feiner et al. 1995; Glass-
`ner 1995; Watt 1995; Shirley 2005) but it is also studied in physics-based computer vision
`(Wolff, Shafer, and Healey 1992a), The behavior of camera lens systems is studied in optics
`(Miler 1988; Hecht 2001; Ray 2002), Two good books on color theory are (Wyszecki and
`Stiles 2000; Healey and Shafer 1992), with (Livingstone 2008) providing a more fun and in-
`formal introduction to the topic of color perception. Topics relating to sampling and aliasing
`are covered in textbooks on signal and image processing (Crane 1997; Jéhne 1997; Oppen-
`heim and Schafer 1996; Oppenheim, Schafer, and Buck 1999; Pratt 2007; Russ 2007; Burger
`and Burge 2008; Gonzales and Woods 2008).
`A note to students: If you have already studied computer graphics, you may wantto
`skim the material in Section 2.1, although the sections on projective depth and object-centered
`projection near the end of Section 2.1.5 may be new to you, Similarly, physics students (as
`well as computer graphics students) will mostly be familiar with Section 2.2. Finally, students
`with a good background in image processing will already be familiar with sampling issues
`(Section 2.3) as well as some of the material in Chapter3.
`
`2.1 Geometric primitivesand transformations
`
`In this section, we introduce the basic 2D and 3D primitives used in this textbook, namely
`points, lines, and planes. We also describe how 3D features are projected into 2D features,
`More detailed descriptions of these topics (along with a gentler and more intuitive introduc-
`tion) can be found in textbooks on multiple-view geometry (Hartley and Zisserman 2004;
`Faugeras and Luong 2001),
`
`2.1.1 Geometric primitives
`
`Geometric primitives form the basic building blocks used to describe three-dimensional shapes.
`In this section, we introduce points, lines, and planes. Later sections of the book discuss
`
`APPL-1012 / Page 7 of 211
`APPL-1012 / Page 7 of 211
`
`

`

`30
`
`a)
`
`2 Image formation
`
` (b)
`
`Figure 2.2 (a) 2D line equation and (b) 3D plane equation, expressed in terms of the normal # and distance to
`the origin a.
`
`curves (Sections 5.1 and 11.2), surfaces (Section 12.3), and volumes (Section 12.5).
`
`2D points. 2D points (pixel coordinates in an image) can be denoted using a pair of values,
`x = (x,y) € R?, or alternatively,
`
`= | *
`
`y
`
`:
`
`(2.1)
`
`(As stated in the introduction, we use the (2:1, a2, ...) notation to denote column vectors.)
`2D points can also be represented using homogeneous coordinates, % = (#, ij, 1b) © P?,
`where vectors that differ only by scale are considered to be equivalent, P* = ‘R* — (0,0, 0)
`is called the 2D projective space.
`A homogeneous vector @ can be converted back into an inhomogeneous vector 2 by
`dividing through by the last element t, i.c.,
`
`z= (2,9, 0) = w(z,y, 1) = we,
`
`(2.2)
`
`where & = (xr, y, 1) is the augmented vector. Homogeneous points whoselast element is w =
`O are called ideal points or points at infinity and do not have an equivalent inhomogeneous
`representation.
`
`2Dlines can also be represented using homogeneous coordinates f = (a, b,c).
`2D lines.
`The corresponding line equation is
`
`#-l=ar+by+ec=0.
`
`(2.3)
`
`Wecan normalize the line equation vector so that ! = (fi,, iy, d) = (2, d) with ||7|) = 1. In
`this case, ft is the normal vector perpendicular to the line and ¢ is its distance to the origin
`(Figure 2.2),
`(The one exception to this normalization is the line at infinity i= (0,0, 1),
`which includes all (ideal) points at infinity.)
`We can also express fi as a function of rotation angle 6, A = (Az, iy) = (cosé,sin#)
`(Figure 2.2a). This representation is commonly used in the Hough transform line-finding
`algorithm, which is discussed in Section 4.3.2. The combination (@,d) is also known as
`polar coontinates.
`When using homogeneous coordinates, we can compute the intersection of two lines as
`
`#=1, xh,
`
`(2.4)
`
`APPL-1012 / Page 8 of 211
`APPL-1012 / Page 8 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`31
`
`where x is the cross product operator. Similarly, the line joining two points can be written as
`
`i= & x Bo.
`
`(2.5)
`
`When trying to fit an intersection point to multiple lines or, conversely, a line to multiple
`points, least squares techniques (Section 6.1.1 and Appendix A.2) can be used, as discusstd
`in Exercise 2.1,
`
`2D conics. There are other algebraic curves that can be expressed with simple polynomial
`homogeneous equations. For example, the conic sections (so called because they arise as the
`intersection of a plane and a 3D cone) can be written using a quadric equation
`
`£' Qz =0.
`
`(2.6)
`
`Quadric equations play useful roles in the study of multi-view geometry and camera calibra-
`tion (Hartley and Zisserman 2004; Faugeras and Luong 2001) but are not used extensively in
`this book,
`
`3D points. Point coordinates in three dimensions can be written using inhomogeneous co-
`ordinates x = (x,y,z) € R* or homogeneous coordinates = (Z,j, 2,1) € P®. As before,
`itis sometimes useful to denote a 3D point using the augmented vector & = (2, y, 2,1) with
`= we.
`
`3D planes can also be represented as homogencous coordinates 7m = (a, b,c, d)
`3D planes.
`with a corresponding plane equation
`.
`
`m= ar+ by -+cor+d—0,
`
`(2.7)
`
`We can also normalize the plane equation as m = (fiz, iy, iz, d) = (vt, d) with ||f|] = 1.
`Tn this case, 71 1s the normal vector perpendicular to the plane and d is its distance to the
`origin (Figure 2.2b). As with the case of 2D lines, the plane atinfinity ym = (0,0,0,1),
`which contains all the points at infinity, cannot be normalized (i.c., it does not have a unique
`normal or a finite distance),
`Wecan express 7 as a function of two angles (0, 4),
`
`ft = (cos é cos @, sin d cos @, sin d),
`
`(2.8)
`
`Le. using spherical coordinates, but these are less commonly used than polar coordinates
`since they do not uniformly sample the space of possible normal vectors,
`
`3D lines. Lines in 3D are less elegant than either lines in 2D or planes in 3D. One possible
`representation is to use two points on the line, (p,q). Any other point on the line can be
`expressed as a linear combination of these two points
`
`-
`
`r= (1—A)p+Aq,
`
`(2.9)
`
`as shown in Figure 2.3. If we restrict 0 < A < 1, we get the line segment joining p and q.
`
`APPL-1012 / Page 9 of 211
`APPL-1012 / Page 9 of 211
`
`

`

`32
`
`2 Image formation
`
`
`
`Figure 2.3 3D line equation, r = (1 — A)p+ Ag.
`
`If we use homogeneous coordinates, we can write the line as
`
`r=pp+ Ag.
`
`(2.10)
`
`A special case of this is whenthe secondpointis at infinity, i.e. @ = (dz, dy,dz,0) = (d,0).
`Here, we see that d is the direction of the line, We can then re-write the inhomogeneous 3D
`line equation as
`
`r=p+daAd.
`
`(2.11)
`
`A disadvantage of the endpoint representation for 3D lines is that it has too many degrees
`of freedom,i.e., six (three for each endpoint) instead of the four degrees that a 3D line truly
`has. However, if we fix the two points on the line to lie in specific planes, we obtain a rep-
`resentation with four degrees of freedom. For example,if we are representing nearly vertical
`lines, then z = 0 and z = 1 form two suitable planes, ie., the (2, y) coordinates in both
`planes provide the four coordinates describing the line. ‘This kind of two-plane parameteri-
`zation is used in the lightfield and Lumigraph image-based rendering systems described in
`Chapter 13 to represent the collection of rays seen by a camera as it moves in front of an
`object. The two-endpoint representation is also useful for representing line segments, even
`when their exact endpoints cannot be seen (only guessed at).
`If we wish to representall possible lines without bias towards any particular orientation,
`we can use Picker coordinates (Hartley and Zisserman 2004, Chapter 2; Faugeras and Luong
`2001, Chapter 3). These coordinates are the six independent non-zero entries in the 4x4 skew
`symmetric matrix
`
`L=pq' —@p’,
`
`(2.12)
`
`where p and g are any two (non-identical) points on the line. This representation has only
`four degrees of freedom, since L is homogeneousand also satisfies det(.L) = 0, which results
`in a quadratic constraint on the Pliicker coordinates.
`Inpractice, the minimal representation is not essential for most applications. An ade-
`quate model of 3D lines can be obtained by estimating their direction (which may be known
`ahead of time, e.g., for architecture) and some point within the visible portion of the line
`(see Section 7.5.1) or by using the two endpoints, since lines are most often visible as finite
`line segments. However, if you are interested in more details about the topic of minimal
`line parameterizations, Férstner (2005) discusses various ways to infer and model 3D lines in
`projective geometry, as well as how to estimate the uncertainty in such fitted models.
`
`APPL-1012 / Page 10 of 211
`APPL-1012 / Page 10 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`a3
`
`
`elONTU
`
`
`
`Figure 2.4 Basic set of 2D planar transformations.
`
`3D quadrics.
`
`‘The 3D analog of a conic section is a quadric surface
`
`#’Qz=0
`
`(2.13)
`
`(Hartley and Zisserman 2004, Chapter 2). Again, while quadric surfaces are useful in the
`study of multi-view geometry and can also serve as useful modeling primitives (spheres,
`ellipsoids, cylinders), we do not study them in great detail in this book.
`
`2.1.2 2D transformations
`
`Having defined our basic primitives, we can now tum our attention to how they can be trans-
`formed. The simplest transformations occur in the 2D plane andare illustrated in Figure 2.4,
`
`Translation.
`
`2D translations can be written as 2" = a2 + ¢ or
`
`where J is the (2 x 2) identity matrix or
`
`wea[ lr ele
`
`Ioet
`
`v=| or |?
`
`(2.14)
`
`(2.15)
`
`where 0 is the zero vector. Using a2 x 3 matrix results in a more compact notation, whereas
`using a full-rank 3 x 3 matrix (which can be obtained from the 2 x 3 matrix by appending a
`(07 1] row) makes it possible to chain transformations using matrix multiplication. Note that
`in any equation where an augmented vector such as & appears on both sides, it can always be
`replaced with a full homogeneous vector #.
`
`Rotation + translation. This transformation is also known as 2D rigid body motion or
`the 29 Euclidean transformation (since Euclidean distances are preserved). It can be written
`asa’ = ARax+tor
`
`where
`
`=[Rt]a@
`a
`‘i
`R=|
`cos0
`.
`is an orthonormal rotation matrix with RR™ = I and |A| = 1,
`
`cos@ —sin
`
`2.16)
`
`217)
`
`APPL-1012 / Page 11 of 211
`APPL-1012 / Page 11 of 211
`
`

`

`34
`
`2 Image formation
`
`Scaled rotation. Also known as the similarity transform, this transformation can be ex-
`pressed as a’ = sft + ¢ where s is an arbitrary scale factor. It can also be written as
`
`
`
`a —b #t,|_
`e'=(sR t]#=
`
`ba «la,
`
`(2.18)
`
`where we no longer require that a? + 6° = 1. Thesimilarity transform preserves angles
`between lines.
`
`Affine. The affine transformation is written as 2’ = AZ, where A is an arbitrary 2 x 3
`matrix, 1.€.,
`
`a
`
`Oo Jz.
`
`(2.19)
`
`ait | Goo
`
`fo1
`
`Mo Gi 42
`
`Parallel lines remain parallel under affine transformations.
`
`Projective. This transformation, also known as a perspective transform or homography,
`operates on homogeneous coordinates,
`
`# = Ha,
`
`(2.20)
`
`where AT is an arbitrary 3 x 3 matrix. Note that Ai is homogeneous,i.c., it is only defined
`up to a scale, and that two AY matrices that differ only by scaleare equivalent, The resulting
`homogeneous coordinate ' must be normalized in order to obtain an inhomogeneous result
`am, Le.,
`
`
`Ayoa + ary + fiz
`ints hoot + hoy + hoa
`hagt + hay + hao
`hagt + hary + hag
`Perspective transformations preserve straight lines (i.e., they remain straight after the trans-
`formation).
`
`and y=
`
`(2.21)
`
`Hierarchy of 2D transformations. The preceding set of transformationsare illustrated
`in Figure 2.4 and summarized in Table 2.1. The easiest way to think of them is as a set
`of (potentially restricted) 3 x 3 matrices operating on 2D homogeneous coordinate vectors.
`Hartley and Zisserman (2004) contains a more detailed description of the hierarchy of 2D
`planar transformations.
`The above transformations form a nested set of growps, Le., they are closed under com-
`position and have an inverse that is a member of the same group. (This will be important
`later when applying these transformations to images in Section 3,6.) Each (simpler) group is
`a subset of the more complex group below it.
`
`Co-vectors. While the above transformations can be used to transform points in a 2D
`plane, can they also be used directly to transform a line equation? Consider the homogeneous
`equation | -& = 0. If we transform «' = Ala, we obtain
`(2.22)
`i .2@=fwe=(a'Tya=i-z=0,
`i.c., (=HaE: Thus, the action of a projective transformation on a co-vector such as a 2D
`line or 3D normal can berepresented by the transposed inverse of the matrix, which ts equiv-
`alent to the adjoint of HY, since projective transformation matrices are homogeneous. Jim
`
`APPL-1012 / Page 12 of 211
`APPL-1012 / Page 12 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`Transformation
`
`Matrix
`
`#DoF Preserves
`
`Icon
`
`translation
`rigid (Euclidean)
`similarity
`affine
`projective
`
`[ETP bess
`[R[t],.,
`[sR|t],.,
`[Adie
`[ H lve
`
`2
`3
`4
`6
`8
`
`orientation
`lengths
`angles
`parallelism
`straight lines
`
`[|
`oO
`Oo
`i!
`|
`
`Table 2.1 Hierarchy of 2D coordinate transformations. Each transformation also preserves the properties listed
`in the rows below it, i.c., similarity preserves not only angles but also parallelism and straight lines, The 2 x 3
`matrices are extended with a third [07 1] row to form a full 3 x 3 matrix for homogeneous coordinate transforma-
`tions,
`
`Blinn (1998) describes (in Chapters 9 and 10) the ins and outs of notating and manipulating
`co-vectors.
`
`While the above transformations are the ones we use most extensively, a number of addi-
`tional transformations are sometimes used,
`
`Stretch/squash. This transformation changes the aspectratio of an image,
`#
`=
`=
`
`s,o +t,
`Syy + ty,
`
`y/
`
`and is a restricted form of an affine transformation. Unfortunately, it does not nest cleanly
`with the groups listed in Table 2.1.
`
`Planar surface flow. This cight-parameter transformation (Horn 1986; Bergen, Anan-
`dan, Hanna et al. 1992; Girod, Greiner, and Niemann 2000),
`
`a = ag+a,r+ agy + asx” + apry
`yi = a3+agr + agy + azz” + agzy,
`
`It can thus be thought of as a
`arises when a planar surface undergoes a small 31D motion.
`small motion approximation to a full homography. Its main attraction is that itis [imear in the
`motion parameters, a;, which are often the quantities being estimated.
`
`Bilinear interpolant. This cight-parameter transform (Wolberg 1990),
`
`a =
`/
`uv =
`
`ap + a2 + doy + agry
`
`ag + Ogu + ogy + O7ry,
`
`can be used to interpolate the deformation due to the motion of the four comer points of
`a square,
`(In fact, it can interpolate the motion of any four non-collinear points.) While
`
`APPL-1012 / Page 13 of 211
`APPL-1012 / Page 13 of 211
`
`

`

`36
`
`2 Image formation
`
`
`
`
`
` Transformation Matrix #DoF Preserves Icon
`
`
`
`
`
`translation
`rigid (Euclidean)
`similarity
`affine
`projective
`
`[1|t],.,
`[Rt],,
`[sR|t],.,
`L Ades
`[ Fi lise
`
`3
`6
`7
`12
`15
`
`L]
`orientation
`Oo
`lengths

`angles
`a}
`parallelism
`straight lines C|
`
`Table 2.2 Hierarchy of 3D coordinate transformations. Each transformation also preserves the properties listed
`in the rows below it, i.c., similarity preserves not only angles but also parallelism and straight lines. The 3 x 4
`matrices are extended with a fourth [0”1] row to form a full 4 x 4 matrix for homogeneous coordinate transfor-
`mations. The mnemonic icons are drawn in 2D but are meant to suggest transformations occurring in a full 3D
`cube,
`
`the deformation is linear in the motion parameters, it does not generally preserve straight
`lines (only lines parallel to the square axes), However, it is often quite useful, e.g., in the
`interpolation of sparse grids using splines (Section 8.3).
`
`2.1.3 3D transformations
`
`The set of three-dimensional coordinate transformations is very similar to that available for
`9D transformations and is summarized in Table 2.2. As in 2D, these transformations form a
`nested set of groups. Hartley and Zisserman (2004, Section 2.4) give a more detailed deserip-
`tion of this hierarchy.
`
`Translation.
`
`3D translations can be written as 2" = a: + ¢ or
`
`v=[i t]z2
`
`(2.23)
`
`where J is the (3 x 3) identity matrix and 0 is the zero vector.
`
`Rotation + translation. Also known as 3D rigid body motion or the 3D Euclidean trans-
`formation,it can be written as a’ = Aa: + ¢ or
`
`(2.24)
`o=[(R tle
`where R is a 3 x 3 orthonormal rotation matrix with RR?’ = I and |R| = 1. Note that
`sometimes it is more convenient to describe a rigid motion using
`
`«' = Riz —c) = Rx — Re,
`
`(2.25)
`
`where ¢ is the center of rotation (often the camera center).
`Compactly parameterizing a 3D rotation is a non-trivial task, which we describe in more
`detail below.
`
`APPL-1012 / Page 14 of 211
`APPL-1012 / Page 14 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`a7
`
`Scaled rotation. The 3D similarity transform can be expressed as a’ = sFta: + ¢ where
`313 an arbitrary scale factor. It can also be written as
`
`This transformation preserves angles between lines and planes.
`
`a =[sR t ]%.
`
`(2.26)
`
`‘
`
`Affine.
`Le,
`
`‘The affine transform is written as 2’ = AZ, where A is an arbitrary 3 = 4 matrix,
`
`J
`© =]
`
`491 42 a
`
`dio G1 Gin Gis
`Oop 91
`Gaz
`fgg
`
`@o0
`
`#
`
`(2.27)
`
`Parallel lines and planes remain parallel under affine transformations.
`
`Projective. This transformation, variously known as a 3D perspective transform, homog-
`raphy, or collineation, operates on homogeneous coordinates,
`
`&' = HG,
`
`(2.28)
`
`where f is an arbitrary 4 x 4 homogeneous matrix, As in 2D, the resulting homogeneous
`coordinate &' must be normalized in order to obtain an inhomogencousresult a. Perspective
`transformations preserve straight lines (i.c., they remain straight after the transformation).
`
`2.1.4 3D rotations
`
`The biggest difference between 2D and 3D coordinate transformations is that the parameter-
`ization of the 3D rotation matrix Ft is not as straightforward but several possibilities exist.
`
`Euler angles
`
`A rotation matrix can be formed as the product of three rotations around three cardinal axes,
`eg. 0, ¥, and z, ora, y, and «, This is generally a bad idea, as the result depends on the
`order in which the transforms are applied. What is worse, it is not always possible to move
`smoothly in the parameter space, i.c., sometimes one or more of the Euler angles change
`dramatically in response to a small changein rotation.' For these reasons, we do not even
`give the formulafor Euler angles in this book—interested readers can look in other textbooks
`or technical reports (Faugeras 1993; Diebel 2006). Note that,
`in some applications, if the
`rotations are known to be a set of uni-axial transforms, they can always be represented using
`an explicit set of rigid transformations,
`
`Axis/angle (exponential twist)
`
`A rotation can be represented by a rotation axis 7 and an angle @, or equivalently by a 3D
`vector w = Of, Figure 2.5 shows how we can compute the equivalent rotation. First, we
`project the vector v onto the axis 7 to obtain
`vy = Aa(A-v) = (AAT)y,
`
`(2.29)
`
`! In robotics, this is sometiones referred to as gimbal lock,
`
`APPL-1012 / Page 15 of 211
`APPL-1012 / Page 15 of 211
`
`

`

`38
`
`:
`
`2 Image formation
`
`
`
`Figure 2.5 Rotation around an axis 7 by an angle @.
`
`which is the component of v that is not affected by the rotation. Next, we compute the
`perpendicular residual of » from 7,
`
`vp =v—vy =(1—An")u.
`
`We can rotate this vector by 90° using the cross product,
`
`vy, =A xX v= [fi],0,
`
`(2.30)
`
`(2.31)
`
`where [72], is the matrix form ofthe cross product operator with the vector 7 = (fix, fiys fiz),
`
`Ty
`fly
`O
`
`[fi]x=|fiz 0 —ne, |. (2.32)
`
`a a
`
`
`
`
`
`Note that rotating this vector by another 90° is equivalent to taking the cross product again,
`
`Use = Ti ¥ ty = jfij2v =—1,
`
`and hence
`
`vy SUL =v7+t+uyy = (I+ [fAl* jw.
`We can now compute the in-plane componentof the rotated vector w as
`
`i) =cosdv, +sindv, = (sin é[n], — cos O[fi]®, )v.
`
`Putting all these terms together, we obtain the final rotated vector as
`
`ut + oy = (2 + sin O[fi], + (1 — cos 6) {va)%, )v.
`
`(2.33)
`
`We can therefore write the rotation matrix corresponding to a rotation by @ around an axis
`as
`
`R(h,0) = 1 +sin off], + (1 — cos @)[njz,,
`
`(2.34)
`
`which is known as Rodriguez'sformula (Ayache 1989).
`The product of the axis 71 and angle #, w = 07 = (w,,wy,wz), is a minimal represen-
`tation for a 3D rotation. Rotations through common angles such as multiples of 90° can be
`represented exactly (and converted to exact matrices) if # is stored in degrees, Unfortunately,
`
`APPL-1012 / Page 16 of 211
`APPL-1012 / Page 16 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`39
`
`this representation is not unique, since we can always add a multiple of 360° (27 radians) to
`(and getthe same rotation matrix. As well, (7, @) and (—#, —@) represent the same rotation.
`However, for small rotations (e.g., corrections to rotations), this is an excellent choice,
`In particular, for small (infinitesimal or instantaneous) rotations and #? expressed in radians,
`&
`Rodriguez's formula simplifies to
`
`R(w) = I+ sin bff], = I+ [6A], =
`
`]
`
`ws
`
`—tily
`
`til,
`
`ly
`
`i =, | j
`
`Wy
`
`1
`
`(2.35)
`
`which gives a nice linearized relationship between the rotation parameters w and #2, We can
`also write R(w)v =v + w x v, which is handy when we want to compute the derivative of
`Ae with respect to uw,
`
`Oo
`
`y
`
`ss —-
`
`-r
`
`O
`
`Another way to derive a rotation through a finite angle is called the exponential twist
`(Murray, Li, and Sastry 1994), A rotation by an angle ( is equivalent to k rotations through
`6/k. In the limit as k — co, we obtain
`
`(2.37)
`R(f, 6) = lim(1 + 7lof)" = exp [w].
`If we expand the matrix exponential as a Taylor series (using the identity [fi]£*? = —[”i]*,
`k > 0, and again assuming @ is in radians), -
`
`BP
`&
`exp [w], = T+ o[f]x + >lAly + alls +°
`gs
`.
`_
`02 #
`= 14+(00-F + Nile t+ (> - At Ax
`= I+sin6[A], +(1—cosé)[fj2,
`
`(2.38)
`
`which yields the familiar Rodriguez's formula.
`
`Unit quaternions
`
`The unit quaternion representation is closely related to the angle/axis representation. A unit
`quaternion is a unit length 4-vector whose components can be written as g = (qx, Gy; Gz, Iw)
`or gq = (2, y, 2, w) for short. Unit quaternions live on the unit sphere ||q|| = 1 and antipodal
`(opposite sign) quaternions, q and —q,represent the same rotation (Figure 2.6), Other than
`this ambiguity (dual covering), the unit quaternion representation of a rotation is unique.
`Furthermore, the representation is continuous, i.e., as rotation matrices vary continuously,
`one can find a continuous quaternion representation, although the path on the quaternion
`sphere may wrap all the way around before returning to the “origin” g, = (0,0,0,1). For
`these and other réasons given below, quaternions are a very popular representation for pose
`and for pose interpolation in computer graphics (Shoemake 1985).
`
`APPL-1012 / Page 17 of 211
`APPL-1012 / Page 17 of 211
`
`

`

`40
`
`2 Image formation
`
`
`
`Figure 2.6 Unit quaternions live on the unit sphere ||q{| = 1. This figure shows a smooth trajectory through the
`three quaternions gp, q,, and q5. The antipodal point to qj, namely —qg, represents the same rotation as qo.
`
`Quaternions can be derived from the axis/angle representation through the formula
`
` #,
`a
`(2.39)
`q = (v,w) = (sin qthons 3)
`where 7 and @ are the rotation axis and angle, Using the trigonometric identities sing =
`2sin £ cos § and (1 — cos@) = 2sin’$, Rodriguez's formula can be convertedto
`
`R(A,0)=I+sin6[A], + (1 —cosé)[Al?
`T+ 2Qwie], + 2[v]2.
`(2.40)
`
`This suggests a quick way to rotate a vector v by a quaternion using a series of cross products,
`scalings, and additions. To obtain a formula for R(q) as.a function of (x,y, z, w), recall that
`
`rz
`zy
`—y—z7
`y
`-z
`0
`
`
`[v];=| O a|and [v]} =2 ry —x* — 24 yz
`
`
`-y x
`0
`zz
`yz
`—e? —y"
`
`
`
`We thus obtain
`
`R(qg)=|
`
`1-2Ay?+2*)
`Aeyt+zw)
`Oaz—yu)
`
`a(zz + yw)
`2(ry—zw)
`BAye—-cw)
`1-2(27427)
`—Ayz+aw) 1-20 + y2)
`
`|.
`
`(2.41)
`
`The diagonal terms can be made more symmetrical by replacing 1 — 2(y* + 2*) with (2? +
`wu? —y* — 2°), ete.
`The nicest aspect of unit quaternions is that there is a simple algebra for composing rota-
`tions expressed as unit quaternions. Given two quaternions gy = (vp, wo) and q, = (v1, Ww),
`the quaternion multiply operator is defined as
`
`Go = Ipod, = (vo X Vp + wWov1 + WV, WOU— Vo + V1),
`
`(2.42)
`
`with the property that R(q,) = R(g,)A(q,). Note that quaternion multiplication is not
`commutative, just as 3D rotations and matrix multiplications are not.
`
`APPL-1012 / Page 18 of 211
`APPL-1012 / Page 18 of 211
`
`

`

`2.1 Geometric primitives and transformations
`
`41
`
`procedure slerp(qp,q 1, @):
`
`return q2 = 440
`
`» Fe = 4/9 = (Vp, Wr)
`
`. ifw, < O then g, + —q,
`
`. 6, = 2tan™(\Ivp|/w,)
`
`~ fp = N(uy) = v-/||v5|]
`
`.= od,
`
`She (sin 22-72,., cos fa.)
`
`Algorithm 2.1 Spherical linear interpolation (slerp). ‘The axis and total angle are first computed from the quater-
`nion ratio. (This computation can be lifted outside an inner loop that generates a set of interpolated position for
`animation.) An incremental quaternion is then computed and multiplied by the starting rotation quaternion.
`
`Taking the inverse of a quaternion is easy: Just flip the sign of v or w (but not both!),
`(You can verify this has the desired effect of transposing the Ri matrix in (2.41).) Thus, we
`can also define guaternion division as
`
`92 = 40/41 = Gon: = (U9 Xv, + wots — wg, —wyuh — Uo- 4).
`
`(2.43)
`
`This is useful when the incremental rotation between tworotations is desired,
`In particular, if we want to determine a rotation that is partway between two given rola-
`tions, we can compute the incremental rotation, take a fraction of the angle, and compute the
`new rotation. This procedure is called spherical linear interpolation or slerp for short (Shoe-
`make 1985) and is given in Algorithm 2.1. Note that Shoemake presents two formulas other
`than the one given here. The first exponentiates q,. by alpha before multiplying the original
`quaternion,
`
`while the second treats the quaternions as 4-vectors on a sphere and uses
`
`d2 = 4r os
`
`2
`
`2
`
`sin(1 — a)é
`~
`sin@
`
`0
`
`
`sin ad}
`sin @
`
`li
`
`(2.44)
`
`(2
`
`45
`
`}
`
`where # = cos~'(qg, « q,) and the dot productis directly between the quaternion 4-vectors,
`AL of these formulas give comparable results, although care should be taken when g, and q,
`are close together, which is why I prefer to use an arctangent to establish the rotation angle.
`
`Which rotation representation is better?
`
`The choice of representation for 3D rotations depends partly on the application.
`The axis/angle representation is minimal, and hence does not require any additional con-
`straints on the patameters (no need to re-normalize after each update).
`If the angle is ex-
`pressed in degrees, it is easier to understand the pose (say, 90° twist around x-axis), and also
`
`APPL-1012 / Page 19 of 211
`APPL-1012 / Page 19 of 211
`
`

`

`42
`
`2 Image formation
`
`easier to express exact rotations. When the angle is in radians, the derivatives of A. with
`respect to w can easily be computed (2.36).
`Quaternions,on the other hand, are better if you wantto keep track of a smoothly moving
`camera, since there are no discontinuities in the representation.It is also easier to interpolate
`between rotations and to chain rigid transformations (Murray, Li, and Sastry 1994; Bregler
`and Malik 1998).
`Myusual preference is to use quaternions, but to update their estimates using an incre-
`mental rotation, as described in Section 6.2.2,
`
`2.1.5 3D to 2D projections
`
`Now that we know how to represent 2D and 3D geometric primitives and how to transform
`them spati

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket