`
`
`
`
`
`
`DECLARATION OF GORDON MACPHERSON
`
`I, Gordon MacPherson, am over twenty-one (21) years of age. I have never been
`convicted of a felony, and I am fully competent to make this declaration. I declare the following
`to be true to the best of my knowledge, information and belief:
`
`1. I am Director Board Governance & IP Operations of The Institute of Electrical and
`Electronics Engineers, Incorporated (“IEEE”).
`
`2. IEEE is a neutral third party in this dispute.
`
`3. I am not being compensated for this declaration and IEEE is only being reimbursed
`for the cost of the article I am certifying.
`
`4. Among my responsibilities as Director Board Governance & IP Operations, I act as a
`custodian of certain records for IEEE.
`
`5. I make this declaration based on my personal knowledge and information contained
`in the business records of IEEE.
`
`6. As part of its ordinary course of business, IEEE publishes and makes available
`technical articles and standards. These publications are made available for public
`download through the IEEE digital library, IEEE Xplore.
`
`7. It is the regular practice of IEEE to publish articles and other writings including
`article abstracts and make them available to the public through IEEE Xplore. IEEE
`maintains copies of publications in the ordinary course of its regularly conducted
`activities.
`
`8. The article below has been attached as Exhibit A to this declaration:
`
`
`A. W. Hoff, et al, “Analysis of head pose accuracy in augmented reality”,
`IEEE Transactions on Visualization and Computer Graphics, Vol. 6, Issue
`4, October – December 2000.
`
`
`
`9. I obtained a copy of Exhibit A through IEEE Xplore, where it is maintained in the
`ordinary course of IEEE’s business. Exhibit A is a true and correct copy of the
`Exhibit, as it existed on or about December 29, 2021.
`
`10. The article and abstract from IEEE Xplore show the date of publication. IEEE
`Xplore populates this information using the metadata associated with the publication.
`
`DocuSign Envelope ID: 7FCDEB04-9D8A-4D7A-9401-7811BCC4CFCA
`
` 1
`
`META 1028
`META V. THALES
`
`
`
`
`11. W. Hoff, et al, “Analysis of head pose accuracy in augmented reality” was published
`in IEEE Transactions on Visualization and Computer Graphics, Vol. 6, Issue 4. IEEE
`Transactions on Visualization and Computer Graphics, Vol. 6, Issue 4 was published
`in October – December 2000. Copies of this publication was made available no later
`than the last day of the last publication month. The article is currently available for
`public download from the IEEE digital library, IEEE Xplore.
`
`12. I hereby declare that all statements made herein of my own knowledge are true and
`that all statements made on information and belief are believed to be true, and further
`that these statements were made with the knowledge that willful false statements and
`the like are punishable by fine or imprisonment, or both, under 18 U.S.C. § 1001.
`
`I declare under penalty of perjury that the foregoing statements are true and correct.
`
`
`
`
`Executed on:
`
`
`
`
`
`
`
`
`
`
`DocuSign Envelope ID: 7FCDEB04-9D8A-4D7A-9401-7811BCC4CFCA
`
`1/6/2022
`
` 2
`
`META 1028
`META V. THALES
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`EXHIBIT A
`
`DocuSign Envelope ID: 7FCDEB04-9D8A-4D7A-9401-7811BCC4CFCA
`
` 3
`
`META 1028
`META V. THALES
`
`
`
`IEEE.org
`
`IEEE Xplore
`
`IEEE-SA
`
`IEEE Spectrum
`
`More Sites
`
`Create
`Account
`
`Personal
`Sign In
`
`Cart
`
`
`
`
`
`Access provided by:
`Everything Demo User
`
`Sign Out
`
`Browse My Settings Help
`
`Access provided by:
`Everything Demo User
`
`Sign Out
`
`All
`
`
`
`Journals & Magazines > IEEE Transactions on Visualiz... > Volume: 6 Issue: 4
`
`
`
`ADVANCED SEARCH
`
` Back to Results
`
`Analysis of head pose accuracy in augmented reality
`Publisher: IEEE
`
`Cite This
`
`
` << Results
`
`W. Hoff ; T. Vincent All Authors
`
`58
`Paper
`Citations
`
`39
`Patent
`Citations
`
`763
`Full
`Text Views
`
`
`Alerts
`
`Manage Content
`
`Alerts
`Add to Citation
`
`Alerts
`
` D
`
`ownl
`
`
`Abstract:A method is developed to analyze the accuracy of the relative head-
`to-object position and orientation (pose) in augmented reality systems with
`head-mounted displays. From... View more
`
` Metadata
`Abstract:
`A method is developed to analyze the accuracy of the relative head-to-object
`position and orientation (pose) in augmented reality systems with head-
`mounted displays. From probabilistic estimates of the errors in optical tracking
`sensors, the uncertainty in head-to-object pose can be computed in the form of
`a covariance matrix. The positional uncertainty can be visualized as a 3D
`ellipsoid. One useful benefit of having an explicit representation of uncertainty is
`that we can fuse sensor data from a combination of fixed and head-mounted
`sensors in order to improve the overall registration accuracy. The method was
`applied to the analysis of an experimental augmented reality system,
`incorporating an optical see-through head-mounted display, a head-mounted
`CCD camera, and a fixed optical tracking sensor. The uncertainty of the pose of
`
`Abstract
`
`Authors
`
`References
`
`Citations
`
`Keywords
`
`Metrics
`
`More Like This
`
`
`
` 4
`
`META 1028
`META V. THALES
`
`More
`Like
`This
`An infrastructure for realizing custom-tailored
`augmented reality user interfaces
`IEEE Transactions on Visualization and
`Computer Graphics
`Published: 2005
`Augmented reality user interface for an
`atomic force microscope-based nanorobotic
`system
`IEEE Transactions on Nanotechnology
`Published: 2006
`Show
`More
`
`
`a movable object with respect to the head-mounted display was analyzed. By
`using both fixed and head mounted sensors, we produced a pose estimate that
`is significantly more accurate than that produced by either sensor acting alone.
`
`Published in: IEEE Transactions on Visualization and Computer Graphics (
`Volume: 6 , Issue: 4, Oct-Dec 2000)
`
`Page(s): 319 - 334
`
`INSPEC Accession Number: 6814274
`
`Date of Publication: Oct-Dec 2000
`
`DOI: 10.1109/2945.895877
`
` ISSN Information:
`
`Publisher: IEEE
`
`Authors
`
`References
`
`Citations
`
`Keywords
`
`Metrics
`
`
`
`
`
`
`
`
`
`
`
`IEEE Personal Account
`
`Purchase Details
`
`Profile Information
`
`Need Help?
`
`CHANGE USERNAME/PASSWORD
`
`PAYMENT OPTIONS
`
`COMMUNICATIONS PREFERENCES
`
`US & CANADA: +1 800 678 4333
`
`VIEW PURCHASED DOCUMENTS
`
`PROFESSION AND EDUCATION
`
`WORLDWIDE: +1 732 981 0060
`
`TECHNICAL INTERESTS
`
`CONTACT & SUPPORT
`
`Follow
`
`
`
`About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | Privacy & Opting Out of Cookies
`A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
`
`© Copyright 2021 IEEE - All rights reserved.
`
`IEEE Account
`
`Purchase Details
`
`Profile Information
`
`Need Help?
`
`» Change Username/Password
`» Update Address
`
`» Payment Options
`» Order History
`» View Purchased Documents
`
`» Communications Preferences
`» Profession and Education
`» Technical Interests
`
`» US & Canada: +1 800 678 4333
`» Worldwide: +1 732 981 0060
`» Contact & Support
`
`About IEEE Xplore Contact Us
`
`|
`
`
`
`|
`
`Help
`
`
`
`|
`
`Accessibility
`
`
`
`|
`
`Terms of Use
`
`
`
`|
`
`Nondiscrimination Policy
`
`
`
`|
`
`Sitemap
`
`
`
`|
`
`Privacy & Opting Out of Cookies
`
`A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
`© Copyright 2021 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
`
` 5
`
`META 1028
`META V. THALES
`
`
`
`IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 6, NO. 4, OCTOBER-DECEMBER 2000
`
`319
`
`Analysis of Head Pose Accuracy
`in Augmented Reality
`
`William Hoff, Member, IEEE, and Tyrone Vincent, Member, IEEE
`
`Abstract—A method is developed to analyze the accuracy of the relative head-to-object position and orientation (pose) in augmented
`reality systems with head-mounted displays. From probabilistic estimates of the errors in optical tracking sensors, the uncertainty in
`head-to-object pose can be computed in the form of a covariance matrix. The positional uncertainty can be visualized as a 3D ellipsoid.
`One useful benefit of having an explicit representation of uncertainty is that we can fuse sensor data from a combination of fixed and
`head-mounted sensors in order to improve the overall registration accuracy. The method was applied to the analysis of an
`experimental augmented reality system, incorporating an optical see-through head-mounted display, a head-mounted CCD camera,
`and a fixed optical tracking sensor. The uncertainty of the pose of a movable object with respect to the head-mounted display was
`analyzed. By using both fixed and head mounted sensors, we produced a pose estimate that is significantly more accurate than that
`produced by either sensor acting alone.
`
`Index Terms—Augmented reality, pose estimation, registration, uncertainty analysis, error propagation, calibration.
`
`(cid:230)
`
`1 INTRODUCTION
`
`AUGMENTED reality is a term used to describe systems in
`
`which computer-generated information is superim-
`posed on top of the real world [1]. One form of enhance-
`ment is to use computer-generated graphics to add virtual
`objects (such as labels or wire-frame models) to the existing
`real world scene. Typically, the user views the graphics
`with a head-mounted display (HMD), although some
`systems have been developed that use a fixed monitor
`(e.g., [2], [3], [4], [5]). The combining of computer-generated
`graphics with real-world images may be accomplished with
`either optical [6], [7], [8] or video technologies [9], [10].
`A basic requirement for an AR system is to accurately
`align virtual and real-world objects so that they appear to
`coexist in the same space and merge together seamlessly.
`This requires that the system accurately sense the position
`and orientation (pose) of the real world object with respect
`to the user’s head. If the estimated pose of the object is
`inaccurate, the real and virtual objects may not be registered
`correctly. For example, a virtual wire-frame model could
`appear to float some distance away from the real object.
`This is clearly unacceptable in applications where the user
`is trying to understand the relationship between real and
`virtual objects. Registration inaccuracy is one of the most
`important problems limiting augmented reality applica-
`tions today [11].
`This paper shows how one can estimate the registration
`accuracy in an augmented reality system, based on the
`characteristics of the sensors used in the system. Only
`quasi-static registration is considered in this paper; that is,
`objects are stationary when viewed, but can freely be
`
`. The authors are with the Engineering Division, Colorado School of Mines,
`1500 Illinois St., Golden, CO 80401.
`E-mail: {whoff, tvincent}@mines.edu.
`
`Manuscript received 1 Feb. 1999; revised 6 July 2000; accepted 10 July 2000.
`For information on obtaining reprints of this article, please send e-mail to:
`tvcg@computer.org, and reference IEEECS Log Number 109094.
`
`moved. We develop an analytical model and show how the
`model can be used to properly combine data from multiple
`sensors to improve registration accuracy and gain insight
`into the effects of object and sensor geometry and
`configuration. A preliminary version of this paper was
`presented at the First International Workshop on Augmen-
`ted Reality [12].
`
`1.1 Registration Techniques in Augmented Reality
`To determine the pose of an object with respect to the user’s
`head, tracking sensors are necessary. Sensor technologies
`that have been used in the past
`include mechanical,
`magnetic, acoustic, and optical [13]. We concentrate on
`optical sensors (such as cameras and photo-effect sensors)
`since they have the best overall combination of speed,
`accuracy, and range [7], [14], [15].
`There has been much work in the past in the photo-
`grammetry and computer vision fields on methods for
`object recognition and pose estimation from images. Some
`difficult problems (which are not addressed here) include
`how to extract features from the images and determine the
`correspondence between extracted image features and
`features on the object. In many practical applications, these
`problems can be alleviated by preplacing distinctive optical
`targets, such as light emitting diodes (LEDs) or passive
`fiducial markings, in known positions on the object. The 3D
`locations of the target points on the object must be carefully
`measured, in some coordinate frame attached to the object.
`In this paper, we will assume that point features have been
`extracted and the correspondences known so that the only
`remaining problem is to determine the pose of the object
`with respect to the HMD.
`One issue is whether the measured points are two-
`dimensional (2D) or three-dimensional (3D). Simple passive
`optical sensors, such as video cameras and photo-effect
`sensors, can only sense the direction to a target point and
`not its range. The measured data points are 2D, i.e., they
`
`1077-2626/00/$10.00 (cid:223) 2000 IEEE
`Authorized licensed use limited to: Everything Demo User. Downloaded on December 29,2021 at 14:43:37 UTC from IEEE Xplore. Restrictions apply.
`
`META 1028
`META V. THALES
`
`
`
`320
`
`IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 6, NO. 4, OCTOBER-DECEMBER 2000
`
`represent the locations of the target points projected onto
`the image plane. On the other hand, active sensors, such as
`laser range finders, can directly measure direction and
`range, yielding fully 3D target points. Another way to
`obtain 3D data is to use triangulation; for example, by using
`two or more passive sensors (stereo vision). The accuracy of
`locating the point is improved by increasing the separation
`(baseline) between the sensors.
`Once the locations of
`the target points have been
`determined (either 2D or 3D), the next step is to determine
`the full six degree-of-freedom (DOF) pose of the object with
`respect to the sensor. Again, we assume that we know the
`correspondence of the measured points to the known 3D
`points on the object model. If one has 3D point data, this
`procedure is known as the “absolute orientation” problem
`in the photogrammetry literature. If one has 2D target
`points, this procedure is known as the “exterior orientation”
`problem [16].
`Another issue is where to locate the sensor and target.
`One possibility is to mount the sensor at a fixed known
`location in the environment and put targets on both the
`HMD and on the object of interest (a configuration called
`“outside-in” [14]). We measure the pose of the HMD with
`respect to the sensor, and the pose of the object with respect
`to the sensor, and derive the relative pose of the object with
`respect to the HMD. Another possibility is to mount the
`sensor on the HMD and the target on the object of interest (a
`configuration called “inside-out”). We measure the pose of
`the object with respect to the sensor and use the known
`sensor-to-HMD pose to derive the relative pose of the object
`with respect to the HMD. Both approaches have been tried
`in the past and each has advantages and disadvantages.
`With a fixed sensor (outside-in approach), there is no
`limitation on size and weight of
`the sensor. Multiple
`cameras can be used, with a large baseline, to achieve
`highly accurate 3D measurements via triangulation. For
`example, commercial optical measurement systems, such as
`Northern Digital’s Optotrak, have baselines of approxi-
`mately 1 meter and are able to measure the 3D positions of
`LED markers to an accuracy of approximately 0.15 mm. The
`orientation and position of a target pattern is then derived
`from the individual point positions. A disadvantage with
`this approach is that head orientation must be inferred
`indirectly from the point positions.
`The inside-out approach has good registration accuracy
`because a slight rotation of a head-mounted camera causes
`a large shift of a fixed target in the image. However, a
`disadvantage of this approach is that large translation
`errors occur along the line of sight of the camera. To avoid
`this, additional cameras could be added with lines of sight
`orthogonal to each other.
`
`1.2 Need for Accuracy Analysis and Fusion
`In order to design an augmented reality system that meets
`the registration requirements for a given application, we
`would like to be able to estimate the registration accuracy
`for a given sensor configuration. For example, we would
`like to estimate the probability distribution of the 3D error
`distance between a generated virtual point and a corre-
`sponding real object point. Another measure of interest is
`the overlay error; that is, the 2D distance between the
`
`projected virtual point and the projected real point on the
`HMD image plane, which is similar to the image alignment
`error metrics that appear in other work [7], [9], [17].
`Another reason to have an analytical representation of
`uncertainty is for fusing data from multiple sensors. For
`example, data from head-mounted and fixed sensors might
`be combined to derive a more accurate estimate of object-to-
`HMD pose. The uncertainties of these two sensors might be
`complementary so that, by combining them, we can derive a
`pose that is much more accurate than that from each sensor
`used alone. In order to do this, a mathematical analysis is
`required of uncertainties associated with the measurements
`and derived poses. Effectively, we can create a hybrid
`system that combines the “inside-out” and “outside-in”
`approaches.
`
`1.3 Relationship to Past Work and Specific
`Contributions
`Augmented reality is a relatively new field, but the problem
`of registration has received ample attention, with a number
`of authors taking an optical approach. Some researchers
`have used photocells or photo-effect sensors which track
`light-emitting diodes (LEDs) placed on the head, object of
`interest, or both [7], [14], [15]. Other researchers have used
`cameras and computer vision techniques to detect LEDs or
`passive fiducial markings [5], [8], [18], [19], [20], [21]. The
`resulting detected features, however they are obtained, are
`used to determine the relative pose of the object to the
`HMD. A number of researchers have evaluated their
`registration accuracy experimentally [17], [7], with Monte-
`Carlo simulations [19], or both [18]. However, no one has
`studied the effect of sensor-to-target configuration on
`registration accuracy.
`In this paper, we develop an
`analytical model to show how sensor errors propagate
`through to registration errors, given a statistical distribution
`of the sensor errors and the sensor-to-target configuration.
`Some researchers avoid the problem of determining pose
`altogether and instead concentrate on aligning the 2D image
`points using affine projections [22], [23]. Although this
`approach works well for video-based augmented reality
`systems, in optical see-through HMD systems, it would not
`work as well because the image as seen by the head-
`mounted camera may be different than the image seen by
`the user directly through the optical combiner.
`A number of researchers have developed error models
`for HMD-based augmented reality systems. Some research-
`ers have looked at the optical characteristics of HMDs in
`order to calculate viewing transformations and calibration
`techniques [24], [25]. Holloway [17] analyzed the causes of
`registration error in a see-through HMD system, due to the
`effects of misalignment, delay, and tracker error. However,
`he did not analyze the causes of tracker error, merely its
`effect on the overall registration accuracy. This work, on the
`other hand, focuses specifically on the tracker error and
`does not look at the errors in other parts of the system, or
`attempt to derive an overall end-to-end error model.
`In the computer vision field, the problem of determining
`the position and orientation from a set of given point or line
`correspondences has been well-studied. Some researchers
`have developed analytical expressions for the uncertainty of
`a 3D feature position as derived from image data [26]. Other
`
`Authorized licensed use limited to: Everything Demo User. Downloaded on December 29,2021 at 14:43:37 UTC from IEEE Xplore. Restrictions apply.
`
`META 1028
`META V. THALES
`
`
`
`HOFF AND VINCENT: ANALYSIS OF HEAD POSE ACCURACY IN AUGMENTED REALITY
`
`321
`
`2 BACKGROUND ON POSE ESTIMATION
`2.1 Representation of Pose
`
`researchers have evaluated the accuracy of pose estimation
`algorithms using Monte Carlo simulations [27], [28], [29],
`[30]. Few researchers have addressed the issue of error
`propagation in pose estimation. We follow the method
`suggested by Haralick and Shapiro [16], who outline how to
`derive the uncertainty of an estimated quantity (such as a
`pose) from the given uncertainties in the measured data.
`Kalman filtering [31] is a standard technique for optimal
`estimation. It has been used to estimate head pose in
`augmented and virtual reality applications [7], [32], [33].
`From a sequence of sensor measurements, these techniques
`also estimate the uncertainty of the head pose. This is
`similar to the work described in this paper in the sense that
`a Kalman filter can be interpreted as a method for obtaining
`a maximum likelihood estimate of the state in a dynamic
`system, given input-output data [34]. Our system is static
`and so we do not have a model of the state dynamics. We
`fuse data from two measurements, rather than data from a
`measurement and a prediction from past data.
`In this work, a method is developed to explicitly
`compute uncertainties of pose estimates, propagate these
`uncertainties from one coordinate system to another, and
`fuse pose estimates from multiple sensors. The contribution
`of this work is the application of this method to the
`registration problem in augmented reality. Specifically:
`
`.
`
`The method shows how to estimate the uncertainty
`of object-to-HMD pose from the geometric config-
`uration of the optical sensors and the pose estima-
`tion algorithms used. To help illustrate the method,
`we describe its application to a specific augmented
`reality system.
`. We show how data from multiple different sensors
`can be fused, taking into account the uncertainties
`associated with each, to yield an improved object-to-
`HMD pose. In particular, it is shown that a hybrid
`sensing system combining both head-mounted and
`fixed sensors can improve registration accuracy over
`that from either sensor used alone.
`. We demonstrate mathematically some insights re-
`garding the characteristics of registration sensors. In
`particular, we show that the directions of greatest
`uncertainty for a head-mounted and fixed sensor are
`nearly orthogonal and that these can be fused in a
`simple way to improve the overall accuracy.
`The remainder of this paper is organized as follows:
`Section 2 provides a background on pose estimation, with a
`description of the terminology used in the paper. Section 3
`develops the method for estimating the uncertainty of a
`pose, transforming it from one coordinate frame to another,
`and fusing two pose estimates. Section 4 describes the
`particular experimental augmented reality system that was
`used to test the registration method—that of a surgical aid.
`Section 5 illustrates the application of the method to the
`surgical aid system. A typical configuration is analyzed and
`the predicted accuracy of the combined (hybrid) pose
`estimate is found to be much improved over that obtained
`by either sensor alone. Finally, Section 6 provides a
`discussion.
`
`The pose of a rigid body {A} with respect to another
`coordinate system {B} can be represented by a six element
`Ax (cid:136) (cid:133)BxAorg; ByAorg; BzAorg; (cid:11); (cid:12); (cid:134)T , where BpAorg (cid:136)
`vector B
`(cid:133)BxAorg; ByAorg; BzAorg(cid:134)T
`in
`frame {A}
`is the origin of
`frame {B}, and ((cid:11), (cid:12), ) are the angles of rotation of {A}
`about the (z, y, x) axes of {B}. An alternative representation
`of orientation is to use three elements of a quaternion; the
`conversion between Euler angles and quaternions is
`straightforward [35].
`Equivalently, pose can be represented by a 4 (cid:2) 4
`homogeneous transformation matrix [35]:
`
`(cid:18)
`
`(cid:19)
`
`(cid:133)1(cid:134)
`H (cid:136) B
`AR BpAorg
`0
`1
`AR is the 3 (cid:2) 3 rotation matrix corresponding to the
`where B
`angles ((cid:11), (cid:12), ). In this paper, we shall use the letter x to
`designate a six-element pose vector and the letter H to
`designate the equivalent 4 (cid:2) 4 homogeneous transforma-
`tion matrix.
`Homogeneous transformations are a convenient and
`elegant representation. Given a homogeneous point
`Ap (cid:136) (cid:133)AxP ; AyP ; AzP ; 1(cid:134)T , represented in coordinate system
`{A}, it may be transformed to coordinate system {B} with a
`simple matrix multiplication Bp (cid:136) B
`AHAp. The homoge-
`neous matrix representing the pose of frame {B} with
`respect to frame {A} is just the inverse of the pose of {A}
`BH (cid:136) B
`AH(cid:255)1. Finally, if we know the
`with respect to {B}, i.e., A
`pose of {A} with respect to {B} and the pose of {B} with
`respect to {C}, then the pose of {A} with respect to {C} is
`AH (cid:136) C
`easily given by the matrix multiplication C
`BHB
`AH.
`
`;
`
`BA
`
`2.2 Pose Estimation Algorithms
`The 2D-to-3D pose estimation problem is to determine the
`pose of a rigid body, given an image from a single camera
`(this is also called the “exterior orientation” problem in
`photogrammetry). Specifically, we are given a set of 3D
`known points on the object (in the coordinate frame of the
`object) and the corresponding set of 2D measured image
`points from the camera, which are the perspective projec-
`tions of the 3D points. The internal parameters of the
`camera (focal length, principal point, etc.) are known. The
`goal is to find the pose of the object with respect to the
`camera, cam
`obj x. There are many solutions to the problem; in
`this work, we used the algorithm described by Haralick and
`Shapiro [16], which uses an iterative nonlinear least squares
`method. The algorithm effectively minimizes the squared
`error between the measured 2D point locations and the
`predicted 2D point locations.
`The 3D-to-3D pose estimation problem is to determine
`the pose of a rigid body, given a set of 3D point
`measurements1 (this is also called the “absolute orientation”
`problem in photogrammetry). Specifically, we are given a
`set of 3D known points on the object {objpi} and the
`
`1. These 3D point measurements may have been obtained from a
`previous triangulation process using a stereo vision sensor.
`
`Authorized licensed use limited to: Everything Demo User. Downloaded on December 29,2021 at 14:43:37 UTC from IEEE Xplore. Restrictions apply.
`
`META 1028
`META V. THALES
`
`
`
`(cid:133)4(cid:134)
`
`M1
`..
`
`. M
`
`n
`
`(cid:1)p1
`.
`..
`(cid:1)pn
`
`0B@
`
`(cid:255)
`
`IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 6, NO. 4, OCTOBER-DECEMBER 2000
`
`1CA(cid:1)x ) (cid:1)P (cid:136) M(cid:1)x:
`0B@
`1CA (cid:136)
`(cid:1)(cid:255)1MT (cid:1)P. The covariance matrix of x is given
`(cid:1)
`(cid:21)
`(cid:16)
`(cid:17)T
`(cid:1)(cid:255)1MT (cid:1)P(cid:1)PT MT M(cid:255)
`
`(cid:1)(cid:255)1MT
`(cid:16)
`(cid:17)T
`(cid:1)(cid:255)1MT E (cid:1)P(cid:1)PT
`(cid:255)
`(cid:1)(cid:255)1MT
`(cid:255)
`(cid:1)
`0BB@
`1CCA MT M
`(cid:16)
`(cid:255)
`(cid:1)(cid:255)1MT
`(cid:1)(cid:255)1MT
`
`322
`
`corresponding set of 3D measured points from the sensor
`{senpi}. The goal is to find the pose of the object with respect
`to the sensor, sen
`obj x. There are many solutions to the problem;
`in this work we used the solution by Horn [36], which uses
`a quaternion-based method.2 The algorithm effectively
`minimizes the squared error between the measured 3D
`point locations and the predicted 3D point locations.
`
`3 DETERMINATION AND MANIPULATION OF POSE
`UNCERTAINTY
`
`Given that we have estimated the pose of an object using
`one of the methods above, what is the uncertainty of the
`pose estimate? We can represent the uncertainty of a six-
`element pose vector x, by a 6 (cid:2) 6 covariance matrix Cx =
`E((cid:1)x(cid:1)xT), which is the expectation of the square of the
`difference between the estimate and the true vector.
`This section describes methods to estimate the covar-
`iance matrix of a pose, given the estimated uncertainties in
`the measurements, transform the covariance matrix from
`one coordinate frame to another, and combine two pose
`estimates.
`
`3.1 Computation of Covariance
`Assume that we have n measured data points from the
`sensor {pi} and the corresponding points on the object
`{qi}. The object points qi are 3D; the data points pi are
`either 3D (in the case of 3D-to-3D pose estimation) or 2D
`(in the case of 2D-to-3D pose estimation). We assume that
`the noise in each measured data point is independent and
`that the noise distribution of each point is given by a
`covariance matrix Cp.
`Let pi = g(qi, x) be the function which transforms object
`points into measured data points for a hypothesized pose x.
`In the case of 3D-to-3D pose estimation, this is just a
`multiplication of qi by the corresponding homogeneous
`transformation matrix.
`In the case of 2D-to-3D pose
`estimation, the function is composed of a transformation
`followed by a perspective projection. The pose estimation
`algorithms described above solve for xest by minimizing the
`sum of the squared errors. Assume that have we solved for
`xest using the appropriate algorithm (i.e., 2D-to-3D or 3D-to-
`3D). We then linearize the equation about the estimated
`solution xest:
`
`(cid:20)
`Since pi (cid:25) g(cid:133)qi; xest(cid:134), the equation reduces to
`(cid:1)x (cid:136) Mi(cid:1)x;
`(cid:1)pi (cid:136) @g
`@x
`
`(cid:20)
`
`(cid:134) (cid:135) @g
`@x
`
`(cid:21)T
`
`qi;xest
`
`(cid:1)x:
`(cid:133)2(cid:134)
`
`(cid:133)3(cid:134)
`
`pi (cid:135) (cid:1)pi (cid:136) g qi; xest (cid:135) (cid:1)x(cid:133)
`
`
`
`(cid:134) (cid:25) g qi; xest(cid:133)
`
`(cid:21)T
`
`qi;xest
`
`Solving for (cid:1)x in a least squares sense, we get
`(cid:1)x (cid:136) MT M
`(cid:255)
`by the expectation of the outer product:
`(cid:20)
`(cid:255)
`Cx (cid:136) E (cid:1)x (cid:1)xT
`(cid:255)
`(cid:136) E MT M
`(cid:255)
`
`(cid:17)T
`
`: (cid:133)
`
`5(cid:134)
`
`(cid:136) MT M
`
`(cid:136) MT M
`
`MT M
`
`Cp
`.
`..
`0
`
`(cid:1) (cid:1) (cid:1)
`0
`.
`.
`..
`. .
`(cid:1) (cid:1) (cid:1) Cp
`
`Note that we have assumed that the errors in the data
`points are independent, i.e., E((cid:1)pi(cid:1)pjT) = 0, for i 6(cid:136) j. If the
`errors in different data points are actually correlated, our
`simplified assumption could result in an underestimate of
`the actual covariance matrix. Also, the above analysis was
`derived assuming that the noise is small. However, we
`computed the covariance matrices for the configuration
`described in Section 4, using both (5) and using a Monte
`Carlo simulation, and found (5) is fairly accurate even for
`noise levels much larger than in our application. For
`example, using input noise with variance 225 mm2
`(compared to the actual 0.0225 mm2 in our application)
`the largest deviation between the variances of the transla-
`tional dimensions was 5.5 mm2 (out of 83 mm2).
`
`3.2 Transformation of Covariance
`We can transform a covariance matrix from one coordinate
`frame to another. Assume that we have a six-element pose
`vector x and its associated covariance matrix Cx. Assume
`that we apply a transformation, represented by a six-
`element vector w, to x to create a new pose y. Denote y =
`g(x, w). A Taylor series expansion yields (cid:1)y (cid:136) J(cid:1)x, where
`h
`(cid:1) (cid:136) E J(cid:1)x
`(cid:255)
`J = (@g/@x). The covariance matrix Cy is found by:
`(cid:1)
`(cid:255)
`Cy (cid:136) E (cid:1)y(cid:1)yT
`(cid:133)
`
`(cid:134) J(cid:1)x(cid:133)
`(cid:134)T
`JT (cid:136) JCxJT :
`(cid:136) JE (cid:1)x(cid:1)xT
`the
`A variation on this method is to assume that
`transformation w also has an associated covariance matrix
`Cw. In this case, the covariance matrix Cy is:
`
`Cy (cid:136) JxCxJTx (cid:135) JwCwJTw;
`
`where Jx = (@g/@x) and Jw = (@g/@w). The above analysis
`was verified with Monte Carlo simulations, using both the
`3D-to-3D algorithm and the 2D-to-3D algorithm.
`
`(cid:133)6(cid:134)
`
`(cid:133)7(cid:134)
`
`i
`
`3.3 Interpretation of Covariance
`A useful
`interpretation of
`the covariance matrix is
`obtained by assuming that the errors are jointly Gaussian.
`The joint probability density for n-dimensional error
`vector (cid:1)x is [37]:
`
`where Mi
`is the Jacobian of g, evaluated at (qi, xest).
`Combining all the measurement equations:
`
`2. This is the algorithm used in the Northern Digital Optotrak sensor,
`described in Section 4.
`
`Authorized licensed use limited to: Everything Demo User. Downloaded on December 29,2021 at 14:43:37 UTC from IEEE Xplore. Restrictions apply.
`
`META 1028
`META V. THALES
`
`
`
`HOFF AND VINCENT: ANALYSIS OF HEAD POSE ACCURACY IN AUGMENTED REALITY
`
`(cid:16)
`
`p (cid:1)x(cid:133)
`
`(cid:134) (cid:136) 2(cid:25)j
`
`jN=2 Cxj
`
`j1=2
`
`(cid:17)(cid:255)1
`
`(cid:255)
`
`(cid:1