`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
` February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`High-Performance Wide-Area Optical Tracking
`The HiBall Tracking System
`
`Greg Welch, Gary Bishop, Leandra Vicci, Stephen Brumback, and Kurtis Keller:
`
`University of North Carolina at Chapel Hill
`Department of Computer Science, CB# 3175
`Chapel Hill, NC 27599-3175 USA
`01-919-962-1700
`{welch, gb, vicci, brumback, keller}@cs.unc.edu
`
`D’nardo Colucci:
`
`Alternate Realities Corporation
`27 Maple Place
`Minneapolis, MN 55401 USA
`01-612-616-9721
`colucci@virtual-reality.com
`
`Page 1 of 22
`
`Align EX1039
`Align v. 3Shape
`IPR2022-00144
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
` February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`High-Performance Wide-Area Optical Tracking
`The HiBall Tracking System
`
`ABSTRACT
`Since the early 1980’s the Tracker Project at the University of North Carolina at Chapel Hill has
`been working on wide-area head tracking for Virtual and Augmented Environments. Our long-
`term goal has been to achieve the high performance required for accurate visual simulation
`throughout our entire laboratory, beyond into the hallways, and eventually even outdoors.
`In this article we present results and a complete description of our most recent electro-optical
`system, the
`. In particular we discuss motivation for the geometric
`HiBall Tracking System
`configuration, and describe the novel optical, mechanical, electronic, and algorithmic aspects that
`enable unprecedented speed, resolution, accuracy, robustness, and flexibility.
`
`Figure 1
`
`1. INTRODUCTION
`Systems for
` for interactive computer graphics have been explored for over 30 years
`head tracking
`(Sutherland, 1968). As illustrated in Figure 1, the authors have been working on the problem for
`over twenty years (Azuma, 1993, 1995; Azuma & Bishop, 1994a, 1994b; Azuma & Ward, 1991;
`Bishop, 1984; Gottschalk & Hughes, 1993; UNC Tracker Project, 2000; Wang, 1990; J.-F. Wang
`et al., 1990; Ward, Azuma, Bennett, Gottschalk, & Fuchs, 1992; Welch, 1995, 1996; Welch &
`Bishop, 1997; Welch et al., 1999). From the beginning our efforts have been targeted at
`
`wide-area
`applications in particular. This focus was originally motivated by applications for which we
`believed that actually walking around the environment would be superior to virtually “flying.” For
`example, we wanted to interact with room-filling virtual molecular models, and to naturally
`explore life-sized virtual architectural models. Today we believe that a wide-area system with
`high performance everywhere in our laboratory provides increased flexibility for all of our
`graphics, vision, and interaction research.
`
`Page 2 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
`February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`1.1 Previous Work
`In the early 1960’s Ivan Sutherland implemented both mechanical and ultrasonic (carrier phase)
`head tracking systems as part of his pioneering work in virtual environments. He describes these
`systems in his seminal paper “A Head-Mounted Three Dimensional Display” (Sutherland, 1968).
`In the ensuing years, commercial and research teams have explored mechanical, magnetic,
`acoustic, inertial, and optical technologies. Complete surveys include (Bhatnagar, 1993; Burdea &
`Coiffet, 1994; Meyer, Applewhite, & Biocca, 1992; Mulder, 1994a, 1994b, 1998). Commercial
`magnetic tracking systems for example (Ascension, 2000; Polhemus, 2000) have enjoyed
`popularity as a result of a small user-worn component and relative ease of use. Recently inertial
`hybrid systems (Foxlin, Harrington, & Pfeifer, 1998; Intersense, 2000) have been gaining
`popularity for similar reasons, with the added benefit of reduced high-frequency noise and direct
`measurements of derivatives.
`An early example of an optical system for tracking or motion capture is the
` by
`Twinkle Box
`Burton (Burton, 1973; Burton & Sutherland, 1974). This system measured the positions of user-
`worn flashing lights with optical sensors mounted in the environment behind rotating slotted
`disks. The
` system (Woltring, 1974) used fixed camera-like photo-diode sensors and target-
`Selspot
`mounted infrared light-emitting diodes that could be tracked in a one-cubic-meter volume.
`Beyond the HiBall Tracking System, examples of current optical tracking and motion capture
`systems include the
`© and
`™ systems by Image Guided Technologies (IGT,
`FlashPoint
`Pixsys
`2000), the
`™ system by Ascension Technology (Ascension, 2000), and the
`laserBIRD
`CODA
` by B & L Engineering (BL, 2000). These systems employ analog optical
`Motion Capture System
`sensor systems to achieve relatively high sample rates for a moderate number of targets. Digital
`cameras (two-dimensional image-forming optical devices) are used in motion capture systems
`such as the
` by the Motion Analysis Corporation (Kadaba &
`HiRes 3D Motion Capture System
`Stine, 2000; MAC, 2000) to track a relatively large number of targets, albeit at a relatively low
`rate because of the need for 2D image processing.
`
`1.2 Previous Work at UNC-Chapel Hill
`As part of his 1984 dissertation on
`, Bishop put forward
`Self-Tracker
`the idea of outward looking tracking systems based on user-mounted
`1
` by observing landmarks in the
`sensors that estimate user
`pose
`environment (Bishop, 1984). He described two kinds of landmarks:
`high signal-to-noise-ratio beacons such as LEDs (light emitting
`diodes) and low signal-to-noise-ratio landmarks such as naturally
`occurring features. Bishop designed and demonstrated custom VLSI
`chips (Figure 2) that combined image sensing and processing on a
`single chip (Bishop & Fuchs, 1984). The idea was to combine multiple
`instances of these chips into an outward-looking cluster that estimated
`cluster motion by observing natural features in the un-modified
`environment. Integrating the resulting motion to estimate pose is prone to accumulating error, so
`further development required a complementary system based on easily detectable landmarks
`(LEDs) at known locations. This LED-based system was the subject of a 1990 dissertation by Jih-
`Fang Wang (Wang, 1990).
`
`Figure 2
`
`1
`
`We use the word
`pose
`
` to indicate both position and orientation (six degrees of freedom).
`
`Page 3 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
` February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`In 1991 we demonstrated a working scalable electro-optical head-
`tracking system in the
` gallery at that year’s ACM
`Tomorrow's Realities
`SIGGRAPH conference (J.-F. Wang et al., 1990; Wang, Chi, & Fuchs,
`1990; Ward et al., 1992). The system (Figure 3) used four head-worn
`lateral effect photo-diodes that looked upward at a regular array of
`infrared LEDs installed in precisely machined ceiling panels. A user-
`worn backpack contained electronics that digitized and communicated
`the photo-coordinates of the sighted LEDs. Photogrammetric techniques
`were used to compute a user’s head pose using the known LED positions
`and the corresponding measured photo-coordinates from each LEPD
`sensor (Azuma & Ward, 1991). The system was ground-breaking in that
`it was unaffected by ferromagnetic and conductive materials in the
`environment, and the working volume of the system was determined
`solely by the number of ceiling panels. (See Figure 3, top.)
`
`Figure 3
`
`1.3 The HiBall Tracking System
`In this article we describe a new and vastly improved version of the
`1991 system. We call the new system the
`.
`HiBall Tracking System
`Thanks to significant improvements in hardware and software this
`HiBall system offers unprecedented speed, resolution, accuracy,
`robustness, and flexibility. The bulky and heavy sensors and backpack of the previous system have
`been replaced by a small
` unit (Figure 4, bottom). In addition, the precisely machined LED
`HiBall
`ceiling panels of the previous system have been replaced by looser-tolerance panels that are
`relatively inexpensive to make and simple to install (Figure 4, top; Figure 10). Finally, we are
`using an unusual Kalman-filter-based algorithm that generates very accurate pose estimates at a
`high rate with low latency, and simultaneously self-calibrates the system.
`As a result of these improvements
`the HiBall Tracking System can
`generate over 2000 pose estimates per
`second, with less than one millisecond
`of latency, better than 0.5 millimeters
`and 0.03 degrees of absolute error and
`noise, everywhere in a 4.5 by 8.5
`meter room (with over two meters of
`height variation). The area can be
`expanded by adding more panels, or
`by using checkerboard configurations
`which spread panels over a larger area. The weight of the user-worn HiBall is about 300 grams,
`making it lighter than one
`optical sensor
`in the 1991 system. Multiple HiBall units can be daisy-
`
`
`chained together for head or hand tracking, pose-aware input devices, or precise 3D point
`digitization throughout the entire working volume.
`
`Figure 4
`
`2. DESIGN CONSIDERATIONS
`In all of the optical systems we have developed (see Section 1.2) we have chosen what we call an
`configuration, where the optical sensors are on the (moving) user and the
`inside-looking-out
` (e.g., LEDs) are fixed in the laboratory. The corresponding
`landmarks
`outside-looking-in
`
`Page 4 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
`February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`alternative would be to place the landmarks on the user, and to fix the optical sensors in the
`laboratory. (One can think about similar outside-in and inside-out distinctions for acoustic and
`magnetic technologies.) The two configurations are depicted in Figure 5.
`
`lab-mounted (fixed)
`optical sensor
`
`head-mounted landmarks
`
`lab-mounted
`(fixed) landmarks
`
`head-mounted sensor
`
`Outside-Looking-In
`
`Inside-Looking-Out
`
`Figure 5
`
`There are some disadvantages to the inside-looking-out approach. For small or medium-sized
`working volumes, mounting the sensors on the user is more challenging than mounting them in
`the environment. It is difficult to make user-worn sensor packaging small, and communication
`from the moving sensors to the rest of the system is more complex. In contrast, there are fewer
`mechanical considerations when mounting sensors in the environment for an
`
`outside-looking-in
`configuration. Because landmarks can be relatively simple, small, and cheap, they can often be
`located in numerous places on the user, and communication from the user to the rest of the system
`can be relatively simple or even unnecessary. This is particularly attractive for full-body motion
`capture (BL, 2000; MAC, 2000).
`However there are some significant advantages to the inside-looking-out approach for head
`tracking. By operating with sensors on the user rather than in the environment, the system can be
`scaled indefinitely. The system can evolve from using dense active landmarks to fewer, lower
`signal-to-noise ratio, passive, and some day natural features for a Self-Tracker that operates
`entirely without landmark infrastructure (Bishop, 1984; Bishop & Fuchs, 1984; Welch, 1995).
`The inside-looking-out configuration is also motivated by a desire to maximize sensitivity to
`changes in user pose. In particular, a significant problem with an outside-looking-in configuration
`is that only position estimates can be made directly, and so orientation must be inferred from
`position estimates of multiple fixed landmarks. The result is that orientation sensitivity is a
`function of both the
` from the sensor and the
`distance to the landmarks
`baseline between the
` on the user. In particular, as the distance to the user increases or the baseline between
`landmarks
`the landmarks decreases the sensitivity goes down. For sufficient orientation sensitivity one would
`likely need a baseline that is considerably larger than the user’s head. This would be undesirable
`from an ergonomic standpoint and could actually restrict the user’s motion.
`With respect to translation, the change in measured photo-coordinates is the same for an
`environment-mounted (fixed) sensor and user-mounted (moving) landmark as it is for a user-
`mounted sensor and an environment-mounted landmark. In other words, the translation and
`corresponding sensitivity are the same for either case.
`
`Page 5 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
` February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`3. SYSTEM OVERVIEW
`The HiBall Tracking System consists
`of three main components (Figure 6).
`An outward-looking sensing unit we
`call the
` is fixed to each user to
`HiBall
`be tracked. The HiBall unit observes
`a subsystem of fixed-location
`1
`.
`infrared LEDs we call the
`Ceiling
`Communication and synchronization
`between the host computer and these
`subsystems is coordinated by the
`
`Ceiling-HiBall Interface Board
`(CIB). In Section 4 we describe these
`components in more detail.
`Each HiBall observes LEDs
`through multiple sensor-lens
`
`views
`that are distributed over a large solid angle. LEDs are sequentially flashed (one at a time) such that
`they are seen via a diverse set of views for each HiBall. Initial
` is performed using a
`acquisition
`brute force search through LED space, but once initial lock is made, the selection of LEDs to flash
`is tailored to the views of the active HiBall units. Pose estimates are maintained using a Kalman-
`filter-based prediction-correction approach known as
` or SCAAT
`single-constraint-at-a-time
`tracking. This technique has been extended to provide self-calibration of the Ceiling, concurrent
`with HiBall tracking. In Section 5 we describe the methods we employ, including the initial
`acquisition process and the SCAAT approach to pose estimation, with the
`
`autocalibration
`extension.
`
`Figure 6
`
`4. SYSTEM COMPONENTS
`
`4.1 The HiBall
`The original electro-optical tracker
`(Figure 3, bottom) used independently-
`housed lateral effect photo-diode units
`(LEPDs) attached to a light-weight
`tubular framework. As it turns out, the
`mechanical framework would flex
`(distort) during use, contributing to
`estimation errors. In part to address this
`problem the HiBall sensor unit was
`designed as a single rigid hollow ball
`having dodecahedral symmetry, with lenses in the upper six faces and LEPD on the insides of the
`opposing six lower faces (Figure 7). This immediately gives six primary “camera”
`
`views
`uniformly spaced by 57 degrees. The views efficiently share the same internal air space, and are
`rigid with respect to each other. In addition, light entering any lens sufficiently off axis can be
`
`Figure 7
`
`1
`
`At the present time, the LEDs are in fact entirely located in the ceiling of our laboratory, hence
`the sub-system name
`, but LEDs could as well be located on walls or other fixed locations.
`Ceiling
`
`Page 6 of 22
`
`
`
`The HiBall Tracking System
`
`February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`
`
`
`
`
`
`seen by a neighboring LEPD, giving rise to five secondary views through the top or central lens,
`and three secondary views through the five other lenses. Overall, this provides 26 fields of view
`which are used to sense widely separated groups of LEDs in the environment. While the extra
`views complicate the initialization of the Kalman filter as described in Section 5.5, they turn out
`to be of great benefit during steady-state tracking by effectively increasing the overall HiBall field
`of view without sacrificing optical sensor resolution.
`The lenses are simple plano-convex fixed-focus lenses. Infrared (IR) filtering is provided by
`fabricating the lenses themselves from RG-780 Schott glass filter material which is opaque to
`better than 0.001% for all visible wavelengths, and transmissive to better than 99% for IR
`wavelengths longer than 830 nm. The longwave filtering limit is provided by the DLS-4 LEPD
`silicon photodetector (UDT Sensors, Inc.) with peak responsivity at 950 nm but essentially blind
`above 1150 nm.
`The LEPDs themselves are not imaging devices;
`rather they detect the centroid of the luminous flux
`incident on the detector. The x-position of the centroid
`determines the ratio of two output currents, while the y-
`position determines the ratio of two other output
`currents. The total output current of each pair are
`commensurate, and proportional to the total incident
`flux. Consequently, focus is not an issue, so the simple
`fixed-focus lenses work well over a range of LED
`distances from about half a meter to infinity. The LEPDs
`and associated electronic components are mounted on a
`custom rigid-flex printed circuit board (Figure 8). This
`arrangement makes efficient use of the internal HiBall
`volume while maintaining isolation between analog and
`digital circuitry, and increasing reliability by alleviating the need for inter-component mechanical
`connectors.
`Figure 9 shows the physical arrangement
`of the folded electronics in the HiBall. Each
`LEPD has four transimpedance amplifiers
`(shown together as one “Amp” in Figure 9),
`the analog outputs of which are multiplexed
`with those of the other LEPDs, then
`sampled, held, and converted by four 16-bit
`Delta-Sigma analog-to-digital (A/D)
`converters. Multiple samples are integrated
`via an accumulator. The digitized LEPD data
`are organized into packets for
`communication back to the CIB. The
`packets also contain information to assist in
`error-detection. The communication
`protocol is simple, and while presently
`implemented by wire, the modulation scheme is amenable to a wireless implementation. The
`present wired implementation allows multiple HiBall units to be daisy-chained so a single cable
`can support a user with multiple HiBall units.
`
`Figure 9
`
`Figure 8
`
`Page 7 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
` February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`4.2 The Ceiling
`As presently implemented, the infrared LEDs are
`packaged in 61 centimeter square
`, to fit a standard false
`panels
`ceiling grid (Figure 10, top). Each panel uses five printed circuit
`boards: a main controller board and four identical transverse-
`mounted
` (bottom). Each strip is populated with eight LEDs
`strips
`for a total of 32 LEDs per panel. We mount the assembly on top
`of a metal panel such that the LEDs protrude through 32
`corresponding holes. The design results in a Ceiling with a
`rectangular LED pattern with periods of 7.6 and 15.2 centimeters.
`This spacing is used for the initial estimates of the LED positions
`in the lab, then during normal operation the SCAAT algorithm
`continually refines the LED position estimates (Section 5.4). The
`SCAAT
` not only relaxes design and installation
`autocalibration
`constraints, but provides greater precision in the face of initial
`and ongoing uncertainty in the Ceiling structure.
`We currently have enough panels to cover an area
`approximately 5.5 by 8.5 meters with a total of approximately
`1
` The panels are daisy-chained to each other, and panel selection encoding is position
`3,000 LEDs.
`(rather than device) dependent. Operational commands are presented to the first panel of the daisy
`chain. At each panel, if the panel select code is zero the controller decodes and executes the
`operation; else it decrements the panel select code and passes it along to the next panel
`(controller). Upon decoding, a particular LED is selected and the LED is energized. The LED
`brightness (power) is selectable for
` as described in Section 5.2.
`automatic gain control
`We currently use Siemens SFH-487P GaAs LEDs which provide both a wide angle radiation
`pattern and high peak power, emitting at a center wavelength of 880 nm in the near IR. These
`µs
`devices can be pulsed up to 2.0 Amps for a maximum duration of 200
` with a 1:50 (on:off)
`duty cycle. While the current Ceiling architecture allows flashing of only one LED at a time,
`LEDs may be flashed in any sequence. As such no single LED can be flashed too long or too
`frequently. We include both hardware and software protection to prevent this.
`
`Figure 10
`
`4.3 The Ceiling-HiBall Interface Board
`The Ceiling-HiBall Interface Board or CIB
`(Figure 11) provides communication and
`synchronization between a host personal
`computer, the HiBall (Section 4.1), and the
`Ceiling (Section 4.2). The CIB has four Ceiling
`ports allowing interleaving of ceiling panels for up
`to four simultaneous LED flashes and/or higher
`Ceiling bandwidth. (The Ceiling bandwidth is
`inherently limited by LED power restrictions as
`described in Section 4.2, but this can be increased by spatially multiplexing the Ceiling panels.)
`The CIB has two tether interfaces that can communicate with up to four daisy-chained HiBall
`units. The full-duplex communication with the HiBall units uses a modulation scheme (BPSK)
`
`Figure 11
`
`1
`
`The area is actually L-shaped; a small storage room occupies one corner.
`
`Page 8 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
`February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`allowing future wireless operation. The interface from the CIB to the host PC is the stable
`IEEE1284C extended parallel port (EPP) standard.
`The CIB comprises analog drive and receive components as well as digital logic components.
`The digital components implement store and forward in both directions and synchronize the
`timing of the LED “on” interval within the HiBall dark-light-dark intervals (Section 5.2). The
`protocol supports full-duplex flow control. The data are arranged into packets that incorporate
`error detection.
`
`5. METHODS
`
`5.1 Bench-Top (Off-Line) HiBall Calibration
`After each HiBall is assembled we perform an off-line calibration procedure to determine
`the correspondence between image-plane coordinates and rays in space. This involves more than
`just determining the view transform for each of the 26 views. Non-linearities in the silicon sensor
`and distortions in the lens (e.g., spherical aberration) cause significant deviations from a simple
`pin-hole camera model. We dealt with all of these issues through the use of a two-part camera
`model. The first part is a standard pin-hole camera represented by a 3 x 4 matrix. The second part
`is a table mapping real image-plane coordinates to ideal image-plane coordinates.
`Both parts of the camera model are determined using a calibration procedure that relies on a
`goniometer (an angular positioning system) of our own design. This device consists of two servo
`motors mounted together such that one motor provides rotation about the vertical axis while the
`second motor provides rotation about an axis orthogonal to vertical. An important characteristic of
`the goniometer is that the rotational axes of the two motors intersect at a point at the center of the
`HiBall optical sphere; this point is defined as the origin of the HiBall. (It is this origin that
`provides the reference for the HiBall state during run time as described in Section 5.3.) The
`rotational positioning motors were rated to provide 20 arc-second precision; we further calibrated
`them to 6 arc seconds using a laboratory grade theodolite—an angle measuring system.
`In order to determine the mapping between sensor image-plane coordinates and three-space
`rays, we use a single LED mounted at a fixed location in the laboratory such that it is centered in
`the view directly out of the top lens of the HiBall. This ray defines the Z or up axis for the HiBall
`coordinate system. We sample other rays by rotating the goniometer motors under computer
`control. We sample each view with rays spaced about every 6 minutes of arc throughout the field
`of view. We repeat each measurement 100 times in order to reduce the effects of noise on the
`individual measurements and to estimate the standard deviation of the measurements.
`Given the tables of approximately 2500 measurements for each of the 26 views, we first
`determine a 3 by 4 view matrix using standard linear least-squares techniques. Then we determine
`the deviation of each measured point from that predicted by the ideal linear model. These
`deviations are re-sampled into a 25 by 25 grid indexed by sensor-plane coordinates using a simple
`scan conversion procedure and averaging. Given a measurement from a sensor at run time
`(Section 5.2) we convert it to an “ideal” measurement by subtracting a deviation bilinearly
`interpolated from the nearest 4 entries in the table.
`
`5.2 On-Line HiBall Measurements
`Upon receiving a command from the CIB (Section 4.3), which is synchronized with a CIB
`command to the ceiling, the HiBall selects the specified LEPD and performs three measurements,
`one before the LED flashes, one during the LED flash, and one after the LED flash. Known as
`“dark-light-dark”, this technique is used to subtract out DC bias, low frequency noise, and
`
`Page 9 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The HiBall Tracking System
`
` February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`background light from the LED signal. We then convert the measured sensor coordinates to
`“ideal” coordinates using the calibration tables described in Section 5.1.
`In addition, during run time we attempt to maximize the signal-to-noise ratio of the
`measurement with an automatic gain control scheme. For each LED we store a target signal
`strength factor. We compute the LED current and number of integrations (of successive
`accumulated A/D samples) by dividing this strength factor by the square of the distance to the
`LED, estimated from the current position estimate. After a reading we look at the strength of the
`actual measurement. If it is larger than expected we reduce the gain, if it is less than expected we
`increase the gain. The increase and decrease are implemented as on-line averages with scaling
`such that the gain factor decreases rapidly (to avoid overflow) and increases slowly. Finally we
`use the measured signal strength to estimate the noise on the signal using (Chi, 1995), and then
`use this as the measurement noise estimate for the Kalman filter (Section 5.3).
`
`5.3 Recursive Pose Estimation (SCAAT)
`The on-line measurements (Section 5.2) are used to estimate the pose of the HiBall during
`operation. The 1991 system collected a group of diverse measurements for a variety of LEDs and
`sensors, and then used a method of simultaneous non-linear equations called
` (Azuma
`Collinearity
`& Ward, 1991) to estimate the pose of the sensor fixture shown in Figure 3 (bottom). There was
`one equation for each measurement, expressing the constraint that a ray from the front principal
`point of the sensor lens to the LED, must be collinear with a ray from the rear principal point to
`the intersection with the sensor. Each estimate made use of a group of measurements (typically 20
`or more) that together over-constrained the solution.
`This
` method had several drawbacks. First, it had a significantly lower
`multiple constraint
`estimate rate due to the need to collect multiple measurements per estimate. Second, the system of
`non-linear equations did not account for the fact that the sensor fixture continued to move
`throughout the collection of the sequence of measurements. Instead the method effectively
`assumes that the measurements were taken simultaneously. The violation of this
`simultaneity
` could introduce significant error during even moderate motion. Finally, the method
`assumption
`provided no means to identify or handle unusually noisy individual measurements. Thus, a single
`erroneous measurement could cause an estimate to jump away from an otherwise smooth track.
`In contrast, the approach we use with the new HiBall system produces tracker reports as each
`new measurement is made, rather than waiting to form a complete collection of observations.
`Because single measurements under-constrain the mathematical solution, we refer to the approach
`as
` or SCAAT tracking (Welch, 1996; Welch & Bishop, 1997). The key
`single-constraint-at-a-time
`is that the single measurements provide
` information about the HiBall state, and thus can be
`some
`used to incrementally improve a previous estimate. We intentionally
`fuse each individual
`
`“insufficient” measurement immediately as it is obtained. With this approach we are able to
`generate estimates more frequently, with less latency, with improved accuracy, and we are able to
`estimate the LED positions on-line concurrently while tracking the HiBall (Section 5.4).
`We use a Kalman filter (Kalman, 1960) to fuse the measurements into an estimate of the
`HiBall
`
` (the pose of the HiBall). We use the Kalman filter—a minimum variance stochastic
`state
`x
`estimator—both because the sensor measurement noise and the typical user motion dynamics can
`be modeled as normally-distributed random processes, and because we want an efficient on-line
`method of estimation. A basic introduction to the Kalman filter can be found in Chapter 1 of
`(Maybeck, 1979), while a more complete introductory discussion can be found in (Sorenson,
`1970), which also contains some interesting historical narrative. More extensive references can be
`
`Page 10 of 22
`
`
`
`The HiBall Tracking System
`
`February, 2001
`Presence: Teleoperators and Virtual Environments (10:1),
`
`
`
`
`
`
`
`
`
`
`
`
`
`Figure 12
`
`found in (Brown & Hwang, 1992; Gelb, 1974; Jacobs, 1993; Lewis, 1986; Maybeck, 1979; Welch
`& Bishop, 1995). Finally, we maintain a Kalman filter web page (Welch & Bishop, 2000) with
`introductory, reference, and research material.
`The Kalman filter has been used previously to address similar or related problems. See for
`example (Azarbayejani & Pentland, 1995; Azuma, 1995; Emura & Tachi, 1994; Fuchs (Foxlin),
`1993; Mazuryk & Gervautz, 1995; Van Pabst & Krekel, 1993). A relevant example of a Kalman
`filter used for sensor fusion in wide-area tracking system is given by (Foxlin et al., 1998) which
`describes a hybrid inertial-acoustic system that is commercially-available today (Intersense,
`2000).
`The SCAAT approach is described in detail in (Welch, 1996;
`Welch & Bishop, 1997). Included there is discussion of the benefits
`of using the approach, as opposed to a
` approach
`multiple-constraint
`such as (Azuma & Ward, 1991). However one key benefit warrants
`discussion here. There is a direct relationship between the
`complexity of the estimation algorithm, the corresponding speed
`(execution time per estimation cycle), and the change in HiBall
`pose between estimation cycles (Figure 12). As the algorithmic
`complexity increases, the execution time increases, which allows
`for significant non-linear HiBall motion between estimation cycles,
`which in turn implies the need for a more complex estimation
`algorithm.
`The SCAAT approach on the other hand is an attempt to reverse
`this cycle. Because we intentionally use a single constraint per estimate, the algorithmic
`complexity is drastically reduced, which reduces the execution time, and hence the amount of
`motion between estimation cycles. Because the amount of motion is limited we are able to use a
`simple dynamic (process) model in the Kalman filter, which further simplifies the computations.
`In short, the simplicity of the approach means it can run very fast, which means it can produce
`estimates very rapidly, with low noise.
`The Kalman filter requires both a model of the process dynamics, and a model of the
`relationship between the process state and the available measurements. In part due to the
`simplicity of the SCAAT approach we are able to use a simple PV (position-velocity) process
`model (Brown & Hwang, 1992). Consider the simple example state vector
`]T
`[
`x p t( ) xv t( ),
`x t( )
`
`x p t( )
`=
`, where the first element
` is the pose (position or orientation) and the
`d
`xv t( )
`x p t( )
`xv t( )
`=
` is the corresponding velocity, i.e.
`. We model the
`second element
`td
`continuous change in the HiBall state with the simple differential equation
`x p t( )
`xv t( )
`
`(1)
`
`u t( )
`
`,
`
`0µ
`
`+
`
`x t( )
`
`=
`
`d
`td
`
`0 1
`0 0
`
`u t( )
` is a normally-distributed white (in the frequency spectrum) scalar noise process, and
`where
`µ
` represents the magnitude or spectral density of the noise. We use a similar model
`the scalar
`with a distinct noise process for each of the six pose elements. We determine the individual noise
`magnitudes using an off-line simulation of the system and a non-linear optimization strategy that
`seeks to minimize the variance between the estimated pose and a known motion path. (See
`Section 6.2.2