`Symposium
`Zyda
`Michael
`Program Co-Chairs
`Pat Hanrahan
`Jim Winget
`
`1995 SYMPOSIUM
`ON INTERACTIVE
`30 GRAPHICS
`a��
`PRESS /4 .•
`
`by the Association
`for
`Sponsored
`Special
`Machinery's
`Computing
`Group on Computer
`Interest
`Graphics
`
`of ACM SIGGRAPH
`A publication
`
`'.-;,;:··
`
`1
`
`MS 1006
`
`
`
`Proceedings
`1995 Symposium on
`Interactive 3D Graphics
`
`Monterey, California
`April 9 — 12, 1995
`
`Symposium Chair
`
`Michael Zyda, Naval Postgraduate School
`
`Program Co-Chairs
`
`Pat Hanrahan, Stanford University
`Jim Winget, Silicon Graphics, Inc.
`
`Program Committee
`
`Frank Crow, Apple Computer
`Andy van Dam, Brown University
`Michael Deen'ng, Sun Microsystems
`Steven Feiner, Columbia University
`Henry Fuchs, UNC - Chapel Hill
`Thomas Funkhouser, Bell Labs
`Fred Kitson, Hewlett-Packard
`
`Randy Pausch. University of Virginia
`Paul Strauss, Silicon Graphics, Inc.
`Andy Witkin, Carnegie-Mellon University
`David Zeltzer, Massachusetts Institute of Technology
`
`Financial support provided by the following organizations:
`
`Office of Naval Research, Advanced Research Projects Agency
`US Army Research Laboratory
`Apple Computer
`AT&T Bell Laboratories
`Cyberware
`Hewlett-Packard
`
`Microsoft Corporation
`Silicon Graphics, Inc.
`Sun Microsystems
`
`Production Editor
`
`Stephen Spencer, The Ohio State University
`
`
`
`2
`
`
`
`The Association for Computing Machinery, Inc.
`1515 Broadway, 17th Floor
`New York, NY 10036
`
`Copyright © 1995 by the Association for Computing Machinery, Inc. Copying withOut fee is permit-
`ted provided that the copies are not made or distributed for direct commercial advantage and credit to
`the source is given. Abstracting with credit is permitted. For other copying of articles that carry a code
`at the bottom of the first page, copying is permitted provided that the per-copy fee is paid through the
`Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For permission to republish
`write to Director of Publications, Association for Computing Machinery. To copy otherwise, or re-
`publish, requires a fee andfor specific permission.
`
`Orders from ACM Members:
`
`A limited number of copies are available at
`the ACM member discount. Send order with
`payment in US. dollars to:
`
`ACM Order Department
`PO. Box 12114
`Church Street Station
`New York, NY 10257
`
`OR, for information on accepted European
`currencies and exchange rates, contact;
`
`ACM European Service Center
`Avenue Marcel Thiry 204
`1200 Brussels
`
`Belgium ’
`Tel: +32 2 774 9602
`Fax: +32 2 774 9690
`Email: acm_europe@acm . org
`
`Credit card orders from U.S. and Canada:
`1-800—342-6626
`Credit card orders may also be placed by
`mail.
`
`Credit card orders from the New York
`metropolitan area and outside the
`U.S.:
`+1 212—626-0500
`
`Single‘copy orders placed by fax:
`+1 212-944-1318
`
`Electronic mail inquiries may be directed to
`acmhelpaacm.org.
`
`Please include your ACM member
`number and the ACM order number
`with your order:
`
`ACM Order Number: 429953
`ACM ISBN: 0-89791 513677
`
`
`
`3
`
`
`
`Table of Contents and Symposium Program
`
`Preface4
`Monday, April 10, 1995
`8:00—8:15
`Welcome
`
`8:15~ 10:15
`
`Session 1: Virtual Reality
`Chair: Henry Fuchs — University ofNorth Carolina, Chapel Hill
`
`Resolving Occlusion in Augmented Reality
`Matthias M. Wloka and Brian G. Anderson
`
`Surface Modification Tools in a Virtual Environment Interface to a Scanning Probe Microscope
`M. Finch, M. Falvo, V L. Chi, S. Washbarn, R. M. Taylor, and R. Superfine
`Color Plates
`
`Combatting Rendering Latency
`..
`Marc Olano, Jon Cohen, Mark Mine and Gary Bishop
`ColorPlates
`
`5
`
`13
`
`...203
`19
`
`...204
`
`11:00 — 12:05
`
`12:05 — 1:30
`
`1:30—3:10
`
`Underwater Vehicle Control from a Virtual Environment Interface ............................................................ 25
`Stephen D, Fleischer, Stephen M. Rock and Michael J. Lee
`‘
`Color Plates204
`Session 2: Geometric Modeling
`Chair: Paul Strauss — Silicon Graphics, Inc.
`
`Interactive Design, Analysis, and Illustration ofAssemblie527
`Elana Driskill and Elaine Cohen
`
`Hierarchical and Variational Geometric Modeling with Wavelets
`Steven J. Gortler and Michael F. Cohen
`Color Plates205
`Interactive Shape Metamorphosis43
`David T. Chen, Andrei State and David Banks
`Color Plates209
`Lunch
`
`35
`
`Session 3: Rendering Systems
`Chair: Michael Deering — Sun Microsystems
`
`Shadow Volume BSP Trees for Computation of Shadows in Dynamic Scenes45
`Yiorgos Chrysanthoa and Mel Slater
`
`Interactive Display of Large-Scale NURBS Models
`Sabodh Kamar, Dinesh Monocha and Anselmo Laslra
`Color Plates ................................................................................................................................................. 206
`Real—Time Programmable Shading59
`Anselmo Lastra, Steven Molnar, Marc Olano and Yalan Wang
`Color Plates ................................................................ .
`
`Interactive Full Spectral Rendering67
`Mark S. Peercy, Benjamin M. Zhu and Daniel R. Baum
`Color Plates207
`
`207
`
`51
`
`Session 4: Benefits of Exchange Between Computer Scientists and Perceptual Scientists
`Chair: Randy Pansch — University of Virginia
`Panel: Robert Eggleston — Wright-Patterson AFB, Steve Ellis ~ NASA Ames, Mary Kaiser k NASA
`Antes, Jack Loomis — UCSB, Dennis Profi‘itt — University of Virginia
`
`Session 5: Government Programs on Virtual Environments & Real-Time Interactive 3D
`Chair: Michael Zyda, Naval Postgraduate School
`Panel: Rick Satava ~ARPA, Craig Wier —ARPA, Ralph Wachter — ONR, Paul Stay —ARL
`
`4:00 — 5:00
`
`8:00 — 9:30
`
`4
`
`
`
`Tuesday, April 11, 1995
`
`8:30 — 10:10
`
`Session 6: Parallel and Distributed Algorithm
`Chair: Frank Crow —Apple Computer
`
`Interactive Volume Visualization on a Heterogeneous Message-Passing Multicomputer .......................... 69
`A State, J. McAllister, U. Neumomt, H. Chen, T. J. Cullip D T. Chen andH. Fuchs
`Color Plates. 208
`
`The Sort-First Rendering Architecture for High-Performance Graphics
`Carl Mueller
`Color Plates209
`
`
`
`75
`
`RING: A Client-Server System for Multi—User Virtual Environments ........................................................ 85
`Thomas A. Funkhouser
`Color Plates209
`
`NPSNET: A Multi-Player 3D Virtual Environment over the Internet..
`
`93
`
`M R. Macedonia,D. P. Brurzman, M. J. Zyda, D. R. Pratt P. T, Barkam,J.Falbyand;Locke
`
`Color Plates.
`
`210
`
`11:00 — 12:10
`
`Session 7: Virtual Environments
`Chair: Thomas Funkhouser — AT& T Bell Laboratories
`
`Visual Navigation of Large Environments Using Textured Clusters95
`Paulo W. C. Maciel and Peter Shirley
`Color Plates211
`
`Guided Navigation of VirtualEnvironments 103
`Tinsley A. Galyean
`Color Plates210
`
`Portals and Mirrors: Simple, Fast Evaluation of Potentially Visible Sets
`David Luebke and Chris Georges
`Color Plates212
`
`105
`
`Interactive Playing with Large Synthetic Environments
`Bruce F. Naylor
`Color Plates212
`Lunch
`
`107
`
`Session 8: Input and Output Techniques
`Choir: Randy Pousch — University of Virginia
`
`Of Mice and Monkeys: A Specialized Input Device for Virtual Body Animation .................................... 109
`Chris Esposito, W.Bradford PaleyandJueyChong Ong-
`Color Plates.
`
`213
`
`A Virtual Space Teleconferencing System that Supports Intuitive Interaction
`for Creative and Cooperative Work"
`M Yoshida, Y. Tijer1‘,no S. Abe and F. Kishmo
`
`115
`
`Haptic Rendering: Programming Touch Interaction With Virtual Objects ............................................... 123
`K. Solisbm‘y, D. Brock, T. Massie, N. Swamp and C. Zilles
`
`12:10 — 1:30
`
`1:30 — 2:50
`
`4:00 — 5:00
`
`Session 9: Invited Speaker
`
`
`
`5
`
`
`
`Wednesday, April 12, 1995
`
`8:30 — 10:10
`
`Session 10: Interactive Manipulation
`Chair: Davinr Zeltzer — MIT Research Laboratory ofElectronics
`
`Object Associations: A Simple and Practical Approach to Virtual 3D Manipulation... 131
`Richard W. Bakowski and Carla H. Sequin
`Color Plates“
`CamDroid: A System for Implementing Intelligent Camera Control
`Steven M. Dracker and David Zeltzer
`
`...214
`.
`
`
`139
`
`3D Painting on Scanned Surfaces ............................................................................................................... 145
`Maneesh Agrawala, Andrew C. Beers and Marc Levoy
`Color Plates215
`
`Volume Sculpting. 151
`
`Sidney W. Wang"andArieE. Kanfinan
`
`Color Plates" 214
`
`11:00 — 12:10
`
`Session 11: Applications
`Chair: Steven Feiner — Columbia University
`
`The Tecate Data Space Exploration Utility
`Peter Kochevar and Len Wanger
`
`157
`
`An Environment for Real-time Urban Simulation 165
`William Jepson, Robin Liggett and Scott Friedman
`Color Plates216
`
`Mathenautics: Using Virtual Reality to Visit 3-DManifolds 167
`R. Hudson, C. Glenn, G. K. Francis. D. J. Sandin and T. A. DeFanti
`
`Tracking A Turbulent Spot in an Immersive Environment
`David C. Banks and Michael Kelley
`Color Plates216
`Lunch
`
`171
`
`Session 12: Physical and Behavioral Simulation
`Chair: Fred Kitson ~ Hewlett-Packard Labs
`
`12:10 — 1:30
`
`1:30 ~ 2:45
`
`..
`Behavioral Control for Real-Time Simulated Human Agents“
`John P. Granieri Welton Becket, Barry D Reich, Jonathan Crabtree and NnrmnnI Badlng
`
`173
`
`-—
`
`Impulse-based Simulation of Rigid Bodies
`Brian Mirn'ch and John Canny
`Color Plates. 217
`
`181
`
`I-COLLIDE: An Interactive and Exact Collision Detection System for Large—Scale Environments.
`J. D. Cohen, M. C. Lin, D. Manoeha andM. K. Ponamgi
`Color Plates.
`
`..189
`
`218
`
`3:30 — 4:30
`
`Keynote Address: Interactive 3D Graphics: Challenges and Opportunities
`Henry Fuchs — University ofNorth Carolina, Chapel Hill
`Chair: Andy van Dam d Brown. University
`
`Closing Remarks
`Conference Chairs: Pat Hanrahan, Jim Winger and Michael Zyda
`
`197
`Author Index
`Cover Image Credits
`
`Color Plate Section .................................................................... 203
`
`6
`
`
`
`Preface
`
`This proceedings represents the technical papers and pro-
`gram for the 1995 Symposium on Interactive 3D Graphics. This
`symposium is the fourth in a series of what is hopefully a per-
`manent conference whose focus is on the topic: Where is the
`frontier today in real—time interactive 3D graphics? 3D graph-
`ics is becoming ever more prevalent as our workstations and
`personal computers speed up. We have in—horne users of 3D
`today with 486 PCs and Doom. By the end of 1995, we will be
`seeing $250 home 3D gaming machines running 100,000 tex-
`tured polygons per second. Because of this impending wide-
`spread usage of 3D graphics, we need a conference dedicated
`to real-time, interactive 3D graphics and interactive techniques.
`
`We received 96 paper submissions for this symposium.
`This is a record for the symposium series and is particutarly
`notable in that we did not even have a conference chair for the
`
`symposium until the last day of SIGGRAPH ’94. We accepted
`22 full length papers and 11 short papers. The reasoning be—
`hind the short papers category was that ther were some inter-
`esting submissions that would provide a great live demo but
`for which there was not sufficient material for a full length
`technical paper. We have such flexibility in this smaller eon-
`ferenoe.
`
`One of the major changes for the 1995 symposium is our
`status with respect to ACM SIGGRAPH. ACM SIGGRAPH
`has always provided “in cooperation” status in the past but none
`of the past symposium chairs has wanted to fill out the “daunt-
`ing” paperwork required for “sponsored by" status. Donna
`Baglio of ACM convinced the symposium chair that it wasn‘t
`so difficult and it wasn’t. “Sponsored by” means ACM
`SIGGRAPH guaranteed that all bills are paid. Such guarantees
`allow the symposium’s chair to sleep easier. The “sponsored
`by” status was facilitated by supporters on the SIGGRAPH
`Executive Committee. In particular, Steve Cunningham and
`Mary Whitton helped out enormously, getting the TMRF signed
`off and approved rapidly! Steve also pointed us in the right
`direcion for getting ACM SIGGRAPI-I to include the proceed—
`ings in the SIGGRAPH Member Plus program, which means
`distribution of the proceedings to more than 4,000 individuals.
`
`When we started circulating the call for participation for
`the symposium, we had a major coup. Robert McDermott called
`and volunteered to be Media Coordinator for the symposium.
`He had helped us with the AV for the 1990 symposium at Snow-
`bird. He volunteered to edit and produce a videotape of the
`accepted symposium papers, another symposium first. He also
`volunteered to plan the AV and computer setup for the confer—
`ence. We have plans of a significant AV setup for the sympo-
`sium, with live demos and Internet at the podium. It is very
`nice to have someone who has done this before.
`
`We had a smaller, more compact program committee than
`in the past. Our program committee contains many of the
`world’s outstanding leaders in the field of computer graphics:
`
`Frank Crow, Apple Computer
`Michael Deering, Sun Microsystems
`Steven Feiner, Columbia University
`Henry Fuchs, UNC - Chapel Hill
`Thomas Funkhouser, AT&T Bell Laboratories
`Fred Kitson, Hewlett-Packard Labs
`Randy Pausch, University of Virginia
`Paul Strauss, Silicon Graphics, Inc.
`Andy van Dam, Brown University
`Andy Witkin, Carnegie Mellon University
`David Zeltzer, Massachusetts Institute of Technology
`
`We take this opportunity to thank our supporters who
`helped us with equipmentloans or with financial contributions
`to assure that this symposium would indeed happen. We would
`like to recognize for their generous support: Office of Naval
`Research (Ralph Wachter), Advanced Research Projects Agency
`(Rick Satava and Craig Wier), US. Army Research Labora-
`tory (Paul Stay), Apple Computer (Frank Crow), AT&T Bell
`Laboratories (S. Kicha Ganapathy), Cyberware (David
`Addleman and George Dabrowski), Hewlett-Packard (Fred
`Kitson and Phil Ebersole), Microsoft Corporation (Dan Ling),
`Silicon Graphics, Inc. (Forrest Baskett), Patrick Barrett (Sun
`Microsystems), Jim Rose for his assismnce with the sympo-
`sium video review and the NSF Science and Technology Cen-
`ter for Computer Graphics and Scientific Visualization. With-
`out the snpport of these individuals and organizations, we could
`not hold this conference. Many of these individuals have pro‘
`vided financial support every year the symposium has been
`offered. Thank you very much!
`
`The Symposium is still four months away, and close to
`fully sold out! We expect that we will have to disappoint many
`dozens of people whom we simply cannot accommodate. This
`enthusiastic response attests to the wide interest that the field
`of 3D interactive graphics has garnered. We can only hope and
`recommend that we will not have to wait again so long to en-
`joy the next Symposium on 3D Interactive Graphics.
`
`Pat l-lanrahan, Jim Winget & Michael Zyda
`January 1995
`
`
`
`7
`
`
`
`Resolving Occlusion in Augmented Reality
`
`Matthias M. Wloka“ and Brian G. Anderson“
`
`Science and Technology Center for Computer Graphics and Scientific Visualization,
`Brown University Site
`
`Abstract
`
`Current state-of-the-art augmented reality systems simply overlay
`computer-generated visuals on the real-world imagery, for example
`via video or optical see—through displays. However, overlays are not
`effective when displaying data in three dimensions, since occlusion
`between the real and computer-generated objects is not addressed.
`We present a video see-through augmented reality system ca-
`pable of resolving occlusion between real and computer—generated
`objects. The heart of our system is a new algorithm that assigns
`depth values to each pixel in a pair of stereo video images in near;
`real-time. The algorithm belongs to the class of stereo matching
`algorithms and thus works in fully dynamic environments. We de—
`scribe our system in general and the stereo matching algorithm in
`particular.
`
`real-time, stereo matching, occlusion, augmented re-
`Keywords:
`ality. interaction, approximation, dynamic environments
`
`I
`
`Introduction
`
`Augmented reality systems enhance theuser’s Vision with computer-
`generated imagery. To make such systems and their applications
`effective, the synthetic or virtual imagery needs to blend convinc-
`ingly with the real images. Towards this goal, researchers study
`such areas as minimizing object registration errors [2] and overall
`system log [2} [l 7] so as to increase the “realncss” of virtual objects.
`Since occlusion provides a significant visual cue to the human
`perceptual system when displaying data in three dimensions, proper
`occlusion resolution between real and virtual objects is highly desir-
`able in augmented reality systems. However, solving the occlusion
`problem for augmented reality is challenging: little is known about
`the real world we wish to augment. For example,
`in an optical
`see-through head—mounted display (HMD), no information at all is
`available about the surrounding real world. In a video seevthrough
`HMD, however, at least a pair of 2D intensity bitmaps is available
`in digital memory.
`
`*Box 1910, Department of Computer Science, Brown Uni-
`versity, Providence, RI 02912. Phone:
`(401) 863 7600, email:
`{mmw | bga}@cs .brown. edu.
`
`Permission to copy without too all or part of this material is
`granted provided that the copies are not made or distributed for
`direct commercial advantage, the ACM copyright notice and the
`title oi the publication and its date appear, and notice is given
`that copying is by permission of the Association of Computing
`Machinery. To copy otherwise, or to republish, requires a fee
`and/or specific permission.
`1995 Symposium on Interactive 3D Graphics, Monterey CA USA
`© 1995 ACM 089791-736-7/95/0004...$3.50
`
`Typical augmented reality scenarios further complicate the prob-
`lem. Because they do not restrict the real environment to be static,
`precomputing depth maps to resolve occlusion is impossible. As
`a result, occlusion between virtual and real objects needs to be
`determined for every frame generated, i.e., at real-time rates.
`We introduce here a video see-through system capable of resolv-
`ing occlusion between real and virtual objects at close to real—time
`rates. We achieve these near-real—time rates by computing depth
`maps for the left and right views at half the original video image
`resolution (depth maps are thus 320 by 240 pixels). Few addi-
`tional assumptions are made on the real and virtual environments;
`in particular, both can be fully dynamic.
`The heart ofour video see-through system is anew stereo match—
`ing algorithm that infers dense depth maps from a stereo pair of
`intensity bitmaps. This new algorithm trades accuracy for speed
`and thus outperforms known stereo matching algorithms except for
`those that run on customebuilt hardware. Furthermore, our algo-
`rithm is robust:
`it does not require a fully calibrated pair of stereo
`cameras.
`
`1.1 Overview
`
`We briefly review related work in Section 2. In Section 3 we outline
`the architecture of our video see-through augmented reality system.
`The basic algorithm for stereo matching video images in close to
`real-time is explained in Section 4. Several extensions, described in
`Section 5, make the basic algorithm faster and more robust. Finally.
`in Section 6 we discuss drawbacks of the algorithm and propose
`possible future work.
`
`2 Related Work
`
`While several other augmented reality systems are described in the
`literature [3] [7] [9], none of these systems addresses the occlusion
`problem. We know of only one augmented reality system other
`than our own that attempts to correct this deficiency [10]. The
`envisioned application of the competing project [10], Le. virtual
`teleconferencing, is more ambitious than our own, i.e. augmented
`interior design, but the basis of both systems is to compute dense
`depth maps for the surrounding real world Preliminary results
`in [10] indicate process times of several minutes per depth map
`on an high-end workstation [1]. In contrast, we claim sub-second
`performance for depth maps of similar resolution, although our
`depth maps are not as accurate.
`The work by Koch [12] also applies computer vision techniques
`to infer dense, accurate depth maps from image pairs, and uses
`this information to construct 3D graphical representations of the
`surveyed world. Unfortunately, his methods are far from real-time,
`restricted to static environments. and thus not suitable for augmented
`reality applications.
`
`8
`
`
`
`Like Koch, we use a computer vision technique known as stereo
`matching to infer depth from stereo image pairs. Stereo matching
`is a well-established research area in the computer vision literature
`[8] [5]. Nonetheless, real-time algorithms for stereo matching are
`only a recent development.
`We believe that our near-real-tjme stereo matching algorithm is
`new. It is faster than other published near-real-time algorithms [13]
`[15], and is excelled only by algorithms running on custom-built
`hardware [15] [14] [11] (see Table 1). However, since our algorithm
`runs on general—purpose workstations,
`it is more affordable (no
`expensive, single/use, custom-built hardware is required) and more
`flexible (none of the parameters are hard-wired) than those.
`Eventhough our algorithm is faster than some of those previously
`published efforts, our resulting depth maps are also less accurate.
`
`Resolution
`320x240x30
`
`320x240x30
`160x120x15
`64x 60x 6
`
`
`
`
`1-proc. SGI Onyx
`
`2~proc. SGI Onyx
`
`2—proc. SGI Onyx
`
`
`Matthias [12]
`
`68020 WIS
`
`image proc. cards
`
`2460ms
`256x240x16
`Ross [15]
`Sun Spare ]]
`
`
`
`150ms
`256x240x16
`Ross [15]
`64 Cell iWarp
`
`
`2180ms
`512x480x88
`Ross [15]
`64 Cell iWarp
`
`Nishihara [13]
`custom-design
`512x512x ?
`33ms
`
`
`
`25 6x240x30
`33ms
`Kanade [10]
`custom-design
`
`
`
`
`Time
`620ms
`
`370ms
`
`lOOms
`lOOOms
`
`
`
`Table 1: Running times of our stereo matching algorithm com-
`pared with previous real-time or near—real-time stereo matching
`algorithms. Resolution is the resolution of the generated depth map
`in m, y, and depth, ie, the range of possible disparity values for a
`matched point.
`
`3 System Description
`Figure 1 outlines the architecture of our augmented reality sys-
`tem. Two black and white video cameras are mounted on top of
`a Fakespace Boom. The cameras need to be aligned so that their
`epi-polar linesl roughly coincide with their horizontal scan-lines.
`Unfortunately, simply aligning the cameras’ outer housings is in"
`sufficient due to large manufacturing tolerances in internal image
`sensor—array alignments (for example, our cameras had a pitch dif—
`ference of several degrees). While we achieved alignment of i3
`pixels manually by trial and error. less time—consuming options are
`available; for example, the images could be aligned in software [4]
`or by using calibrated off-the-shelf hardware [16].
`The cameras continuously transmit gen-locked left/right video
`image pairs, such as shown in Figures 2 and 4,
`to the red and
`green inputs of a Sirius video card. The Shins video card digitizes
`the analogue video signal and transfers the bitmaps into the main
`memory of an SGI Onyx.
`We then apply the stereo matching algorithm described in Sec-
`tion 4 to the image pair. Figures 3 and 5 show the resulting depth
`maps. We copy the z-values for the left image into the z-buffer and
`transfer the left video image to the red frame—buffer. Since every
`pixel of the video image now has an associated z—value, we simply
`render all computer graphics objects for the left View — z-buffering
`resolves all occlusion relations.
`The procedure is repeated for the right view: we clear the z-
`huffer, copy the generated z—values for the right view into it. transfer
`
`lAn cpl-polar line is the intersection of the image plane with the
`plane defined by the projection centers of the two cameras and an
`arbitrary point in the 3D world space.
`
`Stereo cameras
`mounted on Boom
`
`Fakespace Boom
`
` Head positio
`and orientation
`
`
`
`
`
`Video out
`
`[Rzleft View
`a:right View].
`
`_
`'
`Sirius video
`frame grabber
`
`
`
`
`
`
`
`I-‘i COPY !"I
`
`
`- Frame buffer .
`.
`-
`Ella Zg
`
`
`
`
`
`7-
`
`
`SGI Onyx with two 150MHz R4400 Processors
`and RealityEngine II
`'
`
`
`
`
`match
`
`
`
`
`Figure 1‘. Schematic of our video see—through augmented reality
`system. The inputs and outputs R, G, and B correspond to the red.
`green, and blue channels. respectively. Zr and Zg are the depth- or
`z-values for the red and green channels, respectively. Since only
`one zabuffer is available in double-buffering mode, We first render
`the left view (red channel) completely and then the right View (green
`channel).
`
`Time
`
`Stereo matching algorithm
`320x240
`
`
`z-value transfer per frame
`320x240
`
`
`
`640x 240
`RGB~value transfer per frame
`
`640x240
`Video capture
`
`1280a 1024
`Rendering per frame
`
`
`1280x 1024
`Total for stereo image pair
`
`
`
`
`Table 2: Results of timing tests for the various parts of our sys~
`tern, running on a 150MHz two-processor SGI Onyx with a Real-
`ityEngine ll. Capturing images from video does not use the CPU,
`but does introduce an additional lag of at least lOOms. Rendering
`is highly scene-dependent— our extremely simple test scenes take
`less than 101113 to render.
`
`the right video image into the green frame-buffer, and finally render
`all computer graphics objects for the right View.
`The Fakespace Boom then displays the redfgrecn augmented re»
`aljty image pairs so generated as a stereo image pair. Figures 6
`and 8 and Figures 7 and 9 show two examples. The Boom also
`allows us to couple the virtual camera pair position and orientation
`
`9
`
`
`
`
`
`Figure 2: Left video camera image. Using this and the right video
`image, we infer depth for every pixel in the video image in near-
`real-time.
`
`Figure 4: Right video camera image.
`
`directly with those of the video camera pair. Therefore, the com-
`puter graphics objects are rendered with the same perspective as the
`video images.
`While our system is video see—through, the same setup is used
`for optical see-through systems. Instead of a Fakespace Boom, the
`user wears a head~mounted optical see-through display and a pair
`of head—mounted video cameras. The video signal is processed as
`before, except that the video images are never transferred to the
`frame-buffer. Therefore, only the computer graphics objects —
`properly clipped by the generated z~values to occlude only more
`distant objects — are displayed on the optical see—through display.
`The various parts ofour system require varying amounts of com-
`pute time. Table 2 shows the results of our tinting tests. While
`the stereo-frame rate of roughly two updates per second is still an
`order of magnitude too slow for practical augmented reality sys-
`tems, our Work may guide hardware architects to address the needs
`for faster and more affordable video processing hardware. Alterna~
`
`lively. resolution of the depth maps or video images may be reduced
`further.
`
`4 Basic Algorithm
`The new stereo matching algorithm we use to infer depth for the
`video images is central to our occlusion-resolving augmented reality
`system. Like all other stereo matching algorithms,
`it works by
`matching points in the left image to points in the right image and
`vice versa. Once the relative image positions of a pair of matched
`points are established? triangulation is used to infer the distance of
`the matched points to the cameras [8].
`Our algorithm is area-based, ie, it attempts to match image areas
`to one another. It works in five phases.
`
`4.1 Phase One
`
`In the first phase, we subsample the original video image. Currently,
`we operate at half the resolution of the video images. Higher (lower)
`resolution gives more (less) accurate results while slowing down
`
`
`
`Figure 3: Our new stereo matching algorithm produces a half-
`resolution, approximate depth map for the left and right camera-
`view in near-real—time. The depth map for Figure 2 is shown here.
`
`Figure 5: The computed depth map for Figure 4.
`
`
`
`10
`
`10
`
`
`
`Figure 6: Real and virtual imagery are combined via the standard
`z-buffer algorithm. Here, a virtual sphere occludes and is occluded
`by real-world objects in the left camera view.
`
`Figure 8: The right view of the scene in Figure 6.
`
`(speeding up) the algorithm.
`
`4.2 Phase Two
`
`The second phase analyzes me vertical pixel spans of the subsam—
`pled video images for sudden changes in intensities; the vertical
`pixel span is split at the points at which such a change occurs.
`Figures 10 and 12 illustrate the result of this operation.
`
`4.3 Phase Three
`
`The third phase generates the individual areas or blocks. A block
`is part of a vertical pixel span whose length is delimited by the
`splits introduced in the second phase of the algorithm. Therefore,
`all the pixels belonging to a particular block vary little in intensity
`(otherwise the second phase would have generated a split). Ac-
`cordingly, only a few parameters suffice to describe a block:
`its
`
`:5 and y position, its length, the average intensity of all its pixels,
`and the standard deviation of the intensities. To ensure that average
`intensity and standard deviation properly characterize a block. we
`impose a minimum length of 3 pixels.
`
`4.4 Phase Four
`
`The fourth phase of our algorithm matches blocks in the left image
`to blocks in the right image and vice versa. We compare every
`given block with all blocks in the other image that share the same
`horizontal scan-lines to find the best match.
`(This is less work
`than a full search because the range of possible disparity values
`restricts the number of blocks we must examine; see Figure 11
`and also Section 5.2.) TWO blocks match if the differences in their
`y-position. their length, their average intensity, and their standard
`deviation are below preset tolerances. The resulting depth estimates
`for the left and right block are entered into the depth map for the
`left and right image, respectively. The differences in the matching
`blocks’ parameters are also recorded and used to weight the depth
`
`
`
`
`
`
`
`Figure 9: The virtual, screen~aligned disk in die right view.
`
`Figure 7: To visualize depth We let a virtual screen-aligned disk
`occlude and be occluded by real-world objects. Due to errors in the
`computed depth map for the video image, occlusion is not always
`resolved properly.
`
`11
`
`11
`
`
`
`estimate.
`
`At the end of the fourth phase we have two depth maps, one
`for the left and one for the right image. Since not every block is
`guaranteed to match and since some blocks might match several
`times, each depth map has between zero and several depth entries
`for each pixel.
`
`4.5 Phase Five
`
`In the fifth and final phase, we average multiple depth enhies for
`each pixel in each depth map (using the earlier recorded weights).
`Pixels with no depth entries are interpolated from the neighboring
`entries in the same horizontal scan—line of the depth map.
`
`4.6 Critical Features
`
`Several features of our algorithm are critical. First, blocks are
`part of vertical pixel scans. Compared to horizontal pixel scans,
`vertical pixel scans are less distorted by perspective foreshortening
`and occlusion differences in the left and right views. Therefore,
`matching is less error—prone.
`Second, blocks are only one pixel wide. The same advantages as
`above apply.
`Third, we search for matching blocks only along the same hor-
`izontal scan-line.
`If we assume that the camera’s epi—polar lines
`roughly align with the horizontal scan-lines, then these blocks are
`the only possible matching candidates — even if we tilt the head
`(i.e., both cameras).
`Fourth and last, the exceptional speed of our algorithm results
`from its ability to group pixels into blocks and then match those
`blocks by comparing only a few characterizing values (i.e.,
`3,:—
`position, length, average intensity, and standard deviation). On
`the other hand, these few values do not always characterize a block
`distinctively enough; hence matching is subject to error, and thus
`our algorithm does not always estimate depth correctly.
`
`5 Extensions
`
`Several techniques exist to increase accuracy and speed of the above
`basic algorithm. We describe these techniques here,
`
`Block to match
`
`Limit: search to
`range of possible
`disparity values
`
`and height
`
`Limit: search
`to blocks of
`roughly equal
`ymposition
`
`Blocks that
`are not: even
`examined
`
`Blocks that
`might match
`
`Figure 11: A given block matches only blocks that have roughly the
`same y-position, length, average intensity, and standard deviation.
`The block that matches most closely is selected for computation of
`the depth estimate.
`
`5.1 Allowing for Inaccurate Alignment
`Since our stereo camera pair is not fully calibrated —- in particular,
`the epi—polarlines correspond only to a band of horizontal scan-lines
`—— we adjust the matching algorithm to take inaccurate alignment
`into account. To match a block we therefore first find the scan-line
`that crosses its middle. We then consider all blocks that cross that
`scan-line (not necessarily in the middle) as possible candidates. A
`block matches the original blockonly if the difference in the'vertical
`placement of theirstart- and end-points is within the alignmenterror,
`e.g., in our case within :l:3 pixels.
`
`5.2 Horizontal Depth Coherency
`When matching a block in the left image to blocks in the right
`image, it is unnecess