`
`DANA H. BALLARD CHRISTOPHER M. BROWN
`
`IPR2021-00921
`Apple EX1015 Page 1
`
`
`
`COMPUTER
`VISION
`
`DanaH.Ballard
`Christopher M. Brown
`
`Department of Computer Science
`University of Rochester
`Rochester, New York
`
`PRENTICEHALL,
`
`INC., Englewood Cliffs, New Jersey 07632
`
`IPR2021-00921
`Apple EX1015 Page 2
`
`
`
`Library of Congress Cataloging in Publication Data
`
`I. Brown, Christopher M.
`
`BALLARD. DANA HARRY.
`Computer vision.
`Bibliography: p.
`Includes index.
`I. Image processing.
`II. Title.
`TA1632.B34
`ISBN 0131653164
`
`621.38'04I4
`
`8120974
`AACR2
`
`Cover design by Robin Breite
`
`© 1982 by PrenticeHall, Inc.
`Englewood Cliffs, New Jersey 07632
`
`All rights reserved. No part of this book
`may be reproduced in any form or by any means
`without permission in writing from the publisher.
`
`Printed in the United States of America
`
`10 9 8 7 6 5 4 3 2
`
`ISBN D13XtS31bM
`
`PRENTICEHALL INTERNATIONAL, INC., London
`PRENTICEHALL OF AUSTRALIA PTY. LIMITED, Sydney
`PRENTICEHALL OF CANADA, LTD., Toronto
`PRENTICEHALL OF INDIA PRIVATE LIMITED, New Delhi
`PRENTICEHALL OF JAPAN, INC., Tokyo
`PRENTICEHALL OF SOUTHEAST ASIA PTE. LTD., Singapore
`WHITEHALL BOOKS LIMITED, Wellington, New Zealand
`
`IPR2021-00921
`Apple EX1015 Page 3
`
`
`
`Preface
`
`xiii
`
`Acknowledgments
`
`xv
`
`Mnemonics for Proceedings and Special Collections Cited in
`References
`xix
`
`1 COMPUTER VISION
`
`1
`1.1 Achieving Simple Vision Goals
`1.2 HighLevel and LowLevel Capabilities
`1.3 A Range of Representations
`6
`1.4 The Role of Computers
`9
`1.5 Computer Vision Research and Applications
`
`2
`
`12
`
`IPR2021-00921
`Apple EX1015 Page 4
`
`
`
`Part I
`GENERALIZED IMAGES
`13
`
`IMAGE FORMATION
`
`2.1
`2.2
`
`2.3
`
`17
`Images
`18
`Image Model
`2.2.1
`Image Functions, 18
`2.2.2
`Imaging Geometry, 19
`2.2.3 Reflectance, 22
`2.2.4 Spatial Properties, 24
`2.2.5 Color, 31
`2.2.6 Digital Images, 35
`Imaging Devices for Computer Vision
`2.3.1 Photographic Imaging, 44
`2.3.2 Sensing Range, 52
`2.3.3 Reconstruction Imaging, 56
`
`EARLY PROCESSING
`
`42
`
`93
`
`63
`
`88
`
`3.1
`3.2
`
`3.3
`
`3.7
`
`Recovering Intrinsic Structure
`Filtering the Image
`65
`3.2.1 Template Matching, 65
`3.2.2 Histogram Transformations, 70
`3.2.3 Background Subtraction, 72
`3.2.4 Filtering and Reflectance Models, 73
`Finding Local Edges
`75
`3.3.1 Types of Edge Operators, 76
`3.3.2 Edge Thresholding Strategies, 80
`3.3.3 ThreeDimensional Edge Operators, 81
`3.3.4 How Good Are Edge Operators? 83
`3.3.5 Edge Relaxation, 85
`3.4 Range Information from Geometry
`3.4.1 Stereo Vision and Triangulation, 88
`3.4.2 A Relaxation Algorithm for Stereo, 89
`3.5 Surface Orientation from Reflectance Models
`3.5.1 Reflectivity Functions, 93
`3.5.2 Surface Gradient, 95
`3.5.3 Photometric Stereo, 98
`3.5.4 Shape from Shading by Relaxation, 99
`3.6 Optical Flow
`102
`3.6.1 The Fundamental Flow Constraint, 102
`3.6.2 Calculating Optical Flow by Relaxation, 103
`Resolution Pyramids
`106
`3.7.1 GrayLevel Consolidation, 106
`3.7.2 Pyramidal Structures in Correlation, 107
`3.7.3 Pyramidal Structures in Edge Detection, 109
`
`IPR2021-00921
`Apple EX1015 Page 5
`
`
`
`PART II
`SEGMENTED IMAGES
`115
`
`121
`
`137
`
`4.3
`
`4.4
`
`4.5
`
`4 BOUNDARY DETECTION
`119
`4.1 On Associating Edge Elements
`4.2
`Searching N e ar an Approximate Location
`4.2.1 Adjusting A Priori Boundaries, 121
`4.2.2 NonLinear Correlation in Edge Space, 121
`4.2.3 DivideandConquer Boundary Detection, 122
`T he H o u gh Method for Curve Detection
`123
`4.3.1 Use of the Gradient, 124
`4.3.2 Some Examples, 125
`4.3.3 Trading Off Work in Parameter Space for Work in
`Image Space, 126
`4.3.4 Generalizing the Hough Transform, 128
`Edge Following as G r a ph Searching
`131
`4.4.1 Good Evaluation Functions, 133
`4.4.2 Finding All the Boundaries, 133
`4.4.3 Alternatives to the A Algorithm, 136
`Edge Following as Dynamic Programming
`4.5.1 Dynamic Programming, 137
`4.5.2 Dynamic Programming for Images, 139
`4.5.3 Lower Resolution Evaluation Functions, 141
`4.5.4 Theoretical Questions about Dynamic
`Programming, 143
`4.6 Contour Following
`143
`4.6.1 Extension to GrayLevel Images, 144
`4.6.2 Generalization to HigherDimensional Image
`Data, 146
`
`5 REGION GROWING
`Regions
`149
`5.1
`151
`5.2 A Local Technique: Blob Coloring
`5.3 Global Techniques: Region Growing via Thresholding
`5.3.1 Thresholding in Multidimensional Space, 153
`5.3.2 Hierarchical Refinement, 155
`Splitting and Merging
`155
`5.4.1 StateSpace Approach to Region Growing, 157
`5.4.2 LowLevel Boundary Data Structures, 158
`5.4.3 GraphOriented Region Structures, 159
`Incorporation of Semantics
`160
`
`5.4
`
`5.5
`
`6 TEXTURE
`6.1 What Is Texture?
`
`166
`
`6.2
`
`Texture Primitives
`
`169
`
`Contents
`
`IPR2021-00921
`Apple EX1015 Page 6
`
`
`
`6.3
`
`6.4
`
`Structural Models of Texel Placement
`6.3.1 Grammatical Models, 172
`6.3.2 Shape Grammars, 173
`6.3.3 Tree Grammars, 175
`6.3.4 Array Grammars, 178
`Texture as a Pattern Recognition Problem
`6.4.1 Texture Energy, 184
`6.4.2 Spatial GrayLevel Dependence, 186
`6.4.3 Region Texels, 188
`6.5 The Texture Gradient
`
`170
`
`181
`
`189
`
`7 MOTION
`
`195
`7.1 Motion Understanding
`7.1.1 DomainIndependent Understanding, 196
`7.1.2 DomainDependent Understanding, 196
`7.2 Understanding Optical Flow
`199
`7.2.1 Focusof Expansion, 199
`7.2.2 Adjacency, Depth, and Collision, 201
`7.2.3 Surface Orientation and Edge Detection, 202
`7.2.4 Egomotion, 206
`207
`7.3 Understanding Image Sequences
`7.3.1 Calculating Flow from Discrete Images, 207
`7.3.2 Rigid Bodies from Motion, 210
`7.3.3
`Interpretation of Moving Light Displays—A
`DomainIndependent Approach, 214
`7.3.4 Human Motion Understanding—A Model
`Directed Approach, 217
`7.3.5 Segmented Images, 220
`
`Part III
`GEOMETRICAL STRUCTURES
`227
`
`8 REPRESENTATION OF TWODIMENSIONAL
`GEOMETRIC STRUCTURES
`8.1 TwoDimensional Geometric Structures
`8.2 Boundary Representations
`232
`8.2.1 Polylines, 232
`8.2.2 Chain Codes, 235
`8.2.3 The tyj Curve, 237
`8.2.4 Fourier Descriptors, 238
`8.2.5 Conic Sections, 239
`8.2.6 BSplines, 239
`8.2.7 Strip Trees, 244
`
`231
`
`IPR2021-00921
`Apple EX1015 Page 7
`
`
`
`8.3
`
`8.4
`
`247
`Region Representations
`8.3.1 Spatial Occupancy Array, 247
`8.3.2 y Axis, 248
`8.3.3 Quad Trees, 249
`8.3.4 Medial Axis Transform, 252
`8.3.5 Decomposing Complex Areas, 253
`Simple Shape Properties
`254
`8.4.1 Area, 254
`8.4.2 Eccentricity, 255
`8.4.3 Euler Number, 255
`8.4.4 Compactness, 256
`8.4.5 Slope Density Function, 256
`8.4.6 Signatures, 257
`8.4.7 Concavity Tree, 258
`8.4.8 Shape Numbers, 258
`
`264
`
`9 REPRESENTATION OF THREEDIMENSIONAL
`STRUCTURES
`9.1 Solids and Their Representation
`9.2 Surface Representations
`265
`9.2.1 Surfaces with Faces, 265
`9.2.2 Surfaces Based on Splines, 268
`9.2.3 Surfaces That Are Functions on the Sphere, 270
`9.3 Generalized Cylinder Representations
`274
`9.3.1 Generalized Cylinder Coordinate Systems and
`Properties, 275
`9.3.2 Extracting Generalized Cylinders, 278
`9.3.3 A Discrete Volumetric Version of the Skeleton, 279
`9.4 Volumetric Representations
`280
`9.4.1 Spatial Occupancy, 280
`9.4.2 Cell Decomposition, 281
`9.4.3 Constructive Solid Geometry, 282
`9.4.4 Algorithms for Solid Representations, 284
`9.5 Understanding Line Drawings
`291
`9.5.1 Matching Line Drawings to ThreeDimensional
`Primitives, 293
`9.5.2 Grouping Regions Into Bodies, 294
`9.5.3 Labeling Lines, 296
`9.5.4 Reasoning About Planes, 301
`
`Part IV
`RELATIONAL STRUCTURES
`313
`
`10 KNOWLEDGE REPRESENTATION AND USE
`10.1 Representations
`317
`10.1.1 The Knowledge Base—Models and Processes, 318
`
`IPR2021-00921
`Apple EX1015 Page 8
`
`
`
`10.1.2 Analogical and Propositional Representations,
`319
`10.1.3 Procedural Knowledge, 321
`10.1.4 Computer Implementations, 322
`10.2 Semantic Nets
`323
`10.2.1 Semantic Net Basics, 323
`10.2.2 Semantic Nets for Inference, 327
`10.3 Semantic Net Examples
`334
`10.3.1 Frame Implementations, 334
`10.3.2 Location Networks, 335
`10.4 Control Issues in Complex Vision Systems
`10.4.1 Parallel and Serial Computation, 341
`10.4.2 Hierarchical and Heterarchical Control, 341
`10.4.3 Belief Maintenance and Goal Achievement, 346
`
`340
`
`11 MATCHING
`
`352
`
`355
`
`352
`1.1 Aspects of Matching
`11.1.1
`Interpretation: Construction, Matching, and
`Labeling 352
`I l.l.2 Matching Iconic, Geometric, and Relational
`Structures, 353
`1.2 GraphTheoretic Algorithms
`11.2.1 The Algorithms, 357
`11.2.2 Complexity, 359
`Implementing GraphTheoretic Algorithms
`11.3.1 Matching Metrics, 360
`11.3.2 Backtrack Search, 363
`11.3.3 Association Graph Techniques, 365
`1.4 Matching in Practice
`369
`11.4.1 Decision Trees, 370
`11.4.2 Decision Tree and Subgraph Isomorphism, 375
`11.4.3
`Informal Feature Classification, 376
`11.4.4 A Complex Matcher, 378
`
`1.3
`
`360
`
`12
`
`INFERENCE
`
`383
`
`12.1
`
`384
`FirstOrder Predicate Calculus
`12.1.1 ClauseForm Syntax (Informal), 384
`12.1.2 Nonclausal Syntax and Logic Semantics
`(Informal), 385
`12.1.3 Converting Nonclausal Form to Clauses, 387
`12.1.4 Theorem Proving, 388
`12.1.5 Predicate Calculus and Semantic Networks, 390
`12.1.6 Predicate Calculus and Knowledge
`Representation, 392
`12.2 Computer Reasoning
`395
`12.3 Production Systems
`396
`12.3.1 Production System Details, 398
`12.3.2 Pattern Matching, 399
`
`Contents
`
`IPR2021-00921
`Apple EX1015 Page 9
`
`
`
`408
`
`12.3.3 An Example, 401
`12.3.4 Production System Pros and Cons, 406
`12.4 Scene Labeling and Constraint Relaxation
`12.4.1 Consistent and Optimal Labelings, 408
`12.4.2 Discrete Labeling Algorithms, 410
`12.4.3 A Linear Relaxation Operator and a Line
`Labeling Example, 415
`12.4.4 A Nonlinear Operator, 419
`12.4.5 Relaxation as Linear Programming, 420
`12.5 Active Knowledge
`430
`12.5.1 Hypotheses, 431
`12.5.2 HOWTO and SOWHAT Processes, 431
`12.5.3 Control Primitives, 431
`12.5.4 Aspects of Active Knowledge, 433
`
`13 GOAL ACHIEVEMENT
`
`439
`13.1 Symbolic Planning
`13.1.1 Representing the World, 439
`13.1.2 Representing Actions, 441
`13.1.3 Stacking Blocks, 442
`13.1.4 The Frame Problem, 444
`Planning with Costs
`445
`13.2.1 Planning, Scoring, and Their Interaction, 446
`13.2.2 Scoring Simple Plans, 446
`13.2.3 Scoring Enhanced Plans, 451
`13.2.4 Practical Simplifications, 452
`13.2.5 A Vision System Based on Planning, 453
`
`13.2
`
`APPENDICES
`465
`
`438
`
`A1 SOME MATHEMATICAL TOOLS
`
`465
`
`A l .2
`
`465
`
`Al.l Coordinate Systems
` I Cartesian, 465
`A l. I.
`Al.l.2 Polar and Polar Space, 465
`Al.l.3 Spherical and Cylindrical, 466
`Al.l.4 Homogeneous Coordinates, 467
`Trigonometry
`468
`Al.2.1 Plane Trigonometry, 468
`A 1.2.2 Spherical Trigonometry, 469
`A 1.3 Vectors
`469
`A 1.4 Matrices
`471
`474
`A 1.5
`Lines
`A 1.5.1 Two Points, 474
`A 1.5.2 Point and Direction, 474
`A 1.5.3 Slope and Intercept, 474
`
`Contents
`
`xi
`
`IPR2021-00921
`Apple EX1015 Page 10
`
`
`
`477
`
`A 1.5.4 Ratios, 474
`Al.5.5 Normal and Distance from Origin (Line
`Equation), 475
`A 1.5.6 Parametric, 476
`A 1.6 Planes
`476
`A 1.7 Geometric Transformations
`A 1.7.1 Rotation, 477
`A 1.7.2 Scaling, 478
`A 1.7.3 Skewing, 479
`Al.7.4 Translation, 479
`A 1.7.5 Perspective, 479
`Al.7.6 Transforming Lines and Planes, 480
`A 1.7.7 Summary, 480
`A 1.8 Camera Calibration and Inverse Perspective
`A1.8.1 Camera Calibration, 482
`A 1.8.2
`Inverse Perspective, 483
`484
`A 1.9 LeastSquaredError Fitting
`A1.9.1 PseudoInverse Method, 485
`A 1.9.2 Principal Axis Method, 486
`Al.9.3 Fitting Curves by the PseudoInverse Method,
`487
`
`481
`
`488
`A 1.10 Conies
`489
`A 1.11
`Interpolation
`Al.11.1 OneDimensional, 489
`A 1.11.2 TwoDimensional, 490
`A 1.12 The Fast Fourier Transform
`A 1.13 The Icosahedron
`492
`A 1.14 Root Finding
`
`493
`
`490
`
`A2 ADVANCED CONTROL MECHANISMS
`
`497
`
`A2.2
`
`A2.1 Standard Control Structures
`A2.1.1 Recursion, 498
`A2.1.2 CoRoutining, 498
`Inherently Sequential Mechanisms
`A2.2.1 Automatic Backtracking, 499
`A2.2.2 Context Switching, 500
`A2.3 Sequential or Parallel Mechanisms
`A2.3.1 Modules and Messages, 500
`A2.3.2 Priority Job Queue, 502
`A2.3.3 PatternDirected Invocation, 504
`A2.3.4 Blackboard Systems, 505
`
`499
`
`500
`
`AUTHOR INDEX
`
`SUBJECT INDEX
`
`IPR2021-00921
`Apple EX1015 Page 11
`
`
`
`Preface
`
`The dream of intelligent automata goes back to antiquity; its first major articulation
`in the context of digital computers was by Turing around 1950. Since then, this
`dream has been pursued primarily by workers in the field of
` artificial intelligence,
`whose goal is to endow computers with informationprocessing capabilities
`comparable to those of biological organisms. From the outset, one of the goals of
`artificial intelligence has been to equip machines with the capability of dealing with
`sensory inputs.
`Computer vision is the construction of explicit, meaningful descriptions of
`physical objects from images. Image understanding is very different from image
`processing, which studies imagetoimage transformations, not explicit description
`building. Descriptions are a prerequisite for recognizing, manipulating, and
`thinking about objects.
`We perceive a world of coherent threedimensional objects with many
`invariant properties. Objectively, the incoming visual data do not exhibit
`corresponding coherence or invariance; they contain much irrelevant or even
`misleading variation. Somehow our visual system, from the retinal to cognitive
`levels, understands, or imposes order on, chaotic visual input. It does so by using
`intrinsic information that may reliably be extracted from the input, and also through
`assumptions and knowledge that are applied at various levels in visual processing.
`The challenge of computer vision is one of explicitness. Exactly what
`information about scenes can be extracted from an image using only very basic
`assumptions about physics and optics? Explicitly, what computations must be
`performed? Then, at what stage must domaindependent, prior knowledge about
`the world be incorporated into the understanding process? How are world models
`and knowledge represented and used? This book is about the representations and
`mechanisms that allow image information and prior knowledge to interact in image
`understanding.
`Computer vision is a relatively new and fastgrowing field. The first
`experiments were conducted in the late 1950s, and many of the essential concepts
`
`xiii
`
`IPR2021-00921
`Apple EX1015 Page 12
`
`
`
` five years. With this rapid growth, crucial ideas
`have been developed during the last
`have arisen in disparate areas such as artificial intelligence, psychology, computer
`graphics, and image processing. Our intent is to assemble a selection of
` this material
`in a form that will serve both as a senior/graduatelevel academic text and as a
`useful reference to those building vision systems. This book has a strong artificial
`intelligence flavor, and we hope this will provoke thought. We believe that both the
`intrinsic image information and the internal model of the world are important in
`successful vision systems.
`The book is organized into four parts, based on descriptions of objects at four
`different levels of abstraction.
`
`1. Generalized images—images and imagelike entities.
`2. Segmented images—images organized into subimages that are likely to
`correspond to "interesting objects."
`3. Geometric structures—quantitative models of image and world structures.
`4. Relational structures—complex symbolic descriptions of image and world
`structures.
`The parts follow a progression of increasing abstractness. Although the four
`parts are most naturally studied in succession, they are not tightly interdependent. Part
`I is a prerequisite for Part II, but Parts III and IV can be read independently.
`Parts of the book assume some mathematical and computing background
`(calculus, linear algebra, data structures, numerical methods). However, throughout
`the book mathematical rigor takes
` a backseat to concepts. Our intent
` is to transmit a set
`of ideas about a new field to the widest possible audience.
`In one book it is impossible to do justice to the scope and depth of prior work in
`computer vision. Further, we realize that in a fastdeveloping field, the rapid influx of
`new ideas will continue.
` We hope that our readers
` will be challenged to think, criticize,
`read further, and quickly go beyond the confines of this volume.
`
`xiv
`
`Preface
`
`IPR2021-00921
`Apple EX1015 Page 13
`
`
`
`Acknowledgments
`
`Jerry Feldman and Herb Voelcker (and through them the University of Rochester)
`provided many resources for this work. One of the most important was a capable
`and forgiving staff (secretarial, technical, and administrative). For massive text
`editing, valuable advice, and good humor we are especially grateful to Rose Peet.
`Peggy Meeker, Jill Orioli, and Beth Zimmerman all helped at various stages.
`Several colleagues made suggestions on early drafts: thanks to James Allen,
`Norm Badler, Larry Davis, Takeo Kanade, John Render, Daryl Lawton, Joseph
`O'Rourke, Ari Requicha, Ed Riseman, Azriel Rosenfeld, Mike Schneier, Ken
`Sloan, Steve Tanimoto, Marty Tenenbaum, and Steve Zucker.
`Graduate students helped in many different ways: thanks especially to Michel
`Denber, Alan Frisch, Lydia Hrechanyk, Mark Kahrs, Keith Lantz, Joe Maleson,
`Lee Moore, Mark Peairs, Don Perlis, Rick Rashid, Dan Russell, Dan Sabbah, Bob
`Schudy, Peter Selfridge, Uri Shani, and Bob Tilove. Bernhard Stuth deserves special
`mention for much careful and critical reading.
`Finally, thanks go to Jane Ballard, mostly for standing steadfast through the
`cycles of elation and depression and for numerous engineeringtoEnglish transla
`tions.
`As Pat Winston put it: "A willingness to help is not an implied endorsement."
`The aid of others was invaluable, but we alone are responsible for the opinions,
`technical details, and faults of this book.
`Funding assistance was provided by the Sloan Foundation under Grant 784
`15, by the National Institutes of Health under Grant HL21253, and by the Defense
`Advanced Research Projects Agency under Grant N0001478C0164.
`The authors wish to credit the following sources for figures and tables. For
`complete citations given here in abbreviated form (as "from . . ." or "after . . ."),
`refer to the appropriate chapterend references.
`
`Fig. 1.2 from Shani, U., "A 3D modeldriven system for the recognition of abdominal
`anatomy from CT scans," TR77, Dept. of Computer Science, University of Rochester, May
`1980.
`
`Acknowledgments
`
`xv
`
`IPR2021-00921
`Apple EX1015 Page 14
`
`
`
`Fig. 1.4 courtesy of Allen Hanson and Ed Riseman, COINS Research Project, University of
`Massachusetts, Amherst, MA.
`Fig. 2.4 after Horn and Sjoberg, 1978.
`Figs. 2.5, 2.9, 2.10, 3.2, 3.6, and 3.7 courtesy of Bill Lampeter.
`Fig. 2.7a painting by Louis Condax; courtesy of Eastman Kodak Company and the Optical
`Society of America.
`Fig. 2.8a courtesy of D. Greenberg and G. Joblove, Cornell Program of Computer Graphics.
`Fig. 2.8b courtesy of Tom Check.
`Table 2.3 after Gonzalez and Wintz, 1977.
`Fig. 2.18 courtesy of EROS Data Center, Sioux Falls, SD.
`Figs. 2.19 and 2.20 from Herrick, C.N., Television Theory and Servicing: Black/White and
`Color, 2nd Ed. Reston, VA: Reston, 1976.
`Figs. 2.21, 2.22, 2.23, and 2.24 courtesy of Michel Denber.
`Fig. 2.25 from Popplestone et al., 1975.
`Fig. 2.26 courtesy of Production Automation Project, University of Rochester.
`Fig. 2.27 from Waag and Gramiak, 1976.
`Fig. 3.1 courtesy of Marty Tenenbaum.
`Fig. 3.8 after Horn, 1974.
`Figs. 3.14 and 3.15 after Frei and Chen, 1977.
`Figs. 3.17 and 3.18 from Zucker, S.W. and R.A. Hummel, "An optimal 3D edge operator,"
` 324331.
`IEEE Trans. PAMI3, May 1981, pp.
`Fig. 3.19 curves are based on data in Abdou, 1978.
`Figs. 3.20, 3.21, and 3.22 from Prager, J.M., "Extracting and labeling boundary segments in
`natural scenes," IEEE Tans. PAMI 12, 1, January 1980. © 1980 IEEE.
`Figs. 3.23, 3.28, 3.29, and 3.30 courtesy of Berthold Horn.
`Figs. 3.24 and 3.26 from Marr, D. and T. Poggio, "Cooperative computation of stereo dis
`parity," Science, Vol. 194, 1976, pp. 283287. © 1976 by the American Association for the
`Advancement of Science.
`Fig. 3.31 from Woodham, R.J., "Photometric stereo: A reflectance map technique for deter
`mining surface orientation from image intensity," Proc. SPIE, Vol. 155, August 1978.
`Figs. 3.33 and 3.34 after Horn and Schunck, 1980.
`Fig. 3.37 from Tanimoto, S. and T. Pavlidis, "A hierarchical data structure for picture pro
`cessing," CGIP 4, 2, June 1975, pp. 104119.
`Fig. 4.6 from Kimme et al., 1975.
`Figs. 4.7 and 4.16 from Ballard and Sklansky, 1976.
`Fig. 4.9 courtesy of Dana Ballard and Ken Sloan.
`Figs. 4.12 and 4.13 from Ramer, U., "Extraction of line structures from photgraphs of curved
`objects," CGIP 4, 2, June 1975, pp. 81103.
`Fig. 4.14 courtesy of Jim Lester, Tufts/New England Medical Center.
`Fig. 4.17 from Chien, Y.P. and K.S. Fu, "A decision function method for boundary detec
`tion," CGIP 3, 2, June 1974, pp. 125140.
`Fig. 5.3 from Ohlander, R., K. Price, and D.R. Reddy, "Picture segmentation using a recur
`sive region splitting method," CGIP 8, 3, December 1979.
`Fig. 5.4 courtesy of Sam Kapilivsky.
`Figs. 6.1, 11.16, and A 1.13 courtesy of Chris Brown.
`Fig. 6.3 courtesy of Joe Maleson and John Kender.
`Fig. 6.4 from Connors, 1979. Texture images by Phil Brodatz, in Brodatz, Textures. New
`York: Dover, 1966.
`Fig. 6.9 texture image by Phil Brodatz, in Brodatz, Textures. New York: Dover, 1966.
`Figs. 6.11, 6.12, and 6.13 from Lu, S.Y. and K.S. Fu, "A syntactic approach to texture
`analysis," CGIP 7, 3, June 1978, pp. 303330.
`
`xvi
`
`Acknowledgments
`
`IPR2021-00921
`Apple EX1015 Page 15
`
`
`
`Fig. 6.14 from Jayaramamurthy, S.N., "Multilevel array grammars for generating texture
`scenes," Proc. PRIP, August 1979, pp. 391398. © 1979 IEEE.
`Fig. 6.20 from Laws, 1980.
`Figs. 6.21 and 6.22 from Maleson et al., 1977.
`Fig. 6.23 courtesy of Joe Maleson.
`Figs. 7.1 and 7.3 courtesy of Daryl Lawton.
`Fig. 7.2 after Prager, 1979.
`Figs. 7.4 and 7.5 from Clocksin, W.F., "Computer prediction of visual thresholds for surface
`slant and edge detection from optical flow fields," Ph.D. dissertation, University of Edin
`burgh, 1980.
`Fig. 7.7 courtesy of Steve Barnard and Bill Thompson.
`Figs. 7.8 and 7.9 from Rashid, 1980.
`Fig. 7.10 courtesy of Joseph O'Rourke.
`Figs. 7.11 and 7.12 after Aggarwal and Duda, 1975.
`Fig. 7.13 courtesy of HansHellmut Nagel.
`Fig. 8.Id after Requicha, 1977.
`Figs. 8.2, 8.3,
` 8.21a, 8.22, and 8.26 after Pavlidis, 1977.
`Figs. 8.10, 8.11, 9.6, and 9.16 courtesy of Uri Shani.
`Figs. 8.12, 8.13, 8.14, 8.15, and 8.16 from Ballard, 1981.
` Duff; S. Levialdi, P.E. Norgren, and Ji. Toriwaki,
`Fig. 8.21 b from Preston, K., Jr., M.J.B.
`"Basics of cellular logic with some applications in medical image processing," Proc. IEEE,
`Vol. 67, No. 5, May 1979, pp. 826856.
`Figs. 8.25, 9.8, 9.9, 9.10, and 11.3 courtesy of Robert Schudy.
`Fig. 8.29 after Bribiesca and Guzman, 1979.
`Figs. 9.1, 9.18, 9.19, and 9.27 courtesy of Ari Requicha.
`Fig. 9.2 from Requicha, A.A.G., "Representations for rigid solids: theory, methods,
`systems," Computer Surveys 12, 4, December 1980.
`Fig. 9.3 courtesy of Lydia Hrechanyk.
`Figs. 9.4 and 9.5 after Baumgart, 1972.
`Fig. 9.7 courtesy of Peter Selfridge.
`Fig. 9.11 after Requicha, 1980.
`Figs. 9.14 and 9.15b from Agin, G.J. and T.O. Binford, "Computer description of curved ob
`jects," IEEE Trans, on Computers 25, 1, April 1976.
`Fig. 9.15a courtesy of Gerald Agin.
`Fig. 9.17 courtesy of A. Christensen; published as frontispiece of ACM SIGGRAPH 80
`Proceedings.
`Fig. 9.20 from Marr and Nishihara, 1978.
`Fig. 9.21 after Tilove, 1980.
`Fig. 9.22b courtesy of Gene Hartquist.
`Figs. 9.24, 9.25, and 9.26 from Lee and Requicha, 1980.
`Figs. 9.28a, 9.29, 9.30, 9.31, 9.32, 9.35, and 9.37 and Table 9.1 from Brown, C. and R. Pop
`plestone, "Cases in scene analysis," in Pattern Recognition, ed. B.G. Batchelor. New York:
`Plenum, 1978.
`Fig. 9.28b from Guzman, A., "Decomposition of a visual scene into threedimensional bodies,"
`in Automatic Interpretation and Classification of Images, A. Grasseli, ed., New York:
`Academic Press, 1969.
`Fig. 9.28c from Waltz, D., "Understanding line drawing of scenes with shadows," in The
`Psychology of Computer Vision, ed. P.H. Winston. New York: McGrawHill, 1975.
`Fig. 9.28d after Turner, 1974.
`Figs. 9.33, 9.38, 9.40, 9.42, 9.43, and 9.44 after Mackworth, 1973.
`
`Acknowledgments
`
`xvii
`
`IPR2021-00921
`Apple EX1015 Page 16
`
`
`
`Figs. 9.39, 9.45, 9.46, and 9.47 and Table 9.2 after Kanade, 1978.
`Figs. 10.2 and A2.1 courtesy of Dana Ballard.
`Figs. 10.16, 10.17, and 10.18 after Russell, 1979.
`Fig. 11.5 after Fischler and Elschlager, 1973.
`Fig. 11.8 after Ambler et al., 1975.
`Fig. 11.10 from Winston, P.H., "Learning structural descriptions from examples," in The
`Psychology of Computer Vision, ed. P.H. Winston. New York: McGrawHill, 1975.
`Fig. 11.11 from Nevatia, 1974.
`Fig. 11.12 after Nevatia, 1974.
`Fig. 11.17 after Barrow and Popplestone, 1971.
`Fig. 11.18 from Davis, L.S., "Shape matching using relaxation techniques," IEEE Trans.
`PA MI 1, 4, January 1979, pp. 6072.
`Figs. 12.4 and 12.5 from Sloan and Bajcsy, 1979.
`Fig. 12.6 after Barrow and Tenenbaum, 1976.
`Fig. 12.8 after Freuder. 1978.
`Fig. 12.10 from Rosenfeld, A.R., A. Hummel, and S.W. Zucker, "Scene labeling by relaxation
`operations," IEEE Trans. SMC 6, 6, June 1976, p. 420.
`Figs. 12.11, 12.12, 12.13, 12.14, and 12.15 after Hinton, 1979.
`Fig. 13.3 courtesy of Aaron Sloman.
`Figs. 13.6, 13.7, and 13.8 from Garvey, 1976.
`Fig. A 1.11 after Duda and Hart, 1973.
`Figs. A2.2 and A2.3 from Hanson, A.R. and E.M. Riseman, "VISIONS: A computer system
`for interpreting scenes," in Computer Vision Systems, ed. A.R. Hanson and E.M. Riseman.
`New York: Academic Press, 1978.
`
`Acknowledgments
`
`IPR2021-00921
`Apple EX1015 Page 17
`
`
`
`Mnemonics
`for Proceedings and Special Collections
`Cited in the References
`
`CGIP
`Computer Graphics and Image Processing
`COMPSAC
`IEEE Computer Society's 3rd International Computer Software and Applica
`tions Conference, Chicago, November 1979.
`
`cvs
`
` Vision Systems. New
`
`Image Understanding
`
`Hanson, A. R. and E. M. Riseman (Eds.). Computer
`York: Academic Press, 1978.
`DARPA IU
`Defense Advanced Research Projects Agency
`Workshop, Minneapolis, MN, April 1977.
`Defense Advanced Research Projects Agency
`Workshop, Palo Alto, CA, October 1977.
`Defense Advanced Research Projects Agency
`Workshop, Cambridge, MA, May 1978.
`Image Understanding
`Defense Advanced Research Projects Agency
`Workshop, CarnegieMellon University, Pittsburgh, PA, November 1978.
`Defense Advanced Research Projects Agency
`Image Understanding
`Workshop, University of Maryland, College Park, MD, April 1980.
`IJCAI
`2nd International Joint Conference on Artificial Intelligence, Imperial
`College, London, September 1971.
`4th International Joint Conference on Artificial Intelligence, Tbilisi, Georgia,
`USSR, September 1975.
`5th
`International Joint Conference on Artificial
`Cambridge, MA, August 1977.
`6th International Joint Conference on Artificial Intelligence, Tokyo, August
`1979.
`
`Image Understanding
`
`Image Understanding
`
`Intelligence, MIT,
`
`Mnemonics
`
`xix
`
`IPR2021-00921
`Apple EX1015 Page 18
`
`
`
`IJCPR
`2nd International Joint Conference on Pattern Recognition, Copenhagen,
`August 1974.
`3rd International Joint Conference on Pattern Recognition, Coronado, CA,
`November 1976.
`4th International Joint Conference on Pattern Recognition, Kyoto, November
`1978.
`5th International Joint Conference on Pattern Recognition, Miami Beach,
`FL, December 1980.
`
`MI4
`
`MI5
`
`MI6
`
`M17
`
`PCV
`
`PRIP
`
`Meltzer, B. and D. Michie (Eds.). Machine
`burgh University Press, 1969.
`
`*
` Intelligence 4. Edinburgh: Edin
`
`Meltzer, B. and D. Michie (Eds.). Machine
`burgh University Press, 1970.
`
` Intelligence 5. Edinburgh: Edin
`
`Meltzer, B. and D. Michie (Eds.). Machine
`burgh University Press, 1971.
`
` Intelligence 6. Edinburgh: Edin
`
`Meltzer, B. and D. Michie (Eds.). Machine
`burgh University Press, 1972.
`
` Intelligence 7. Edinburgh: Edin
`
`Winston, P. H. (Ed.). The Psychology of Computer Vision. New York:
`McGrawHill, 1975.
`
`IEEE Computer Society Conference on Pattern Recognition and Image
`Processing, Chicago, August 1979.
`
`Mnemonics
`
`IPR2021-00921
`Apple EX1015 Page 19
`
`
`
`Computer
`Vision
`
`1
`
`Computer Vision Issues
`
`1.1 ACHIEVING SIMPLE VISION GOALS
`
`Suppose that you are given an aerial photo such as that of Fig. 1.1a and asked to lo
`cate ships in it. You may never have seen a naval vessel in an aerial photograph be
`fore, but you will have no trouble predicting generally how ships will appear. You
`might reason that you will find no ships inland, and so turn your attention to ocean
`areas. You might be momentarily distracted by the glare on the water, but realizing
`that it comes from reflected sunlight, you perceive the ocean as continuous and
`flat. Ships on the open ocean stand out easily (if you have seen ships from the air,
`you know to look for their
` wakes). Near the shore the image is more confusing, but
`you know that ships close to shore are either moored or docked. If you have a map
`(Fig. 1.1b), it can help locate the docks (Fig.
` 1.1c); in a lowquality photograph it
`can help you identify the shoreline. Thus it might be a good investment of your
`time to establish the correspondence between the map and the image. A search
`parallel to the shore in the dock areas reveals several ships (Fig.
` 1 .Id).
`Again, suppose that you are presented with a set of computeraided tomo
`graphic (CAT) scans showing "slices" of the human abdomen (Fig.
` 1.2a). These
`images are products of high technology, and give us views not normally available
`even with xrays. Your job is to reconstruct from these cross sections the three
`dimensional shape of the kidneys. This job may well seem harder than finding
`ships. You first need to know what to look for (Fig.
` 1.2b), where to find it in CAT
`scans,