throbber

`
`eeeseafagesBAD USS. Patent No. 9,955,551
`
`=eZ
`
`z,
`
`sD-=7:a
`
`PATTERN
`RECOGNITION
`
`STATISTICAL, STRUCTURAL
`AND NEURAL APPROACHES
`
`ROBERT SCHALKOFF
`
`ma
`
`.
`
`"
`es eee a
`ee ey
`eaeieieisl
`f
`ae
`
`:
`
`
`
`nee
`
`"
`
`|
`
`VWGOoA EX1014
`
`LA
`
`ales
`ut
`Ele
`tog
`
`=a
`
`)=:a2C
`
`OD
`ee
`1°
`ua
`e
`
`AHOMNTVHOS
`
`0001
`
`VWGoA EX1014
`U.S. Patent No. 9,955,551
`
`

`

`ABOUT THE TEXT
`
`This self-contained text is structured in four main
`sections.
`
`@ Part 1 (Chapter 1) introduces the overall subject by
`asking ‘Whatis pattern recognition?’ and provides an
`overview of pattern recognition concerns which are
`commonto all approaches. The three approaches are
`then briefly introduced and compared.
`@ Part 2 (Chapters 2-5) concernsthe statistical approach,
`beginning with a simple example and ending with
`unsupervised learning throughclustering.
`@ Part 3 (Chapters 6-9) develops the syntactic or struc-
`tural approach, beginning with a look at the representa-
`tional capability of string grammars and corresponding
`recognition approaches (parsing), continuing with an
`exploration of higher-dimensional representations and
`graphical approaches, and ending with an introduction
`to supervised learning (grammatical inference).
`@ Part 4 (Chapters 10-13) details the emerging neural
`approach, beginning with an overview of neural com-
`puting basics, an examination of pattern associators
`and feedforward nets and associated training algo-
`rithms, and concluding with unsupervised learning
`implementations.
`
`In each case, following development ofthe theory, exam-
`ples are shownto illustrate the concepts. Each chapter
`contains a concluding section which cites relevant lit-
`erature for more in-depth study of specific topics—
`approximately 250 references are cited.
`
`Diagram of a Neural Net (experimental), 1989.
`Collection, The Museum of Modern Art, New York. Gift of Synaptics,
`Incorporated.
`Photograph © 1992 The Museum of Modern Art, New York.
`
`
`
`0002
`
`0002
`
`

`

`
`
`Pattern Recognition:
`Statistical, Structural
`and Neural Approaches
`
`Robert J. Schalkoff
`
`Clemson University
`
`John Wiley & Sons,Inc.
`New York © Chichester @ Brisbane @ Toronto e Singapore
`
`
`0003
`
`
`
`0003
`
`

`

`Tl
`se
`
`=ame
`
`Steven Elliot
`Acquisitions Editor
`Genevieve Scandone
`Copy Editor
`Linda Muriello
`Production Manager
` Savoula Amanatidis
`Senior Production Supervisor
`Sigmund Malinowski
`Illustration Coordinator
`Lorraine Fumoso
`Manufacturing Manager
`Recognizing the importanceof preserving whathas been written,it is a policy of John Wiley & Sons,Inc. to
`have books of enduring value published in the United States printed on acid-free paper, and we exert our
`best efforts to that end.
`
`Copyright ©1992, by John Wiley & Sons,Inc.
`All rights reserved. Published simultaneously in Canada.
`Reproduction ortranslation of any part of
`this work beyond that permitted by Sections
`107 and 108 of the 1976 United States Copyright
`Act without the permission of the copyright
`owneris unlawful. Requests for permission
`or further information should be addressed to
`the Permissions Department, John Wiley & Sons.
`
`Library of Congress Cataloging in Publication Data:
`Schalkoff, Robert J.
`Pattern recognition : statistical, structural and neural approaches / RobertJ. Schalkoff.
`p-
`cm.
`Includesbibliographical references and index.
`ISBN 0-471-52974-5 (cloth)
`1. Pattern perception-Statistical methods.I. Title.
`Q327.827 1992
`006.4—dce20
`
`91-4751CIP
`
`Printed in the United States of America
`
`10987654321
`
`
`
`0004
`
`0004
`
`

`

`About The Author
`
`
`Robert J. Schalkoff is Professor of Electrical and Computer Engineering at Clemson
`University in Clemson, South Carolina. His primary research interests are in artificial
`intelligence and computer vision with a special emphasis on model-based image un-
`derstanding, motion andstereovision, and associated computerarchitectures. He is
`also the author ofDigital Image Processing and Computer Vision (John Wiley & Sons,
`Inc., 1989) andArtificial Intelligence: An EngineeringApproach (McGraw-Hill, 1990).
`Hereceived the Ph.D. degree in Electrical Engineering from the University of Vir-
`ginia in 1979.
`
`vi
`
`
`
`0005
`
`0005
`
`

`

`
`
`Preface
`
`
`
`
`Theart ofbeing wiseis the art ofknowing whatto overlook.
`The Principles ofPsychology [1890], Chapter 22
`William James 1842-1910
`
`This book brings together an identifiable ‘core’ of pattern recognition (PR) ideas,
`techniques, and applications using Statistical, syntactic and neural approaches.Itis
`intendedfor beginning graduate/advanced undergraduate students as well as practic-
`ing engineers andscientists. It is the outgrowth of a graduate course taught in the
`Electrical and Computer Engineering Department at Clemson University. Like many
`similar medium-sized and ‘mainstream’ departments, we realized the needto intro-
`duce graduate students to the growinginventory of PR topics found in this book, in-
`cludinga structured introduction to neural networks for PR applications. The text is
`suitable for use in a one- or two-semester course and may be supplementedby indi-
`vidual student projects and readings from the current literature. Since there is more
`material than could comfortably be coveredina single semester, individualinstructors
`may choose to emphasize or deemphasize specific topics.
`Aswith mosttechnical disciplines, the conceptof‘pattern recognition’ meansdif-
`ferent thingsto different people. PR researchers are typically fragmentedinto disjoint
`communities, It is my belief that PR is becoming a mature technology with roots in
`diverse areas such as estimation theory, formal languages, and optimization theory. In
`addition, the interfaces between pattern recognition and other technologies and ap-
`plication areas suchasartificial intelligence and image processing, are quite fuzzy. All
`three PR approaches(statistical, syntactic, and neural) are currently receiving consid-
`erable attention and often share overlapping themes. Thechoice of one over another
`is a function of manyvariables, including constraints of the specific application and
`vii
`
`ae
`
`0006
`
`

`

`
`
`viii
`
`Preface
`
`the backgroundofthe PR system designer. Therefore, I have attempted an exposition
`thatillustrates the similarities and differences among the three approaches. Although
`statistical and syntactic pattern recognition are different, they share commonideas.
`An exampleis the use of formal models for pattern generation, whichleads to formal
`solution proceduresfor classification/description. In addition, neural network-based
`approaches share a numberof characteristics with ‘traditional’ pattern recognition
`techniques. They include hierarchical structures, clustering, pattern association, and
`learning ortraining.
`I have also tried to emphasize an ‘engineering’ approach to PR,since I believe
`we should not be teaching PR as esoteric mathematics oras the ‘art’ of developing
`a set of problem-solving heuristics. An understanding of the underlying models and
`techniques andtheir respective limitations is a fundamental aspect of PR system de-
`sign. Like many engineering andscientific disciplines, PR system design forces the
`designer to consider trade-offs between exact solutions to approximate models and
`approximate solutions to exact models. One ofthe caveatsI stress to studentsis “In
`PR problem formulations, you will get what you ask for in the solution, whether you
`recognizeit or not.’ This allows students to appreciate a number of mathematicalre-
`sults that also appealto intuition. Examples are as follows:
`e Whena statistical pattern recognition-based algorithm changes decision re-
`gions to account for a changein a priori probabilities.
`e When a grammatical approachto describing the structure inherent in a pattern
`yields a context-sensitive grammar.
`e Whena feedforward-structure neural net, after training with a well-knownal-
`gorithm, developsan internal layer that allows the attachment of semantics to
`the action of the internal units.
`After completing this book, students should have a good groundingin the fundamen-
`tals of PR, exposure to three diverse PR approaches, and someperspective on the
`applicability of each approach. They shouldalso be able to begin reading the current
`literature.
`Opportunities abound for the application of pattern recognition techniques to
`present-day problems. Current examples are automated character recognition, digital
`image processing and analysis, and recognition and understanding of speech.In fact,
`manypattern recognition applicationsarise in situations where they are not expected.
`Examples of these are weather forecasting and stock market analysis. Individual in-
`structors may wish to emphasize PR application in their respective research domains.
`My ownexperience with computer vision problemsprovided a wealth of examples,
`someof which are used in this text.
`An attempt was madeto keep the presentation balanced. Thetext has four main
`sections. Part 1 (Chapter1) introducesthe overall subject by asking the question What
`is pattern recognition? It provides an overview of pattern recognition concerns that
`are commontoall approaches. Thethree approachesarethenbriefly introduced and
`compared. Part 2 (Chapters 2-5) concernsthestatistical approach, beginning with
`a simple example and ending with unsupervised learning through clustering. Part 3
`
`
`
`0007
`
`0007
`
`

`

`
`
`Preface
`
`ix
`
`(Chapters 6-9) develops the syntactic or structural approach, beginning with a look at
`the representational capability of string grammars and corresponding recognition ap-
`proaches(parsing), continuing with an exploration of higher dimensional representa-
`tions and graphical approaches, and ending with an introduction to supervised learn-
`ing (grammatical inference). Finally, Part 4 (Chapters 10-13) details the emerging
`neural approach, beginning with an overview of neural computing basics, an examina-
`tion of pattern associators and feedforward nets and associated training algorithms,
`and concluding with unsupervised learning implementations. In each case, following
`developmentofthe theory, examples are shown toillustrate the concepts. Each chap-
`ter contains a concludingsection that cites relevant literature for more in-depth study
`of specific topics. Approximately 250 referencesare cited.
`Some backgroundin probability, linear algebra, and discrete mathematicsis nec-
`essary to fully appreciate the topics considered in this book. The appendices docu-
`ment these and other fundamental background concepts and enable the book to be
`self-contained. In addition, numerousexercises are presented that will challenge and
`motivate the readerto further explore relevant concepts. Manyof these exercises pro-
`vide the bases for projects and thesis work. Instructors should contact the publisher
`regarding the availability of a solution manual.
`This book completes a trilogy of books I have written over the last three years.
`The other two are Digital Image Processing and Computer Vision (John Wiley & Sons,
`Inc.) and Artificial Intelligence: An EngineeringApproach (McGraw-Hill). They cover
`a broad area of computing technology that I believe (or hope) is likely to receive
`widespreadattention as it continues to mature during the ’90s.
`
`Acknowledgments
`
`A textbook doesnot just ‘happen,’ but rather requires a significant commitment from
`its author as well as a supporting environment.
`I have been quite fortunate in this
`regard. My wife, Leslie, functioned as ‘the ultimate pattern recognition system’ in
`transcribing pages of scribbling into an attractive TEX document. Students who used
`earlier versions of the notes provided both valuable feedback and enhanced motiva-
`tion, since they proved that the educational objective of this book could be achieved.
`The environmentin the Departmentof Electrical and Computer Engineering and the
`College of Engineering at Clemson University was also conducive to this task. The
`efforts of the professional staff at John Wiley and Sons, especially Steven Elliott and
`Savoula Amanatidis, deserve special thanks. Finally, in watching my daughter Katie
`grow,I realize there are lots of things we’ll never want to automate.
`
`Robert J. Schalkoff
`Clemson, South Carolina
`
`0008
`
`FO
`
`s|PyPre
`
`0008
`
`

`

`
`
`Contents
`
`
`
`
`PART 1
`
`INTRODUCTION AND GENERAL
`PATTERN RECOGNITION CONCERNS
`
`CHAPTER 1 PATTERN RECOGNITION (PR) OVERVIEW
`Whatis This Book About?
`Overview
`‘Engineering’ Approachesto Pattern Recognition
`Relationship of PR to Other Areas
`Pattern Recognition Applications
`Pattern Recognition, Classification, and Description
`Abstract Representation of Pattern Mappings
`Structure of a ‘Typical’ PR System
`Patterns and Feature Extraction with Examples
`Patterns and Features
`Pattern Distortions—A Fundamentaland Difficult Problem
`Example: Feature Extraction Using Generalized Cylinders for
`3-D Object Description and Classification
`Example: Generating RST Invariant Features and Application to
`2-D Figure Recognition
`Feature Extraction
`Numerical Results and Analysis
`Analysis
`What’s Really Different Among Different Patterns?
`The Feature Vector and Feature Space
`Definitions
`
`a
`
`ODAMARwWWWHHNb
`
`10
`11
`
`12
`12
`
`13
`13
`14
`
`xi
`
`0009
`
`0009
`
`

`

`xii
`
`Contents
`
`,
`
`Classifiers, Decision Regions and Boundaries, and Discriminant
`Functions
`Training and Learning in PR Systems
`Using A Priori Knowledge or ‘Experience’
`Learning Curves (Improved ‘Performance’)
`Training Approaches
`Pattern Recognition Approaches
`TheStatistical Pattern Recognition Approach (StatPR)
`The Syntactic Pattern Recognition Approach (SyntPR)
`The Neural Pattern Recognition Approach (NeurPR)
`Comparing and Relating StatPR, SyntPR and NeurPR
`Examples of Pattern Recognition Approaches
`StatPR
`SyntPR
`Example: Combinations of StatPR and SyntPR
`Procedures for PR System Engineering
`Other Approaches to PR
`‘Black Box’ Approaches
`Reasoning-Driven Pattern Recognition
`Overview of PR Literature and Resources
`Books
`Journals
`Conferences
`Exercises
`
`PART 2. STATISTICAL PATTERN RECOGNITION
`(StatPR)
`CHAPTER 2
`INTRODUCTIONTO STATISTICAL
`PATTERN RECOGNITION
`Introduction to Statistical Pattern Recognition
`Approaches to Developing StatPR Classifiers
`Simple ExamplesofStatistical Models and Applications
`The Gaussian Case and Class Dependence
`Gaussian Models for p(z|w;)
`Discriminant Functions
`Generalized Results (Gaussian Case)
`Decision Surfaces for Specific Cases
`A General Lookat the StatPR ApproachStructure
`Additional Examples
`Example: Uniform Densities, c = 2, d= 1
`Example: c = 3, d = 2, Gaussian Case
`Extensions
`
`15
`16
`16
`17
`17
`18
`18
`19
`20
`20
`20
`21
`26
`26
`260
`27
`27
`28
`28
`29
`29
`30
`30
`
`33
`
`34
`34
`34
`35
`39
`39
`39
`40
`40
`43
`43
`
`Training
`
`0010
`
`0010
`
`

`

`
`
`
`
` Contents _xili
`
`Alternative Classification Procedures
`Unsupervised Approaches
`Classifier Performance, Risk, and Errors
`MeasuresofClassification Performance
`General Measuresof Classification Risk
`Bibliographical Remarks
`Exercises
`CHAPTER 3 SUPERVISED LEARNING (TRAINING)
`USING PARAMETRIC AND NONPARAMETRIC
`APPROACHES
`
`Introduction
`Parametric Estimation and Supervised Learning
`Approachesto Parameter Estimation
`Supervised versus Unsupervised Learning
`Maximum Likelihood (ML) Estimation
`Formulation
`Use ofthe Training Set
`The Likelihood Function
`The Bayesian Parameter Estimation Approach
`Supervised Learning Using Nonparametric Approaches
`Nonparametric Approaches
`General Aspects of Nonparametric Density Estimation
`The Basic Approach
`- Parzen Windows
`Unit Step Function ¢(z)
`Extension to More GeneralInterpolation Functions
`k-nn Nonparametric Estimation
`A Problem with Vp
`Estimation of P(w;|x) Directly
`Examples of Nonparametric Learning
`Direct Classification Using the Training Set [The
`Nearest Neighbor Rule (NNR)]
`The NNR Approach
`Bibliographical Remarks
`Exercises
`
`,
`
`CHAPTER 4 LINEAR DISCRIMINANT FUNCTIONS
`AND THE DISCRETE AND BINARY
`FEATURE CASES
`
`‘
`
`Introduction
`Linear Discriminant Functions
`Fisher’s Linear Discriminant
`Discrete and Binary Classification Problems
`
`,
`
`0011
`
`47
`48
`49
`49
`51
`54
`54
`
`58
`
`58
`58
`59
`59
`59
`59
`59
`60
`62
`66
`66
`67
`67
`70
`70
`73
`74
`74
`75
`75
`
`75
`75
`83
`83
`
`89
`89
`89
`90
`94
`
`0011
`
`

`

`141
`
`119
`120
`
`121
`
`127
`
`128
`
`128
`129
`129
`130
`130
`130
`131
`133
`133
`135
`136
`137
`139
`
`xiv
`
`Contents
`
`Classification Procedures for Discrete Feature Data
`Formulation for the Binary Case
`Classification Proceduresfor the Binary Case
`Nonparametric Estimation of P(x|w;)
`Techniques to Directly Obtain Linear Classifiers
`The Concept of Linear Separability
`Design of Linear Classifiers
`Bibliographical Remarks
`Exercises
`CHAPTER 5 UNSUPERVISED LEARNING AND
`CLUSTERING
`Formulation of Unsupervised Learning Problems
`A Modified Training Set
`Unsupervised Learning Approaches
`Clustering for Unsupervised Learning and Classification
`The Clustering Concept and the Search for ‘Natural Clusters’
`The c-means Algorithm
`Learning Vector Quantization (LVQ)
`Formal Characterization of General Clustering Procedures
`Clustering Strategies
`‘Cluster Swapping’ Approaches
`Extended Example: A Hierarchical Clustering Procedure
`Bibliographical Remarks
`Exercises
`PART3 SYNTACTIC PATTERN RECOGNITION
`(SyntPR)
`
`CHAPTER 6 OVERVIEW
`Syntactic Pattern Recognition (SyntPR) Overview
`Quantifying Structure in Pattern Description and Recognition
`Hierarchical Approaches
`Relational Models
`Grammar-Based Approach and Applications
`Using FormalLinguistics to Model and Describe Patterns
`Structural Analysis and Description Applications
`Elements of Formal Grammars
`Definitions and Conventions
`Grammar and Languages Using Strings
`Grammar Application Modes
`Grammar Types and Productions
`Other Production Constraints in String Grammars
`Additional Introductory Concepts
`
`0012
`
`0012
`
`

`

`
`
`Contents
`
`xv
`
`Example: Context-Sensitive, Context-Free, and Finite State
`(Regular) GrammarProductions
`Examples of String Generation as Pattern Description
`Example: 2-D Line Drawing Description Grammar [Shaw 1970]
`Example: Character Description Using PDL
`Example: Object Description Using Projected Cylinder Models
`Extended Example: Blocks World Description
`Remarks on the Heuristic Generation of Grammars
`Bibliographical Remarks
`Exercises
`
`CHAPTER 7 SYNTACTIC RECOGNITIONVIA PARSING
`AND OTHER GRAMMARS
`
`Recognition of Syntactic Descriptions
`Recognition by String Matching
`Recognition by Parsing
`Parsing
`Fundamental Concepts
`An Abstract Viewof the Parsing Problem
`Parsing Approaches
`The Cocke-Younger-Kasami (CYK) Parsing Algorithm
`The Approach
`_ The CYK Table
`(Augmented) Transition Networks (ATNs) in Parsing
`Transition Networks
`Augmented Transition Nets (ATNs)
`Higher Dimensional Grammars
`Introduction
`Tree Grammars
`Traversing and Describing Trees
`Productions and Derivations with Trees
`Stochastic Grammars and Applications
`Stochastic Grammars
`Derivations in a Stochastic Language
`Bibliographical Remarks
`Exercises
`
`CHAPTER 8 GRAPHICAL APPROACHES TO
`SyntPR
`
`Graph-Based Structural Representations
`From Digraphs to Semantic Nets to Relational Graphs
`Application of Relational Graphs to PR
`Graph Isomorphism _
`
`0013
`
`142 .
`144
`144
`146
`146
`146
`149
`150
`150
`
`155
`
`155
`155
`156
`156
`156
`157
`159
`160
`160
`160
`163
`163
`164
`165
`165
`166
`166
`169
`170
`171
`172
`173
`174
`
`176
`
`176
`176
`177
`179
`
`0013
`
`

`

`179
`180
`181
`
`182
`183
`185
`
`185
`185
`186
`187
`187
`188
`
`188
`191
`192
`
`204
`
`194
`
`194
`194
`194
`195
`195
`195
`197
`197
`197
`198
`198
`199
`199
`200
`201
`
`203
`
`204
`
`204
`
`xvi
`
`Contents
`
`Uniqueness of Digraph-BasedStructural Description and
`Isomorphism
`Determining Isomorphism
`Extensions to the Elementary Graph Matching Approach
`A Formal Characterization of the Relational Graph Similarity
`Problem
`Extensions of Relational Graphs (Attributed Graphs)
`A Structured Strategy to Compare Attributed Graphs
`Relaxing Requirements for Pattern Isomorphism and
`Subisomorphism
`Match Graphs (MGs)
`Recursive Procedure to Find Cliques[cliques (X, Y)]
`Other Attributed Graph Distance or Similarity Measures
`Design andSelection of Similarity Measures
`Transforming G; into G;
`Extended Example:Structural Unification Using Attributed
`Graphs
`Bibliographical Remarks
`Exercises
`
`CHAPTER 9 LEARNING VIA GRAMMATICAL
`INFERENCE
`
`Learning Grammars
`Problem Overview
`Difficulties in Structural Learning
`Problem Formulation
`Characterizing the Grammar Source,
`ConcernsRelated to the Training Set
`Grammatical Inference (GI) Approaches
`Grammatical Inference Objectives
`Intuitive Sample GI Procedure
`Procedures to Generate Constrained Grammars
`Using a Single String to Partially Infer Giearn
`Generation of Cannonical Definite Grammars
`General Procedure to Generate CDFSG,(G.)
`Bibliographical Remarks
`Exercises
`
`PART 4 NEURAL PATTERN RECOGNITION(NeurPR)
`
`CHAPTER 10
`
`INTRODUCTION TO NEURAL NETWORKS
`
`Neurons and NeuralNets
`Introduction
`
`0014
`
`0014
`
`

`

`SES
`
`
`
`
`
`Contents _xvii
`
`History
`Neural Networks as a Black Box Approach and ‘Artificial’ Neural
`Systems
`Key Neural Network Concepts
`Characteristics of Neural Computing Applications
`“Connectionist’ Models and Computing
`Neural Network Structures for PR Applications
`Neural Network Structure (How Should I Connect My Neurons?)
`Learning in Neural Networks
`The Neural PR Application Design Phase
`WhyIs Neural Computation So Important?
`Physical Neural Networks
`The Physical Neuron
`Dynamicsof a Biological Neural System
`The Artificial Neural Network Model
`Artificial Neuron Activation and Output Characteristics
`Neural Unit Interconnection Strengths (Weights)
`Other Neural Network Parameters
`Network Implementationsfor Constraint Satisfaction and Matching
`Problems
`Hardware Realizations of Neural Networks
`Bibliographical Remarks
`
`CHAPTER 11
`
`INTRODUCTION TO NEURAL PATTERN
`ASSOCIATORS AND MATRIX
`APPROACHES
`
`Neural Network-Based Pattern Associators
`Introduction
`Design Procedure
`‘Black Box’ Structure
`CAM and Other Neural Memory Structures
`Desirable Pattern Associator (PA) Properties
`Matrix Approaches (Linear Associative Mappings) and Examples
`An Elementary Linear Network Structure and Mathematical
`Representation
`Approach 1: A Linear CAM (‘Hopfield’) Network
`Approach 2: Matched Filters/Adaptive Filters
`Approach 3: Outer Product Formulations
`Approach 4: Generalized Inverse Applications
`Extended Example: Heteroassociative Memory Design
`Hebbian or Correlation-Based Learning
`Bibliographical Remarks
`Exercises
`
`,
`
`205
`
`205
`206
`206
`207
`207
`_ 207
`208
`210
`210
`211
`211
`212
`213
`213
`217
`218
`
`219
`220
`220
`
`221
`
`221
`221
`221
`222 .
`223
`224
`224
`
`224
`226
`227
`228
`229
`231
`232
`233
`233
`
`0015
`
`0015
`
`

`

`xviii
`
`Contents
`
`CHAPTER 12. FEEDFORWARD NETWORKSAND
`TRAINING BY BACKPROPAGATION
`
`Multilayer, Feedforward Network Structure
`Introduction
`Overall Feedforward Structure
`The Role of Internal/Hidden Layers (What’s ‘Hidden’?)
`Training Considerations
`Training the Feedforward Network: The Delta Rule (DR) and
`Generalized Delta Rule (GDR)
`Overview
`Gradient Descent Approachesfor Training
`Training by Sample or Training by Epoch
`Derivation of the DR
`A General Assessmentof Error Sensitivity
`Extension of the DR for Units in the Hidden Layers [The
`Generalized Delta Rule (GDR)]
`Back Propagation—Summary of the Multistep Procedure
`Adding Momentum to the Training Procedure
`Training the Unit Bias Inputs
`Example: ‘Handworked?Training of a Single Unit Using the GDR
`Extended Example: Pattern Associator for Character Classification
`Chosen Features
`Case 1: Single 4-Valued Output, Pseudoinverse Solution
`Case 2: Single 4-Valued Output, 2-Layer Neural Network Solution
`Case 3: 4 Output (‘One-of’) 2-Layer Neural Network Solution
`Case 3: Summary of Results
`Case 4: 4-Output, 3-Layer (Hidden) Neural Network Solution
`Other Ramifications of Hidden Units
`Bibliographical and Historical Remarks
`Exercises
`
`CHAPTER 13. CONTENT ADDRESSABLE MEMORY
`APPROACHES AND UNSUPERVISED
`LEARNINGIN NeurPR
`
`Introduction
`The Hopfield Approach to Neural Computing
`Network Parameters
`Network Dynamics
`Hamming Distance andIts Significance in CAM
`Example: Hamming Distances Between Stored States
`Example: Design of a Simple Hopfield Network—Storing and
`Accessing Stable States
`Why the Hopfield Strategy Should Work
`
`236
`
`236
`236
`236
`238 ©
`239
`
`240
`240
`241
`241
`242
`242
`
`246
`247
`247
`249
`250
`251
`251
`252
`252
`253
`254
`254
`257
`258
`299
`
`264
`
`264
`265
`265
`265
`267
`267
`
`268
`
`0016
`
`0016
`
`

`

`
`
`
`
`Contents
`
`xix
`
`Additional Examples of CAM Applications in PR
`Example: Character Recognition
`Example: Relational Constraint Satisfaction (Coloring)
`Example: Symbolic Constraint Satisfaction (Labeling)
`Unsupervised Learning in NeurPR:Self-Organizing Networks
`Introduction
`Adaptive Resonance Architectures
`Self-Organizing Feature Maps (Kohonen)
`Summary
`Bibliographical Remarks
`Exercises
`
`APPENDIX 1 Linear Algebra Review
`APPENDIX 2 Probability and Random Variables/Vectors
`APPENDIX 3 Discrete Mathematics Review
`APPENDIX 4 Descent Procedures: Examples and Error Function
`Contours
`APPENDIX 5 Similarity Measures, Matching Techniques, and
`Scale-Space Approaches
`APPENDIX 6 Geometry for Classification and State-Space
`Visualization
`
`-
`
`REFERENCES
`
`PERMISSION SOURCE NOTES
`
`INDEX
`
`272
`272°
`273
`279
`275
`275
`276
`. 280
`284
`285
`286
`
`289
`298
`314
`
`323
`
`328
`
`337
`
`344
`
`-
`
`360
`
`361
`
`0017
`
`0017
`
`

`

`
`
`Part
`
`
`
`
`Introduction and General
`Pattern Recognition Concerns
`
`0018
`
`0018
`
`

`

`
`
`Pattern Recognition
`(PR) Overview
`
`
`
`
`Knowledgeis of two kinds. We know a subject ourselves, or we know where
`we can find information uponit.
`
`From James Boswell, Life ofJohnson 1791
`Samuel Johnson 1709-1784
`
`WHATIS THIS BOOK ABOUT
`
`Overview
`Machineintelligence will be a dominant technology in the 1990s. Pattern recognition
`(PR) techniquesare often an important component ofintelligent systemsandare used
`for both data preprocessing and decision making. Broadly speaking, pattern recogni-
`tion (PR)is the science that concerns the description orclassification (recognition)
`of measurements. Thereis little doubt that PR is an important, useful, and rapidly
`developing technology with cross-disciplinary interest and participation. PR is not
`comprised of one approach,but ratheris a broad bodyofoften loosely related knowl-
`edge and techniques. Historically, the two major approachesto pattern recognition
`are thestatistical (or decision theoretic) and the syntactic (or structural) approaches.
`Recently, the emerging technology of neural networks has provided a third approach,
`especially for ‘black box’ implementation of PR algorithms. Since nosingle technol-
`ogy is always the optimum solution for a given PR problem,all three are exploredin
`this text.
`Eachofthe three interrelated pattern recognition approaches:
`« Statistical Pattern Recognition (StatPR)
`e Syntactic Pattern Recognition (SyntPR)
`e Neural Pattern Recognition (NeurPR)
`
`2
`
`0019
`
`0019
`
`

`

`
`
`
`
`Pattern Recognition (PR) Overview 3
`
`is defined in the following sections. Where appropriate, computationalissues related
`to practical or even ‘real-time’ implementation of the corresponding PR systemsis:
`considered. In orderto achieve this breadth of coverage,the scope of some PR topics
`is restricted.
`
`‘Engineering’ Approachesto Pattern Recognition
`An engineering approach to problem solving involves incorporating all available and
`relevant problem informationin a structured fashion to formulate a solution. This in-
`cludes the developmentor modification of models that perhapsincorporate structural
`and a priori information,including‘training’ data.! Therefore, throughout this book,
`our examination of PR approacheswill be guided by these three questions:
`1. Are pattern recognition techniquessuitable, or even applicable to the problem
`at hand? (Can the problem besolved?)
`2. Can we develop or modify useful models for the situation and,if necessary,
`determine the model parameters?
`If so, are there formal tools and heuristics that may be applied, and does a
`computationallypractical solution procedure exist?
`
`3.
`
`Relationship of PR to Other Areas
`
`Pattern recognition techniques overlap with other areas, such as:
`° (Adaptive) signal processing and systems
`e Artificial intelligence
`e Neural modeling
`e Optimization/estimation theory
`e Automata theory
`© Fuzzy sets
`e Structural modeling
`e Formal languages
`
`Pattern Recognition Applications
`PRapplicationsinclude:
`e Image preprocessing, segmentation, and analysis
`e Computervision
`e Seismic analysis
`e Radarsignalclassification/analysis
`‘Pattern recognition (especially statistical) booksare replete with the terms‘a priori’ and ‘a posteriori.’
`Wewill define ‘a priori’ as ‘known before’ (usually‘before’ meaning ‘before feature measurement’) and ‘a
`posteriori’ to mean ‘derived from subsequent measurements.’
`
`0020
`
`0020
`
`

`

`
`
`4
`
`Introduction and General Pattern Recognition Concerns
`
`e Face recognition
`e Speech recognition/understanding
`e Fingerprint identification
`e Character(letter or number) recognition
`e Handwriting analysis (‘notepad’ computers)
` Electrocardiographicsignal analysis/understanding
`e Medical diagnosis
`
`PATTERN RECOGNITION, CLASSIFICATION, AND
`DESCRIPTION
`
`Abstract Representation of Pattern Mappings
`PR maybe characterized as an information reduction, information mapping,ot infor-
`mation labeling process. An abstract view of the PR classification/description problem
`is shown in Figure 1. We postulate a mapping between class-membership space, C,
`and pattern space, P. This mappingis donevia a relation, G;, for each class, and may
`be probabilistic. Each class, w;, generatesa subsetof ‘patterns’ in pattern space, where
`the ith pattern is denoted p;. Note, however, that these subspacesoverlap, to allow
`patterns from differentclassesto shareattributes. Anotherrelation, /, mapspatterns
`from subspacesof P into observations or measured patternsorfeatures, denoted m.
`Using this concept, the characterization of many PR problemsis simply that, given
`measurement m,, we desire a methodto identify and invert mappings M and G;for
`all i. Unfortunately, in practice, these mappingsare not functions. Even if they were,
`they are seldom 1:1, onto or invertible. For example, Figure 1 showsthat identical
`measurements or observations mayresult from different p;, which in turn correspond
`to different underlyingclasses. This suggestsa potential problem with ambiguity. Nev-
`ertheless,it seems reasonable to attempt to model and understandthese processes, in
`the hopethatthis leads to betterclassification/description techniques.
`The abstract formulation of Figure 1 is well suited for modeling in both StatPR
`and SyntPR. Looking ahead, Figure 13 shows how more detailed formalizations of
`these approaches may berelated to Figure 1. We may view the realized patternsin
`Figure 1 as basic ‘world data,’ which is then measured. Thus, another important as-
`pect of Figure 1 concerns M. This mappingreflects our choice ofmeasurementsystem.
`Measurementsystem design is an important aspect of PR system design, in the sense
`that good‘features’ or ‘primitives,’ to be derived subsequently, probably require good,
`or at least adequate, measurements.It is unlikely that erroneous or incomplete mea-
`surementswill facilitate good PR system performance.Finally, note that patterns that
`are generated by the sameclass (p, and p;, from w,, for example) and ‘close’ in pattern
`space, P, do not necessarily yield measurements (m, and mzin this case) that are also
`‘close.’ This is significant when‘clustering’ of measurement(or feature or primitive)
`data is used for measuring pattern similarity.
`
`-
`
`0021
`
`0021
`
`

`

`[
`
`
`
`Pattern Recognition (PR) Overview 5
`
`
`
`Class membership
`or description space
`
`(Realized) pattern
`space
`
`World observation
`or measurement space
`F
`
`representation of pattern genera-
`Figure 1: Mappings in an abstract
`tion/classification/interpretation systems.
`
`Example Useof the Abstract Representation. Consider a two-class problem where w,
`correspondsto theclass ‘apple’ and w2 correspondsto the class ‘hand grenade.’ In
`realized pattern space, there will be some difference among the p; resulting from ap-
`ples, since notall apples are the samesize, shape, weight, color, and so on. A similar
`remark holdstrue forrealizations of class wz. Also, somerealizations of ‘apple’ and
`‘hand grenade’sharesimilar attributes (e.g., mass, volume, weight). Therefore, basing
`our PR system on a measurementconsisting simply of realized pattern weightis likely
`to:
`
`¢ (Mis)-classify some apples and hand grenadesas the same.
`e (Mis)-classify or distinguish between heavy apples andlight apples (and simi-
`larly for we).
`Ironically, at the same time westrive to understand and to ‘undo’ these mappings,
`alternative (so-called black box) approaches,principally implementedby using neural
`networks, are receiving increased attention. These black box approaches attemptto
`eliminate the need for detailed information on both the mappings andinverses and,
`instead, require a goodsetof training data (sample mappings between C' and M) and
`a trainable system.
`
`Structure of a ‘Typical’ PR System
`
`Thestructure of a typical pattern recognition system is shown in Figure 2. Notice that
`it consists of a sensor (an image sensor or camera, for example), a feature extraction
`mechanism (algorithm), and a classification or description algorithm (depending

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket