throbber
GTL 1011
`IPR of U.S. Patent 9,007,420
`
`0001
`
`

`
`— E
`
`ditorial Advisory
`Board
`
`Gary S. Lynch
`University of California.
`Irvine
`
`George A. Miller
`Princeton University
`
`Mortimer Mishkin
`National Institutes of
`Health
`
`Associate Editors
`
`Richard Andersen
`Massachusetts Institute of
`Technology
`
`Emilio Bizzi
`Massachusetts Institute of
`Technology
`
`Floyd E. Bloom
`Scripps Clinic, La Jolla
`
`Alfonso Caramazza
`Johns Hopkins University
`
`J. Anthony Movshon
`New York University
`
`Patricia S. Churchland
`University of California,
`San Diego
`
`Daniel C. Dennett
`Tufts University
`
`Mitchell Glicl-tstein
`University College, London
`
`Lila Gleitman
`University of Pennsylvania
`
`Patricia Ge-ldrnan—Ral<ic
`Yale University
`
`Steven A I-lillyard
`University of California
`
`William Hirst
`New School for Social
`Research
`
`Terence W. Picton
`
`University of Ottawa
`
`Michael I. Posner
`University of Oregon
`
`David Prernack
`University of
`Pennsylvania
`
`Zenon W. Pylyshyrt
`University of
`Western Ontario
`
`Marcus E. Raichle
`Washington University
`
`Pasko Raloc
`Yale University
`
`David Rumelhart
`
`Stanford University
`
`David H. I-lubel
`Harvard University
`
`Stanley Schachter
`Columbia University
`
`Jon I-I. Kaas
`Vanderbilt University
`
`Terrence J. Sejnowski
`The Salk Institute
`
`"Journal of Cognitive Neuroscience
`Volume 5, No. 1, ‘Winter 1991
`ISSN 0393-929X
`
`Editors
`
`Michael S. Gazznniga
`Editor-in-Chief
`Dartmouth Medical School
`
`Ira. B. Black
`Robert Wood Johnson Medical School
`
`Stephen M. Kosslyn
`Harvard University
`
`Gordon M. Shepherd
`Yale University
`
`Managing Editor
`Charlotte Smylie
`Editorial Address
`Journal of Cognitive Neuroscience
`Cognitive Neuroscience Institute
`R0. Box 1204
`Norwich. VT 05055
`
`Individuals wishing to submit manuscripts should follow the
`guidelines provided at the back of Lltis issue.
`
`jourrtal ofCogm':t've New-osctettce is indexed or abstracted in:
`A.rt1fict'ai'!nIel'It'gence Abstracts, Cztrrerir Contents, F.-trcevpm ilferiicrt,
`and Linguistics arm‘ fflrtgungf Behavior Abstmcts.
`
`Business Oflices and Subscription Rates
`Subscriptions, address changes, and mailing list correspondence
`should be addressed to MIT Pressjournals, 55 Hayward Street,
`Cambridge, MA 02142. (617) 2532889. journal of Cognitive
`Neuroscience (ISSN 0898-929x) is published quarterly (winter,
`Spring, Summer, and Fall) by The MIT Press, Cambridge,
`Massachusetts, $56.00 for individuals and $120.00 for institutions.
`Subscribers outside the United States add $14.00 for postage and
`handling, Single copies of current issues: $30.00. To be honored free,
`claims for missing issues must be made immediately upon receipt of
`the next published issue.
`
`Postmaster
`Send address changes to Journal’ ofcognfzive Neuroscience,
`55 Hayward Street, Cambridge, MA 02142. Second Class postage
`paid at Boston, MA, and at additional post offices.
`
`Advertising
`inquiries may be addressed to the Aclvertising Manager, MIT Press
`Journals, 53 Hayward Street, Cambridge, MA 02142. (61?) 2532866.
`
`Copyright Ittformation
`Permission to photocopy articles for internal or personal use, or the
`internal or personal use of specific Clients, is granted by the
`copyright owner for users registered with the Copyright Clearance
`Center (CCC) Transactional Reporting Service, provided that the fee
`of $5.00 per article-copy is paid directly to CCC, 27 Congress Street,
`Salem, MA 01910. The fee code for users of the Transactional
`Reporting Service is 0898-9293991 $5.00. For those organizations that
`have been granted a photocopy license with CCC, a separate system
`of payment has been arranged.
`® This publication is printed on acid-free paper.
`© 1991 by the Massachusetts institute of Technology
`
`Herbert P. Killackey
`University of California,
`Irvine
`
`Larry R. Squire
`University of California.
`San Diego
`
`Marta Kutati
`University of California,
`San Diego
`
`David C. Van Essen
`California Institute of
`Technology
`
`Ralph Linsker
`IBM Research
`
`Edgar Zurif
`Brandeis University
`
`0002
`
`

`
`Eigenfaces for Recognition
`
`Matthew Turk and Alex Pentland
`Vision and Modeling Group
`The Media laboratory
`Massachusetts Institute of Technology
`
`Abstract
`
`I We have developed a near—real—timc computer system that
`can locate and track a subjects head, and then recognize the
`person by comparing characteristics of the face to those of
`known individuals. The computational approach taken in this
`system is motivated by both physiology and information theory,
`as well as by the practical requirements of near-real-time per-
`formance and accuracy. Our approach treats the face recog
`nition problem as an intrinsically two-dimensional
`(2-D)
`recognition problem rather than requiring recovery of three-
`dimensional geometry. taking advantage of the fact that faces
`are normally upright and thus may be described by a small set
`of 2-D characteristic views. The system functions by projecting
`
`face images onto a feature space that spans the significant
`variations among known face images. The significant features
`are known as “eigenfaces," because they are the eigenvectors
`(principal components) of the set of faces; they do not neces-
`sarily correspond to features such as eyes, ears, and noses. The
`projection operation characterizes an individual face by a
`weighted sum of the eigenface features, and so to recognize a
`particular face it is necessary only to compare these weights to
`those of known individuals. Some particular advantages of our
`approach are that it provides for the ability to learn and later
`recognize new faces in an unsupervised manner, and that it is
`easy to implement using a neural network architecture. I
`
`INTRODUCTION
`
`The face is our primary focus of attention in social in-
`tercourse, playing a major role in conveying identity and
`emotion. Although the ability to infer intelligence or
`character from facial appearance is suspect, the human
`ability to recognize faces is remarkable. We can recog-
`nize thousands of faces learned throughout our lifetime
`and identify familiar faces at a glance even after years of
`separation. This skill
`is quite robust, despite large
`changes in the visual stimulus due to viewing conditions,
`expression, aging, and distractions such as glasses or
`changes in hairstyle or facial hair. As a consequence the
`visual processing of human faces has fascinated philos-
`ophers and scientists for centuries, including figures such
`as Aristotle and Darwin.
`
`Computational models of face recognition, in partic-
`ular, are interesting because they can contribute not only
`to theoretical insights but also to practical applications.
`Computers that recognize faces could be applied to a
`Wide variety of problems, including criminal identifica-
`tion, security systems, image and film processing, and
`human—computer interaction. For example, the ability to
`model a particular face and distinguish it from a large
`number of stored face models would make it possible
`to vastly improve criminal identification. Even the ability
`I0 merely detect faces, as opposed to recognizing them,
`
`can be important. Detecting faces in photographs, for
`instance,
`is an important problem in automating color
`film development, since the effect of many enhancement
`and noise reduction techniques depends on the picture
`content (eg, faces should not be tinted green, while
`perhaps grass should).
`Unfortunately, developing a computational model of
`face recognition is quite difficult, because faces are com-
`plex, multidimensional, and meaningful visual stimuli.
`They are a natural class of objects, and stand in starl-c
`Contrast to sine wave gratings, the “blocks world," and
`other artificial stimuli used in human and computer vi-
`sion research (Davies, Ellis, 8: Shepherd, 1981), Thus
`unlike most early visual functions, for which we may
`construct detailed models of retinal or striate activity,
`face recognition is a very high level task for which com-
`putational approaches can currently only suggest broad
`constraints on the corresponding neural activity.
`We therefore focused our research toward developing
`a son of early, preattcntive pattern recognition capability
`that does not depend on having threeclimensional in-
`formation or detailed geometry. Our goal, which we
`believe we have reached, was to develop a computational
`model of face recognition that is fast, reasonably simple,
`and accurate in constrained environments such as an
`
`office or a household. In addition the approach is bio-
`logically implementable and is in concert with prelimi-
`
`© 1991 Mctssadauserts lnsntwe of Technology
`
`joumal of Cognitive Neuroscience Voiume 3, Number I
`
`0003
`
`

`
`nary findings in the physiology and psychology of face
`recognition.
`The scheme is based on an information theory ap-
`proach that decomposes face images into a small set of
`characteristic feature images called “eigenfaces,” which
`may be thought of as the principal components of the
`initial training set of face images. Recognition is per-
`formed by projecting a new image into the subspace
`spanned by the eigenfaces (“face space") and then clas-
`sifying the face by comparing its position in face space
`with the positions of known individuals.
`Automatically learning and later recognizing new faces
`is practical Within this framework. Recognition under
`widely varying conditions is achieved by training on a
`limited number of characteristic views (e.g., a "straight
`on" view, a 45° view, and a profile view). The approach
`has advantages over other face recognition schemes in
`its speed and simplicity, learning capacity, and insensitiv-
`ity to small or gradual changes in the face image.
`
`Background and Related Work
`
`Much of the work in computer recognition of faces has
`focused on detecting individual features such as the eyes,
`nose, mouth, and head outline, and defining a face model
`by the position, size, and relationships among these fea-
`tures. Such approaches have proven difficult to extend
`to multiple views, and have often been quite fragile,
`requiring a good initial guess to guide them. Research
`in human strategies of face recognition, moreover, has
`shown that individual features and their immediate re-
`
`lationships comprise an insufficient representation to ac-
`count
`for
`the performance of adult human face
`identification (Carey & Diamond, 1977). Nonetheless,
`this approach to face recognition remains the most pop-
`ular one in the computer vision literature.
`Bledsoe (1966a,b) was the first to attempt semiauto-
`mated face recognition with a hybrid human—computer
`system that classified faces on the basis of fiducial marks
`entered on photographs by hand. Parameters for the
`classification were normalized distances and ratios
`
`among points such as eye corners, mouth corners, nose
`tip, and chin point. Later work at Bell Labs (Goldstein,
`Harmon, 8: Lesk, 1971; Harmon, 1971) developed a vec-
`tor of up to 21 features, and recognized faces using
`standard pattern classification techniques. The chosen
`features were largely subjective evaluations (e.g., shade
`of hair, length of ears, lip thickness) made by human
`subjects, each of which would be quite difficult
`to
`automate.
`
`An early paper by Fischler and Elschlager (1973) at-
`tempted to measure similar features automatically. They
`described a linear embedding algorithm that used local
`feature template matching and a global measure of fit to
`find and measure facial features. This template matching
`approach has been continued and improved by the re-
`cent work of Yuille, Cohen, and Hallinan (1989) (see
`
`'
`
`Yuille, this volume). Their strategy is based on “deform '
`able templates,“ which are parameterized models of the
`face and its features in which the parameter values are
`determined by interactions with the image.
`Connectionist approaches to face identification seek to ':
`capture the configurational, or gestalt-like nature of the
`task. Kohonen (1989) and Kohonen and Lahtio (1981)
`describe an associative network with a simple learning
`algorithm that can recognize (classify) face images and
`recall a face image from an incomplete or noisy version
`input to the network. Fleming and Cottrell (1990) extend
`these ideas using nonlinear units, training the system by I
`backpropagation. Stonham‘s WISARD system (1986) is a
`general-purpose pattern recognition device based on
`neural net principles.
`It has been applied with some '
`success to binary face images, recognizing both identity
`and expression. Most connectionist systems dealing with
`faces (see also Midorikawa, 1988; O"I‘oole, Millward, 8:
`Anderson, 1988) treat the input image as a general 2-D
`pattern. and can make no explicit use of the configura .
`tional properties of a face. Moreover, some of these
`systems require an inordinate number of training ex-
`amples to achieve a reasonable level of performance.
`Only very simple systems have been explored to date,
`and it is unclear how they will scale to larger problems.
`Others have approached automated face recognition
`by characterizing a face by a set of geometric parameters '
`and performing pattern recognition based on the param-
`eters (e.g., Kaya 8: Kobayashi, 1972; Cannon, Jones,
`Campbell, & Morgan, 1986; Craw, Ellis, 8: Lishman, 1987;
`Wong, Law, & Tsaug, 1989). Kanade"s (1975) face identi-
`fication system was the first (and still one of the few)
`systems in which all steps of the recognition process
`were automated, using a top-down control strategy di-
`rected by a generic model of expected feature charac-
`teristics. His system calculated a set of facial parameters
`from a single face image and used a pattern classification .
`technique to match the face from a known set, a purely .
`statistical approach depending primarily on local histo-
`.
`gram analysis and absolute gray-scale values.
`Recent work by Burt (1988a,b) uses a “smart sensing"
`approach based on multiresolution template-matching.
`This coarse-to-fine strategy uses a special—purpose com-
`puter built to calculate multiresolution pyramid images
`quickly, and has been demonstrated identifying people
`in near—rea1-time. This system works well under limited
`circumstances, but should suffer from the typical prob-
`lems of correlation—based matching, including sensitivity
`to image size and noise. The face models are built by _
`hand from face images.
`
`'
`
`THE EIGENFACE APPROACH
`
`Much of the previous work on automated face l‘CC0gI1l‘
`tion has ignored the issue of just what aspects of the face
`stimulus are important for identification. This suggested
`to us that an information theory approach of coding and
`
`_.
`
`72
`
`Journal of Cognitive Neuroscience
`
`Volume 3, Number 1
`
`0004
`
`

`
`into the infor-
`decoding face images may give insight
`mation content of face images, emphasizing the signifi-
`cant local and global “features.“ Such features may or
`may not be directly related to our intuitive notion of face
`features such as the eyes, nose, lips, and hair. This may
`have important implications for the use of identification
`tools such as Identikit and Photofit (Bruce, 1988).
`In the language of information theory, we want to
`extract the relevant information in a face image, encode
`it as efficiently as possible, and compare one face encod-
`mg’ with a database of models encoded similarly. A simple
`approach to extracting the information contained in an
`image of a face is to somehow capture the variation in a
`collection of face images, independent of any judgment
`of features, and use this information to encode and -'com-
`
`pare individual face images.
`In mathematical terms, we wish to find the principal
`components of the distribution of faces, or the eigenvec-
`tors of the covariance matrix of the set of face images,
`treating an image as a point (or vector) in a very high
`dimensional space. The eigenvectors are ordered, each
`one accounting for a different amount of the variation
`among the face images.
`These eigenvectors can be thought of as a set of fea-
`tures that together characterize the variation between
`face images. Each image location contributes more or
`less to each eigenvector, so that we can display the ei-
`genvector as a sort of ghostly face which we call an
`ezgenface. Some of the faces we studied are illustrated
`in Figure 1, and the corresponding eigenfaces are shown
`in Figure 2. Each eigenface deviates from uniform gray
`where some facial feature differs among the set of train-
`ing faces; they are a sort of map of the variations between
`faces.
`
`Each individual face can be represented exactly in
`terms of a linear combination of the eigenfaces. Each
`face can also be approximated using only the “best”
`eigenfaces—those that have the largest eigenvalues, and
`which therefore account for the most variance within
`
`me set of face images. The best M eigenfaces span an
`M-dimensional subspace—“face space"——of all possible
`images.
`The idea of using eigenfaces was motivated by a tech-
`nique developed by Sirovich and Kirby (1987) and Kirby
`and Sirovich (1990) for efficiently representing pictures
`of faces using principal component analysis. Starting with
`an ensemble of original face images, they calculated a
`best coordinate system for image compression, where
`each coordinate is actually an image that they termed an
`91'genpicm.re. They argued that, at least in principle, any
`Collection of face images can be approximately recon-
`structed by storing a small collection of weights for each
`\
`._ face and a small set of standard pictures (the eigenpic—
`lures). The weights describing each face are found by
`Projecting the face image onto each eigenpicture.
`It occurred to us that if a multitude of face images can
`be reconstructed by weighted sums of a small collection
`
`of characteristic features or eigenpictures, perhaps an
`efficient way to learn and recognize faces would be to
`build up the characteristic features by experience over
`time and recognize particular faces by comparing the
`feature weights needed to (approximately) reconstruct
`them with the weights associated with known individuals.
`Each individual, therefore, would be characterized by
`the small set of feature or eigenpicture weights needed
`to describe and reconstruct themman extremely com-
`pact representation when compared with the images
`themselves.
`
`This approach to face recognition involves the follow-
`ing initialization operations:
`
`1. Acquire an initial set of face images (the training
`set).
`2. Calculate the eigenfaces from the training set, keep-
`ing only the M images that correspond to the highest
`eigenvalues. These M images define the face sprzce. As
`new faces are experienced, the eigenfaces can be up-
`dated or recalculated.
`
`5. Calculate the corresponding distribution in Meli-
`mensional weight space for each known individual, by
`projecting their face images onto the “face space."
`
`These operations can also be performed from time
`to time whenever there is free excess computational
`capacity.
`Having initialized the system, the following steps are
`then used to recognize new face images:
`
`1. Calculate a set of weights based on the input image
`and the M eigenfaces by projecting the input image onto
`each of the eigenfaces.
`2. Determine if the image is a face at all (whether
`known or unknown) by checking to see if the image is
`sufficiently close to “face space."
`3. If it is a face, classify the weight pattern as either a
`known person or as unknown.
`4. (Optional) Update the eigenfaces and/or weight
`patterns.
`_
`5. (Optional) If the same unknown face is seen several
`times. calculate its characteristic weight pattern and in-
`corporate into the known faces.
`
`Calculating Eigenfaces
`
`Let a face image I(x,y) be a two-dimensionaliv byN array
`of 03-bit) intensity values. An image may also be consid-
`ered as a vector of dimension N2, so that a typical image
`of size 256 by 256 becomes a vector of dimension 65,556,
`or, equivalently, a point in 65,556-dimensional space. An
`ensemble of images, then, maps to a collection of points
`in this huge space.
`Images of faces, being similar in overall configuration,
`will not be randomly distributed in this huge image space
`and thus can be described by a relatively low dimen-
`sional subspace. The main idea of the principal compo-
`
`lurk and Pentland
`
`73
`
`0005
`
`

`
`Figure 1. (a)Face images
`used as the training set.
`
`nent analysis (or Karhunen—Loeve expansion) is to find
`the vectors that best account for the distribution of face
`
`images within the entire image space. These vectors de-
`fine the subspace of face images, which we call "face
`space." Each vector is of length N2, describes an N by N
`image, and is a linear combination of the original face
`images. Because these vectors are the eigenvectors of
`the covariance matrix corresponding to the original face
`images, and because they are face-like in appearance, we
`refer to them as “eigenfaces.” Some examples of eigen-
`faces are shown in Figure 2.
`,
`.
`.
`.
`Let the training set of face images be T1, F2, F5,
`FM. The average face of the set
`is defined by ‘I’ =
`if L: Fn- Each face differs from the average by the
`vector 11>, = F, — ‘I’. An example training set is shown
`in Figure 1a, with the average face ‘I’ shown in Figure
`1b. This set of very large vectors is then subject to prin-
`cipal component analysis, which seeks a set of M ortho-
`normal vectors, u,«., which best describes the distribution
`of the data. The lath vector, us, is chosen such that
`
`At = 1 § (u:.'tI> )2
`Mn=1
`H
`
`(1)
`
`is a maximum, subject to
`
`lltrus = 3»; ={
`
`1,
`0,
`
`if!=ie
`otherwise
`
`(2) '
`
`The vectors 11.: and scalars M are the eigenvectors and
`eigenvalues, respectively, of the covariance matrix
`
`T
`1”
`c=—2«1>,,ct-,.
`Mn=1
`
`: MT
`
`(5)_
`
`I
`(I)}|f]- The matrix C,
`.
`.
`where the matrix A = [£11, (I); .
`however, is N2 by N‘, and determining the N2 eigenvec- .
`tors and eigenvalues is an intractable task for typical
`image sizes. We need a computationally feasible method
`to find these eigenvectors.
`If the number of data points in the image space is 1655
`than the dimension of the space (M < N2), there will be
`only M - 1, rather than N2, meaningful eigenvectots. '
`(The remaining eigenvectors will have associated ei3e“'
`values of zero.) Fortunately we can solve for the N2‘
`dimensional eigenvectors in this case by first solving fill’
`the eigenvectors of an M by M matrix—e.g., solving 3'
`16 X 16 matrix rather than a 16,384 X 16,384 matrix-e
`
`74
`
`Journal of C‘ogm'::'ve Neuroscience
`
`Votume 3, Number 1'
`
`0006
`
`

`
`Following this analysis, we Construct theM byM matrix
`L = ATA, where 1...,,.., = tI)3,",(IJ,,, and find the M eigenvec-
`tors, vi, of 11. These vectors determine linear combina-
`tions of the M training set face images to form the
`eigenfaces m.
`
`M
`
`111: 2V.1»q’_§,
`,9:
`
`1:1,...
`
`.M
`
`(6)
`
`With this analysis the calculations are greatly reduced,
`from the order of the number of pixels in the images
`(N2) to the order of the number of images in the training
`set (M). In practice, the training set of face images will
`be relatively small (M 4 N2), and the calculations become
`quite manageable. The associated eigenvalues allow us
`to rank the eigenvectors according to their usefulness in
`characterizing the variation among the images. Figure 2
`shows the top seven eigenfaces derived from the input
`images of Figure 1.
`
`Using Eigenfaces to Classify a Face Image
`
`The eigenface images calculated from the eigenvectors
`of L span a basis set with which to describe face images.
`Sirovich and Kirby (1987) evaluated a limited version of
`this framework on an ensemble of M = 115 images of
`Caucasian males, digitized in a controlled manner, and
`found that about 40 eigenfaces were sufficient for a very
`good description of the set of face images. With M’ =
`40 eigenfaces, RMS pixel-by-pixel errors in representing
`cropped versions of face images were about 2%.
`Since the eigenfaces seem adequate for describing face
`images under very controlled conditions, we decided to
`investigate their usefulness as a tool for face identifica-
`tion. In practice, a smaller M’ is sufficient for identifica-
`tion, since accurate reconstruction of the image is not a
`requirement. In this framework, identification becomes
`a pattern recognition task. The eigertfaces span an M‘-
`dimensional subspace of the original JV?’ image space.
`The M’ significant eigenvectors of the 1 matrix are chosen
`as those with the largest associated eigenvalues. In many
`of our test cases, based on M = 16 face images, M’ = 7
`eigenfaces were used.
`A new face image (F) is transformed into its eigenface
`components (projected into "face space”) by a simple
`operation,
`
`(Ilia = u}i{F — ‘F3
`
`(7)
`
`. ,M'. This describes a set of point—by-point
`.
`for £2 = 1, .
`image multiplications and summations, operations per-
`formed at approximately frame rate on current image
`processing hardware. Figure 3 shows an image and its
`projection into the seven-dimensional face space.
`The weights form a vector QT = [(91,
`(.02 .
`.
`. am] that
`describes the contribution of each eigenface in repre-
`senting the input face image, treating the eigenfaces as a
`basis set for face images, The vector may then be used
`
`Time and Pemland
`
`75
`
`',:
`
`Figure 1. (la) The average face ‘P.
`
`*
`
`_ Figure 2. Seven of the eigenfaces calculated from the input images
`. of Figure 1.
`
`H‘ and then taking appropriate linear combinations of the
`face images <I>.. Consider the eigenvectors v,- ofATA such
`that
`
`ATAV; = ].l.-{V1
`
`Premultiplying both sides by A, we have
`
`/lJ’11.‘AVr =
`
`u..Avt
`
`EFOIII which we see that Av; are the eigenvectors of C =
`AAT.
`
`0007
`
`

`
`in a standard pattern recognition algorithm to find which
`of a number of predefined face classes, if any, best de-
`scribes the face. The simplest method for determining
`which face class provides the best description of an input
`face image is to find the face class it that minimizes the
`Eucliclian distance
`
`as = ||(0 - 9-t)||2
`
`(8)
`
`where (Is is a vector describing the kth face class. The
`face classes 0; are calculated by averaging the results of
`the eigenface representation over a small number of face
`images (as few as one) of each individual. A face is
`classified as belonging.to class .(a when the minimum at
`is below some chosen threshold 6.. Otherwise the face
`is classified as "unknown," and optionally used to create
`a new face class.
`
`Because creating the vector of weights is equivalent to
`projecting the original face image onto the low-climen-
`sional face space, many images (most of them looking
`nothing like a face) will project onto a given pattern
`vector. This is not a problem for the system, however,
`since the distance e between the image and the face
`space is simply the squared distance between the mean-
`adjusted input image ([1 = I‘ — ‘I’ and <11; = L, myuy,
`its projection onto face space:
`-
`
`e’ = Iltb — MI’
`
`(9)
`
`Thus there are four possibilities for an input image and
`its pattern vector: (1) near face space and near a face
`class, (2) near face space but not near a known face class,
`(3) distant from face space and near a face class, and (4)
`distant from face space and not near a known face class.
`In the first case, an individual is recognized and iden-
`tified. In the second case, an unknown individual is pres-
`ent. The last two cases indicate that the image is not a
`face image. Case three typically shows up as a false pos-
`itive in most recognition systems;
`in our framework,
`however, the false recognition may be detected because
`of the significant distance between the image and the
`subspace of expected face images. Figure 4 shows some
`images and their projections into face space and gives a
`measure of distance from the face space for each.
`
`Summary of Eigenface Recognition
`Procedure
`
`with the highest associated eigenvalues. (Let M‘ = 1()_-
`this example.)
`5. Combine the normalized training set of images 3
`cording to Eq. (6) to produce the (M’ = 10) eigenfac
`Us
`,.
`4. For each known individual, calculate the class
`tor 01: by averaging the eigenface pattern vectors 0 [fro
`Eq. (8)] calculated from the original (four) images of
`. "
`individual. Choose a threshold 3, that defines the
`mum allowable distance from any face Class, and
`threshold 8. that defines the maximum allowable di'
`tance from face space [according to Eq. (9)].
`S. For each new face image to be identified, calculan
`its pattern vector (1, the distances 6: to each known (:1 «Q2;
`and the distance 6 to face space. if the minimum distan
`at < 6., and the distance e < 6., classify the input
`as the individual associated with class vector Qt. If
`minimum distance as > as but distance 6 < 9;, then
`image may be classifed as “unknown,” and optional:
`used to begin a new face class.
`6. If the new image is classified as a known individ .
`this image may be added to the original set of fami1'
`face images, and the eigenfaces may be recalcula eel
`(steps 1-4). This gives the opportunity to modify the fa
`space as the system encounters more instances of kno ii’;
`
`In our current system calculation of the eigenfaces
`done offline as part of the training. The recognitio
`currently takes about 400 msec running rather -ié
`ciently in Lisp on a Sun4, using face images of size 128
`128. With some special-purpose hardware, the cur ‘ ii":
`version could run at close to frame rate (53 msec).
`‘f
`Designing a practical system for
`face recognitio
`within this framework requires assessing the tradeo
`between generality, required accuracy, and speed. If
`face recognition task is restricted to a small set of peop.
`(such as the members of a family or a small compan
`a small set of eigenfaces is adequate to span the faces ti
`interest. If the system is to learn new faces or repres it
`many people, a larger basis set of eigenfaces will
`I.:I'
`required. The results of Sirovich and Kirby (1987) -
`53”
`Kirhyand Sirovich (1990) for coding of face images gi
`some evidence that even if it were necessary to reptes ..
`a large segment of the population, the number of eigelf
`faces needed would still be relatively small.
`
`E
`
`To summarize, the eigenfaces approach to face recogni-
`tion invoives the following steps:
`
`Locating and Detecting Faces
`
`1. Collect a set of characteristic face images of the
`known individuals. This set should include a number of
`
`images for each person, with some variation in expres-
`sion and in the lighting. (Say four images of ten people,
`so M = 40.)
`2. Calculate the (40 X 40) matrix L, find its eigenvec-
`tors and eigenvalues, and choose the M’ eigenvectors
`
`The analysis in the preceding sections assumes we hi‘
`a centered face image, the same size as the trainl
`images and the eigenfaces. We need some way, then,
`locate a face in a scene to do the recognition. We h
`developed two schemes to locate andlor track faces.
`ing motion detection and manipulation of the image5 “_
`“face space".
`'
`
`76
`
`Journal of Cognitive Neuroscience
`
`Volume 3. Number
`
`0008
`
`

`
`Figure 3. An original face image and its projectionlonto the face space defined by the eigenfaces of Figure 2.
`
`_
`
`"If-'_..
`5;‘
`
`Motion Detecting and Head Tracking
`'7} People are constantly moving. Even while sitting, we
`_ fidget and adjust our body position, nod our heads, look
`around, and such. In the case of a single person moving
`in a static environment, a simple motion detection and
`tracking algorithm, depicted in Figure 5, will locate and
`track the position of the head. Simple spatiotemporal
`filtering (e.g., frame differencing) accentuates image lo
`cations that change with time, so a moving person “lights
`up" in the filtered image. If the image “lights up" at all,
`motion is detected and the presence of a person is
`postulated.
`After thresholding die filtered image to produce a
`binary motion image, we analyze the “motion blobs" over
`time to decide if the motion is caused by a person
`"moving and to determine head position. A few simple
`rules are applied, such as “the head is the small upper
`blob above a larger blob (the body),” and “head motion
`ff. must be reasonably slow and contiguous" (heads are not
`expected to jump around the image erratically). Figure
`6 shows an image with the head located, along with the
`Path of the head in the preceding sequence of frames.
`The motion image also allows for an estimate of scale.
`The size of the blob that is assumed to be the moving
`hfiad determines the size of the subimage to send to the
`recognition stage. This subimage is rescaled to Fit the
`dimensions of the eigenfaces.
`
`faces from motion (eg, if there is too little motion or
`many moving objects) or as a method of achieving more
`precision than is possible by use of motion tracking
`alone. This method allows us to recognize the presence
`of faces apart from the task of identifying them.
`As seen in Figure 4, images of faces do not change
`radically when projected into the face space, while the
`projection of nonface images appears quite different.
`This basic idea is used to detect the presence of faces in
`a scene: at every location in the image, calculate the
`distance 6 between the local subimage and face space.
`This distance from face space is used as a measure of
`“faceness," so the result of calculating the distance from
`face space

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket