`Amir et al.
`
`USOO65391 OOB1
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 6,539,100 B1
`Mar. 25, 2003
`
`(54) METHOD AND APPARATUS FOR
`ASSOCIATING PUPILS WITH SUBJECTS
`(75) Inventors: Arnon Amir, Cupertino, CA (US);
`Myron Dale Flickner, San Jose, CA
`(US); David Bruce Koons, San Jose,
`CA (US); Carlos Hitoshi Morimoto,
`San Jose, CA (US)
`(73) Assignee: International Business Machines
`Corporation, Armonk, NY (US)
`Subject to any disclaimer, the term of this
`y
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`* ) Notice:
`
`(21) Appl. No.: 09/238,979
`(22) Filed:
`Jan. 27, 1999
`
`(51) Int. Cl. .................................................. G06K 9/00
`
`(52) U.S. Cl. ........................................ 382/117; 382/173
`
`(58) Field of Search ................................. 382/117, 199,
`382/128, 173, 286; 351/206, 221, 208,
`210; 434/40, 42
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`4,275,385 A 6/1981 White ........................ 340/312
`4,625,329 A 11/1986 Ishikawa et al. ..
`... 382/1
`4,931,865 A 6/1990 Scarampi ..................... 358/84
`5,016,282 A * 5/1991 Tomono et al.
`... 382/117
`5,291,560 A * 3/1994 Daugman .....
`... 382/117
`5,430.809 A 7/1995 Tomitaka ...
`... 382/173
`5,432,866 A * 7/1995 Sakamoto .
`... 382/128
`5,550,928 A 8/1996 Lu et al.....
`... 38.2/116
`5,596,362. A
`1/1997 Zhou ........................... 348/14
`OTHER PUBLICATIONS
`Aizawa et al., “Detection and Tracking of Facial Features,”
`Proc. of the SPIE Com. and Image Proc. (1995), v. 2501,
`Taipei Taiwan, pp. 1161-1172.
`
`Baluja et al., “Neutral Network-Based Face Detection,”
`Proc. IEEE Conf. on Computer Vision and Pattern Recog
`nition (1996), San Francisco, CA, pp. 203-208.
`Baluja et al., “Rotation Invariant Neutral Network-Based
`Face Detection,” Proc. IEEE Conf. on Computer Vision and
`Pattern Recognition (Jun. 1998), Santa Barbara, CA, pp.
`38–44.
`Bérard et al., “LAFTER: Lips and Face RealTime Tracker,”
`Proc. IEEE Conf. on Computer Vision and Pattern Recog
`nition (Jun. 1997), Puerto Rico, pp. 123-129.
`(List continued on next page.)
`Primary Examiner Andrew W. Johns
`Assistant Examiner Seyed Azarian
`(74) Attorney, Agent, or Firm-Dan Hubert
`(57)
`ABSTRACT
`A method and apparatus analyzes a Scene to determine
`which pupils correspond to which Subjects. First, a machine
`readable representation of the Scene, Such as a camera
`image, is generated. Although more detail may be provided,
`this representation minimally depicts certain visually per
`ceivable characteristics of multiple pupil candidates corre
`sponding to multiple Subjects in the Scene. A machine Such
`as a computer then examines various features of the pupil
`candidates. The features under analysis include (1) visually
`perceivable characteristics of the pupil candidates at one
`given time (“spatial cues”), and (2) changes in visually
`perceivable characteristics of the pupil candidates over a
`Sampling period (“temporal cues’). The spatial and temporal
`cues may be used to identify associated pupil pairs. Some
`exemplary Spatial cues include interocular distance, shape,
`height, and color of potentially paired pupils. In addition to
`features of the pupils themselves, Spatial cues may also
`include nearby facial features Such as presence of a nose/
`mouth/eyebrows in predetermined relationship to potentially
`paired pupils, a similarly colored iris Surrounding each of
`two pupils, Skin of Similar color nearby, etc. Some exem
`plary temporal cues include motion or blinking of paired
`pupils together, etc. With the foregoing examination, each
`pupil candidate can be associated with a Subject in the Scene.
`
`65 Claims, 2 Drawing Sheets
`
`
`
`
`
`
`
`
`
`
`
`OUTPUT
`DEVICE(S)
`
`
`
`FASTACCESS
`
`22
`
`24
`NONVOLATE
`
`PROCESSOR H
`
`8
`
`STORAGE
`DGALDATAPROCESSING
`APPARATUS
`
`
`
`2
`to 9
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 1
`
`
`
`US 6,539,100 B1
`Page 2
`
`OTHER PUBLICATIONS
`Birchfeld, “Elliptical Head Tracking Using Intensity Gradi
`ents and Color Histograms,” Proc. IEEE Conf. on Computer
`Vision and Pattern Recognition (1998), pp. 232-237.
`Cohen et al., “Feature Extraction From Faces. Using
`Deformable Template.” International Journal of Computer
`Vision (1992), vol. 8, No. 2, pp. 99–111.
`Darrell et al., “Active Face Tracking and Pose Estimation in
`an Interactive Room.” MIT Media Lab, TR-356 (1996), pp.
`1-16.
`Darrell et al., “A Virtual Mirror Interface Using Real-Time
`Robust Face Tracking.” Proc. Int’l conf. on Automatic Face
`and Gesture Recognition (Apr. 1998), Japan, pp. 616-621.
`Darrell et al., “Integrated Person Tracking Using Stereo,
`Color, and Pattern Detection,” Proc. IEEE on Conf. on
`Computer Vision and Pattern Recognition (Jun. 1998), Santa
`Barbara, CA, pp. 601-608.
`Ebisawa et al., “Effectiveness of Pupil Area Detection
`Technique Using Two Light Sources and Image Difference
`Method”, Proc. of the 15" Ann. Int'l Conf. of IEEE Engi
`neering in Medicine and Biology Society, vol. 15, (Jan.
`1993), pp. 1268–1269.
`Ebisawa, “Unconstained Pupil Detection Technique Using
`Two Light Sources and the Image Difference Method”,
`Visualization and Intelligent Design in Engineering and
`Architecture (1995), pp. 79–89.
`Fieguth et al., “Color-Based Tracking of Heads and Other
`Mobile Objects at Video Frame Rates”, Proc. IEEE Confer
`ence on Computer Vision and Pattern Recognition (1997),
`pp. 21-27.
`
`Govindaraju et al., “A Computational Model For Face
`Location.” Proc. of the Int’l Conf. on computer Vision (Dec.
`1990), pp. 718–721, Osaka, Japan.
`Harville et al., “Tracking People With Integrated Stereo,
`Color, and Face Detection,”, Proc. of the IEEE Conference
`on Computer Vision and Pattern Recognition, Jun. 1998, pp.
`601-608
`Kothari et al., “Detection of Eye Locations in Unconstrained
`Visual Images”, Proc. Int’l Conf. on Image Processing
`(1996), Switzerland, pp. 519–522.
`Poggio et al., “Example-Based Learning for View-Based
`Human Face Detection,” MIT AI Lab TR-AI-1521 (1994),
`pp. 1-20.
`Scassellati, “Eye Finding via Face Detection for a Foveated,
`Active Vision System.” Proceedings of the 15" Conf. on
`Artificial Intelligence (AAAI-98), ISBM 0-262-51098-7,
`Jul. 26–30, 1998.
`Scassellati, “Real-Time Face and Eye Detection,” world
`wide-web, 1998.
`Stiefelhagen et al., “A Model-Based Gaze Tracking Sys
`tem.” Proc. Joint Symposium on Intelligence and Systems
`(1996), pp. 304-310.
`Sirohey, “Human Face Segmentation and Identification”,
`CAR-TR-695, CS-TR-3176, (1993), pp. 1–33.
`Waibel et al., “A Real-Time Face Tracker”, Proc. of the 3'
`IEEE Workshop on Applications of Computer Vision
`(1996), Sarasota, FL, pp. 142-147.
`* cited by examiner
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 2
`
`
`
`U.S. Patent
`
`Mar. 25, 2003.
`
`Sheet 1 of 2
`
`US 6,539,100 B1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`OUTPUT
`DEVICES)
`
`O8
`
`FAST ACCESS
`
`22
`
`24
`
`NON-VOATLE
`
`STORAGE
`
`20
`
`FIG. 1
`
`DIGITAL RASSESSING
`AP
`
`O2
`
`FIG. 2
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 3
`
`
`
`U.S. Patent
`
`Mar. 25, 2003
`
`Sheet 2 of 2
`
`US 6,539,100 B1
`
`302
`
`START
`
`3OO
`M
`
`SEARCH FOR PUPL
`DETECTION CANDIDATES
`
`FILTER
`
`ASSOCATE PUPL
`CANDIDATES WITH
`FACES
`
`
`
`BEGIN TRACKING
`VERIFIED FACES
`AND MONTOR FACE
`CHARACTERISTICS
`
`30
`4.
`
`30
`6
`
`3O8
`
`3O
`
`YES/LOSE TRACK
`NEW FACEP
`
`CONTINUE
`
`32
`
`34
`
`FIG. 3
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 4
`
`
`
`1
`METHOD AND APPARATUS FOR
`ASSOCIATING PUPILS WITH SUBJECTS
`
`US 6,539,100 B1
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`The present invention relates to Sophisticated interfaces
`between humans and machines. More particularly, the
`invention concerns a method and apparatus for analyzing a
`Scene containing multiple Subjects to determine which
`pupils correspond to which Subjects.
`2. Description of the Related Art
`AS more powerful human-machine interfaces are being
`developed, many Such interfaces include the capability to
`perform user detection. By detecting the presence of a
`human user, a machine can manage its own functions more
`efficiently, and more reliably respond to human input. For
`example, a computer may employ user detection to Selec
`tively activate a Screen Saver when no users are present, or
`to display advertising banners only when a user is present.
`AS another application, in home-based television viewing
`monitors for assessing "Nielson' ratings, it may be useful to
`determine how many people are watching a television. User
`detection techniques Such as face detection may also be used
`as a valuable precursor to eye gaze detection. In addition,
`face detection will likely be an important component of
`future human-machine interfaces that consider head and
`facial gestures to Supplement mouse, voice, keyboard, and
`other user input. Such head and facial gestures may include
`nodding, leaning forward, head Shaking, and the like. Thus,
`user detection is an important tool that enables a more
`natural human-machine interface.
`Some user detection techniques are already known. For
`instance, a number of techniques focus on face detection
`using a combination of attributes Such as color, shape,
`motion, and depth. Some of these approaches, for example,
`include template matching as described in U.S. Pat. No.
`5,550,928 to Lu et al., and skin color analysis as described
`in U.S. Pat. No. 5,430,809 to Tomitaka. Another approach is
`the “Interval' system. The Interval system obtains range
`information using a Sophisticated Stereo camera System,
`gathers color information to evaluate as flesh tones, and
`analyzes face candidate inputs with a neural network trained
`to find faces. One drawback of the Interval system is the
`Substantial computation expense. An example of the Interval
`system is described in Darrell et al., “Tracking People With
`Integrated Stereo, Color, and Face Detection,' Perceptual
`User Interface Workshop, 1997. Although the Interval sys
`tem may be Satisfactory for Some applications, certain users
`with less powerful or highly utilized computerS may be
`frustrated with the interval System's computation require
`ments. The following references discuss Some other user
`detection Schemes: (1) T. Darrell et al., “Integrated person
`Tracking Using Stereo, Color, and Pattern Detection,” 1998,
`and (2) T. Darrell et al., “Active Face Tracking and Pose
`Estimation in an Interactive Room,” 1996.
`As a different approach, Some techniques perform user
`detection based on pupil detection. Pupil characteristics may
`be further analyzed to track eye position and movement, as
`described in U.S. Pat. No. 5,016,282 to Ptomain et al.
`Although the 282 patent and other pupil detection Schemes
`may be Satisfactory for Some applications, Such approaches
`are unable to process multiple faces and multiple pupils in
`an input image. Some difficulties include determining which
`pupils belong to the same face, and accounting for a partially
`off-screen perSon with only one pupil showing.
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`Thus, when multiple people and multiple pupils are
`present in an image, there may be considerable difficulty in
`asSociating pupils with people in order to detect how many
`people are present. In this respect, known approaches are not
`completely adequate for Some applications due to certain
`unsolved problems.
`SUMMARY OF THE INVENTION
`Broadly, the present invention concerns a method and
`apparatus for analyzing a Scene containing multiple Subjects
`to determine which pupils correspond to which Subjects.
`First, a machine-readable representation of the Scene, Such
`as a camera image, is generated. Although more detail may
`be provided, this representation minimally depicts certain
`Visually perceptible characteristics (Such as relative
`locations, shape, size, etc.) of multiple pupil candidates
`corresponding to multiple Subjects in the Scene. A computer
`analyzes various characteristics of the pupil candidates, Such
`as: (1) visually perceivable characteristics of the pupil
`candidates at one given time (“spatial cues”), and (2)
`changes in Visually perceivable characteristics of the pupil
`candidates over a Sampling period (“temporal cues’). The
`Spatial and temporal cues may be used to identify associated
`pupil pairs, i.e., two pupils belonging to the same Subject/
`face. Some exemplary spatial cues include interocular dis
`tance between potentially paired pupils, horizontal align
`ment of pupils, Same shape/size of pupils, etc. In addition to
`features of the pupils themselves, Spatial cues may also
`include nearby facial features Such as presence of a nose/
`mouth/eyebrows in predetermined relationship to potentially
`paired pupils, Similarly colored irises Surrounding the pupils,
`nearby Skin of Similar color, etc. Some exemplary temporal
`cues include motion or blinking of paired pupils together.
`With the foregoing analysis, each pupil candidate can be
`asSociated with a Subject in the Scene.
`In one embodiment, the invention may be implemented to
`provide a method for analyzing a Scene containing multiple
`Subjects to determine which pupils correspond to which
`Subjects. In another embodiment, the invention may be
`implemented to provide a computer-driven apparatus pro
`grammed to analyze a Scene containing multiple Subjects to
`determine which pupils correspond to which Subjects. In Still
`another embodiment, the invention may be implemented to
`provide a signal-bearing medium tangibly embodying a
`program of machine-readable instructions executable by a
`digital data processing apparatus to perform operations for
`analyzing a Scene containing multiple Subjects to determine
`which pupils correspond to which subjects. Still another
`embodiment involves a logic circuit configured to analyze a
`Scene containing multiple Subjects to determine which
`pupils correspond to which Subjects.
`The invention affords its users with a number of distinct
`advantages. First, unlike prior techniques, the invention is
`capable of determining which pupils belong to which faceS/
`Subjects in a Scene with multiple Subjects. In a Scene with
`multiple Subjects, understanding the pupil-Subject relation
`ship is an important prerequisite for tracking facial
`expressions, tracking movement, tracking user presence/
`absence, etc. AS another advantage, the invention is inex
`pensive to implement when compared to other detection and
`tracking Systems. For example, no dense range Sensing is
`required. Also, an inexpensive camera may be used when a
`Suitable lighting Scheme is employed to cancel noise. The
`analysis provided by the invention is particularly robust
`because it is based on the grouping of multiple cues, both
`Spatial and temporal. The invention also provides a number
`of other advantages and benefits, which should be apparent
`from the following description of the invention.
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 5
`
`
`
`3
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a block diagram of the hardware components
`and interconnections of a machine-driven System for ana
`lyzing a Scene to determine which pupils correspond to
`which subjects.
`FIG. 2 shows an exemplary signal-bearing medium in
`accordance with the invention.
`FIG. 3 is a flowchart depicting a Sequence of operations
`for analyzing a Scene to determine which pupils correspond
`to which subjects.
`
`DETAILED DESCRIPTION
`The nature, objectives, and advantages of the invention
`will become more apparent to those skilled in the art after
`considering the following detailed description in connection
`with the accompanying drawings. AS mentioned above, the
`invention concerns a System and method for analyzing a
`Scene to determine which pupils correspond to which Sub
`jects.
`
`Hardware Components & Interconnections
`Introduction
`One aspect of the invention concerns a System for asso
`ciating detected pupils with Subjects, which may be embod
`ied by various hardware components and interconnections.
`One example is the system 100, shown in FIG. 1. Generally,
`the function of the system 100 is to analyze features of a
`Scene 112, including “spatial” and/or “temporal cues exhib
`ited by the Scene 112, to determine which pupils in the Scene
`correspond to which Subjects. AS discussed below, one
`technique to map pupils to Subjects is to find matching pairs
`of pupils. In the illustrated example, the Scene 112 includes
`multiple subjects 114-116, which also may be referred to as
`“users,” “people,” etc. Human Subjects are discussed
`throughout this disclosure for ease of explanation; however,
`the invention may also be practical with nonhuman Subjects
`Such as livestock, Zoo animals, etc.
`Although facial analysis or representation of faces in the
`Scene 112 is unnecessary, the System 100 may prepare a
`mapping Specifically associating each pupil to a particular
`face in the Scene 112. AS explained below, the foregoing
`pupil-Subject mapping analysis helps to provide more
`natural, user-friendly human-machine interfaces. For
`example, if the System 100 is used to operate a computer
`game, it can automatically determine how many players are
`present.
`The system 100 includes a number of different
`components, which provide one example of the invention.
`Ordinarily skilled artisans (having the benefit of this
`disclosure) will recognize that certain components may be
`Substituted, eliminated, consolidated, or changed in various
`ways without departing from the Scope of the invention. The
`System 100 includes a digital data processing apparatus 102
`("computer”), a camera 104, a light source 106, and one or
`more output devices 108.
`Light Source
`The light source 106 may be used for various purposes,
`depending upon the manner of implementing the System
`100. In one example, the light source 106 may serve to
`illuminate the Subjects pupils to aid in pupil detection. In
`this example, the light source 106 may include multiple
`light-emitting elements, Such as two concentric rings of
`light-emitting elements as described in the 282 patent
`mentioned above. This embodiment works by creating a first
`image (using light from one angle) and a Second pupil image
`(using light from a different angle ). Pupils appear dark in
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,539,100 B1
`
`1O
`
`15
`
`4
`one image and bright in the other, enabling their detection by
`computing the difference between the first and Second
`images.
`The light source 106 may also serve to illuminate the
`Subject's faces, to aid in facial analysis if this optional
`feature is incorporated into the system 100. This function
`may be performed with the same light-emitting components
`used to illuminate pupils, or with additional light-emitting
`elements.
`The light source 106 may be provided by an incandescent
`light bulb, fluorescent light bulb, infrared light-emitting
`device, candle, vessel of reacting chemicals, light-emitting
`diode(s), or another suitable source. Preferably, the light
`Source 106 uses infrared light, so that the subjects are not
`disturbed by the light. To conveniently cast light upon the
`Subjects 114-116, the light Source casts light upon a wide
`area (e.g., omnidirectionally) rather than using a collimated
`beam Such as a laser beam. In one embodiment, the light
`Source 106 may be omitted, using ambient light instead Such
`as room lighting, Sunlight, etc.
`Camera
`The camera 104 comprises a device capable of represent
`ing the appearance of the Scene 112 in machine-readable
`format. To Suit this purpose, the camera 104 may comprise
`a black/white video camera, color Video camera, camcorder,
`“still shot' digital camera, etc. The camera 104 may be
`Sensitive to Some or all of the Visible spectrum of light,
`infrared light, another wavelength of light, or any other
`wavelength of emitted energy including at least the energy
`emitted by the light source 106. In an exemplary
`embodiment, where the light source 106 is an incandescent
`bulb, the camera 104 comprises a black/white video camera.
`In one embodiment, a second camera (not shown) may
`also be used, where the cameras have different fields of view.
`The wide-angle camera may be used to generally locate the
`Subject, with the narrow-angle camera being used to monitor
`more detailed features of the Subject. The cameras may also
`be used cooperatively to determine the range to the Subjects
`114-116 using known Stereo computer vision techniques.
`Furthermore, various other known non-vision-based range
`Sensing Systems may be used to provide range information.
`Output Device(s)
`The output devices(s) 108 include one or more devices
`that receive the results of the present invention's association
`of eyes (pupils) and Subjects. For ease of illustration, only
`one output device is described, although there may be
`multiple output devices. In one embodiment, the output
`device 108 may comprise a mechanism reporting the asso
`ciation between detected pupils and Subjects to a human
`user; Such a mechanism may be a Video monitor, Sound
`Speaker, LCD display, light-emitting diode, etc.
`Another embodiment of the output device 108 is a
`machine whose operation uses pupil-Subject mapping as an
`input. Some examples include (1) a “Nielson' rating moni
`tor installed in a home to detect the number of television
`viewers, (2) a computer that activates or deactivates certain
`functions depending upon whether any Subjects (and how
`many) are looking at the computer, (3) Surveillance or crowd
`flow monitoring/management at movies, Seminars,
`conferences, races, etc., and (4) Surveillance or monitoring
`of a group of animals in a Zoo, farm, ranch, laboratory,
`natural habitat, etc.
`As another embodiment, the output device 108 may
`comprise a photographic camera for taking pictures of a
`group of people. The photographer provides input represent
`ing the number of pupils or people in the Scene to the
`photographic camera (not shown), Such as by adjusting an
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 6
`
`
`
`US 6,539,100 B1
`
`S
`indicator wheel, Setting a Switch, rotating a dial, pressing
`buttons to enter data in conjunction with a menu shown on
`a display Screen, etc. In addition to this input, the photo
`graphic camera receives certain electronic input from the
`computer 102. This input includes signals representing the
`number of pupils detected by the system 100 using the
`methods described herein. The photographic camera evalu
`ates the computer input against the photographer's manual
`input, and avoids taking the group picture until the number
`of detected pupils (from the computer 102) equals the
`number of known pupils (entered by the photographer). In
`this way, the photographic camera ensures that the picture is
`taken when all Subjects eyes are open.
`Digital Data Processing Apparatus
`The computer 102 receives input from the camera 104 and
`performs computations to associate each eye (pupil) in the
`scene 112 with a subject. The computer 102 may also
`conduct preliminary analysis of the Scene 112 to initially
`detect the pupils. AS this feature is not necessary to the
`invention, however, the computer 102 may obtain such
`information from another Source.
`The computer 102 may be embodied by various hardware
`components and interconnections. AS shown, the computer
`102 includes a processor 118, Such as a microprocessor or
`other processing machine, coupled to a storage 120. In the
`present example, the Storage 120 includes a fast-acceSS
`Storage 122, as well as nonvolatile Storage 124. The fast
`access Storage 122 may comprise random access memory
`(RAM), and may be used to Store the programming instruc
`tions executed by the processor 118. The nonvolatile storage
`124 may comprise, for example, one or more magnetic data
`Storage diskS Such as a “hard drive,” a tape drive, or any
`other Suitable Storage device. The computer 102 also
`includes an input/output 110, Such as a number of lines,
`buses, cables, electromagnetic links, or other means for the
`processor 118 to exchange data with the hardware external
`to the computer 102, such as the light source 106, camera
`104, and output device 108.
`Despite the Specific foregoing description, ordinarily
`skilled artisans (having the benefit of this disclosure) will
`recognize that the apparatus discussed above may be imple
`mented in a machine of different construction, without
`departing from the Scope of the invention. As a specific
`example, one of the components 122 and 124 may be
`eliminated; furthermore, the storage 120 may be provided on
`board the processor 118, or even provided externally to the
`computer 102.
`
`Operation
`In addition to the various hardware embodiments
`described above, a different aspect of the invention concerns
`a method for analyzing a Scene and determining which
`pupils correspond to which Subjects.
`Signal-Bearing Media
`In the context of FIG. 1, such a method may be
`implemented, for example, by operating the computer 102 to
`execute a sequence of machine-readable instructions. These
`instructions may reside in various types of Signal-bearing
`media. In this respect, one aspect of the present invention
`concerns a programmed product, comprising Signal-bearing
`media tangibly embodying a program of machine-readable
`instructions executable by a digital data processor to per
`form a method to associate eyes (pupils) in a scene with
`Subjects.
`This signal-bearing media may comprise, for example,
`RAM (not shown) contained within the storage 120, as
`represented by the fast-acceSS Storage 122 for example.
`
`6
`Alternatively, the instructions may be contained in another
`Signal-bearing media, Such as a magnetic data Storage dis
`kette 200 (FIG. 2), directly or indirectly accessible by the
`processor 118. Whether contained in the storage 120, dis
`kette 200, or elsewhere, the instructions may be stored on a
`variety of machine-readable data Storage media, Such as a
`direct access storage device (DASD) (e.g., a conventional
`“hard drive,” redundant array of inexpensive disks (RAID),
`or etc.), magnetic tape, electronic read-only memory (e.g.,
`ROM, EPROM, or EEPROM), optical storage (e.g.,
`CD-ROM, WORM, DVD, digital optical tape), paper
`"punch cards, or other Suitable Signal-bearing media
`including transmission media Such as digital and analog and
`communication links and wireleSS. In an illustrative embodi
`ment of the invention, the machine-readable instructions
`may comprise Software object code, compiled from a lan
`guage Such as “C.” etc.
`Logic Circuitry
`In addition to the Signal-bearing media discussed above,
`the association of pupils with Subjects according to this
`invention may be implemented in a different way, without
`using a processor to execute instructions. Namely, this
`technique may be performed by using logic circuitry instead
`of executing Stored programming instructions with a digital
`data processor. Depending upon the particular requirements
`of the application with regard to Speed, expense, tooling
`costs, and the like, this logic may be implemented by
`constructing an application-specific integrated circuit
`(ASIC) having thousands of tiny integrated transistors. Such
`an ASIC may be implemented using CMOS, TTL, VLSI, or
`another Suitable construction. Other alternatives include a
`digital signal processing chip (DSP), discrete circuitry (Such
`as resistors, capacitors, diodes, inductors, and transistors),
`field programmable gate array (FPGA), programmable logic
`array (PLA), and the like.
`In this embodiment, Such logic circuitry may be used in
`replacement of the computer 102. Furthermore, the small
`Size of the logic circuitry may permit installing, embedding,
`or otherwise integrating the logic circuitry into the camera
`104 to provide an extremely compact overall package.
`Overall Sequence of Operation
`FIG. 3 shows a sequence 300 to illustrate one example of
`the present invention's method for analyzing a Scene to
`determine which pupils correspond to which Subjects. For
`ease of explanation, but without any intended limitation, the
`example of FIG. 3 is described in the context of the system
`100 described above.
`Locating Pupil Candidates
`After the sequence 300 is initiated in step 302, step 304
`Searches for pupil candidates. In the illustrated embodiment,
`this operation begins by the camera 104 generating one or
`more machine-readable images of the Scene 112. This may
`involve taking a SnapShot, capturing Several different pic
`tures over time, or filming a Video image. Next, the com
`puter 102 analyzes the image(s) to search for pupil
`candidates, i.e., features likely to represent pupils. The
`Search involves identifying any features of the image(s) that
`bear certain predefined characteristics.
`In one example, the Search for pupil candidates may be
`started by illuminating the scene 112 with different Subcom
`ponents of the light Source having different relative angles to
`the Subjects. This creates one image with dark pupils and
`another image with bright pupils. With this technique, pupil
`candidates are identified by computing the difference
`between the two imageS. This technique is described in
`detail in the 286 patent, mentioned above.
`Although the 282 patent describes one embodiment of
`Step 304, various other approaches will be also apparent to
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`IPR2022-00093 - LGE
`Ex. 1006 - Page 7
`
`
`
`US 6,539,100 B1
`
`15
`
`40
`
`45
`
`25
`
`7
`ordinarily skilled artisans having the benefit of this disclo
`sure. The output of step 304 may comprise various types of
`machine-readable representation of the candidate pupils,
`Such as (x,y) coordinates of pupil centers, an identification
`of pixels in each camera image corresponding to pupil
`candidates, or another representation of the size, shape,
`position, and/or other distinguishing features of the pupil
`candidates.
`Filtering Single Pupil Candidates
`Having identified a number of pupil candidates (possible
`pupils) in step 304, the computer 102 proceeds to filter
`individual candidates to eliminate false candidates (Step
`306). This operation may consider a number of different
`features to eliminate candidates that are not actually pupils.
`For instance, the following features of each pupil candidate
`may be evaluated:
`the pupil candidate's ratio of horizontal size to vertical
`Size ("aspect ratio’), where an aspect ratio of 1:1
`(vertical:horizontal) is Sought to filter out motion dis
`parities that might be mistaken for pupils.
`the pupil candidate's size, where a Sufficiently Small size
`is Sought to filter out reflective emblems on clothing,
`Such as So-called "retro” reflectors on running Shoes
`and jackets, which may otherwise be mistaken as
`pupils.
`the pupil candidate's range, where Subjects are expected
`to be positioned a certain distance away; this informa
`tion may be derived from the camera's focal length,
`using two cameras to perceive depth, or using other
`distance Sensing techniqueS or hardware.
`comparing the pupil candidate to certain model
`Specifications, Such as certain expected size, shape,
`color, and Shape of the region Surrounding the pupil; an
`example of this technique is discussed in Kothari &
`35
`Mitchell, “Detection of Eye Location in Unconstrained
`Visual Images,” Proc. Intl. Conf. on Image Processing,
`September 1996.
`eye blinking, exhibited by disappearance of the pupil
`candidate for about 250-600 milliseconds every 10–45
`Seconds, for example.
`Having eliminated a number of false candidates in Step
`306, the remaining (filtered) candidates are especially likely
`to represent actual pupils. There still may be Some false
`candidates, however, due to the candidates extreme simi
`larities to pupils or due to limitations of the filtering process.
`Accordingly, the filtered results are Still called “candidates.”
`Associate Pupils with Subjects
`Having identified and filtered pupil candidates, step 308
`performs the operation of associating the filtered pupil
`candidates with subjects. In the exemplary setup of FIG. 1
`this operation is performed by the computer 102. Although
`step 308 may also perform the optional tasks of identifying
`regions in the image corresponding to faces and determining
`which pupils belong to the resultant facial regions, this is not
`necessary. In a more general Sense, Step 308 performs
`pupil-Subject mapping by determining which filtered pupil
`candidates belong to which Subject, regardless of whether
`the facial regions in the image are identified or analyzed. In
`this sense, step 308 is considered to perform “pupil-subject”
`mapping, but Since eyes are a necessary part of the face, Step
`308 may also be considered to perform “pupil-face' map
`ping. As an example, in an image containing