`
`US006539100Bl
`
`
`(12)United States Patent
`Amir et al.
`
`(10)Patent No.:US 6,539,100 Bl
`
`Mar.25,2003
`(45)Date of Patent:
`
`(54)METHOD AND APPARATUS FOR
`
`ASSOCIATING PUPILS WITH SUBJECTS
`
`Baluja et al., "Neutral Network-Based Face Detection,"
`
`
`
`
`
`Proc. IEEE Conf. on Computer Vision and Pattern Recog
`
`
`
`nition (1996), San Francisco, CA, pp. 203-208.
`(75) Inventors: Amon Amir, Cupertino, CA (US);
`
`
`
`
`Baluja et al., "Rotation Invariant Neutral Network-Based
`
`Myron Dale Flickner, San Jose, CA
`
`
`
`
`Face Detection," Proc. IEEE Conf. on Computer Vision and
`(US); David Bruce Koons, San Jose,
`
`
`
`
`Pattern Recognition (Jun. 1998), Santa Barbara, CA, pp.
`
`CA (US); Carlos Hitoshi Morimoto,
`38-44.
`San Jose, CA (US)
`
`Berard et al., "LAFTER: Lips and Face Real Time Tracker,"
`
`
`
`
`Proc. IEEE Conf. on Computer Vision and Pattern Recog
`
`
`
`nition (Jun. 1997), Puerto Rico, pp. 123-129.
`
`(73)Assignee: International Business Machines
`
`
`
`
`
`Corporation, Armonk, NY (US)
`
`( *) Notice: Subject to any disclaimer, the term of this
`
`
`
`
`
`patent is extended or adjusted under 35
`Primary Examiner-Andrew W. Johns
`
`
`
`
`U.S.C. 154(b) by O days.
`
`Assistant Examiner----Seyed Azarian
`
`
`(74)Attorney, Agent, or Firm-Dan Hubert
`
`
`
`
`
`(List continued on next page.)
`
`(21)Appl. No.: 0 9/238,979
`
`(57)
`
`ABSTRACT
`
`
`
`(22)Filed:Jan.27, 1999
`
`
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`A method and apparatus analyzes a scene to determine
`
`
`
`
`
`
`
`which pupils correspond to which subjects. First, a machine
`
`(51)Int. Cl.7 .................................................. G06K 9/00
`
`
`
`
`
`readable representation of the scene, such as a camera
`
`
`
`
`image, is generated. Although more detail may be provided,
`
`( 52)U.S. Cl. ........................................ 382/117; 382/173
`
`
`
`
`
`this representation minimally depicts certain visually per
`
`
`
`ceivable characteristics of multiple pupil candidates corre
`
`
`(58)Field of Search ................................. 382/117, 199,
`
`
`
`
`sponding to multiple subjects in the scene. A machine such
`
`
`
`382/128, 173, 286; 351/206, 221, 208,
`
`
`
`as a computer then examines various features of the pupil
`
`210; 434/40, 42
`
`
`
`
`candidates. The features under analysis include (1) visually
`
`
`
`
`perceivable characteristics of the pupil candidates at one
`
`
`
`given time ("spatial cues"), and (2) changes in visually
`
`
`
`perceivable characteristics of the pupil candidates over a
`
`
`
`
`
`sampling period ("temporal cues"). The spatial and temporal
`
`
`
`
`4,275,385 A 6/1981 White ........................ 340/312
`
`
`
`cues may be used to identify associated pupil pairs. Some
`
`
`4,625,329 A 11/1986 Ishikawa et al. ... ... ... ... ... 382/1
`
`
`
`
`exemplary spatial cues include interocular distance, shape,
`
`
`
`4,931,865 A 6/1990 Scarampi . .. ... ... ... ... ... .. . 358/84
`
`
`
`
`height, and color of potentially paired pupils. In addition to
`
`
`
`
`5,016,282 A *5/1991 Tomono et al. . . . . . . . . . . . . . 382/117
`
`
`features of the pupils themselves, spatial cues may also
`
`
`
`
`5,291,560 A *3/1994 Daugman . . . . . . . . . . . . . . . . . . . 382/117
`
`
`include nearby facial features such as presence of a nose/
`
`
`
`
`
`5,430,809 A 7/1995 Tomitaka .................... 382/173
`
`
`mouth/eyebrows in predetermined relationship to potentially
`
`
`
`
`5,432,866 A *7/1995 Sakamoto ................... 382/128
`
`
`
`paired pupils, a similarly colored iris surrounding each of
`
`
`
`
`
`5,550,928 A 8/1996 Lu et al. ..................... 382/116
`
`
`
`
`two pupils, skin of similar color nearby, etc. Some exem
`
`
`5,596,362 A 1/1997 Zhou . . . . . . . . . . . . . . . . . . . . . . . . . . . 348/14
`
`
`
`plary temporal cues include motion or blinking of paired
`
`
`
`
`pupils together, etc. With the foregoing examination, each
`
`
`
`
`pupil candidate can be associated with a subject in the scene.
`Aizawa et al., "Detection and Tracking of Facial Features,"
`
`
`
`
`
`Proc. of the SPIE Com. and Image Proc. (1995), v. 2501,
`
`
`Taipei Taiwan, pp. 1161-1172.
`
`OIBER PUBLICATIONS
`
`
`
`65 Claims, 2 Drawing Sheets
`
`OUTPUT
`DEVICE(S)
`
`108
`
`I FAST ACCESS I
`1227
`124
`NON-VOLATILE
`
`STORAGE 120
`
`\.__ 112
`
`DIGITAL DATA PROCESSING 102
`
`APPARATUS
`
`IPR2021-00923
`Apple EX1006 Page 1
`
`
`
`US 6,539,100 Bl
`
`Page 2
`
`OIBER PUBLICATIONS
`
`Govindaraju et al., "A Computational Model For Face
`
`
`
`
`
`
`
`Location," Proc. of the Int'l Conf. on computer Vision (Dec.
`
`
`
`
`Birchfeld, "Elliptical Head Tracking Using Intensity Gradi
`
`
`1990), pp. 718-721, Osaka, Japan.
`
`ents and Color Histograms," Proc. IEEE Conf. on Computer
`
`
`Harville et al., "Tracking People With Integrated Stereo,
`
`
`
`
`Vision and Pattern Recognition (1998), pp. 232-237.
`
`
`Color, and Face Detection,", Proc. of the IEEE Conference
`
`Cohen et al., "Feature Extraction From Faces Using
`
`
`
`on Computer Vision and Pattern Recognition, Jun. 1998, pp.
`
`
`Deformable Template," International Journal of Computer
`601-608.
`
`
`Vision (1992), vol. 8, No. 2, pp. 99-111.
`et al., "Detection of Eye Locations in Unconstrained
`
`
`Kothari
`
`
`
`
`
`Darrell et al., "Active Face Tracking and Pose Estimation in
`
`
`Visual Images", Proc. Int'l Conf. on Image Processing
`
`an Interactive Room," MIT Media Lab, TR-356 (1996), pp.
`
`(1996), Switzerland, pp. 519-522.
`1-16.
`
`
`
`Poggio et al., "Example-Based Learning for View-Based
`
`
`
`
`
`Darrell et al., "A Virtual Mirror Interface Using Real-Time
`
`
`
`Human Face Detection," MIT AI Lab TR-AI-1521 (1994),
`
`
`
`
`Robust Face Tracking," Proc. Int'l conf. on Automatic Face
`pp. 1-20.
`
`
`
`
`and Gesture Recognition (Apr. 1998), Japan, pp. 616-621.
`
`
`
`
`Darrell et al., "Integrated Person Tracking Using Stereo,
`
`
`
`Scassellati, "Eye Finding via Face Detection for a Foveated,
`
`Color, and Pattern Detection," Proc. IEEE on Conf. on
`of the 15th Conf. on
`
`Active Vision System," Proceedings
`
`Computer Vision and Pattern Recognition (Jun. 1998), Santa
`
`
`
`Artificial Intelligence (AAAI-98), ISBM 0--262-51098-7,
`
`Barbara, CA, pp. 601-608.
`Jul. 26-30, 1998.
`
`
`
`Ebisawa et al., "Effectiveness of Pupil Area Detection
`
`
`
`
`
`Technique Using Two Light Sources and Image Difference
`wide-web, 1998.
`
`Proc. of the 15th Ann. Int'l Conf of IEEE Engi
`Method",
`Stiefelhagen et al., "A Model-Based Gaze Tracking Sys
`
`
`
`
`
`
`neering in Medicine and Biology Society,
`vol. 15, (Jan.
`
`
`
`tem," Proc. Joint Symposium on Intelligence and Systems
`
`1993), pp. 1268-1269.
`
`(1996), pp. 304-310.
`Ebisawa, "Unconstained Pupil Detection Technique Using
`
`
`
`
`
`Sirohey, "Human Face Segmentation and Identification",
`
`Two Light Sources and the Image Difference Method",
`
`
`CAR-TR-695, CS-TR-3176, (1993 ), pp. 1 -33.
`
`
`
`Visualization and Intelligent Design in Engineering and
`
`Waibel et al., "A Real-Time Face Tracker", Proc. of the 3rd
`
`
`Architecture (1995), pp. 79-89.
`
`
`
`
`Fieguth et al., "Color-Based Tracking of Heads and Other
`
`
`
`IEEE Workshop on Applications of Computer Vision
`
`
`Mobile Objects at Video Frame Rates", Proc. IEEE Confer
`
`
`(1996), Sarasota, FL, pp. 142-147.
`
`
`
`ence on Computer Vision and Pattern Recognition (1997),
`
`* cited by examiner
`pp. 21-27.
`
`Scassellati, "Real-Time Face and Eye Detection," world
`
`IPR2021-00923
`Apple EX1006 Page 2
`
`
`
`U.S. Patent Mar.25,2003
`Sheet 1 of 2
`
`US 6,539,100
`
`Bl
`
`C
`A
`M
`104
`
`\._ 112
`
`FIG. 1
`
`OUTPUT
`DEVICE(S)
`
`108
`
`110
`
`FAST ACCESS I
`1227
`
`PROCESSOR
`124
`
`118
`
`NON-VOLATILE
`
`STORAGE
`120
`
`DIGITAL DATA PROCESSING
`
`102
`APPARATUS
`
`200
`
`FIG. 2
`
`G
`
`IPR2021-00923
`Apple EX1006 Page 3
`
`
`
`
`Sheet 2 of 2
`U.S. Patent Mar. 25, 2003
`
`US 6,539,100 Bl
`
`302
`
`300
`
`,I
`
`START
`
`SEARCH FOR PUPIL
`DETECTION CANDIDATES
`
`304
`
`FILTER
`
`306
`
`308
`ASSOCIATE PUPIL
`
`CANDIDATES WITH
`FACES
`
`BEGIN TRACKING
`
`VERIFIED FACES
`AND MONITOR FACE
`CHARACTERISTICS
`
`310
`
`CONTINUE
`
`312
`
`314
`
`FIG. 3
`
`IPR2021-00923
`Apple EX1006 Page 4
`
`
`
`
`
`US 6,539,100 Bl
`
`1
`
`2
`Thus, when multiple people and multiple pupils are
`
`
`METHOD AND APPARATUS FOR
`
`
`
`
`
`present in an image, there may be considerable difficulty in
`
`ASSOCIATING PUPILS WITH SUBJECTS
`
`
`
`
`associating pupils with people in order to detect how many
`
`
`
`
`people are present. In this respect, known approaches are not
`
`
`
`due to certain 5 completely adequate for some applications
`BACKGROUND OF THE INVENTION
`
`unsolved problems.
`
`SUMMARY OF THE INVENTION
`
`1. Field of the Invention
`
`The present invention relates to sophisticated interfaces
`
`
`
`
`between humans and machines. More particularly, the Broadly, the present invention concerns a method and
`
`
`
`
`
`
`
`
`
`
`
`
`
`apparatus for analyzing a scene containing multiple subjects
`
`
`
`
`invention concerns a method and apparatus for analyzing a
`
`scene containing multiple subjects to determine which 10
`
`to determine which pupils correspond to which subjects.
`
`
`
`
`
`
`
`First, a machine-readable representation of the scene, such
`
`pupils correspond to which subjects.
`
`
`
`
`as a camera image, is generated. Although more detail may
`
`
`2.Description of the Related Art
`
`
`
`
`be provided, this representation minimally depicts certain
`
`As more powerful human-machine interfaces are being
`
`
`
`
`
`visually perceptible characteristics (such as relative
`
`developed, many such interfaces include the capability to
`
`
`
`
`
`
`15 locations, shape, size, etc.) of multiple pupil candidates
`
`
`
`perform user detection. By detecting the presence of a
`
`
`
`
`corresponding to multiple subjects in the scene. A computer
`
`
`
`human user, a machine can manage its own functions more
`
`
`
`
`analyzes various characteristics of the pupil candidates, such
`
`
`
`
`efficiently, and more reliably respond to human input. For
`
`
`
`
`as: (1) visually perceivable characteristics of the pupil
`
`
`
`
`example, a computer may employ user detection to selec
`
`
`
`candidates at one given time ("spatial cues"), and (2)
`
`
`tively activate a screen saver when no users are present, or
`
`
`
`
`20 changes in visually perceivable characteristics of the pupil
`
`
`
`
`to display advertising banners only when a user is present.
`
`
`
`candidates over a sampling period ("temporal cues"). The
`
`
`
`
`As another application, in home-based television viewing
`
`
`
`spatial and temporal cues may be used to identify associated
`
`
`
`
`
`monitors for assessing "Nielson" ratings, it may be useful to
`
`
`
`
`pupil pairs, i.e., two pupils belonging to the same subject/
`
`
`
`
`determine how many people are watching a television. User
`
`
`
`face. Some exemplary spatial cues include interocular dis-
`
`
`
`detection techniques such as face detection may also be used
`
`
`paired pupils, horizontal align25 tance between potentially
`
`
`
`as a valuable precursor to eye gaze detection. In addition,
`
`
`
`
`ment of pupils, same shape/size of pupils, etc. In addition to
`
`
`
`
`face detection will likely be an important component of
`
`
`
`
`features of the pupils themselves, spatial cues may also
`
`
`
`
`future human-machine interfaces that consider head and
`
`
`
`
`include nearby facial features such as presence of a nose/
`
`
`
`
`facial gestures to supplement mouse, voice, keyboard, and
`
`
`
`mouth/eyebrows in predetermined relationship to potentially
`
`
`
`
`
`other user input. Such head and facial gestures may include
`
`
`
`
`30 paired pupils, similarly colored irises surrounding the pupils,
`
`
`
`
`
`nodding, leaning forward, head shaking, and the like. Thus,
`
`
`
`nearby skin of similar color, etc. Some exemplary temporal
`
`
`
`user detection is an important tool that enables a more
`
`
`
`
`
`cues include motion or blinking of paired pupils together.
`
`
`natural human-machine interface.
`
`
`
`With the foregoing analysis, each pupil candidate can be
`
`
`associated with a subject in the scene.
`
`
`
`Some user detection techniques are already known. For
`
`
`instance, a number of techniques focus on face detection
`In one embodiment, the invention may be implemented to
`
`
`
`35
`
`
`
`
`using a combination of attributes such as color, shape,
`
`
`
`
`provide a method for analyzing a scene containing multiple
`
`
`
`
`motion, and depth. Some of these approaches, for example,
`
`
`
`subjects to determine which pupils correspond to which
`
`
`
`include template matching as described in U.S. Pat. No.
`
`
`
`
`subjects. In another embodiment, the invention may be
`
`5,550,928 to Lu et al., and skin color analysis as described
`
`
`
`
`implemented to provide a computer-driven apparatus pro-
`
`
`
`
`in U.S. Pat. No. 5,430,809 to Tomitaka. Another approach is
`
`
`
`
`40 grammed to analyze a scene containing multiple subjects to
`
`
`
`
`the "Interval" system. The Interval system obtains range
`
`
`
`
`determine which pupils correspond to which subjects. In still
`
`
`
`information using a sophisticated stereo camera system,
`
`
`
`
`another embodiment, the invention may be implemented to
`
`
`
`
`gathers color information to evaluate as flesh tones, and
`
`
`
`provide a signal-bearing medium tangibly embodying a
`
`
`
`
`
`analyzes face candidate inputs with a neural network trained
`
`
`
`
`program of machine-readable instructions executable by a
`
`
`
`to find faces. One drawback of the Interval system is the
`
`
`
`45 digital data processing apparatus to perform operations for
`substantial computation expense. An example of the Interval
`
`
`
`
`
`
`
`
`
`analyzing a scene containing multiple subjects to determine
`
`system is described in Darrell et al., "Tracking People With
`
`
`
`
`
`which pupils correspond to which subjects. Still another
`
`Integrated Stereo, Color, and Face Detection," Perceptual
`
`
`
`
`
`
`
`
`embodiment involves a logic circuit configured to analyze a
`
`User Interface Workshop, 1997. Although the Interval sys
`
`
`
`
`
`
`
`scene containing multiple subjects to determine which
`tem may be satisfactory for some applications, certain users
`
`
`
`50 pupils correspond to which subjects.
`
`with less powerful or highly utilized computers may be
`
`The invention affords its users with a number of distinct
`
`
`
`
`frustrated with the interval system's computation require
`
`
`
`
`
`advantages. First, unlike prior techniques, the invention is
`
`
`
`ments. The following references discuss some other user
`
`
`capable of determining which pupils belong to which faces/
`
`
`
`
`detection schemes: (1) T. Darrell et al., "Integrated person
`
`
`
`
`subjects in a scene with multiple subjects. In a scene with
`
`
`
`
`Tracking Using Stereo, Color, and Pattern Detection," 1998, 55
`
`
`
`multiple subjects, understanding the pupil-subject relation
`
`
`
`and (2) T. Darrell et al, "Active Face Tracking and Pose
`
`
`
`ship is an important prerequisite for tracking facial
`
`
`Estimation in an Interactive Room," 1996.
`
`
`
`
`expressions, tracking movement, tracking user presence/
`
`
`
`As a different approach, some techniques
`
`
`
`
`absence, perform user etc. As another advantage, the invention is inex
`
`detection based on pupil detection.
`
`
`
`pensive Pupil characteristics may to implement when compared to other detection and
`
`
`
`
`
`
`
`
`
`
`be further analyzed to track eye position and movement, as 60 tracking systems. For example, no dense range sensing is
`
`
`
`
`
`
`
`
`
`
`
`described in U.S. Pat. No. 5,016,282 to Ptomain et al. required. Also, an inexpensive camera may be used when a
`
`
`
`
`
`
`
`
`Although the '282 patent and other pupil detection schemes suitable lighting scheme is employed to cancel noise. The
`
`
`
`
`
`
`
`may be satisfactory for some applications, such approaches analysis provided by the invention is particularly robust
`
`
`
`
`
`
`
`
`are unable to process multiple faces and multiple pupils in because it is based on the grouping of multiple cues, both
`
`
`
`
`
`
`
`
`
`
`an input image. Some difficulties include determining which 65 spatial and temporal. The invention also provides a number
`
`
`
`
`
`
`
`pupils belong to the same face, and accounting for a partially of other advantages and benefits, which should be apparent
`
`
`
`
`off-screen person with only one pupil showing. from the following description of the invention.
`
`IPR2021-00923
`Apple EX1006 Page 5
`
`
`
`
`
`US 6,539,100 Bl
`
`DETAILED DESCRIPTION
`
`4
`3
`BRIEF DESCRIPTION OF THE DRAWINGS one image and bright in the other, enabling their detection by
`
`
`
`
`
`
`
`
`
`
`computing the difference between the first and second
`FIG. 1 is a block diagram of the hardware components
`
`
`images.
`
`
`and interconnections of a machine-driven system for ana
`The light source 106 may also serve to illuminate the
`
`
`
`
`lyzing a scene to determine which pupils correspond to
`
`
`
`
`subject's faces, to aid in facial analysis if this optional
`5
`which subjects.
`
`feature is incorporated into the system 100. This function
`
`
`
`FIG. 2 shows an exemplary signal-bearing medium m
`
`
`may be performed with the same light-emitting components
`
`accordance with the invention.
`
`
`
`used to illuminate pupils, or with additional light-emitting
`FIG. 3 is a flowchart depicting a sequence of operations
`
`
`
`
`elements.
`
`for analyzing a scene to determine which pupils correspond 10
`
`
`
`The light source 106 may be provided by an incandescent
`
`
`to which subjects.
`
`
`
`
`light bulb, fluorescent light bulb, infrared light-emitting
`
`
`
`device, candle, vessel of reacting chemicals, light-emitting
`
`
`
`
`
`diode(s), or another suitable source. Preferably, the light
`
`
`
`source 106 uses infrared light, so that the subjects are not
`
`
`The nature, objectives, and advantages of the invention
`
`
`
`
`
`15 disturbed by the light. To conveniently cast light upon the
`will become more apparent to those skilled in the art after
`
`
`
`
`
`
`subjects 114--116, the light source casts light upon a wide
`
`
`
`
`considering the following detailed description in connection
`
`
`
`area ( e.g., omnidirectionally) rather than using a collimated
`
`
`with the accompanying drawings. As mentioned above, the
`
`
`beam such as a laser beam. In one embodiment, the light
`
`
`
`invention concerns a system and method for analyzing a
`
`
`
`source 106 may be omitted, using ambient light instead such
`
`
`
`scene to determine which pupils correspond to which sub
`
`
`20 as room lighting, sunlight, etc.
`jects.
`Camera
`Hardware Components & Interconnections The camera 104 comprises a device capable of represent
`
`
`
`
`
`
`Introduction
`
`
`ing the appearance of the scene 112 in machine-readable
`
`
`
`
`
`
`One aspect of the invention concerns a system for assoformat. To suit this purpose, the camera 104 may comprise
`
`
`
`
`
`
`
`
`
`
`ciating detected pupils with subjects, which may be embod25 a black/white video camera, color video camera, camcorder,
`
`
`
`
`
`
`
`ied by various hardware components and interconnections. "still shot" digital camera, etc. The camera 104 may be
`
`
`
`
`
`
`
`One example is the system 100, shown in FIG. 1. Generally, sensitive to some or all of the visible spectrum of light,
`
`
`
`
`
`
`
`
`
`
`the function of the system 100 is to analyze features of a infrared light, another wavelength of light, or any other
`
`
`
`
`
`
`
`
`
`scene 112, including "spatial" and/or "temporal" cues exhibwavelength of emitted energy including at least the energy
`
`
`
`
`
`ited by the scene 112, to determine which pupils in the scene 30 emitted by the light source 106. In an exemplary
`
`
`
`
`
`
`correspond to which subjects. As discussed below, one embodiment, where the light source 106 is an incandescent
`
`
`
`
`
`
`technique to map pupils to subjects is to find matching pairs bulb, the camera 104 comprises a black/white video camera.
`
`
`
`
`
`
`of pupils. In the illustrated example, the scene 112 includes In one embodiment, a second camera (not shown) may
`
`
`
`
`
`
`multiple subjects 114-116, which also may be referred to as also be used, where the cameras have different fields of view.
`
`
`
`
`
`"users," "people," etc. Human subjects are discussed 35 The wide-angle camera may be used to generally locate the
`
`
`
`
`
`
`throughout this disclosure for ease of explanation; however, subject, with the narrow-angle camera being used to monitor
`
`
`
`
`
`
`
`the invention may also be practical with nonhuman subjects more detailed features of the subject. The cameras may also
`
`
`
`such as livestock, zoo animals, etc.
`
`
`be used cooperatively to determine the range to the subjects
`
`
`
`
`
`
`
`
`
`Although facial analysis or representation of faces in the 114-116 using known stereo computer vision techniques.
`
`
`
`
`
`
`scene 112 is unnecessary, the system 100 may prepare a 40 Furthermore, various other known non-vision-based range
`
`
`
`
`
`
`mapping specifically associating each pupil to a particular sensing systems may be used to provide range information.
`
`
`face in the scene 112. As explained below, the foregoing Output Device(s)
`The output devices(s) 108 include one or more devices
`
`
`
`pupil-subject mapping analysis helps to provide more
`
`
`
`
`
`
`
`
`
`
`
`
`
`natural, user-friendly human-machine interfaces. For that receive the results of the present invention's association
`
`
`
`
`
`example, if the system 100 is used to operate a computer 45 of eyes (pupils) and subjects. For ease of illustration, only
`
`
`
`
`
`
`
`game, it can automatically determine how many players are one output device is described, although there may be
`present.
`
`
`
`
`multiple output devices. In one embodiment, the output
`
`
`
`
`
`
`The system 100 includes a number of different device 108 may comprise a mechanism reporting the asso
`
`
`
`
`
`
`
`
`components, which provide one example of the invention. ciation between detected pupils and subjects to a human
`
`
`
`
`
`Ordinarily skilled artisans (having the benefit of this 50 user; such a mechanism may be a video monitor, sound
`
`
`
`
`
`
`
`
`disclosure) will recognize that certain components may be speaker, LCD display, light-emitting diode, etc.
`
`
`
`
`
`
`substituted, eliminated, consolidated, or changed in various Another embodiment of the output device 108 is a
`
`
`
`
`
`
`
`ways without departing from the scope of the invention. The machine whose operation uses pupil-subject mapping as an
`
`
`
`
`
`
`
`system 100 includes a digital data processing apparatus 102 input. Some examples include (1) a "Nielson" rating moni-
`
`
`
`
`
`
`("computer"), a camera 104, a light source 106, and one or 55 tor installed in a home to detect the number of television
`
`
`more output devices 108.
`
`
`viewers, (2) a computer that activates or deactivates certain
`
`
`Light Source
`
`functions depending upon whether any subjects (and how
`The light source 106 may be used for various purposes,
`
`
`
`
`
`many) are looking at the computer, (3) surveillance or crowd
`
`
`
`depending upon the manner of implementing the system
`
`
`flow monitoring/management at movies, seminars,
`
`100.In one example, the light source 106 may serve to 60
`
`
`
`
`
`conferences, races, etc., and (4) surveillance or monitoring
`
`
`
`
`
`illuminate the subjects' pupils to aid in pupil detection. In
`
`
`
`of a group of animals in a zoo, farm, ranch, laboratory,
`
`this example, the light source 106 may include multiple
`
`natural habitat, etc.
`
`
`
`light-emitting elements, such as two concentric rings of
`As another embodiment, the output device 108 may
`
`
`
`
`
`light-emitting elements as described in the '282 patent
`
`
`
`
`comprise a photographic camera for taking pictures of a
`
`
`
`
`mentioned above. This embodiment works by creating a first 65
`
`
`
`group of people. The photographer provides input represent
`
`
`image (using light from one angle) and a second pupil image
`
`
`ing the number of pupils or people in the scene to the
`
`
`
`
`
`(using light from a different angle ). Pupils appear dark in
`
`
`
`
`photographic camera (not shown), such as by adjusting an
`
`IPR2021-00923
`Apple EX1006 Page 6
`
`
`
`
`
`US 6,539,100 Bl
`
`6
`5
`indicator wheel, setting a switch, rotating a dial, pressing Alternatively, the instructions may be contained in another
`
`
`
`
`
`
`
`
`
`
`
`signal-bearing media, such as a magnetic data storage dis
`
`
`
`buttons to enter data in conjunction with a menu shown on
`
`
`kette 200 (FIG. 2), directly or indirectly accessible by the
`
`
`
`
`
`a display screen, etc. In addition to this input, the photo
`
`
`
`processor 118. Whether contained in the storage 120, dis-
`graphic camera receives certain electronic input from the
`
`
`5 kette 200, or elsewhere, the instructions may be stored on a
`
`computer 102. This input includes signals representing the
`
`
`
`
`
`variety of machine-readable data storage media, such as a
`
`
`
`number of pupils detected by the system 100 using the
`
`
`
`
`direct access storage device (DASD) (e.g., a conventional
`
`
`
`
`methods described herein. The photographic camera evalu
`
`
`
`"hard drive," redundant array of inexpensive disks (RAID),
`
`
`
`
`ates the computer input against the photographer's manual
`
`
`or etc.), magnetic tape, electronic read-only memory (e.g.,
`
`
`
`
`
`input, and avoids taking the group picture until the number
`
`
`
`ROM, EPROM, or EEPROM), optical storage (e.g.,
`
`
`of detected pupils (from the computer 102) equals the
`10
`
`
`
`CD-ROM, WORM, DVD, digital optical tape), paper
`
`
`
`number of known pupils (entered by the photographer). In
`
`
`"punch" cards, or other suitable signal-bearing media
`
`
`
`this way, the photographic camera ensures that the picture is
`
`including transmission media such as digital and analog and
`
`
`taken when all subjects' eyes are open.
`
`
`
`communication links and wireless. In an illustrative embodi-
`
`
`Digital Data Processing Apparatus
`
`
`
`ment of the invention, the machine-readable instructions
`
`
`
`The computer 102 receives input from the camera 104 and 15
`
`
`
`
`may comprise software object code, compiled from a lan
`
`
`
`performs computations to associate each eye (pupil) in the
`guage such as "C," etc.
`
`
`scene 112 with a subject. The computer 102 may also
`Logic Circuitry
`
`
`conduct preliminary analysis of the scene 112 to initially
`In addition to the signal-bearing media discussed above,
`
`
`
`
`
`
`
`detect the pupils. As this feature is not necessary to the
`
`
`
`
`the association of pupils with subjects according to this
`
`
`
`invention may be implemented in a different way, without
`
`
`
`invention, however, the computer 102 may obtain such 20
`
`
`
`
`using a processor to execute instructions. Namely, this
`
`information from another source.
`
`
`
`
`technique may be performed by using logic circuitry instead
`
`
`The computer 102 may be embodied by various hardware
`
`
`
`of executing stored programming instructions with a digital
`
`
`
`components and interconnections. As shown, the computer
`
`
`
`
`data processor. Depending upon the particular requirements
`
`
`
`102 includes a processor 118, such as a microprocessor or
`
`
`
`
`of the application with regard to speed, expense, tooling
`
`
`
`
`other processing machine, coupled to a storage 120. In the
`25
`
`
`
`
`costs, and the like, this logic may be implemented by
`
`
`
`
`present example, the storage 120 includes a fast-access
`
`
`
`constructing an application-specific integrated circuit
`
`
`
`
`storage 122, as well as nonvolatile storage 124. The fast
`
`
`
`
`
`(ASIC) having thousands of tiny integrated transistors. Such
`
`
`
`access storage 122 may comprise random access memory
`
`(RAM), and may be used to store the programming instruc
`
`
`an ASIC may be implemented using CMOS, TTL, VLSI, or
`
`
`another suitable construction. Other alternatives include a
`
`
`
`
`tions executed by the processor 118. The nonvolatile storage
`30
`
`
`
`
`
`digital signal processing chip (DSP), discrete circuitry (such
`
`
`
`124 may comprise, for example, one or more magnetic data
`
`
`
`
`as resistors, capacitors, diodes, inductors, and transistors),
`
`
`
`
`storage disks such as a "hard drive," a tape drive, or any
`
`
`
`field programmable gate array (FPGA), programmable logic
`
`
`
`
`other suitable storage device. The computer 102 also
`
`array (PLA), and the like.
`
`
`includes an input/output 110, such as a number of lines,
`In this embodiment, such logic circuitry may be used in
`
`
`
`
`
`
`buses, cables, electromagnetic links, or other means for the
`35
`
`
`
`replacement of the computer 102. Furthermore, the small
`
`
`
`
`processor 118 to exchange data with the hardware external
`
`
`
`size of the logic circuitry may permit installing, embedding,
`
`
`
`to the computer 102, such as the light source 106, camera
`
`
`
`or otherwise integrating the logic circuitry into the camera
`
`104, and output device 108.
`
`
`
`
`104 to provide an extremely compact overall package.
`
`
`
`Despite the specific foregoing description, ordinarily
`
`
`Overall Sequence of Operation
`
`
`
`
`skilled artisans (having the benefit of this disclosure) will 40
`
`FIG. 3 shows a sequence 300 to illustrate one example of
`
`
`
`
`recognize that the apparatus discussed above may be imple
`
`
`
`the present invention's method for analyzing a scene to
`
`
`
`mented in a machine of different construction, without
`
`
`
`
`determine which pupils correspond to which subjects. For
`
`
`departing from the scope of the invention. As a specific
`
`
`
`
`ease of explanation, but without any intended limitation, the
`
`
`example, one of the components 122 and 124 may be
`
`
`
`
`
`example of FIG. 3 is described in the context of the system
`
`
`
`eliminated; furthermore, the storage 120 may be provided on 45
`
`100 described above.
`
`
`
`
`board the processor 118, or even provided externally to the
`
`Locating Pupil Candidates
`
`computer 102.
`
`
`
`After the sequence 300 is initiated in step 302, step 304
`
`
`
`searches for pupil candidates. In the illustrated embodiment,
`
`
`
`this operation begins by the camera 104 generating one or
`In addition to the various hardware embodiments
`
`
`
`50
`
`
`more machine-readable images of the scene 112. This may
`
`
`
`
`
`
`described above, a different aspect of the invention concerns
`
`
`
`
`involve taking a snapshot, capturing several different pic
`
`
`
`
`a method for analyzing a scene and determining which
`
`tures over time, or filming a video image. Next, the com
`
`
`pupils correspond to which subjects.
`
`
`puter 102 analyzes the image(s) to search for pupil
`
`Signal-Bearing Media
`
`In the context of FIG. 1, such a method may be
`
`
`
`
`
`55 candidates, i.e., features likely to represent pupils. The
`
`
`
`
`implemented, for example, by operating the computer 102 to
`
`
`
`
`
`
`search involves identifying any features of the image(s) that
`
`
`
`
`execute a sequence of machine-readable instructions. These
`
`bear certain predefined characteristics.
`
`
`
`
`instructions may reside in various types of signal-bearing
`
`
`
`
`In one example, the search for pupil candidates may be
`
`
`
`media. In this respect, one aspect of the present invention
`
`
`
`started by illuminating the scene 112 with different subcom-
`
`
`
`
`concerns a programmed product, comprising signal-bearing
`
`
`60 ponents of the light source having different relative angles to
`
`media tangibly embodying a program of machine-readable
`
`
`
`the subjects. This creates one image with dark pupils and
`
`
`
`
`instructions executable by a digital data processor to per
`
`
`
`
`
`another image with bright pupils. With this technique, pupil
`
`
`form a method to associate eyes (pupils) in a scene with
`
`
`
`candidates are identified by computing the difference
`subjects.
`
`
`
`
`between the two images. This technique is described in
`
`
`This signal-bearing media may comprise, for example,
`
`
`
`65 detail in the '286 patent, mentioned above.
`
`
`
`RAM (not shown) contained within the storage 120, as
`
`
`
`
`Although the '282 patent describes one embodiment of
`
`
`represented by the fast-access storage 122 for example.
`
`
`
`step 304, various other approaches will be also apparent to
`
`Operation
`
`IPR2021-00923
`Apple EX1006 Page 7
`
`
`
`
`
`US 6,539,100 Bl
`
`tracking subjects' movement, facial expressions, presence in
`
`
`
`8
`7
`
`
`
`
`
`
`
`
`ordinarily skilled artisans having the benefit of this disclo
`the scene 112, etc.
`
`
`
`sure. The output of step 304 may comprise various types of
`To map pupils to subjects, step 308 considers "spatial"
`
`
`
`
`machine-readable representation of the candidate pupils,
`
`
`
`
`cues as well as "temporal" cues. The spatial cues are visually
`such as (x,y) coordinates of pupil centers, an identification
`
`
`
`
`of pixels in each camera image corresponding to pupil
`
`
`
`
`
`5 perceivable characteristics of the pupils or surrounding areas
`
`
`
`candidates, or another representation of the size, shape,
`
`
`
`
`in one image ("static"), whereas the temporal cues concern
`
`
`
`
`position, and/or other distinguishing features of the pupil
`
`
`changes in visual characteristics over time ("dynamic").
`candidates.
`
`
`
`
`Some exemplary spatial cues include characteristics of the
`
`
`Filtering Single Pupil Candidates
`
`
`
`
`pupils themselves, and may be employed to associate pupils
`
`
`
`
`10 with people by matching counterpart pupils of a pair. Some
`
`
`
`Having identified a number of pupil candidates (possible
`
`
`exemplary pupil characteristics include one or more of the
`
`pupils) in step 304, the computer 102 proceeds to filter
`
`
`following:
`
`individual candidates to eliminate false candidates (step
`
`
`
`
`306). This operation may consider a number of different
`
`
`
`1) two pupil candidates may be counterparts if they have
`
`
`
`
`features to eliminate candidates that are not actually pupils.
`
`the same size and/or shape.
`
`
`
`For instance, the following features of each pupil candidate
`2) two pupil candidates may be counterparts if they have
`
`
`
`