`
`SAMSUNG EXHIBIT 1007
`Samsung v. Image Processing Techs.
`
`
`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT.
`
`Zimbabwe
`
`Albania
`Armenia
`Austria
`Australia
`Azerbaijan
`Bosnia and Herzegovina
`Barbados
`Belgium
`Burkina Faso
`Bulgaria
`Benin
`Brazil
`Belarus
`Canada
`Central African Republic
`Congo
`Switzerland
`cote d’Ivoire
`Cameroon
`China
`Cuba
`Czech Republic
`Germany
`Denmark
`Estonia
`
`ES
`FI
`FR
`GA
`GB
`GE
`GH
`GN
`GR
`HU
`IE
`IL
`IS
`IT
`JP
`KE
`KG
`KP
`
`KR
`KZ
`LC
`LI
`LK
`LR
`
`Spain
`Finland
`France
`Gabon
`United Kingdom
`Georgia
`Ghana
`Guinea
`Greece
`Hungary
`Ireland
`Israel
`Iceland
`Italy
`Japan
`Kenya
`Kyrgyzstan
`Democratic People’s
`Republic of Korea
`Republic of Korea
`Kazakstan
`Saint Lucia
`Liechtenstein
`Sri Lanka
`Liberia
`
`SI
`SK
`SN
`SZ
`TD
`TG
`TJ
`TM
`TR
`TT
`UA
`UG
`US
`UZ
`VN
`YU
`ZW
`
`Slovenia
`Slovakia
`Senegal
`Swaziland
`Chad
`Togo
`Tajikistan
`Turkmenistan
`Turkey
`Trinidad and Tobago
`Ukraine
`Uganda
`United States of America
`Uzbekistan
`Viet Nam
`Yugoslavia
`
`LS
`LT
`LU
`LV
`MC
`MD
`MG
`MK
`
`ML
`MN
`MR
`MW
`MX
`NE
`NL
`NO
`NZ
`PL
`PT
`RO
`RU
`SD
`SE
`SG
`
`Lesotho
`Lithuania
`Luxembourg
`Latvia
`Monaco
`Republic of Moldova
`Madagascar
`The former Yugoslav
`Republic of Macedonia
`Mali
`Mongolia
`Mauritania
`Malawi
`Mexico
`Niger
`Netherlands
`Norway
`New Zealand
`Poland
`Portugal
`Romania
`Russian Federation
`Sudan
`Sweden
`Singapore
`
`SAMSUNG EXHIBIT 1007
`
`Page 2 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 2 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`DESCRIPTION
`
`SYSTEM FOR HUMAN FACE TRACKING
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`BACKGROUND OF THE INVENTION
`
`The present invention relates to a system for
`
`locating a human face within an image, and more
`
`particularly to a system suitable for real-time tracking
`
`of a human face in video sequenCes.
`
`Numerous systems have been developed for the
`
`detection of a target with an input image.
`
`In
`
`particular, human face detection within an image is of
`
`considerable importance. Numerous devices benefit from
`
`automatic determination of whether an image (or video
`
`frame) contains a human face, and if so where the human
`
`face is in the image.
`
`Such devices may be, for example,
`
`a video phone or a human computer interface.
`
`A human
`
`computer interface identifies the location of a face, if
`
`any,
`
`identifies the particular.face, and understands
`
`facial expressions and gestures.
`
`Traditionally,
`
`face detection has been
`
`performed using correlation template based techniques
`
`which compute similarity measurements between a fixed
`
`target pattern and multiple candidate image locations.
`
`If any of the similarity measurements exceed a threshold
`
`value then a "match" is declared indicating that a face
`
`has been detected and its location thereof.
`
`Multiple
`
`correlation templates may be employed to detect major
`
`facial sub-features.
`
`A related technique is known as
`
`"view-based eigen-spaces," and defines a distance metric
`
`based on a parameterizable sub-space of the original
`
`image vector space.
`
`If the distance metric is below a
`
`threshold value then the system indicates that a face has
`
`been detected.
`
`An alternative face detection technique
`
`involves using spatial image invariants which rely on
`
`compiling a set of image invariants particular to facial
`
`images.
`
`The input image is then scanned for positiVe
`
`SAMSUNGEXHBHWOO?
`
`Page 3 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 3 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`Occurrences of these invariants at all possible locations
`
`to identify human faces.
`
`Yang et al.
`
`in a paper entitled A Real-Time
`
`Face Tracker discloses a real—time face tracking system.
`
`The system acquires a red-green—blue (RGB)
`
`image and
`
`filters it to obtain chromatic colors (r and g) known as
`
`"pure" colors,
`
`in the absence of brightness.
`
`The
`
`transformation of red-green-blue to chromatic colors is a
`
`transformation from a three dimensional space (RGB)
`
`to a
`
`two dimensional space (rg).
`
`The distribution of facial
`
`colors within the chromatic color space is primarily
`
`clustered in a small region.
`
`'Yang et al. determined
`
`after a detailed analysis of skin-color distributions
`
`that the skin color of different people under different
`
`lighting conditions in the chromatic color space have
`similar Guassian distributions.
`To determine whether a
`particular red-green-blue pixel maps onto the region of
`
`the chromatic color space indicative of a facial color,
`
`Yang et al.
`
`teaches the use of a two-dimensional Guassian
`
`model. Based on the results of the two—dimensional
`
`Guassian model for each pixel within the RGB image,
`
`the
`
`facial region of the image is determined. Unfortunately,
`
`the two-dimensional Guassian model is computationally
`
`intensive and thus unsuitable for inexpensive real—time
`
`systems. Moreover,
`
`the system taught by Yang et a1. uses
`
`a simple tracking mechanism which results in the position
`
`of the tracked face being susceptible to jittering.
`
`Eleftheriadis et al.,
`
`in a paper entitled
`
`"Automatic Face Location Detection and Tracking for
`
`Model-Assisted Coding of Video Teleconferencing Sequences
`at Low Bit—Rate," teaches a system for face location
`
`detection and tracking.
`
`The system is particularly
`
`designed for video data that includes head-and-shoulder
`
`sequences of people which are modeled as elliptical
`
`regions of interest.
`
`The system presumes that the
`
`outline of people’s heads are generally elliptical and
`
`have high temporal correlation from frame to frame.
`
`2
`
`10
`
`15
`
`2O
`
`25
`
`'30
`
`35
`
`SAMSUNGEXHBHWOO?
`
`Page 4 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 4 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`Based on this premise,
`
`the system calculates the
`
`difference between consecutive frames and thresholds the
`
`result to identify regions of significant movement, which
`
`are indicated as non-zero. Elliptical non-zero regions
`
`are located and identified as facial regions.
`
`Unfortunately,
`
`the system taught by Eleftheriadis et al.
`
`is computationally intensive and is not suitable for
`
`real-time applications. Moreover, shadows or partial
`
`occlusions of the person’s face results in non-zero
`
`regions that are not elliptical and therefore the system
`
`may fail to identify such regions as a face.
`
`In
`
`addition, if the orientation of the person’s face is away
`
`from the camera then the resulting outline of the
`
`person’s head will not be elliptical and therefore the
`
`system may fail to identify the person’s head.
`
`Also, if
`
`there is substantial movement within the background of
`
`the image the facial region may be obscured.
`
`Hager et al.
`
`in a paper entitled, Real—Time
`
`Tracking of Image Regions with Changes in Geometry and
`
`Illumination, discloses a face tracking system that
`
`.analyzes the brightness of an image within a window.
`
`The
`
`pattern of the brightness within the window is used to
`
`track the face between frames. The system taught by
`
`Hager et al.
`
`is sensitive to face orientation changes and
`
`partial occlusions and shadows which obscure the pattern
`
`of the image.
`
`The system is incapable of initially
`
`determining the position of the face(s).
`
`What is desired,
`
`therefore, is a face tracking
`
`system that is insensitive to partial occlusions and
`
`shadows,
`
`insensitive to face orientation and/or scale
`
`changes,
`
`insensitive to changes in lighting conditions,
`
`easy to calibrate, and can determine the initial position
`
`of the face(s).
`
`In addition,
`
`the system should be
`
`computationally simple so that it is suitable for
`
`real-time applications.
`
`10
`
`15
`
`20
`
`25
`
`3O
`
`35
`
`SAMSUNGEXHBHWOO?
`
`Page 5 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 5 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`SUMMARY OF THE INVENTION
`
`The present invention overcomes the
`
`aforementioned drawbacks of the prior art by providing a
`
`system for detecting a face within an image that receives
`
`the image which includes a plurality of pixels, where a
`plurality of the pixels of the image is represented by
`
`respective groups of at least three values.
`
`The image is
`
`filtered by transforming a plurality of the respective
`
`groups of the at least three values to respective groups
`
`of less than three values, where the respective groups of
`
`the less than three values has less dependency on
`
`brightness than the respective groups of the at least
`
`three values. Regions of the image representative of
`
`skin—tones are determined based on the filtering.
`
`A
`
`first distribution of the regions of the image
`
`representative of the skin—tones in a first direction is
`
`calculated.
`
`A second distribution of the regions of the
`
`image representatiVe of the skin-tones in a second
`
`10
`
`15
`
`direction is calculated, where the first direction and
`
`20
`
`the second direction are different.
`
`The face within the
`
`image is located based on the first distribution and the>
`
`second distribution.
`
`25
`
`30
`
`35
`
`Using a system that determines skin-tone
`
`regions based on a color representation with reduced
`
`brightness dependency together with first and second
`
`distributions permits the face tracking system to be
`
`insensitive to partial occlusions and shadows,
`insensitive to face orientation and/or scale changes,
`
`insensitive to changes in lighting conditions, and can
`
`determine the initial position of the face(s).
`
`In
`
`addition,
`
`the decomposition of the image using first and
`
`'second distributions allows the system to be
`
`computationally simple so that it is suitable for real—
`
`time applications.
`
`In the preferred embodiment the estimated face
`
`location may also be used for tracking the facs between
`
`frames of a video.
`
`For simplicity the face motion may be
`4
`
`SAMSUNGEXHBHWOO?
`
`Page 6 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 6 of 25
`
`
`
`WO 99/35606
`
`PCT/R99/00010
`
`modeled as a piece—wise constant two-dimensional
`
`translation within the image plane.
`
`A linear Kalman
`
`filter may be used to predict and correct the estimation
`
`of the two-dimensional translation velocity vector. The
`
`estimated (filtered) velocity may then also be used to
`
`determine the tracked positions of faces.
`
`The foregoing and other objectives, features,
`
`and advantages of the invention will be more readily
`
`understood upon consideration of the following detailed
`
`description of the invention,
`
`taken in conjunction with
`
`the accompanying drawings.
`
`BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
`
`FIG. 1 is a block diagram of an exemplary
`
`embodiment of a face detection and tracking system of the
`
`present invention.
`
`FIG. 2 is a graph of the distributions of the
`
`skin—colors of different people in chromatic color space
`
`with the grey-scale reflecting the magnitude of the color
`
`10
`
`15
`
`20
`
`concentration.
`
`FIG. 3 is a circle centered generally within
`
`the center of the distribution shown in FIG. 2.
`
`FIG.
`
`4 is an image with a face.
`
`FIG. 5 is a binary image of the face of FIG. 4.
`
`FIG. 6 is a pair of histograms of the binary
`
`image of FIG. 5 together with medians and variances for
`
`each histogram.
`
`DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
`
`Referring to FIG. 1, a face detection and
`
`tracking system 6 includes an image acquisition device 8,
`
`such as a still camera or a video camera.
`
`A frame
`
`grabber 9 captures individual frames from the acquisition
`
`device 8 for face detection and tracking. An image
`
`processor 11 receives an image 10 from the frame grabber
`
`9 with each pixel represented by a red value, a green
`
`value, and a blue value, generally referred to as an RGB
`
`5
`
`25
`
`30
`
`35
`
`SAMSUNGEXHBHWOO?
`
`Page 7 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 7 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`image.
`
`The image 10 may alternatively be represented by
`
`other color formats, such as for example; cyan, magenta,
`
`and yellow;
`
`luminance,
`
`intensity, and chromaticity
`
`generally referred to as the YIQ color model; hue,
`
`saturation,
`
`intensity; hue,
`
`lightness, saturation; and
`
`hue, value, chroma. However,
`
`the RGB format is not
`
`necessarily the preferred color representation for
`
`characterizing skin—color.
`
`In the RGB color spaCe the
`
`10
`
`three colors [R, G, B] represent not only the color but
`also its brightness.
`For example, if the corresponding
`
`elements of two pixels,
`
`[R1, G1, B1] and [R2, G2, B2],
`
`are proportional (i.e., Rl/R2=G1/G2=Bl/B2)
`
`then they
`
`characterize the same color albeit at different
`
`15
`
`20
`
`25
`
`30
`
`35
`
`brightnesses.
`
`The human visual system adapts to
`
`different brightness and various illumination sources
`
`such that a perception of color constancy is maintained
`
`within a wide range of-environmental lighting conditions.
`
`Therefore it is desirable to reduce the brightness
`
`information from the color representation, while
`
`preserving accurate low dimensional color information.
`
`Since brightness is not important for characterizing skin
`
`colors under the normal lighting conditions,
`
`the image 10
`
`is transformed by a transformation 12 (filter) to the
`
`chromatic color space. Chromatic colors (r, g), known as
`
`"pure" colors in the absence of brightness, are generally
`
`defined by a normalization process:
`
`r=R/(R+G+B)
`
`g=G/(R+G+B)
`
`The effect of the transformation 12 is to map the three
`
`dimensional RGB image 10 to a two dimensional rg
`chromatic color space representation,
`The color blue is
`
`redundant after the normalization process because
`
`r+g+b=l.
`
`Any suitable transformation 12 may be used
`
`which results in a color space where the dependence on
`
`brightness is reduced, especially in relation to the RGB
`
`color space.
`
`SAMSUNGEXHBHWOO?
`
`Page 8 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 8 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`It has also been found that the distributions
`
`of the skin-colors of different people are clustered in
`
`chromatic color space, as shown in FIG. 2.
`
`The grey-
`
`scale in FIG.
`
`2 reflects the magnitude of the color
`
`concentration. Although skin colors of different people
`
`appear to vary over a wide range,
`
`they differ much less
`
`in color than in brightness.
`
`In other words,
`
`the skin-
`
`colors of different people are actually quite similar,
`
`while mainly differing in intensities.
`
`The two primary purposes of the transformation
`
`12 are to (1) facilitate distinguishing skin from other
`
`objects of an image, and (2)
`
`to detect skin tones
`
`irrespective of the particular color of the person’s skin
`
`which differs from person to person and differs for the
`
`same person under different lighting conditions.
`
`Accordingly, a suitable transformation 12 facilitates the
`
`ability to track the face(s) of an image equally well
`
`under different lightning conditions even for people with
`
`different ethnic backgrounds.
`
`the present inventor
`Referring to FIG. 3,
`determined that a straightforward characterization of the
`
`chromaticity distribution of the skin tones may be a
`
`circle 20 centered generally within the center of the
`
`distribution shown in FIG. 2. Alternatively, any
`
`suitable regular or irregular polygonal shape (including
`
`a circle) may be used, such as a square, a pentagon, a
`
`hexagon, etc.
`
`The use of a polygonal shape permits
`
`simple calibration of the system by adjusting the radius
`
`of the polygonal shape.
`
`The region encompassed by the
`
`polygonal shape therefore defines whether or not a
`it is
`
`particular pixel is a skin tone.
`
`In addition,
`
`computationally simple to determine whether or not a
`
`particular set of rg values is within the region defined
`
`by the polygonal shape.
`
`If the rg values are within the
`
`polygonal shape, otherwise referred to as the skin-tone
`
`region,
`
`then the corresponding pixel of the image 10 is
`
`SAMSUNG EXHIBIT 1007
`
`Page 9 of 25
`
`10
`
`15
`
`20
`
`25
`
`3O
`
`35
`
`SAMSUNG EXHIBIT 1007
`Page 9 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`considered to be a facial feature, or otherwise having a
`skin tone.
`
`Based on whether each pixel of the image 10 is
`
`within the skin tone region the system generates a binary
`
`image 14 corresponding to the image 10.
`
`The binary image
`
`14 has a value of 1 for each pixel of the image 10 that
`
`is identified as a skin tone.
`
`In contrast,
`
`the binary
`
`image 14 has a value of 0 for each pixel of the image
`that is not identified as a skin tone.
`It is to be
`
`understood that groups of pixels may likewise be compared
`
`on a group by group basis,
`
`instead of a pixel by pixel
`
`basis, if desired.
`
`The result is a binary image 14 that
`
`contains primarily 1’s in those portions of the image 10
`
`that contain skin tones, such as the face, and primary
`
`0’s in the remaining portions of the image.
`
`It is noted
`
`that some portions of non-facial regions will have skin
`
`tone colors and therefore the binary image 14 will
`
`include a few 1's at non-face locations.
`
`The opposite is
`
`also true, facial regions may include pixels that are
`
`indicative of non~skin tones and will therefore be
`
`indicated by 0's.
`
`Such regions may include beards,
`
`moustaches, and hair.
`
`For example,
`
`the image 10 as shown
`
`in FIG.
`
`4 may be mapped to the binary image 14 as shown
`
`in FIG. 5.
`
`'
`
`Alternatively,
`
`the representation of the 0’s
`
`and 1’s may be reversed,
`
`if desired. Moreover, any other
`
`suitable representation may be used to distinguish those
`
`portions that define skin-tones from those portions that
`
`do not define skin tones. Likewise,
`
`the results of the
`
`transformation 12 may result in weighted values that are
`indicative of the likelihood that a pixel (or region of
`
`pixels) are indicative of skin tones.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`As shown in FIG. 5,
`
`the facial region of the
`
`35
`
`image is generally indicated by the primary grouping of
`1’s.
`
`The additional 1’s scattered throughout the binary
`
`image 14 do not indicate a facial feature, and are
`
`generally referred to as noise.
`
`In addition,
`
`the facial
`
`8
`
`SAMSUNGEXHBHWOO?
`
`Page 10 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 10 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`region also includes some 0’s, generally referred to as
`noise.
`
`The present inventor came to the realization
`
`that the two dimensional binary image 14 of skin tones
`
`may further be decomposed into a pair of one dimensional
`
`models using a face locator 16. The reduction of the two
`
`dimensional representation to a pair of one dimensional
`
`representations reduces the computational requirements
`
`necessary to calculate the location of the face.
`
`Referring to FIG. 6,
`
`the mean of the distribution of the
`
`1’s (skin-tones) is calculated in both the x and y
`
`directions.
`
`The distribution is a histogram of the
`
`number of 1’s in each direction.
`
`The mean may be
`
`calculated by u=(1/N)Xxv
`
`The approximate central
`
`location 38 of the face is determined by projecting the
`
`x-mean 30 and the y-mean 32 onto the binary image 14.
`
`The variance of the distribution in each of the x and y
`
`10
`
`15
`
`directions is also calculated. The variance may be
`
`calculated by 02=(1/N)X(xi—p)2.
`indicate the width of the facial feature in its
`
`The variances 34a—34d
`
`20
`
`25
`
`30
`
`35
`
`respective directions. Projecting the variances 34a—34d
`
`onto the binary image 14 defines a rectangle around the
`
`facial region.
`
`The mean and variance are generally
`
`insensitive to variations for random distributions of
`
`noise.
`
`In other words,
`
`the mean and variance are robust
`
`for which such additional 1’s and 0’s are not
`
`statistically important. Under different lighting
`
`conditions for the same person and for different persons,
`
`the mean and variance technique defines the facial
`
`the mean and variance are techniques
`region. Moreover,
`merely requiring the summation of values which is
`
`computationally efficient.
`
`The system may alternatively use other suitable
`
`statistical techniques on the binary image 14 in the x
`
`and y direction to determine a location indicative of the
`
`central portion of the facial feature and/or its size, if
`
`desired. Also, a more complex calculation may be
`
`9
`
`SAMSUNGEXHBHWOO?
`
`Page 11 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 11 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`employed if the data has weighted values.
`
`The system may
`
`also decompose the two-dimensional binary image into
`
`directions other than x and y.
`
`The face locator and tracker 16 provides the
`
`general location of the center of the face and its size.
`
`The output of image processor 11 provides data to a
`
`communication module 40 which may transmit or display the
`
`image in any suitable format.
`
`The face tracking system 6
`
`may enhance the bit rate for the portion of the image
`
`containing the face, as suggested by Eleftheriadis.
`
`The estimated face location may also be used
`
`for tracking the face between frames of a video.
`
`For
`
`simplicity the face motion may be modeled as a piece—wise
`
`constant two-dimensional translation within the image
`
`plane.
`
`A linear Kalman filter may be used to predict and
`
`correct the estimation of the two—dimensional translation
`Velocity vector.
`The estimated (filtered) velocity may
`then also be used to determine the tracked positions of
`faces.
`
`The preferred system model for tracking the
`
`motion is:
`
`x(k+1) =F(k) x(k) +w(k)
`
`z(k+1)=H(k+1)x(k+1)+v(k+1)
`
`where x(k)
`
`is the true velocity vector to be estimated,
`
`z(k)
`
`is the observed instantaneous velocity vector, w(k),
`
`v(k) are white noise, and F(k)EI, H(k)EI for piece-wise
`
`constant motion.
`The Kalman predictor is:
`Q(k+1|k)=F(k)§(k|k), Sc‘(o|0)=o
`€(k+llk)=H(k+l)x(k+l|k)
`The Kalman corrector is:
`
`§(k+1|k+1)=§(k+1|k)+x(k+1)Az(k+ilk)
`Az (k+1 ] k) =2 (k+l) -fz‘(k+1|k)
`
`where K(k+1)
`
`is the Kalman gain.
`
`The Kalman gain is
`
`computed as:
`
`K(k+l)=P(k+l]k)HT(k+l) [H(k+l)P(k+1|k)HT(k+1)+R(k+l) 1“
`
`10
`
`15
`
`20
`
`25
`
`3O
`
`35
`
`10
`
`SAMSUNGEXHBHWOO?
`
`Page 12 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 12 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`The covariances are computed as:
`
`P(k+1lk)=F(k)P(k|k)Fr(k)+Q(k), p(o|0)=p0
`P(k+1|k+1)=[I—K(k+1)H(k+1)]P(k+1|k)
`
`where Q(k)=E[w(k)wT(k)], R(k)=E[v(k)vT(k)] and
`PO=E[X(0)X7(O)].
`’
`In the presence of lighting fluctuation and
`
`image noise,
`
`the tracked faCe image may be jittering.
`
`A
`
`nonlinear filtering module therefore may be included in
`
`the tracking system to remove the undesirable jittering.
`
`A simple implementation of the nonlinear filtering module
`
`is to cancel any movement of the tracked face which is
`
`smaller in magnitude than a prescribed threshold and
`
`shorter in duration than another prescribed threshold.
`
`A particular application suitable for the face
`
`detection and tracking system described herein involves a
`
`video phone. Other suitable device may likewise be used.
`
`An image of the background without a person present is
`
`obtained by the system. Thereafter images are obtained
`
`in the presence of the person. Each image obtained is
`
`compared against the background image to distinguish the
`
`foreground portion of the image from the background image
`
`previously obtained.
`
`The recipient’s video phone has a
`
`The foreground,
`nice background image displayed thereon.
`which is presumably the person,
`is transmitted to and
`
`overlayed on the nice background image of the recipient’s
`
`video phone on a frame-by-frame manner.
`
`The location of
`
`the face is determined by the face tracking system to
`
`smooth out the movement of the person and remove jitter.
`
`the nice background image may be
`Alternatively,
`transmitted to the recipient’s video phone, and is
`I
`This
`preferably transmitted only once per session.
`
`provides the benefit of disguising the actual background
`
`environment and potentially reducing the bandwidth
`
`requirements.
`
`The system may be expanded using the same
`
`teachings to locate and track multiple faces within an
`
`image.
`
`11
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SAMSUNGEXHBHWOO?
`
`Page130f25
`
`SAMSUNG EXHIBIT 1007
`Page 13 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`The terms and expressions which have been
`
`employed in the foregoing specification are used therein
`
`as terms of description and not of limitation, and there
`
`is no intention,
`
`in the use of such terms and
`
`5
`
`expressions, of excluding equivalents of the features
`
`shown and described or portions thereof, it being
`
`recognized that the scope of the invention is defined and
`
`limited only by the claims which follow.
`
`12
`
`SAMSUNGEXHBHWOO?
`
`Page 14 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 14 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`CLAIMS
`
`1.- A method of detecting a face within an
`
`image comprising the steps of:
`
`5
`
`(a)
`
`receiving said image including a plurality
`
`of pixels, where a plurality of said
`
`pixels of said image is represented by
`
`respectiVe groups of at least three
`
`values;
`
`10
`
`(b)
`
`filtering said image by transforming a
`
`15
`
`20
`
`25
`
`30
`
`35
`
`plurality of said respective groups of
`
`said at least three values to respective
`
`groups of less than three values, where
`
`said respective groups of said less than
`
`three values has less dependency on
`
`brightness than said respective groups of
`
`said at least three values;
`
`(c)
`
`determining regions of said image
`
`representative of skin-tones based on said
`
`filtering of step (b);
`
`(d)
`
`calculating a first distribution of said
`
`regions of said image representative of
`
`said skin-tones in a first direction;
`
`(e)
`
`calculating a second distribution of said
`
`regions of said image representative of
`
`said skin—tones in a second direction,
`
`where said first direction and said second
`
`direction are different; and
`
`(f)
`
`locating said face within said image based
`
`on said first distribution and said second
`
`distribution.
`
`I
`
`2.
`
`The method of claim 1 where said image
`
`includes from a video containing multiple images.
`
`3.
`
`The method of claim 1 where said image
`
`includes a human face.
`
`13
`
`SAMSUNGEXHBHWOO?
`
`Page 15 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 15 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`4.
`
`The method of claim 1 where said at least
`
`three values includes a red value, a green value, and a
`blue value.
`
`5.
`
`The method of claim 4 where said
`
`respective groups of less than three values includes, a r
`
`value defined by said red value divided by the summation
`
`of said red value, said green value, and said blue value,
`
`and a g value defined by said green value divided by the
`
`summation of said red value, said green value, and said
`blue value.
`
`6.
`
`The method of claim 1 wherein at least one
`
`of said regions is an individual pixel of said image.
`
`7.
`
`The method of claim 1 wherein said
`
`determining of step (c)
`
`is based on a polygonal shape.
`
`8.
`
`The method of claim 1 wherein said
`
`determining of step (c)
`
`is based on a circle.
`
`9.
`
`The method of claim 1 wherein at least one
`
`of said first distribution and said second distribution
`
`is a histogram.
`
`10.
`
`The method of claim 1 wherein said first
`
`distribution is in a x-direction.
`
`11.
`
`The method of claim 10 wherein said second
`
`distribution is in a y-direction.
`
`12.
`
`The method of claim 11 wherein said first
`
`distribution and said second distribution are in
`
`orthogonal directions.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`14
`
`SAMSUNGEXHBHWOO?
`
`Page 16 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 16 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`13.
`
`The method of claim 1 wherein said first
`
`distribution and said second distribution are independent
`of each other.
`
`14.
`
`The method of claim 1 further comprising
`
`the steps of:
`
`(a)
`
`calculating a first generally central
`
`location of said first distribution;
`
`(b)
`
`calculating a first generally central
`
`location of said second distribution; and
`
`(c)
`
`locating said face based on said first
`
`generally central location of said first
`
`distribution and said first generally
`
`central location of said second
`
`distribution.
`
`15.
`
`The method of claim 14 wherein at least
`
`one of said first generally central location of said
`
`first distribution and said first generally central
`
`location of said second distribution is a mean.
`
`10
`
`15
`
`20
`
`16.
`
`The method of claim 14 wherein the size of
`
`said face is based on the variance of said first
`
`distribution and the variance of said second
`
`25
`
`distribution.
`
`17.
`
`The method of claim 1 wherein said face is
`
`tracked between subsequent frames.
`
`30
`
`18.
`
`The method of claim 17 wherein jitter
`
`movement of said face is reduced between said subsequent
`frames.
`
`15
`
`SAMSUNGEXHBHWOO?
`
`Page17of25
`
`SAMSUNG EXHIBIT 1007
`Page 17 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`FIG.
`
`‘I
`
`9
`
`FRAME
`
`GRABBER
`
`8 -
`
`
` TRANSFORMATION
`
`
`
`40
`
`COMMUNICATION
`MODULE
`
`BINARY
`IMAGE
`
`FACE
`LOCATOR
`
`AND TRACKER
`
`
`
`
`1/5
`
`SAMSUNG EXHIBIT 1007
`
`Page 18 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 18 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`
`
`FIG. 3
`
`2/5
`
`SAMSUNG EXHIBIT 1007
`
`Page 19 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 19 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`.x
`90..090o..0.0.0‘o9....09o.09.o.0.o...0900..999,.990...e9.9.09.0
`9.099009...€09.
`
`0990
`
`.099..900.9.99099
`.o.
`....00.0.
`
`~990....00.9,99900000x99.9.9.9.
`IV.
`.9.....9..099.009099..#0.90z¢39...0
`90999....9999999.;990....9909.99./.99....0..000999..99.
`
`
`£33323V099.99«wQEV0AV.9§w¢.AV..0O.vnowuwwwnafi.
`9.0.990..9..09900909....o«fificufiwfiflflfi
`
`
`
`
`$§§33238§3tflfififlfifififlf:99.320
`
`
`0%fivafivav41.04«fiifldfidttidl’
`.‘..9...
`
`IV
`
`Afifififlfififlfiwwfififi
`
`.3232%:23.
`
`99%fififiv9shfibfiflfiflckfififiv
`$§5$03333$éb¥33
`0....900..0900
`£33339933333333$
`......09..0999
`
`0
`
`00000090009
`912333§€€¢I
`’3333Vipaa
`4C‘Q4‘d444‘
`
`.9.00...
`«aovvvvvvv
`€§§§§3¥3¥
`.90....9...
`€6§¢§8§3¥w
`.....09.99
`
`vrwébw..9
`駀¢€€€§§
`
`fieiééiéiif
`§€€€§§§§§§V09
`9%SSQkaXwQV9.
`9099.099..0..0
`Qfifiwfiwfififififiwv
`
`999900000.0......
`.009.99.0
`afififififlfififlvau
`
`3/5
`
`SAMSUNG EXHIBIT 1007
`
`PageZOof25
`
`Q.
`
`
`
`mac!£2$223V9
`
`9.99..9
`
`.fiflafil.32%
`
`SAMSUNG EXHIBIT 1007
`Page 20 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`oooowoworooorooooooooooooooooooFoowoowopopoo
`oooooooooooooroooooFOOOOOOOFOwoooooooooooooo
`ooooooooroooooooooooooVOOOOOOOOOOOODOOOOOrOO
`
`00
`
`ooooooooooowooo
`
`Fro
`
`oooooooooooooo
`
`Frr
`
`va
`
`wrw
`
`oooooooooooorwwrrrrr
`
`Orr
`
`owrwwva
`orrrFF
`Orrr
`ooooooorooooo
`ooooooowoooo
`oorooooooro
`rOOOOOOOOOO
`
`cor
`
`
`rrrrooooooorooooooooooooowrwrwrrrrwrooooooooooooooooorooowrwro
`
`
`
`
`Orr
`
`er
`
`Frw
`
`Frr
`
`
`FrrwrowoooooooDorooorooorerrwrrwwwrrwoooroooooooooooooorrowrwwForwrooooooooroooroooooorewowwwooooooooowovrwwrrwwwrrooooooowo
`
`
`
`
`
`
`
`
`
`
`
`
`Frr
`
`oooooooooroor
`
`rrrrrwr
`rooooooooro
`
`ooorooooroooor
`oooooroooooooor
`rrwrrrrr
`wooooooroooor
`rooowooooroo
`
`Frw
`
`rrw
`
`www
`
`r
`
`rrr
`
`ooooooooooooooowrooooooooooooc
`
`ODPOOOOOOOOOOOOOFrooooowooooooor
`
`
`
`
`F?
`
`
`
`oooooooooooooooooooroooooooooooooooo
`
`oooooooooooooooooooooFoooooooooooooooooooooo.
`
`ooooooooowooooooooooowoooooooooorooooooooowo
`
`oooooooooooooooooooooFoooooooooorooooooooooo
`
`
`
`Fr?
`
`
`
`
`
`
`
`
`
`Frr
`
`rrr
`
`FFF
`
`
`oooooooooorrrrwwrFrwrroooooooow
`
`oooooooooooorrrrrrrrroooooooooooooooooooowrrrwrr.wrorrooooooooooooooooooorwrowororrrrooooooooo
`
`4/5
`
`SAMSUNG EXHIBIT 1007
`
`PageZ1of25
`
`SAMSUNG EXHIBIT 1007
`Page 21 of 25
`
`
`
`WO 99/35606
`
`PCT/JP99/00010
`
`FIG. 6
`
`38
`
`001010100 0010000000“00000Iv000 000101010000
`000000000'0000101000"0001"00011000000000000
`001000000'0000000000"100'000000 000100000000
`0001000000000000111 111000000 000010000000
`
`0000010000001111111
`0000100000011111111
`010000001
`0000000001111110111
`0010000001111100011 '
`000000000
`010000000
`
`000000100000
`111111110 000001000000
`111111111001000000100
`000000000000
`000100000000
`100000000000
`101000000000
`
`32
`
`
`
`0111111111‘110000000000
`100000000
`000010001 1111111110 1111111111 110001000100
`100000000 1111111111
`
`000000000 1110111111 1110111010 110000000000
`000000000 1011111000 0000111111 110000000000
`
`010000000‘1111111110 0111111111 001000000000
`
`100001000'0011111111 1111111110'000000100000\\\
`000000000! 000111 1 1 1 1
`1 1 1 1 1111 00 000000000000
`100000001'0000111011 1111111000'000000000100
`000000000'0000001111"1111000000I000000000000
`000000000'0000000000"0100000000'000000000000
`010000000-0100000000"0100000000-001000000000
`000000000'0100000000"0100000000'000000000000
`
`34D
`
`36
`
`34C
`
`5/5
`
`SAMSUNG EXHIBIT 1007
`
`Page 22 of 25
`
`SAMSUNG EXHIBIT 1007
`Page 22 of 25
`
`
`
`
`
`
`
`INTERNATIONAL SEARCH REPORT
`
`
` Meme .1 Application No
`
`
`PCT/JP 99/00010
`A. CLASSIFICATION OF SUBJECT MATTER
`IPC 6
`G06K9/00
`
`
`
`
`
` According to lntemational Patent Classification (lPC) or to both national classification and lPC
`B. FIELDS SEARCHED
`
`
`
`
`Minimum documentation searched (classification system followed by classification symbols)
`IPC 6
`G06K
`
`
`
`
`Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched
`
`
`
` C. DOCUMENTS CONSIDERED TO BE RELEVANT
`
`Category °
`
`Citation of document, with indication, where appropriate. of the relevant passages
`
`Electronic data base consulted during the international search (name of data base and, where practical. search terms used)
`
`
`Relevant to claim No.
`
`
`
`
`
`
`
`
` 9
`
`
`
`EP 0 654 749 A (HITACHI EUROP LTD)
`24 May 1995
`see column 8,
`see column 10,
`10—20
`
`line 1
`line 9 — column 9,
`line 15 — line 23; figures
`
`"A MOBILE ROBOT THAT
`NONG C ET AL:
`RECOGNIZES PEOPLE"
`PROCEEDINGS OF THE 7TH INTERNATIONAL
`CONFERENCE ON TOOLS NITH ARTIFICIAL
`INTELLIGENCE, HERNDON, VA., NOV.
`5 - 8
`1995,5 November 1995, pages 346-353,
`XP000598377
`
`
`
`
`INSTITUTE OF ELECTRICAL AND ELECTRONICS
`ENGINEERS
`
`
`
`
`see paragraph 2.1 - paragraph 2.2
`
`
`
`_/__
`
`
`Patent family members are listed in annex.
`
` Further documents are listed in the continuation of box C.
`
`
`
`
`