`Hunke
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US005912980A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,912,980
`Jun.15,1999
`
`[54] TARGET ACQUISITION AND TRACKING
`
`[76]
`
`Inventor: H. Martin Hunke, 147 Belvedere St.,
`San Francisco, Calif. 94117
`
`[21] Appl. No.: 08/501,944
`
`[22] Filed:
`
`Jul. 13, 1995
`
`Int. Cl. 6
`................................ G06K 9/62; G06T 7/20
`[51]
`[52] U.S. Cl. ........................... 382/103; 382/165; 348/169
`[58] Field of Search ..................................... 382/103, 164,
`382/165, 169, 236; 348/169, 170, 171,
`172,416
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`Rander "Real-Time Image-Based Face Tracking." CMU,
`Pittsburgh, PA. pp. 1-51 1993.
`
`Uras et al. "A Computational Approach to Motion Percep(cid:173)
`tion." Biological Cybernetics , vol. 60, pp. 79-87 1988.
`
`Govindaraju et al. "Caption-Aided Face Location in News(cid:173)
`paper Photographs." Proc. IEEE-CS Conf. Computer Vision
`and Pattern Recognition 1989.
`
`Schuster. "Color Object Tracking with Adaptive Modeling."
`Proc. of the Workshop on Visual Behaviors. pp. 91-96, Jun.
`19, 1994.
`
`Duchnowski et al. "Toward Movement-Invariant Automatic
`Lip-Reading and Speech Recognition." 1995 Int. Conf. on
`Acoustics, Speech and Signal Processing, pp. 109-112, vol.
`1, May 12, 1995.
`
`5,280,530
`5,323,470
`5,430,809
`5,473,369
`
`1!1994 Trew eta!. ............................. 382/103
`6/1994 Kara et a!. .............................. 382/103
`7/1995 Tomitaka ................................ 348/169
`12/1995 Abe . ... ... ... ... ... .... ... ... ... ... ... .... .. 348/169
`
`Primary Examiner-Jon Chang
`Attorney, Agent, or Firm---Rosenfeld & Associates
`
`[57]
`
`ABSTRACT
`
`OTHER PUBLICATIONS
`
`Kay et al. "A Versatile Colour System Capable of Fruit
`Sorting and Accurate Object Classification" Proc. 1992
`South African Symp. on Comm. and Signal Proc., pp.
`145-148 Sep. 1992.
`Etoh et al. "Segmentation and 20 Motion Estimation by
`Region Fragments." Proc. Fourth Int. Conf. Computer
`Vision. pp. 192-199 May 1993.
`Dubuisson et al. "Object Contour Extraction Using Color
`and Motion." Proc. 1993 IEEE Computer Society Conf.
`Computer Vision and Pattern Recognition, pp. 471-476 Jun.
`1993.
`
`A method for automatically locating a predetermined target
`class of objects in a video image stream, comprising the
`steps of determining typical colors found in objects of the
`predetermined target class, detecting a moving area in the
`video image stream, determining the colors in the moving
`area, and determining whether the moving area contains
`colors similar to the predetermined target class typical
`colors. Additionally the method allows tracking of such
`located target objects in subsequent frames of the video
`image stream based upon the colors in the target objects.
`
`13 Claims, 5 Drawing Sheets
`
`Virtual
`Camera
`
`42
`
`44
`
`Individual Target
`Color Classifier
`
`46
`
`48
`
`Exhibit 2005
`IPR2017-01218
`Petitioner- Samsung Electronics Co., Ltd., et al.
`Patent Owner- Image Processing Technologies LLC
`1
`
`
`
`U.S. Patent
`
`Jun.15,1999
`
`Sheet 1 of 5
`
`5,912,980
`
`r------------------------------
`Target Acquisition
`Module
`
`Camera
`
`~ Target Tracking
`Module 1
`
`(_4
`
`Target Tracking
`Module 2
`
`8~
`
`Camera
`Control
`.___ __ ____. ....
`
`Target Tracking
`Module N
`
`10~
`
`Fig. 1
`
`2
`
`
`
`U.S. Patent
`
`Jun.15,1999
`
`Sheet 2 of 5
`
`5,912,980
`
`Disable Regions
`Covered by Existing
`Virtual Cameras
`
`Target Acquisition
`Motion Analysis
`
`22
`
`24
`
`26
`
`28
`
`General
`Target Color
`Classifier
`
`Object
`Analysis
`
`Create Target Tracking
`Module and Initialize Virtual
`Camera and Individual Target
`Color Classifier
`
`30
`
`Fig. 2
`
`3
`
`
`
`U.S. Patent
`
`Jun.15,1999
`
`Sheet 3 of 5
`
`5,912,980
`
`42
`
`Virtual
`Camera
`
`Target Update
`Motion
`Analysis
`
`No Motion
`Detected
`
`Motion
`Detected
`52 ----L.-----,
`Individual Target
`Color Classifier
`
`50
`
`48
`
`Object
`Analysis
`
`Fig. 3
`
`4
`
`
`
`U.S. Patent
`
`Jun.15,1999
`
`Sheet 4 of 5
`
`5,912,980
`
`62
`
`64
`
`66
`
`Fig. 4
`
`72
`
`74
`
`76
`
`Fig. 5
`
`5
`
`
`
`U.S. Patent
`
`Jun.15,1999
`
`Sheet 5 of 5
`
`5,912,980
`
`Fio. 6
`'"'
`
`Fig, 7
`
`6
`
`
`
`1
`TARGET ACQUISITION AND TRACKING
`
`BACKGROUND
`
`5,912,980
`
`1. Field of Invention
`This invention relates to the area of image processing in
`general and particularly to the processing of images by
`methods designed to detect, locate, and track distinctive
`target objects in the image.
`2. Discussion of Prior Art
`Systems for detecting, localizing, and tracking distinctive
`targets are used for unsupervised observation applications,
`videoconferencing, human-computer interaction, and other
`applications. In general these systems use a video camera to
`capture a two-dimensional image stream, which is analyzed 15
`by a computer system. The methods for analyzing the image
`stream must solve the following problems:
`Detecting and identifying targets: The system must pro(cid:173)
`vide a method to detect and identify targets in the
`camera image. Generally a target class is defined, 20
`describing all objects that are considered as targets. For
`example, defining the target class as human faces
`would restrict detecting, localizing, tracking of objects
`to human faces. The initial number of targets in the 25
`camera image might not be known, new targets might
`appear or existing targets disappear in successive cam(cid:173)
`era images. The problem of detecting and localizing
`targets becomes more difficult if size, orientation, and
`exact appearance of the targets are not known, for
`example if a plurality of arbitrary human faces are to be
`detected in the camera image.
`Localizing targets: The system must be capable of local(cid:173)
`izing targets by determining their position and size in
`the camera image.
`Tracking targets: The position of each detected and local(cid:173)
`ized target must be tracked in successive images, even
`though this target might be moving, changing its ori(cid:173)
`entation or size by changing the distance to the camera.
`The system should continue to track targets robustly
`even if lighting conditions change or the tracked target
`is partially covered by other objects.
`Several techniques of the prior art have been developed in
`an attempt to address these problems:
`Template matching: One or more pre-stored images of
`objects of the target class are used as templates to
`localize and track targets in the video stream. To locate
`a target, the templates are shifted over the camera
`image to minimize the difference between the templates
`and the corresponding region of the camera image. If
`the difference can be made small for one template, the
`camera image contains the target represented by this
`template. To track the target, this template is shifted
`over the region of the subsequent camera image, where
`the target's position is assumed.
`Model matching: A model for the target class is created,
`containing information about edges, proportions
`between edges, and other structural information about
`objects. Targets are located by extracting these features
`in the camera image and matching them to the target
`class model. Tracking of targets can be performed with
`the same method, but the high computational costs of 65
`this approach suggest other techniques like template
`matching for further tracking of targets.
`
`5
`
`10
`
`2
`In general, these techniques suffer from several well(cid:173)
`known problems:
`1. Template matching:
`(a) In many applications pre stored templates call not
`cover the variety of objects in the target class. For
`example, the number of templates required to cover all
`human faces in all sizes, orientations, etc. would be
`much higher than a real-time tracking system can
`manage.
`(b) If the pre-stored templates do not cover all objects of
`the target class, manual operator intervention is
`required to point out target objects for further tracking.
`(c) Partial occlusions of a tracked target object result in
`substantial differences between the image of the
`tracked object and the stored template, so that the
`system loses track of the target.
`2. Model matching:
`(a) The model for the target class can be very complex
`depending on the geometrical structure of the objects of
`this class, resulting in high computational costs to
`match this model against the camera image.
`(b) To extract geometrical structures of the camera image,
`this image must have a sufficient resolution (for
`example in order to locate human faces, eyes, nose, and
`mouth must be detectable as important geometrical
`substructures of a human face), requiring a high
`amount of data to process.
`A fundamental problem of the technique of template
`30 matching becomes obvious when locating arbitrary objects
`of the target class for further tracking using templates. The
`templates must cover all possible appearances, orientations,
`and sizes of all objects of the target class in order to locate
`them. Because this requirement can not be met in case of
`35 eyes and lips of human faces as target class, P. W. Rander
`(Real-Time Image-Based Face Tracking, Carnegie Mellon
`University, 1993, Pittsburgh, Pa.) requires a user to manu(cid:173)
`ally point out these target objects in a camera image in order
`to generate templates of these objects. These templates are
`40 then tracked in subsequent images. U.S. Pat. No. 5,323,470,
`A. Kara, K. Kawamura, Method and Apparatus for Auto(cid:173)
`matically Tracking an Object, uses template matching to
`automatically track a face of a person who is being fed by
`a robotics system, requiring a pre-stored image of the
`45 person's face but no manual user input. If the distance of the
`person to the camera is not known, the template and the
`camera image will not match each other. It is therefore
`suggested to use a stereo-based vision subsystem to measure
`this distance, requiring a second camera. Because the
`50 requirement of the pre-stored image of the target object, this
`system is unable to locate arbitrary faces. Another severe
`problem of this technique is the incapability to adjust to a
`changing appearance of the tracked object. In order to solve
`this problem, U.S. Pat. No. 5,280,530, T. I. P. Trew, G. C.
`55 Seeling. Method and Apparatus for Tracking a Moving
`Object, updates the template of the tracked object by track(cid:173)
`ing sub-templates of this template, determining displace(cid:173)
`ments of the positions of each of the sub-templates, and
`using these displacements to produce an updated template.
`60 The updated template allows tracking of the object, though
`orientation and appearance of this object might change. This
`method still requires an initial template of the object to be
`tracked and is incapable of locating arbitrary objects of a
`target class, such as human faces.
`Model matching is successfully used to locate faces in
`newspaper articles (V. Govin-daraju, D. B. Sher, and S. N.
`Srihari, Locating Human Faces in Newspaper Photographs,
`
`7
`
`
`
`5,912,980
`
`3
`Proc. of IEEE-CS Conf. Computer Vision and Pattern
`Recognition, 1989, San Diego, Calif.). After detecting edges
`in the image, a structural model is matched against the
`located features. There are several significant disadvantages
`and problems with using the technique of matching struc-
`tural models for localizing and tracking objects. The model
`for the target class must describe all possible appearances of
`targets. If targets appear very differently depending on
`orientation, the model becomes very complex and does not
`allow efficient real time tracking of targets. The process of 10
`model matching itself requires a sufficient resolution of the
`tracked target in the camera image to allow edge detection
`and feature matching, resulting in a considerable amount of
`data to process.
`The present invention provides a novel image processing 15
`and target tracking system, based on a new scheme of
`dynamic color classification that overcomes these and other
`problems of the prior art.
`Objects and Advantages
`This invention differs fundamentally from conventional 20
`image tracking methods of the prior art in that patterns of
`color and motion are used as the basis for determining the
`identity of a plurality of individual moving targets and
`tracking the position of these targets continuously as a
`function of tine. Specifically, this invention improves on 25
`prior art methods in a number of important aspects:
`The system can acquire and track targets automatically.
`Templates or pre-stored images of objects of the
`desired target class are not necessary.
`The system can acquire and track targets in an unsuper(cid:173)
`vised manner. Human intervention is not required to
`manually select targets.
`Size, orientation, and exact appearance of targets need not
`to be known in order to detect and locate targets.
`The system acquires and tracks multiple targets simulta(cid:173)
`neously.
`The described tracking system is capable of rapid adjust(cid:173)
`ments to changing lighting conditions and appearance
`of the tracked target, such as changes in the orientation.
`The computational costs for the methods described in this
`invention are substantially smaller than those of the
`prior art, resulting in significantly faster real-time
`tracking systems.
`The system is very resistant to partial occlusions of a 45
`tracked target.
`
`5
`
`4
`that actually occur in the target, so that this individual
`target color classifier classifies all colors typical for the
`individual target as individual target colors;
`tracking the position of each target using the individual
`target color classifier and the target's motion in a search
`region restricted to an estimated position of the target;
`constantly adjusting the individual target color classifier
`to changing appearance of the target, due to changing
`lighting conditions or motion and orientation changes
`of the target.
`determining position and size of all tracked targets to
`adjust position of the camera and zoom lens.
`The herein described invention can be used for all appli(cid:173)
`cations requiring locating and tracking target objects. The
`output of the herein described system comprises of control
`signals to adjust position of the camera and zoom lens, the
`position of tracked targets, and images of tracked targets.
`The images of the tracked targets can be used to generate
`stable image streams of the targets independent of target
`movement. With human faces as target objects, the system
`can be used in applications such as videophoning,
`videoconferencing, observation, etc. The stable image
`stream of a tracked face can be used for human-computer
`interaction by extracting lip and eye movement, etc., for
`communication purposes. Lip movement can be used, for
`example, as an adjunct input to computer systems for speech
`recognition and understanding. Eye position can be used, for
`example, in remote operator eye control systems. Currently
`30 developed systems for extracting such information require a
`speaker to be at pre-defined position and size within the
`camera image. Furthermore operator intervention is often
`required to acquire target objects, like lips and eyes. The
`herein described invention automatically provides a stable
`35 image stream of the acquired and tracked target, containing
`the target at pre-defined size and position independent of
`target motion. In case of human faces as tracked targets,
`stable image streams for each located and tracked face are
`generated, containing a face in pre-defined size and position
`40 independent of speaker movements. System for lip reading
`can use this stable image stream as input containing a
`seemengly nonl-moving speaker.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows the system configuration of the target
`acquisition and tracking system according to the present
`invention.
`FIG. 2 shows an embodiment of a target acquisition
`module.
`FIG. 3 shows an embodiment of a target tracking module.
`FIG. 4 shows an example of a color distribution used by
`a general target color classifier.
`FIG. 5 shows an example of a color distribution used by
`55 an individual target color classifier.
`FIG. 6 illustrates the application of an individual target
`color classifier, detecting skin-like colors as individual target
`colors.
`FIG. 7 shows an artificial neural network used for object
`60 analysis.
`
`The system can be implemented using conventional
`hardware, such as a common videocamera and work(cid:173)
`station or PC-type computer.
`Further objects and advantages of this invention will 50
`become apparent from a consideration of the drawings and
`ensuing description.
`
`SUMMARY OF THE INVENTION
`
`This invention provides a novel method, and associated
`apparatus, for acquiring and tracking images of objects in
`real time in virtually arbitrary environments. The method
`comprises the steps of
`creating a general target color classifier, classifying all
`colors typical for objects of a target class (such as
`human faces, etc.) as general target class colors;
`detecting target objects of the target class and locating
`their position in the image using the general target color
`classifier and the object's motion;
`creating an individual target color classifier for each such
`detected and located target by determining the colors
`
`DETAILED DESCRIPTION OF PREFERRED
`EMBODIMENT OF THE INVENTION
`FIG. 1 shows a system configuration according to the
`65 invention. A video camera 2, such as a Sony TR-101 CCD
`camera, captures images of a scene at a resolution of 150 by
`100 pixels and a frame rate of 10 to 50 frames per second.
`
`8
`
`
`
`5,912,980
`
`5
`Each image frame from the camera is digitized by a
`framegrabber, such as a Media Magic Multimedia Card from
`Media Magic Inc., in real-time under control of a computer
`system 10, such as a HP series 700 model 735. Each
`digitized image is read into the internal memory of the
`computer system. The sequence of digitized video frames is
`hereafter termed the image stream. The video camera is
`equipped with a zoom lens and is mounted on a pan tilt unit,
`which is under control of the computer system. The method
`described here is not limited to the frame-rates and resolu(cid:173)
`tion described here. Instead of using a video stream captured
`by a video camera, recorded sequences such as from tele(cid:173)
`vision or a VCR might be used as well.
`A target acquisition module 4 detects, identifies, and
`locates new objects in the image stream. If a new object
`belongs to the target class, it is considered as a new target
`and the target acquisition module creates a target tracking
`module 6 for it. Each target tracking module tracks the
`position and size of a single target. Given that target objects
`move slowly relative to the frame rate of the video, the
`positions of a given target object in successive video frames
`will be substantially proximate. Therefore, once a target
`tracking module has been established for a given target, this
`module needs only to examine that portion of the image
`stream that corresponds to the target's area and its imme(cid:173)
`diate vicinity. The portion of the image that corresponds to
`each target and serves as input to the appropriate target
`tracking module is termed a virtual camera. Each virtual
`camera is a smaller rectangular portion of the video image
`that contains one target. Each virtual camera is translated
`and zoomed electronically by the associated target tracking
`module to maintain the given target within the center of the
`virtual camera's margins. In FIG. 1 the virtual cameras are
`indicated as boxes surrounding the tracked targets in the
`target tracking module 6.
`The camera control module 8 adjusts the position of the
`video camera and zoom lens to compensate for changes in
`the position and size of the tracked targets, for example by
`considering the following set of rules:
`If the size of a tracked target exceeds a given threshold,
`the caniera zooms out.
`If a tracked target is near to one margin of the physical
`camera without having another tracked target near to
`the opposite margin, the camera is moved into direction
`of this target.
`If a tracked target is near to one margin of the physical
`camera and another tracked target is near to the oppo(cid:173)
`site margin, the camera zooms out.
`If the size of a tracked target falls short of a given
`minimum size, the camera zooms in.
`If two of the above rules interfere with each other, the
`tracking module tracking the smaller target is removed
`from the system, and the set of rules is considered again
`for the remaining targets.
`More specific rules or computation of absolute values for
`adjustments are not necessary, because the inertia of the
`physical adjustments is large compared to the frame rate, so
`that incremental adjustments can be conducted on the fly.
`If one of the target tracking modules 6 loses track of its
`target, that module is removed from the system. If a new
`target not already being tracked by one of the target tracking
`modules occurs in the camera image 2, the target acquisition
`module 4 creates a new target tracking module 6 to track this
`new target.
`Color and movement are the main features used to acquire
`and track targets. The main advantages of this approach are
`
`5
`
`6
`the independency of these features on orientation of tracked
`objects and the easy computation of these features even in
`low resolution images, resulting in a significantly reduced
`amount of data to process compared to other methods for
`locating and tracking objects. The target acquisition module
`4 detects and locates a new target by searching for moving
`objects in the camera image 2 having general target class
`colors. General target class colors are colors typical for the
`predetermined target class. The target class defines a set of
`10 objects that are to be located and tracked if appearing in the
`camera image. For example, if the system is used to track
`human faces, the target class includes arbitrary human faces,
`including all skin-colors, orientation of faces, sizes, etc. The
`general target class colors are stored as color templates.
`15 These color templates can be determined as described in the
`following section about general target color classifiers or
`from an operator selected area. The color templates are
`substantially different to templates used by the technique of
`template matching, because they are independent of size,
`20 position, and orientation of objects. Since human faces come
`in a variety of skin-types, this example makes clear that the
`target color templates can include multiple color distribu(cid:173)
`tions.
`The target tracking modules 6 track a target by searching
`25 in a subimage determined by the virtual camera for the
`specific colors occurring in the tracked object, termed indi(cid:173)
`vidual target colors. For example, in the application of
`tracking human faces, the target tracking module searches
`for the specific skin-color of the tracked individual. While
`30 movement is necessary to locate targets in the target acqui(cid:173)
`sition module, the target tracking module does not require
`movement to track the position of a target. If the target
`tracking module does not detect motion, the position of the
`target is the same as in the previous image.
`FIG. 2 shows the target acquisition module 4 in detail.
`Several methods may be used to transfer information
`between the steps within this module. In the herein described
`embodiment a blackboard method is used, allowing multiple
`modules simultaneously reading and writing information on
`40 the blackboard. The blackboard is part of the internal
`memory of the workstation, containing information about
`which regions of the camera image 2 might contain targets.
`The target acquisition module 4 initially draws the entire
`camera image 2 on the target acquisition blackboard, indi-
`45 eating that targets might occur in any part of the camera
`image. Step 22 erases from the blackboard the regions
`covered by the virtual cameras 42 of the target tracking
`modules 6, so that regions containing already tracked targets
`are not considered in order to locate new targets. To locate
`50 a new target, the target acquisition module requires this
`target to be moving. A target acquisition motion analysis 24
`detects motion and erases all regions not containing move(cid:173)
`ment from the target acquisition blackboard, so that these
`regions are not considered for locating new targets. A
`55 general target color classifier (GTCC) classifies colors as
`typical for the target class or non-typical. In step 26 the
`GTCC erases from the target acquisition blackboard all
`regions not containing colors typical for the target class, so
`that these regions are not considered for locating new
`60 targets. The remaining regions on the target acquisition
`blackboard contain motion and general target class colors
`outside of regions covered by the virtual cameras of the
`target tracking modules. In step 28 an object analysis
`eventually locates target objects in these remaining regions
`65 of the target acquisition blackboard. For each located new
`target a target tracking module 6 is created in step 30 to
`allow this target to be tracked through successive video-
`
`35
`
`9
`
`
`
`5,912,980
`
`7
`frames. The position and size of the virtual camera 42 of the
`target tracking module are adjusted, so that the located target
`is c entered in the virtual camera with enough of a border
`that the object will remain within the virtual camera in the
`next video frame. The colors actually occurring in the
`tracked object are used to create an individual target color
`classifier (ITCC), that classifies only these colors typical of
`this specific target.
`The target tracking module 6 is described in detail in FIG.
`3. Instead of reading the entire camera image, only the
`subimage determine by the virtual camera 42 is considered
`for tracking a target. To transfer information between the
`parts of the target tracking module the previously described
`blackboard method is again used. The target tracking black(cid:173)
`board contains information about possible positions of the
`tracked target. Initially the entire virtual camera 42 is
`registered on the blackboard as possibly containing the
`target. In step 44 a target update motion analysis determines
`motion. If motion is detected, all regions not containing
`motion are erased from the target tracking blackboard, so
`that these regions are not considered for tracking the target.
`The ITCC 46 searches for colors typical for the specific
`tracked object. All regions not containing individual target
`colors are erased from the target tracking blackboard, so that
`only regions containing movement and individual target
`colors remain on the blackboard. Finally the object analysis
`48 locates the tracked target in the remaining regions of the
`target tracking blackboard. If no motion is detected in step
`44, the position and size of the target is the same as in the
`previous image and steps 46 and 48 are bypassed as indi(cid:173)
`cated in 50. The position and size of the tracked target
`update the position and size of the virtual camera 54, so that
`the target will be inside the margins of the virtual camera in
`the next video frame. The ITCC 46 is updated regarding the
`colors occurring in the tracked object 52 allowing the system
`to automatically adjust to changing lighting conditions and
`appearance of the tracked target. If the ITCC 46 could not
`locate individual target colors, or the object analysis 48
`detects an unrealistic change in the size of the target, the
`target tracking module assumes the target to be disappeared
`and is removed from the system.
`The following sections describe in detail the preferred
`embodiments of the target acquisition motion analysis 24,
`target update motion analysis 44, GTCC 26, ITCC 46, and
`object analysis 28 and 48 and indicate possible ramifications
`of these modules. The details of the presented embodiments 45
`should not be considered as limiting the scope of the
`invention, but show one of several possibilities to realize this
`invention. Other functionally equivalent embodiments will
`be apparent to those skilled in the art.
`A-Target Color Classification
`The invention describes method and apparatus to perform
`two main tasks: Target acquisition and target tracking. In
`both tasks extracting color is the key factor for fast and
`reliable operation. The exact appearance of new targets in
`the camera image is unknown and unpredictable, hence the 55
`target acquisition module 4 must detect and identify candi(cid:173)
`date targets based on features that are independent of size,
`location, and orientation of objects. Color is such a feature.
`Once a target is detected and located by the target acquisi(cid:173)
`tion module, the target tracking module 6 tracks its position 60
`in successive images. In most applications a target will have
`colors at least slightly different of the colors occurring in the
`background, so that knowledge of the distribution of colors
`in the target can be used advantageously to track the target's
`position. Confusion with similar colors in the background 65
`can be avoided by considering motion as an additional
`feature as described in the next section.
`
`8
`The images obtained from the video camera eventually
`are represented as a pixel matrix of RGB-values in the
`internal memory of the computer system. The main difficulty
`in using color as a feature is the dependency of these
`5 RGB-values on operational conditions, such as the charac(cid:173)
`teristics of the video imaging system used including the
`video camera and framegrabber, and the lighting conditions.
`A single object might have different RGB-values when
`images are obtained under different operational conditions,
`10 such as using different cameras, framegrabbers or different
`lighting situations. In order to restrict these dependencies to
`one module of the system, the scheme of target color
`classification is introduced. A target color classifier func-
`15 tions as an interface between the RGB-values dependent on
`operational conditions and an representation independent of
`these factors, which is used to classify each color pixel as a
`target color or a non-target color. A color is classified as
`target color if it is a color typical for objects of this target
`20 class. The GTCC classifies general target class colors, which
`are typical for objects of the target class, while the ITCC
`classifies individual target colors, which are typical for an
`individual target. The target color classifier binds all color
`dependencies at a central module of the system, so that other
`25 modules of the system such as the object analysis 28 are
`independent of operational conditions and use only the
`abstract information of the target color classifier. For
`example, if artificial neural networks are used in the object
`analysis 28 to consider geometrical structures of objects, the
`30 weights of the network do not have to be retrained in order
`to use the system in different operational conditions.
`Though the following description of a target color clas(cid:173)
`sifier specifies several details of the presently preferred
`35 embodiment, these details should not be considered as
`limitations of the system. Instead the herein described target
`color classifier is one of several possibilities to abstract from
`RGB-values to target colors. The target color classifier
`described here uses distributions of normalized RGB values
`40 to determine the most frequently occurring colors in an
`object, which are then classified as target colors.
`The color of light reflected from an object depends on
`both the color of the object (expressed as percent reflectance
`for the given color) and the color composition of the source
`of light. The color of an object is therefore dependent on the
`composition of colors in the source of light and these
`percentages. The brightness of an object is proportional to
`the brightness of the source of light and not a feature of the
`50 object itself. The first step of the target color classifier
`therefore consists of eliminating the brightness information,
`by normalizing the three-dimensional RGB-values to two(cid:173)
`dimensional brightness-normalized color values:
`
`(r,g)=f(R,G,B)=(R+~+B' R+~+B)
`
`The next step in creating a target color classifier is to use
`a normalized color distribution of a sample image to deter(cid:173)
`mine percentages of reflections of a target with a specific
`source of light. The color distrib