`
`US005666157A
`
`[19]
`
`United States Patent
`
`Aviv
`
`5,666,157
`[llJ Patent Number:
`[45J Date of Patent:
`Sep.9, 1997
`
`[54) ABNORMALITY DETECTION AND
`
`
`SURVEILLANCE SYSTEM
`
`
`
`[75]Inventor: David G. Aviv, New York, N.Y.
`
`5,097,328
`3/1992 Boyette ................................... 348/150
`
`
`5,283,644 2/1994 Maeno .................................... 348/152
`
`
`5,512,942 4/1996 Otsuki ..................................... 348/152
`5,546,072
`
`8/1996 Creuseremee et al .................. 348/143
`
`
`
`New York. N.Y. [73) Assignee: ARC Incorporated.
`
`
`
`[21]Appl. No.: 367,712
`
`W. Britton
`Primary &-aminer-Howard
`
`Attom� Agent, or Finn-Darby & Darby
`
`[57]
`
`ABSTRACT
`
`
`
`[22) Filed: Jan. 3, 1995
`
`
`
`
`A surveillance system having at least one primary video
`
`
`camera for translating real images of a zone into electronic
`[51]Int. CL6 ...................................
`
`.................... H04N 7/18
`
`
`video signals at a first level of resolution. The system
`[52]U.S. CI ........................... 348/152; 348/150; 348/154;
`
`
`
`
`
`includes means for sampling movements of an individual or
`
`348/155; 348/161
`
`
`
`individuals located within the zone from the video signal
`
`[58) Field of Search ..................................... 348/143, 161.
`
`
`output from at least one video camera. Video signals of
`
`348/150. 154, 155, 152; H04N 7/18
`
`
`sampled movements of the individual is electronically com
`
`
`pared with known characteristics of movements which are
`
`
`
`
`indicative of individuals having a criminal intent. The level
`
`
`
`of criminal intent of the individual or individuals is then
`
`
`determined and an appropriate alarm signal is produced
`
`
`4,337,482 6/1982 Coutta ..................................... 348/150
`
`4/1988 Araki et al. ............................. 348/161
`4,737,847
`4 Claims, 5 Drawing Sheets
`
`
`5,091,780 211992 Pomerleau .............................. 348/152
`
`[56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`10
`
`12
`
`14
`
`17
`
`PICTURE
`INPUT MEANS
`
`PICTURE
`
`PROCESSING MEANS
`
`COMPARISON ..__....i POST PROCESSOR
`MEANS
`DESIGN LOGIC
`
`16
`
`DATABASE
`MEANS
`
`VCR
`1- - - - - - --- - - - - - � CONTROLLER -----
`CONTROLLER
`
`18
`
`26
`
`VCR
`
`20
`___ ....,_....i.-___
`HIGH RESOLUTION
`PICTURE MONITOR
`INPUT MEANS
`
`22
`
`OPTION:
`TO LAW ENFORCEMENT,
`COURT AND OTHER
`ALTERNATE OPTION:
`L EGAL FACILITIES
`TO LAW ENFORCEMENT,
`COURT ANO OTHER
`LEGAL FACILITIES
`
`24
`
`RECORDER
`
`IPR2021-00923
`Apple EX1007 Page 1
`
`
`
`......
`Ol
`�
`
`Ol
`
`RECORDE R
`, (24
`�
`
`-
`
`ENT,.
`J
`•'-
`
`COURT AND OTHER I AL. I
`
`LEGAL FACILITIES
`TO LAW ENFORCEMENT,
`
`OPTION:
`
`LEGAL FACILITIES
`COURT AND OTHER
`
`I
`(J'I
`l=i,
`�
`a
`�
`
`PICTURE MONITOR
`(22
`
`INPUT MEANS
`HIGH RESOLUTION
`
`I
`
`:-";
`
`VCR
`1
`
`26
`r 20
`CONTROLLERt----------------1 CONTROLLER
`I VCR
`( 18
`281
`
`PICTURE .__._ COMPARISON POST PROCESSOR
`
`DESIGN LOGIC
`
`MEANS
`DATABASE
`.. ,16
`II
`
`MEANS
`
`�
`... 1.0
`
`•
`
`17
`
`14
`FIG. I
`
`12
`
`PROCESSING MEANS
`
`IN PUT MEANS
`
`PICTURE
`
`10
`
`IPR2021-00923
`Apple EX1007 Page 2
`
`
`
`U.S. Patent Sep. 9, 1997 Sheet 2 of 5
`
`5,666,157
`
`FIG. 2A
`FIG. 28
`A; I
`B
`A '---7
`0 1,Q
`0 :o, I B I
`-------f-----
`0 I o-.
`o-: 10
`C l D
`
`C I I D
`
`---J..---
`
`FIG. 2C
`
`FIG. 2D
`
`A
`
`o-.
`
`D
`
`C B
`
`FIG. 2E
`
`FIG. 2F
`
`IPR2021-00923
`Apple EX1007 Page 3
`
`
`
`U.S. Patent Sep. 9, 1997 Sheet 3 of 5
`
`5,666,157
`
`FIG. 2G
`
`Qe 0---
`
`FIG. 2H
`
`FIG. 2I
`
`o-..
`
`IPR2021-00923
`Apple EX1007 Page 4
`
`
`
`U.S. Patent Sep. 9, 1997 Sheet 4 of 5
`
`5,666,157
`
`�
`0- B O FIG. 3A
`
`A
`
`C
`
`�
`FIG. 30 0/ 0 'o
`
`A
`
`C
`
`FIG. 3D
`
`IPR2021-00923
`Apple EX1007 Page 5
`
`
`
`°' ....J
`
`°'
`
`f.11
`
`�
`
`f.11
`
`MEANS
`
`DECISION LOGIC
`
`AND
`
`POST PROCESSING
`
`48
`
`•
`
`_(
`
`
` m.
`� �
`
`AND GIST
`
`WORDS
`
`RECOGNIZED
`
`MATCHING MEANS
`
`COMPARATOR/ 1-
`
`PATTERN
`
`[_46
`
`MEANS
`ANALYSIS
`
`J
`
`{_42
`
`40
`
`(I AMPLIFIER I
`
`�
`
`AND
`
`MEANS
`FILTERING
`
`M
`
`1 �
`
`EXPRESSIONS STATISTICAL MODELS MEANS
`
`OR
`
`AND
`
`H
`
`TEMPLATES MEANS
`
`TRAINED WORDS
`
`(44
`
`•
`
`•
`
`FIG. 4
`
`IPR2021-00923
`Apple EX1007 Page 6
`
`
`
`5,666,157
`
`1
`
`2
`
`
`As discussed above, it is the human link which lowers the
`ABNORMALITY DETECTION AND
`
`overall
`
`
`
`reliability of the entire surveillance system. U.S. Pat.
`
`SURVEILLANCE SYSTEM
`
`to Araki et al. discloses an improved
`
`No. 4,737,847 issued
`FIELD OF THE INVENTION
`
`
`
`abnormality surveillance system wherein motion sensors are
`
`5 positioned within a protected area to first determine the
`
`
`
`This invention generally relates to surveillance systems,
`
`
`
`presence of an object of interest, such as an intruder. In the
`
`
`
`and more particularly, to trainable surveillance systems
`
`
`
`system disclosed by U.S. Pat. No. 4,737,847, zones having
`
`
`
`which detect and respond to specific abnormal video and
`
`
`levels" are defined within the protected
`audio input signals.
`
`prescribed "warning
`
`
`area. Depending on which of these zones an object or person
`
`BACKGROUND OF THE INVENTION
`
`
`10 is detected in, moves to, and the length of time the detected
`
`
`
`
`Today's surveillance systems vary in complexity, effi
`
`
`object or person remains in a particular zone determines
`
`
`
`ciency and accuracy. Earlier surveillance systems use sev
`
`whether the object or person entering the zone should be
`
`
`
`eral closed circuit cameras, each connected to a devoted
`
`considered an abnormal event or a threat.
`
`
`
`monitor. This type of system works sufficiently well for
`
`
`
`The surveillance system disclosed in U.S. Pat No. 4,737,
`
`
`
`
`low-coverage sites, i.e., areas requiring up to perhaps six
`15 847 does remove some of the monitoring
`responsibility
`
`
`
`cameras. In such a system. a single person could scan the six
`
`
`
`otherwise placed on human personnel; however, such a
`
`
`
`monitors, in "real" time, and e:lfectively monitor the entire
`
`
`
`system can only determine an intruder's "intent" by his
`
`
`(albeit small) protected area, offering a relatively high level
`
`
`
`presence relative to particular zones. The actual movements
`
`
`of readiness to respond to an abnormal act or situation
`
`
`
`and sounds of the intruder are not measured or observed A
`
`
`observed within the protected area. In this simplest of
`
`determine the warning levels 20 skilled criminal could easily
`
`
`of
`
`
`
`surveillance systems, it is left to the discretion of security
`
`
`
`obvious zones within a protected area and act accordingly;
`
`
`
`personnel to determine, first, if there is any abnormal event
`
`
`spending little time in zones having a high warning level, for
`
`
`in progress within the protected area, second, the level of
`example.
`
`
`concern placed on that particular event, and third, what
`It is therefore an object of the present invention to provide
`
`
`
`
`
`
`
`actions should be taken in response to the particular event.
`25 a surveillance system
`
`which overcomes the problems of the
`
`
`
`The reliability of the entire system depends on the alertness
`prior art.
`
`and efficiency of the worker observing the monitors.
`to provide such a It is another object of the invention
`
`
`
`
`
`
`Many surveillance systems, however, require the use of a
`
`
`
`surveillance system wherein a potentially abnormal event is
`
`
`greater number of cameras (e.g .• more than six) to police a
`determined
`
`by a computer prior to summoning a human
`30 supervisor.
`
`
`larger area, such as at least every room located within a large
`
`
`
`
`
`museum. To adequately ensure reliable and complete sur
`It is another object of the invention to provide a surveil
`
`
`
`
`
`
`veillance within the protected area, either more personnel
`
`
`lance system which compares specific measured movements
`
`must be employed to constantly watch the additionally
`
`
`
`
`of a particular person or persons with a trainable, predeter-
`
`
`
`
`required monitors (one per camera), or fewer monitors may
`
`
`
`and35 mined set of "typical" movements to determine the level
`
`be used on a simple rotation schedule wherein one monitor
`
`type of a criminal or mischievous event.
`
`
`
`
`sequentially displays the output images of several cameras,
`
`It is another object of this invention to provide a surveil
`
`
`displaying the images of each camera for perhaps a few
`
`lance system which transmits the data from various sensors
`
`
`
`
`seconds. In another prior art surveillance system (referred to
`it can be recorded for evidentiary
`to a location where
`
`
`as the "QUAD" system), four cameras are connected to a
`It is another object
`40 pwposes.
`
`of this invention to provide
`
`
`
`single monitor whose screen continuously and simulta
`
`such a surveillance system which is operational
`day and
`
`
`
`neously displays the four different images. In a "quaded
`night.
`
`
`
`quad" prior art surveillance system, sixteen cameras are
`It is another object of this invention to provide a surveil
`
`
`
`
`linked to a single monitor whose screen now displays,
`lance system which can cull out real-time events
`
`
`
`continuously and simultaneously all sixteen different
`which
`
`
`
`by resolving the low 45 indicate criminal intent using a weapon,
`
`
`images. These improvements flow fewer personnel to
`
`
`adequately supervise the monitors to cover the larger pro
`
`
`temperature of the weapon relative to the higher body
`
`
`
`temperature and by recognizing the stances taken by the
`tected area.
`person with the weapon.
`
`
`
`These improvements, however, still require the constant
`It is yet another object of this invention to provide a
`
`
`attention of at least one person. The above described
`
`50 surveillance system
`
`
`multiple-image/single screen systems suffered from poor
`
`which eliminates or reduces the number
`
`
`of TV monitors and guards presently required to identify
`
`
`
`resolution and complex viewing. The reliability of the entire
`
`
`in abnormal events, as this system will perform this function
`
`
`
`
`system is still dependent on the alertness and efficiency of
`
`
`
`the security personnel watching the monitors. The personnel
`near real time.
`
`
`
`
`
`watching the monitors are still burdened with identifying an
`JNCORPORATED BY REFERENCE
`
`abnormal act or condition shown on one of the monitors,
`55
`The content of the following
`
`
`
`
`
`determining which camera, and which corresponding zone
`
`porated by reference.
`
`
`
`of the protected area is recording the abnormal event,
`
`1.Motz L. and L. Bergstein "Zoom Lens Systems",
`determining the level of concern placed on the particular
`
`
`
`
`Journal of Optical Society of America. 3 papers in Vol. 52,
`
`
`
`
`event, and finally, determining the appropriate actions that
`60 1992.
`
`
`must be taken to respond to the particular event.
`2.D. G. Aviv, "Sensor Software Assessment of Advanced
`
`
`
`
`
`Eventually, it was recognized that human personnel could
`
`Earth Resources Satellite Systems". ARC Inc. Report
`
`
`
`
`not reliably monitor the "real-time" images from one or
`
`
`#70-80-A, pp. 2-107 through 2-119; NASA contract NAS-
`
`several cameras for long ''watch" periods of time. It is
`1-16366.
`
`natural for any person to become bored while performing a
`3.Shio, A. and J. Sklansky
`
`
`monotonous task, such as staring at one or several monitors 65
`"Segmentation of People in
`
`
`
`
`
`Motion", Proc. of IEEE Workshop on Visual Motion,
`
`
`
`
`continuously, waiting for something unusual or abnormal to
`
`
`Princeton, N.J., October 1991.
`occur, something which may never occur.
`
`references is hereby incor
`
`IPR2021-00923
`Apple EX1007 Page 7
`
`
`
`5,666,157
`
`4
`3
`
`FIG. 3A illustrates a frame of a video camera's ou1put.
`4.Agaiwal. R. and J Sklansky "Estimating Optical Flow
`
`
`
`according to the invention, showing a "two on one" inter
`
`from Clustered Trajectory Velocity Time".
`
`
`action of objects (people) A, B, and C;
`
`5. Suzuki, S. and J Sklansky "Extracting Non-Rigid
`FIG. 3B illustrates a later frame of the video camera's
`
`
`Moving Objects by Temporal Edges", IEEE, 1992. Trans
`
`5 output of FIG. 3A, according to the invention, showing
`actions of Pattern Recognition.
`
`objects A and C moving towards object B;
`
`
`6. Rabiner, L. and Biing-Hwang Juang "Fundamental of
`FIG. 3C illustrates a later frame of the video camera's
`Speech Recognition". Pub. Prentice Hall, 1993,
`
`
`output of FIG. 3B, according to the invention, showing
`(p.434-495).
`objects A and C moving in close proximity to object B;
`
`
`7.Weibel, A. and Kai-Fu Lee Eds. "Readings in Speech
`
`Recognition", Pub. Morgan Kaaufman, 1990 (p.267-296). FIG. 3D illustrates a later frame of the video camera's
`10
`
`output of FIG. 3C, according to the invention, showing
`
`8. Rabiner L. "Application of Voice Processing to
`
`objects A and C quickly moving away from object B;
`
`Telecommunication". Proc. IEEE. Vol. 82, No. 2, February,
`FIG. 4 is a schematic block diagram of a conventional
`1994.
`
`
`word recognition system which may be employed in the
`15
`invention.
`
`SUMMARY OF THE INVENI1ON
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`DEfAJLED DESCRIPTION OF THE
`PREFERRED
`EMBODIMENTS
`
`A preferred embodiment of the herein disclosed invention
`
`
`
`involves a surveillance system having at least one primary
`
`video camera for translating real images of a zone into
`Referring to FIG. 1, the picture input means 10, may be
`20
`
`
`electronic video signals at a first level of resolution and
`
`any conventional electronic picture pickup device opera
`
`means for sampling movements within the zone from the
`tional within the infrared or visual spectrum ( or both)
`
`video camera ou1pUt. These elements are combined with
`
`including a vidicon and a CCDnv camera of moderate
`
`
`means for electronically comparing the sampled movements
`
`resolution, e.g., a camera about 1 ½ inches in length and
`
`
`with known characteristics of movements which are indica
`
`
`about 1 inch in diameter, weighing about 3 ounces, including
`25
`
`tive of individuals engaged in criminal activity and means
`
`
`for particular deployment a zoom lens attachment. This
`
`
`for determining the level of such criminal activity. Associ
`
`
`device is intended to operate continuously and translate the
`
`ated therewith are means for activating at least one second
`
`field of view ("real") images within a first observation area
`
`ary sensor and associated recording device having a second
`
`
`into conventional video electronic signals.
`higher level of resolution, said activating means being in
`
`Alternatively, a high rate camera/recorder, up to 300
`
`response to determining a predetermined level of criminal
`30
`frames/see (similar to those made by NAC Visual Systems
`activity.
`
`of Woodland Hills, Calif., SONY and others) may be used as
`the picture input means If. This would enable the detection
`
`of even the very rapid movement of body parts that are
`FIG. 1 is a schematic block diagram of the video, analysis,
`
`
`
`
`as hereinand their recording, 35 indicative of criminal intent.
`
`
`control. alarm and recording subsystems of an embodiment
`below described. The more commonly used camera operates
`of this invention;
`at 30 frames per second and cannot capture such quick body
`movement with sufficient resolution.
`
`FIG. 2A illustrates a frame K of a video camera's output
`Picture input means If, instead of operating continuously,
`
`
`of a particular environment. according to the invention,
`
`40 may be activated by an "alert" signal from the processor of
`
`
`
`showing four representative objects (people) A, B, C, and D,
`
`the low resolution camera or from the audio/word recogni
`wherein objects A. B and D are moving in a direction
`
`
`
`tion processor when sensing a suspicious event.
`
`
`indicated with arrows, and object C is not moving;
`Picture input means If contains a preprocessor which
`FIG. 2B illustrates a frame K+S of the video camera's
`normalizes a wide range of illumination levels, especially
`
`
`
`
`ou1put. according to the invention, showing objects A. B,
`
`
`for outside observation. The preprocessor emulates a verte
`
`and D are stationary. and object C is moving;
`45
`
`brate' s retina, which has a an efficient and accurate normal
`
`FIG. 2C illustrates a frame K+l0 of the video camera's
`
`
`ization process. One such preprocessor (VLSI retina chip) is
`
`
`ou1put, according to the invention, showing the current
`
`fabricated by the Carver Meade Laboratory of the California
`
`location of object A. B. C, D, and E;
`
`
`Institute of Technology in Pasadena, Calif. Use of this
`
`FIG. 2D illustrates a frame K+ll of the video camera's
`50 particular preprocessor chip will inaease the automated
`
`
`
`ou1put. according to the invention, showing object B next to
`
`
`
`vision capability of this invention whenever variation of
`object C, and object E moving to the right;
`
`
`light intensity and light reflection may otherwise weaken the
`
`FIG. 2E illustrates a frame K+12 of the video camera's
`picture resolution.
`
`ou1put. according to the invention, showing a potential crime
`The signals from the picture input means If are converted
`
`taking place between objects Band C;
`
`55 into digitized signals and then sent to the picture processing
`
`FIG. 2F illustrates a frame K+13 of the video camera's
`
`means U. The processor means controlling each group of
`
`
`
`ou1put. according to the invention, showing objects B and C
`cameras will be governed by an artificial intelligence
`interacting;
`
`
`
`
`system, based on dynamic pattern recognition principles, as
`
`FIG. 2G illustrates a frame K+15 of the video camera's
`
`
`
`further described below. Picture processing means 12
`
`
`
`ou1put. according to the invention. showing object C moving
`
`
`60 includes an image raster analyzer which effectively seg
`the right and object B following;
`
`ments each image to isolate each pair of people. The image
`
`FIG. 2H illustrates a frame K+16 of the video camera's
`
`
`
`
`raster analyzer subsystem of picture processing means 12
`
`
`output. according to the invention, showing object C moving
`segments each sampled image to identify and isolate each
`away from a stationary object B;
`
`
`pair of objects (or people), and each "two on one" group of
`65 three people separately.
`
`FIG. 21 illustrates a frame K+17 of the video camera's
`The ''two on one" grouping represents a common mug
`
`
`
`
`output. according to the invention. showing object B moving
`
`
`ging situation in which two individuals approach a victim,
`
`towards object C;
`
`IPR2021-00923
`Apple EX1007 Page 8
`
`
`
`5,666,157
`
`5
`6
`one from in front of the victim and the other from behind.
`
`Each image frame segment, once digitized, is stored in a
`
`The forward mugger tells the potential victim that if he does
`
`
`
`frame by frame memory storage of picture processing means
`not give up his money, (or watch, ring, etc.) the second
`
`12. Each frame from the picture input means 10 is subtracted
`mugger will shoot him, stab or otherwise harm him. The
`
`
`
`from a previous frame already stored in processing means 12
`group of three people will thus be considered a potential
`
`
`
`s using any conventional differencing process. The differenc
`
`crime in progress and will therefore be segmented and
`
`
`
`
`ing process involving multiple differencing steps takes place
`
`
`analyzed in picture processing means.
`
`
`
`in the processing section 12. The resulting difference signal
`
`
`With respect to a zoom lens system useful as an element
`
`
`
`( outputted from the differencing sub-section of means 12) of
`
`
`in the picture input means 10, the essentials of the zoom lens
`
`
`each image indicates all the changes that have occurred from
`
`
`subsystem are described in three papers written by L. Motz
`
`10 one frame to the next. These changes include any move
`
`
`and L. Bergstein, in an article titled "Zoom Lens Systems"
`
`ments of the individuals located within the segment and any
`
`
`in the Journal of Optical Society of America, Vol. 52, April,
`
`
`
`movements of their limbs, e.g., arms.
`1992. This article is hereby incorporated by reference.
`Referring to FIG. 3. a collection of differencing signals for
`
`
`
`
`The essence of the zoom system is to vary the focal length
`
`
`each moved object of subsequent sampled frames of images
`
`such that an object being observed will be focused and
`
`
`
`
`15 ( called a ''track") allows a determination of the type, speed
`
`
`
`magnified at its image plane. In an automatic version of the
`
`
`
`
`and direction (vector) of each motion involved, processing
`
`
`zoom system. once an object is in the camera's field-of-view
`
`
`
`which will extract acceleration, i.e., note of change of
`
`(FOV), the lens moves to focus the object onto the camera's
`
`
`velocity: and change in acceleration with respect to time
`
`image plane. An error signal which is used to correct the
`
`
`
`
`( called "jerkiness"), and correlating this with stored signa-
`
`
`focus by the image planes is generated by a CCD array into
`
`
`
`20 ture s of known physical criminal acts. For example, subse
`
`
`
`two halves and measuring the difference, segmenting in each
`
`
`
`quent differencing signals may reveal that an individual's
`
`
`until the object is at the center. Dividing the CCD array into
`
`
`
`arm is moving to a high position, such as the upper limit of
`
`
`more than two segments, say four quadrants, is a way to
`that arm's motion, i.e .. above his head) at a fast speed. This
`
`
`
`achieve automatic centering, as is the case with mono-pulse
`
`
`particular movement could be perceived, as described
`
`
`radar. Regardless of the number of segments, the error signal
`
`
`
`25 below, as a hostile movement with a possible criminal
`
`
`is used to generate the desired tracking of the object.
`
`
`activity requiring the expert analysis of security personnel.
`
`
`In a wide field-of-view (WFOV) operation, there may be
`
`
`The intersection of two tracks indicates the intersection of
`
`
`
`more than one object, thus special attention is given to the
`
`
`
`two moved objects. The intersecting objects, in this case,
`
`
`design of the zoom system and its associated software and
`could be merely the two hands of two people greeting each
`
`
`firmware control. Assuming three objects, as is the "two on
`
`30 other, or depending on other characteristics, as described
`
`one" potential mugging threat described above, and that the
`
`
`below, the intersecting objects could be interpreted as a fist
`
`three persons are all in one plane, one can program a shifting
`
`
`of an assailant contacting the face of a victim in a less
`
`
`from one object to the next, from one face to another face,
`
`
`
`friendly greeting. In any event, the intersection of two tracks
`
`in a prescribed sequential order. Moreover, as the objects
`
`
`
`immediately requires further analysis and/or the summoning
`
`move within the WFOV they will be automatically tracked
`35
`
`
`
`of security personnel. But the generation of an alarm. fight
`
`
`in azimuth and elevation. In principle, the zoom would focus
`
`
`
`and sound devices located, for example, on a monitor will
`
`
`
`on the nearest object, assuming that the mount of light on
`
`
`
`tum a guard's attention only to that monitor, hence the labor
`each object is the same so that the prescribed sequence
`
`
`
`
`savings. In general however, friendly interactions between
`
`starting from the closes object will proceed to the remaining
`
`
`individuals is a much slower physical process than is a
`
`
`
`objects from. for example, right to left.
`
`
`40 physical assault vis-a-vis body parts of the individuals
`
`
`However, when the three objects are located in different
`
`
`
`
`involved. Hence, friendly interactions may be easily distin
`
`planes, but still within the camera's WFOV, the zoom. with
`
`
`
`guished from hostile physical acts using current low pass
`
`
`
`input from the segmentation subsystem of the picture analy
`
`and high pass filters, and current pattern recognition tech
`
`sis means 12 will focus on the object closest to the right hand
`
`
`niques based on experimental reference data.
`
`side of the image plane, and then proceed to move the focus
`45
`
`
`
`When a large number of sensors ( called a sensor suite) are
`
`
`to the left, focusing on the next object and on the next
`
`
`distributed over a large number of facilities, for example, a
`sequentially.
`
`number of ATMs (automatic teller machines), associated
`
`
`In all of the above cases, the automatic zoom can more
`
`with particular bank branches and in a particular state or
`
`naturally choose to home-in on the person with the brightest
`
`50 states and all operated under a single bank network control,
`
`
`emission or reflection, and then proceed to the next bright
`
`then only one monitor is required.
`
`ness and so forth. This would be a form of an intensity/time
`
`
`selection multiplex zoom system.
`
`
`A commercially available software tool may enhance
`
`
`
`object-movement analysis between frames ( called optical
`
`
`The relative positioning of the input camera with respect
`
`
`
`flow computation). With optical flow computation, specific
`to the area under surveillance will effect the accuracy by
`
`
`
`
`called farkles, emitted elements, 55 (usually bright) reflective
`
`which the image raster analyzer segments each image. In
`
`from the clothing and/or the body parts of an individual of
`
`this preferred embodiment, it is beneficial for the input
`
`
`one frame are subtracted from a previous frame. The bright
`
`camera to view the area under surveillance from a point
`
`
`portions will inherently provide sharper detail and therefore
`
`
`
`located directly above, e.g., with the input camera mounted
`
`
`
`will yield more accurate data regarding the velocities of the
`
`
`
`high on a wall, a utility tower, or a traffic light support tower.
`
`
`
`
`as Additional computation, 60 relative moving objects.
`
`
`The height of the input camera is preferably sufficient to
`
`described below, will provide data regarding the accelera
`
`minimize occlusion between the input camera and the move
`
`
`tion and even change in acceleration or '1erkiness" of each
`
`ment of the individuals under surveillance.
`
`moving part sampled.
`
`Once the objects within each sampled video frame are
`
`
`
`segmented (i.e., detected and isolated), an analysis is made
`The physical motions of the individuals involved in an
`
`
`
`
`of the detailed movements of each object located within 65
`
`interaction, will be detected by first determining the edges of
`
`
`the of each person imaged. And the movements of the body
`
`each particular segment of each image, and their relative
`
`movements with respect to the other objects.
`
`
`
`parts will then be observed by noting the movements of the
`
`IPR2021-00923
`Apple EX1007 Page 9
`
`
`
`5,666,157
`
`8
`7
`may vary. For example, in a high risk area, every frame from
`
`
`edges of the body parts of the individuals involved in the
`
`
`
`interaction. The differencing process will enable the deter
`
`
`the CCD/I'V camera may be analyzed continuously to
`
`
`
`ensure that the maximum amount of information is recorded
`
`
`mination of the velocity and acceleration and rate of accel
`prior to and during a crime. In a low risk area, it may be
`eration of those body parts.
`5 preferred to sample pemaps every 10 frames from each
`
`
`The now processed signal is sent to comparison means 14
`camera, sequentially.
`
`which compares selected flames of the video signals from
`If, during such a sampling, it is determined that an
`
`
`
`the picture input means 10 with "signature" video signals
`
`
`abnormal or suspicious event is occurring, such as two
`
`stored in memory 16. The signature signals are representa
`
`people moving very close to each other, then the system
`
`
`
`tive of various positions and movements of the body ports of
`
`an individual having various levels of criminal intent The 10
`
`
`
`would activate an alert mode wherein the system becomes
`
`
`
`
`"concerned and curious" in the suspicious actions and the
`
`method for obtaining the data base of these signature video
`
`
`sampling rate is increased to perhaps every 5 frames or even
`
`
`signals in accordance with another aspect of the invention is
`
`
`every frame. As described in greater detail below, depending
`
`described in greater detail below.
`
`
`
`on the type of system employed (i.e., video only, audio only
`If a comparison is made positive with one or more of the
`15
`
`or both), during such an alert mode, the entire system may
`
`
`
`
`signature video signals, an output "alert" signal is sent from
`
`be activated wherein both audio and video system begin to
`
`the comparison means 14 to a controller 18. The controller
`
`
`sample the environment for sufficient information to deter
`
`
`18 controls the operation of a secondary, high resolution
`
`mine the intent of the actions.
`picture input means (video camera) 20 and a conventional
`
`monitor 22 and video recorder 24. The field of view of the
`
`
`
`Referring to FIG. 2, several frames of a particular camera
`secondary camera 20 is preferably at most, the same as the 20
`
`
`
`output are shown to illustrate the segmentation process
`
`
`
`performed in accordance with the invention. The system
`
`field of view of the primary camera 10, surveying a second
`
`begins to sample at frame K and determines that there are
`observation area. The recorder 24 may be located at the site
`
`
`
`four objects (previously determined to be people,
`
`and/or at both a law enforcement facility (not shown) and
`
`
`as
`
`
`
`described below), A-D located within a particular zone being
`simultaneously at a court office or legal facility to prevent
`
`
`
`
`
`25 policed. Since nothing unusual is determined from the initial
`loss of incriminating information due to tampering.
`
`
`analysis, the system does not warrant an "alert" status.
`
`The purpose of the secondary camera 20 is to provide a
`
`People A, B, and D are moving according to normal,
`
`
`
`detailed video signal of the individual having assumed
`
`non-criminal intent, as could be observed.
`
`criminal intent and also to improve false positive and false
`
`A crime likelihood is indicated when frames K+lO
`
`negative performance. This information is recorded by the
`on a monitor 22. An alarm 30
`
`
`through K+ 13 are analyzed by the dilfereocing process. And
`
`video recorder 24 and displayed
`
`
`if the movement of the body parts indicate velocity, accel
`bell or light (not shown) or both may be provided and
`
`activated by an output signal from the controller 20 to
`
`
`eration and "jerkiness" that compare positively with the
`
`stored digital signals depicting movements of known crimi
`
`summon a supervisor to immediately view the pertinent
`
`
`
`that a crime is in progress nal physical assaults, it is likely
`
`video images showing the apparent crime in progress and
`35
`access its accuracy.
`here.
`Additionally, if a high velocity of departure is indicated
`
`
`
`In still another embodiment of the invention, a VCR 26 is
`when person C moves away from person B, as indicated
`
`
`
`operating continuously (using a 6 hour loop-tape, for
`in
`
`
`frames K+15 through K+17, a wger level of confidence, is
`
`example). The VCR 26 is being controlled by the VCR
`
`attained in deciding that a physical criminal act has taken
`
`
`controller 28. All the "real-time" images directly from the 40
`place or is about to.
`
`picture input means 10 are immediately recorded and stored
`
`for at least 6 hours, for example. Should it be determined
`An alarm is generated the instant any of the above
`
`
`
`
`that a crime is in progress, a signal from the controller 18 is
`
`
`
`
`conditions is established. This alarm condition will result in
`
`
`sent to the VCR controller 28 changing the mode of record
`
`
`
`
`sending in Police or Guards to the crime site, activating the
`
`
`ing from tape looping mode to non-looping mode. Once the 45
`
`high resolution CCD/I'V camera to record the face of the
`
`VCR 26 is changed to a non-looping mode, the tape will not
`
`person committing the assault, a loud speaker being acti-
`re-loop and will therefore retain the perhaps vital recorded
`
`
`
`
`vated automatically, playing a recorded announcement
`
`
`video information of the surveyed site, including the crime
`
`warning the perpetrator the seriousness of his actions now
`
`
`
`itself, and the events leading up to the crime.
`being undertaken and demanding that he cease the criminal
`50 act After dark a strong light will be turned on automatically.
`
`
`When the non-looping mode is initiated, the video signal
`
`
`The automated responses will be actuated the instant
`
`
`may also be transmitted to a VCR located elsewhere; for
`an
`
`
`
`alarm condition is determined by the processor.
`example, at a law enforcement facility and, simultaneously
`
`
`
`Furthermore, an alarm signal is sent to the police station, and
`
`
`to other secure locations of the Court and its associated
`
`the same video signal of the event is transmitted to a court
`offices.
`
`55 appointed data collection office, to the Public Defender's
`
`
`Prior to the video signals being compared with the "sig
`
`office and the District Attorney's Office.
`nature" signals stored in memory, each sampled frame of
`
`
`As described above, it is necessary to compare the result
`
`
`video is "segmented" into parts rel