`
`I, Rachel J. Watters, am a librarian, and the Director of Wisconsin TechSearch
`
`(“WTS”), located at 728 State Street, Madison, Wisconsin, 53706. WTS is an
`
`interlibrary loan department at the University of Wisconsin—Madison.
`
`I have worked as
`
`a librarian at the University of Wisconsin library system since 1998.
`
`I have been
`
`employed at WTS since 2002, first as a librarian and, beginning in 201 l, as the Director.
`
`Through the course of my employment, I have become well informed about the
`
`operations of the University of Wisconsin library system, which follows standard library
`
`practices.
`
`This Declaration relates to the dates of receipt and availability of the following:
`
`Brill, F.Z., Olson, T.J., Tserng, C. (1998) Event Recognition and
`Reliability Improvements for the Autonomous Video
`Surveillance System. Image Understanding Workshop:
`Proceedings of a Workshop held in Monterey, California, I, 267—
`283.
`
`
`Standard ogerating procedures (or materials at the University of Wisconsin—
`
`Madison Libraries. When a volume was received by the Library, it would be checked
`
`in, stamped with the date of receipt, added to library holdings records, and made
`
`available to readers as soon after its arrival as possible. The procedure normally took a
`
`few days or at most 2 to 3 weeks.
`
`Exhibit A to this Declaration is true and accurate copy of the title page with
`
`library date stamp of Image Understanding Workshop: Proceedings ofa Workshop held
`
`in Monterey, California (1998), from the University of Wisconsin-Madison Library
`
`1
`
`_Canon Ex. 1053 Page 1 of21
`
`Canon Ex. 1053 Page 1 of 21
`
`
`
`Declaration of Rachel J. Watters on Authentication of Publication
`
`collection. Exhibit A also includes an excerpt of pages 267 to 283 of that volume,
`
`showing the article entitled Event Recognition and Reliability Improvements for the
`
`Autonomous Video Surveillance System (1998). Based on this information, the date
`
`stamp on the journal title page indicates Event Recognition anal Reliability
`
`Improvements for the Autonomous Video Surveillance System (1998) was received by
`
`the Kurt F. Wendt Library, University of Wisconsin-Madison on February 4, 1999.
`
`Based on the information in Exhibit A, it is clear that the volume was received by
`
`the library on or before February 4, 1999, catalogued and available to library patrons
`
`within a few days or at most 2 to 3 weeks after February 4, 1999.
`
`I declare that all statements made herein of my own knowledge are true and that
`
`all statements made oninformation and belief are believed to be true; and further that
`
`these statements were made with the knowledge that willful false statements and the like
`
`so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18
`
`of the United States Code.
`
`fly
`
`
`
` Date: August 22, 2019
`
`Wisconsin TechSearch
`
`Rachel J
`
`Director
`
`Memorial Library
`728 State Street
`
`Madison, Wisconsin 53706
`
`Canon Ex. 1053 Page 2 of21
`
`Canon Ex. 1053 Page 2 of 21
`
`
`
`Image Understanding Workshop
`
`heldm
`
`
`
`
`Volume]
`
`' " 'nURT F. weaver HERA. .
`COLLEGE OF ENGINEERiN’.
`"
`FEB 41999
`
`UW-MAD'QG“. WI ”"0“
`
`_ Sponsored by}
`Defense Advanced Research Projects Agency
`
`This document contains copies of reports prepared for the DARPA Image Understanding Workshop.
`Included are Principal Investigator reports and technical results from the basic and strategic comput-
`ing programs within DARPA/ISO-sponsored projects and certain technical reports from selected sci—
`entists from other organizations.
`
`APPROVED FOR PUBLIC RELEASE
`
`DISTRIBUTION UNLIMITED
`
`The views and conclusions contained in this document are those of the authors and should not be
`
`interpreted as necessarily representing the official policies, either expressed or implied, of the De-
`fense Advanced Research Projects Agency or the Government of the United States of America.
`
`Canon EX. 1053 Page 3 of21
`
`EX. A Page 1 of 19
`
`Canon Ex. 1053 Page 3 of 21
`
`Ex. A Page 1 of 19
`
`
`
`Kurt F. Wendt Library
`University of Wisccnsin - Madison
`215 N. Randall Avenue
`Madison. WI 53706-1688
`
`Distributed by:
`Morgan Kaufmann Publishers Inc.
`340 Pine Street, 6th Floor
`San Francisco, Calif. 94104-3205
`ISBN: 1—55860—583—5
`
`Printed in the United States of America
`
`Canon EX. 1053 Page 4 0f21
`
`EX. A Page 2 of 19
`
`Canon Ex. 1053 Page 4 of 21
`
`Ex. A Page 2 of 19
`
`
`
`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`Event Recognition and Reliability Improvements for
`
`the Autonomous Video Surveillance System
`
`Frank Z. Brill, Thomas J. Olson, and Christopher Tserng
`Texas Instruments
`
`P.O. Box 655303, MS 8374, Dallas, TX 75265
`
`brill@csc.ti.com, olson@csc.ti.com, tsemg@csc.ti.com
`
`Abstract
`
`This report describes recent progress in the devel-
`opment of the Autonomous Video Surveillance
`(AVS) system, a general-purpose system for mov-
`ing object detection and event recognition. AVS
`analyses live video of a scene and builds a descrip—
`tion of the activity in that scene. The recent
`enhancements to AVS described in this report are:
`(I) use of collateral information sources, (2) cam-
`era hand—off, (3) vehicle event recognition, and (4)
`complex-event recognition. Also described is a
`new segmentation and tracking technique and an
`evaluation of AVS performing the best—view selec-
`tion task.
`
`1. Introduction
`
`The Autonomous Video Surveillance (AVS) sys-
`
`tem processes live video streams from surveillance
`cameras to automatically produce a real—time map-
`based display of the locations of people, objects
`and events in a monitored region. The system al-
`lows
`a
`user
`to
`specify
`alarm conditions
`interactively, based on the locations of people and
`objects in the scene, the types of objects in the
`scene, the events in which the people and objects
`are involved, and the times at which the events oc-
`
`cur. Furthermore, the user can specify the action to
`take when an alarm is triggered, e.g., to generate an
`audio alarm or write a log file. For example, the
`user can specify that an audio alarm should be trig-
`gered if a person deposits a briefcase on a given
`table between 5:00pm and 7:00am on a weeknight.
`Section 2 below describes recent enhancements to
`
`This research was sponsored in part by the DARPA Image
`Understanding Program.
`
`the AVS system. Section 3 describes progress in
`improving the reliability of segmentation and
`tracking. Section 4 describes an experiment that
`quantifies the performance of the AVS “best view
`selection” capability.
`
`2. New AVS functionality
`
`The structure and function of the AVS system is
`described in detail in a previous IUW paper [Olson
`and Brill, 1997]. The primary purpose of the cur-
`rent paper is to describe recent enhancements to
`the AVS system. These enhancements are de-
`scribed in four sections below:
`(1) collateral
`
`information sources, (2) camera hand-off, (3) vehi-
`
`cle event
`recognition.
`
`recognition, and (4)
`
`complex—event
`
`2.1. Collateral information sources
`
`Figure 1 shows a diagram of the AVS system. One
`or more “smart” cameras process the video stream
`to recognize events. The resulting event streams
`are sent
`to a Video Surveillance Shell
`(VSS),
`
`which integrates the information and displays it on
`a map. The VSS can also generate alarms based on
`the information in the event streams.
`In recent
`
`work, the VSS was enhanced to accept information
`from other sources, or “recognition devices” which
`can identify the objects being reported on by the
`cameras. For example, a camera may report that
`there is a person near a door. A recognition device
`may report that the person near the door is Joe
`Smith. The recognition device may be a badge
`reader, a keypad in which a person types their PIN,
`a face recognition system, or other recognition sys-
`tem.
`
`267
`
`Canon EX. 1053 Page 5 of21
`
`EX. A Page 3 of 19
`
`Canon Ex. 1053 Page 5 of 21
`
`Ex. A Page 3 of 19
`
`
`
`Smart Camera 1
`
`Smart Camera 2
`
`Figure l: AVS system diagram
`
`The recognition device we have incorporated is a
`voice verification system. The user stands in a pre—
`defined location in the room, and Speaks his or her
`name. The system matches the utterance to previ—
`ously captured examples of the person speaking
`their name, and reports to the VSS if there is a
`match. The VSS now knows the identity of the per-
`son being observed, and can customize alarms
`based on the person's identity.
`
`A recognition device could identify things other
`than people, and could classify actions instead of
`objects. For example, the MIT Action Recognition
`System (MARS) recognizes actions of people in
`the scene, such as raising their arms or bending
`over. MARS is trained by observing examples of
`the action to be recognized and forming “temporal
`templates” that briefly describe the action [Davis
`and Bobick, 1997}. At run time, MARS observes
`the motion in the scene and determines when the
`motion matches one of the stored temporal tem—
`plates. TI has obtained an evaluation copy of the
`
`MARS software and used it as an recognition de-
`vice which identifies actions, and sends the result
`to the AVS VSS. We successfully trained MARS to
`recognize the actions of opening a door, and open-
`ing the drawer of a file cabinet. When MARS
`recognizes these actions, it sends a message to the
`AVS VSS, which can generate an appropriate
`alarm.
`
`2.2. Camera hand-off
`
`As depicted in Figure 1, the AVS system incorpo—
`rates multiple cameras to enable surveillance of a
`wider area than can be monitored via a single cam—
`era. If the fields of view of these cameras are
`adjacent, a person can be tracked from one moni—
`tored area to another. When the person leaves the
`field of view of one camera and enters another, the
`process of maintaining the track from one camera
`view to another is termed camera hand-of. Figure
`2 shows an area monitored by two cameras. Cam—
`
`
`
`tttttitiiiitii
`;A___ _._.__..
`
`Figure 2: Multiple cameras with adjacent fields of view
`
`Canon EX. 1053 Page 6 0f21
`
`268
`
`EX. A Page 4 of 19
`
`Canon Ex. 1053 Page 6 of 21
`
`Ex. A Page 4 of 19
`
`
`
`and Camera-2
`hallway,
`the
`monitors the interior of the room. When a person
`moves through the doorway to enter the room from
`the hall or vice-versa, camera hand-off is necessary
`to enable the system to know that the person that
`was being monitored in the hall via Camera-1 is
`the same as the person being monitored in the
`room via Camera-2.
`
` era-1 monitors
`
`The AVS system accomplishes camera hand-off by
`integrating the information from the two cameras
`in the map coordinate system. The AVS “smart”
`cameras report the locations of the monitored ob-
`jects and people in map coordinates, so that when
`the VSS receives reports about a person from two
`separate cameras, and both cameras are reporting
`the person’s coordinates at about the same map lo-
`cation, the VSS can deduce that the two separate
`reports refer to the same person. In the example de—
`picted in Figure 2, when a person is standing in the
`doorway, both cameras can see the person and re—
`port his or her location at nearly the same place.
`The VSS reports this as one perSOn, using a mini—
`mum distance to allow for errors in location. When
`Camera—2 first sees a person at a location near the
`doorway and reports this to the VSS,
`the VSS
`checks to see if Camera—1 recently reported a per—
`son near the door. If so, the VSS reports the person
`in the room as the same one that Camera-1 had
`
`been tracking in the hall.
`
`2.3. Vehicle event recognition
`
`This section describes extensions to the existing
`AVS system that enable the recognition of events
`involving interactions of people with cars. These
`new capabilities enable smart security cameras to
`monitor streets, parking lots and driveways and re-
`port when suspicious events occur. For example, a
`smart camera signals an alarm when a person exits
`a car, deposits an object near a building, reenters
`the car, and drives away.
`
`2.3.1. Scope and assumptions
`
`Extending the AVS system to handle human—vehi-
`cle interactions reliably involved two separable
`subproblems. First,
`the system’s vocabulary for
`events and objects must be extended to handle a
`new class of object (vehicle) and new event types.
`Second,
`the AVS moving object detection and
`tracking software must be modified to handle the
`outdoor environment, which features variable
`lighting, strong shadows, atmospheric disturbanc—
`
`backgrounds. The work
`dynamic
`and
`es,
`described here in section 2.3 addresses the first
`problem, to extend the system for vehicle events in
`conditions of uniform overcast with little wind.
`Our approach to handling general outdoor lighting
`conditions is discussed in section 4.
`
`The method is further specialized for imaging con—
`ditions in which:
`
`1. The camera views cars laterally.
`
`2. Cars are unoccluded by other cars.
`3. When cars and people overlap, only one of
`the overlapping objects is moving
`4. The events of interest are people getting
`into and out of cars.
`
`2.3.2. Car detection
`
`The first thing that was done to expand the event
`recognizing capability of the current system was to
`give the system the ability to distinguish between
`people and cars. The system classifies objects as
`cars by using their sizes and aspect ratios. The size
`of an object in feet is obtained using the AVS sys-
`tem’s
`image
`coordinate
`to world coordinate
`mapping. Once the system has detected a car, it an-
`alyzes the motion graph to recognize new events.
`
`2.3.3. Car event recognition
`
`In principle, car exit and car entry events could be
`recognized by detecting characteristic interactions
`of blobs in difference images, in a manner similar
`to the way AVS recognizes DEPOSIT and RE—
`MOVE events. In early experiments, however, this
`method tumed out to be unsatisfactory because the
`underlying motion segmentation method did not
`segment cars from people. Whenever the people
`pass near the car they appear to merge with it, and
`track is lost until they walk away from it.
`
`To solve this problem, a new approach involving
`additional image differencing was developed. The
`technique allows objects to be detected and tracked
`even when their images overlap the image of the
`car. This method requires two reference images:
`one consists of the original background scene
`(background image), and the other is identical to
`the first except it includes the car. The system takes
`differences between the current video image and
`the original reference image as usual. However, it
`also differences the current video image with the
`reference image containing the car. This allows the
`
`269
`
`Canon EX. 1053 Page 7 of21
`
`EX. A Page 5 0f19
`
`Canon Ex. 1053 Page 7 of 21
`
`Ex. A Page 5 of 19
`
`
`
`system to detect objects which may be overlapping
`the car. Using this technique, it is easy to detect
`when people enter and exit a car. If an object disap—
`pears while overlapping with a car,
`it probably
`entered the car. Similarly, if an object appears over-
`lapping a car, it probably exited the car.
`
`2.3.4. Basic method
`
`When a car comes to rest, the following steps are
`taken. First, the image of the car object is removed
`from its frame and stored. Then. the car image is
`merged with the background image, creating an
`updated reference image containing the car. (Ter-
`minology: a reference car image is the subregion
`of the updated reference image that contains the
`car.) Then, the car background image, the region of
`
`the original background image that is replaced by
`the car image, is stored.
`
`For each successive frame, two difference images
`are generated. One difference image.
`the fore-
`ground
`difierence
`image,
`is
`calculated
`by
`differencing the current video image with the up-
`dated reference image. The foreground difference
`image will contain all the blobs that represent ob-
`jects other than the car, including ones that overlap
`the car. The second difference image, the car dif-
`ference
`image,
`is
`calculated using the
`car
`background image. The car difference image is
`formed from the difference between the current
`
`frame and the car background image, and contains
`the large blob for the car itself. Figures 3 and 4
`show the construction and use of these images.
`
`
`
`(a)
`
`(b)
`
`(C)
`
`Figure 3: (a) Background image. (b) Car background image.
`(c) Updated reference image
`
`
`
`(a)
`
`Ju-
`
`” (K
`
`Figure 4: (a) Current video image. (b) Foreground difference image
`
`270
`
`Canon EX. 1053 Page 8 of21
`
`EX. A Page 6 0f 19
`
`Canon Ex. 1053 Page 8 of 21
`
`Ex. A Page 6 of 19
`
`
`
`
`
`‘-/
`car object
`
`frame prior to
`car resting
`
`car resting
`frame
`
`previous frame
`
`current frame
`
`Figure 5: Creation of the motion graph.
`prior to the background image being updated.
`The starred frame represents the frame
`
`The blobs in the foreground difference image are
`grouped into objects using the normal grouping
`heuristics and placed in the current frame. The
`blobs in the car difference image necessarily repre-
`sent the car, so they are all grouped into one current
`car object and placed in a special reference frame.
`Normal links occur between objects in the previous
`frame and objects in the current frame. Additional-
`ly, the stored car object, which was removed from
`its frame, (from Step 1) is linked to the current car
`object which is in the reference frame. In any given
`sequence, there is only one reference frame.
`
`Figure 5 demonstrates the creation of this new mo~
`tion graph. As indicated by the dotted lines, all
`objects maintain their tracks using this method.
`Notice that even though the car object disappears
`from future frames (due to the updated reference
`image), it is not detected to have exited because its
`track is maintained throughout every frame. Using
`this method, the system is able to keep track of the
`car object as well as any objects overlapping the
`car. If an object appears intersecting a car object,
`
`an INCAR event is reported. If an object disap-
`pears while intersecting a car object, an OUTCAR
`event is reported. Figure 6 shows the output of the
`system. The system will continue to operate in this
`manner until the car in the reference frame begins
`to move again.
`
`When the car moves again, the system reverts to its
`normal single-reference—image state. The system
`detects the car’s motion based on the movement of
`its centroid. It compares the position of the cen-
`troid of the stored car object with the centroid of
`the current car object. Figure 7 shows the slight
`movement of the car.
`
`
`
`Figure 6: Final output of system
`
`(b)
`
`(C)
`
`((1)
`
`Figure 7: (a) Reference car image. (b) Moving car image.
`(c) Reference car difference image. (d) Moving car difference image
`
`271
`
`EX. A Page 7 0f 19
`
`Canon EX. 1053 Page 9 0f21
`
`Canon Ex. 1053 Page 9 of 21
`
`Ex. A Page 7 of 19
`
`
`
`I "'
`
`link from
`before car
`"381mg / /
`/
`
`/
`
`/
`
`__ i _ __
`stored car
`Object
`
`“‘ a. \
`
`transient objects
`removed in
`previous frame
`
`I -
`
`\
`
`\
`
`transient objects grouped
`together to form new
`moving car object in
`current frame
`
`\
`
`i
`
`\
`
`reference frame/
`
`
`previous frame
`
`current frame
`
`Figure 8: Restoration ofnormal differencing. The starred frame represents the last frame prior to the
`original reference image being restored.
`
`If the centroid locations differ by more than a
`threshold, the following sequence of events occur
`to restore the system to its original state:
`
`LAn object representing the moving car is
`created in the current frame.
`2. The stored car object is linked to this new
`moving car object in the current frame.
`3. Objects in the previous frame that intersect
`the moving car are removed from that
`frame.
`4. The car background image is merged with
`the updated reference image to restore the
`original reference image.
`5. Normal differencing continues.
`
`Figure 8 demonstrates how the system is restored
`to its original state. Note that there is one continu-
`ous track that
`represents the path of the car
`throughout.
`
`When the car begins to move again, transient blobs
`appear in the foreground difference image due to
`the fact that the car is in the updated reference im-
`age as seen in Figure 9. Therefore, to create a new
`moving car object in the current frame, these tran-
`sient objects, which are
`identified by their
`intersection with the location of the resting car, are
`
`grouped together as one car object. If there are no
`transient objects, a copy of the stored car object is
`inserted into the current frame. This way, there is
`definitely a car object in the current frame to link
`with the stored car object. Transient objects might
`also appear in the previous frame when a car is
`moving. Therefore, these transient objects must be
`removed from their frame in order to prevent them
`from being linked to the new moving car object
`that was just created in the current frame. After the
`steps described above occur, the system continues
`as usual until another car comes to rest.
`
`2.3.5. Experiments: disk-based sequences
`
`To test the principles behind the modified AVS sys-
`tem,
`three sequences of video that represented
`interesting events were captured to disk. These se-
`quences represented events which the modified
`system should be able to recognize. Capturing the
`sequences to disk reduces noise and ensures that
`the system processes the same frames on every run,
`making the results deterministic.
`In addition to
`these sequences, longer sequences were recorded
`and run directly from videotape to test how the sys-
`tem would work under less ideal conditions.
`
`
`
`
`
`U?)
`
`Figure 9: (3) Updated reference image. (b) Current video image. (c) Foreground difference
`image
`Canon EX. 1053 Page 10 0f21
`
`272
`
`EX. A Page 8 of 19
`
`Canon Ex. 1053 Page 10 of 21
`
`Ex. A Page 8 of 19
`
`
`
`filmed from the 3rd story of an office building
`overlooking the driveway in front of the building.
`A car drives up and a person exits the car, walks
`away, deposits a briefcase, and finally reenters the
`car. Then, the car drives away. In this segment, the
`system successfully detects the person exiting the
`car. However, the person entering the car is missed
`because the person gets grouped with a second per-
`son walking near the car.
`
` 2.3.5.1. Simple sequence. The first sequence was
`
`Further on in the sequence, the car drives up again
`and a person exits the car, walks away, removes the
`briefcase, and finally reenters the car. Again, the
`car drives away. In this segment, both the person
`entering and exiting the car are recognized. In both
`these sequences, there was only the one false nega-
`tive mentioned earlier and no false positives.
`
`was
`sequence
`2.3.5.2. Pickup sequence. This
`filmed in front of a house looking at the street in
`front of the house. In the sequence, a person walks
`into the scene and waits at the curb. A car drives
`
`up, picks up the person, and drives away. The sys-
`tem correctly detects the person entering the car.
`There are no false positives or negatives.
`
`sequence was
`2.3.5.3. Drop ofi sequence. This
`filmed in the same location as the previous one. In
`this sequence, a car drives up and a person is
`dropped off. The car drives away with the person
`still standing in the same location. Then, the person
`walks off. The system correctly detects the person
`exiting the car and does not report a false enter
`event when the car moves away.
`
`2.3.6. Experiments: videotaped sequences
`
`These sequences were run on the system straight
`from videotape. These were all run at a higher
`threshold to accommodate noise on the videotape.
`However, this tended to decrease the performance
`
`of the system.
`
`2.3.6.]. Dark day. This is a 15 minute sequence
`that was recorded from the 3rd floor of a building
`on a fairly dark day. In that time span, 8 cars passed
`through the camera’s field of view. The system de—
`tected 6 cars correctly and one false car (due to
`people grouped together). One car that was not de-
`tected was due to its small size. The other car was
`
`undetected because the system slowed down (due
`to multiple events occurring) and missed the imag—
`
`es with the car in them. In this sequence, two
`people entered a car. However, both events were
`missed because the car was not recognized as rest-
`ing due to the dark lighting conditions on this rainy
`day.
`
`2.3.6.2. Cloudy day. This is a 13 minute sequence
`in the same location as the previous sequence ex-
`cept it is a cloudy day. In this time span, 9 cars
`passed through the camera’s field of view and all of
`them were detected by the system. There were a to-
`tal of 2 people entering a car and 2 people exiting a
`car. The system successfully detected them all. Ad-
`ditionally,
`it
`incorrectly reported one person
`walking near a car as an instance of a person exit-
`ing a car.
`
`2.3.6.3. Cloudy day—extended time. This is a 30
`minute sequence in the same location as the previ-
`ous two. In this time span, 28 cars pass through and
`all of them were detected. The system successfully
`detected one person exiting a car but missed two
`others. The two people were missed because the
`car was on the edge of the camera’s field of view
`and so it was not recognized immediately as a car.
`
`2.3.7. Evaluation of car-event recognition
`
`The modified AVS system performs reasonably
`well on the test data. However,
`it has only been
`tested on a small number of videotaped sequences,
`in which much of the action was staged. Further
`experiments and further work with live, uncon—
`trolled data will be required to make the system
`handle outdoor vehicle events as well as it handles
`
`indoor events. The technique of using multiple ref—
`erence images is interesting and can be applied to
`other problems, e.g. handling repositioned fumi-
`ture in indoor environments. For more detail on
`
`this method, see [Tsemg, 1998].
`
`2.4. Complex events
`
`The AVS video monitoring technology enables the
`recognition of specific events such as when a per-
`son enters a room, deposits or picks up an object,
`or loiters for a while in a given area. Although
`these events are more sophisticated than those de—
`tected via simple motion detection, they are still
`unstructured events that are detected regardless of
`the context in which they occur. This can result in
`alarms being generated on events that are not of
`interest.
`
`273
`
`EX. A Page 9 0f19
`
`Canon EX. 1053 Page 11 of21
`
`Canon Ex. 1053 Page 11 of 21
`
`Ex. A Page 9 of 19
`
`
`
`For example, if the system is monitoring a room or
`store with the intention of detecting theft, the sys-
`tem could be set up to generate an alarm whenever
`an object is picked up (i.e., whenever a REMOVE
`event occurs). However, no theft has occurred un-
`less the person leaves the area with the object. A
`simple, unstructured event
`recognition system
`would generate an alarm every time someone
`picked up an object, resulting in many false alarms;
`whereas a system that can recognize complex
`events could be programmed to only generate an
`alarm when the REMOVE event is followed by an
`EXIT event. The EXIT event provides context for
`the REMOVE event that enables the system to fil-
`ter out uninteresting cases in which the person does
`not leave the area with the object they picked up.
`This section describes the design and implementa-
`tion of such a complex-event recognition system.
`
`We use the term simple event to mean an unstruc-
`tured atomic event. A complex event is structured,
`in that it is made up of one or more sub-events. The
`sub-events of a complex event may be simple
`events, or they may be complex, enabling the defi-
`nition of event hierarchies. We will simply say
`event to refer to an event that may be either simple
`or complex. In our theft example above, REMOVE
`and EXIT are simple events, and THEFT is a com-
`plex event. A user may also define a further event,
`e.g., CRIME—SPREE, which may have one or more
`complex THEFT events as sub-events.
`
`We created a user interface that enables definition
`of a complex event by constructing a list of sub-
`events. After one or more complex events have
`been defined, the sub-events of subsequently de-
`fined complex events can be complex events
`themselves.
`
`2.4.1. Complex-event recognition
`
`Once the user has defined the complex events and
`the actions to take when they occur, the event rec-
`ognition system recognizes these events as they
`occur in the monitored area. For the purposes of
`this section, we assume a priori that the simple
`events can be recognized, and that the object in-
`volved
`in
`them can
`be
`tracked.
`In
`the
`implementation we will use the methods discussed
`in [Courtney 1997, Olson and Brill, 1997] to track
`objects and recognize the simple events. In order to
`recognize a complex event, the system must keep a
`record of the sub-events that have occurred thus
`
`far, and the objects involved in them. Whenever the
`first sub—event
`in a complex event’s sequence is
`recognized, an activation for that complex event is
`created. The activation contains the ID of the ob-
`ject involved in the event, and an index, which is
`the number of sub-events in the sequence that have
`been recognized thus far. The index is initialized to
`1 when the activation is created, since the activa-
`tion is only created when the first
`sub-event
`matches. The system maintains a list of current ac-
`tivations for each defined complex-event
`type.
`Whenever any new event is recognized, the list of
`current activations is consulted to see if the newly
`recognized (or incoming) event matches the next
`sub-event in the complex event. If so, the index is
`incremented. If the index reaches the total number
`of sub-events in the sequence, the complete com-
`plex event has been recognized, and any desired
`alarm can be generated. Also, since the complex
`event that was just recognized may also be a sub-
`event of another complex event, the activation lists
`are consulted again (recursively) to see if the indi-
`ces of any other complex event activations can be
`advanced.
`
`the complex
`To return to our THEFT example,
`THEFT event has two sub-events, REMOVE and
`EXIT. When a REMOVE event occurs, an activa—
`tion for the THEFT event is created, containing the
`ID of the person involved in the REMOVE event,
`and an index set to 1. Later, when another event is
`recognized by the system, the activation is consult-
`ed to see if the event type of this new, incoming
`event matches the next sub-event in the sequence
`(in this case, EXIT). If the event type matches, the
`object ID is also checked, in this case to see if the
`person EXITing is the same as that of the person
`who REMOVEd the object earlier. This is to ensure
`that we do not signal a THEFT event when one
`person picks up an object and a different person ex—
`its the area. In a closed environment, the IDs used
`may merely be track—IDs, in which each object that
`enters the monitored area is assigned a unique
`track-ID, and the track-ID is discarded when the
`object is no longer being tracked. If both the event
`type and the object ID match, the activation’s index
`is incremented to 2. Since there are only 2 sub-
`events in the complex event in this example, the en—
`tire complex-event has been recognized, and an
`alarm is generated if desired. Also, since the
`THEFT event has been recognized, this newly rec—
`ognized THEFT event may be a sub—event of
`Canon Ex. 1053 Page 12 of21
`
`274
`
`Ex. A Page 10 of 19
`
`Canon Ex. 1053 Page 12 of 21
`
`Ex. A Page 10 of 19
`
`
`
`another complex event. When the complex THEFT
`event is recognized, the current activations are re—
`cursively checked to see if the theft is a part of
`another higher-level event, such as a CRIME—
`SPREE.
`
`2.4.2. Variations and enhancements
`
`We have described the basic mechanism of defin-
`
`ing and recognizing complex events. There are
`several variations on this basic mechanism. One is
`
`i.e., complex events
`to allow unordered events,
`which are simply the conjunction or disjunction of
`their sub-events. Another is to allow negated sub—
`events, which can be used to cancel an activation
`when the negated sub—event occurs. For example,
`considering the definition for THEFT again, if the
`person pays for the item, it is not a theft. Also, if
`the person puts the item back down before leaving,
`no theft has occurred. A more complete definition
`of theft is one in which “a person picks up an item
`and then leaves without putting it back or paying.”
`Assuming we can recognize the simple events RE—
`MOVE, DEPOSIT, PAY, and EXIT, the complex
`THEFT event can now be expressed as the ordered
`list (REMOVE, ~DEPOSIT, ~PAY, EXIT), where
`“~” indicates negation. Another application of the
`complex event with negated sub-events is to detect
`suspicious behavior in front of a building. The nor-
`mal behavior may be for a persou to park the car,
`get out of it, and then come up into the building. If
`the person parks the vehicle and leaves the area
`without coming up into the building, this may be a
`car bombing scenario. If we can detect the sub—
`events for PARK, OUTCAR, ENTER-BUILDING,
`and EXIT, we can define the car-bombing scenario
`as
`(PARK, OUTCAR, ~ENTER-BUILDING,
`EXIT).
`
`Another variation is to allow the user to label the
`
`objects involved in the events, which facilitates the
`abi