`
`US 20050146605Al
`
`(19) United States
`(12) Patent Application Publication
`Lipton et al.
`
`(10) Pub. No.: US 2005/0146605 Al
`Jul. 7, 2005
`( 43) Pub. Date:
`
`(54) VIDEO SURVEILLANCE SYSTEM
`EMPLOYING VIDEO PRIMITIVES
`
`(76)
`
`Inventors: Alan J. Lipton, Falls Church, VA (US);
`Thomas M. Strat, Oakton, VA (US);
`Peter L. Venetianer, McLean, VA
`(US); Mark C. Allmen, Morrison, CO
`(US); William E. Severson, Littleton,
`CO (US); Niels Haering, Arlington, VA
`(US); Andrew J. Chosak, McLean, VA
`(US); Zhong Zhang, Herndon, VA
`(US); Matthew F. Frazier, Arlington,
`VA (US); James S. Sfekas, Arlington,
`VA (US); Tasuki Hirata, Silver Spring,
`MD (US); John Clark, Leesburg, VA
`(US)
`
`Correspondence Address:
`VENABLE LLP
`P.O. BOX 34385
`WASHINGTON, DC 20435-9998 (US)
`
`(21) Appl. No.:
`
`09/987,707
`
`(22) Filed:
`
`Nov. 15, 2001
`
`Related U.S. Application Data
`
`(63) Continuation-in-part of application No. 09/694,712,
`filed on Oct. 24, 2000.
`
`Publication Classification
`
`(51)
`Int. Cl.7 ....................................................... H04N 7/18
`(52) U.S. Cl. .............................................................. 348/143
`
`(57)
`
`ABSTRACT
`
`A video surveillance system is set up, calibrated, tasked, and
`operated. The system extracts video primitives and extracts
`event occurrences from the video primitives using event
`discriminators. The system can undertake a response, such
`as an alarm, based on extracted event occurrences.
`
`15
`
`14
`
`video
`sensors
`
`video
`recorders
`
`other
`sensors
`
`17
`
`11
`
`computer system
`
`computer
`
`computer-readable
`medium
`
`12
`
`13
`
`16
`
`1/0 devices
`
`1/18
`
`DOJ EX. 1021
`
`
`
`14
`
`video
`sensors
`
`15~ I
`
`video
`recorders
`
`h
`
`, ..
`
`11
`'\
`I computer system
`I computer
`
`~1
`
`other ~ I
`
`sensors
`
`!
`
`12
`
`17
`
`I
`1~
`
`I I
`
`16
`
`, l
`
`1/0 devices
`
`I
`
`"Cl
`"Cl
`
`(')
`
`(')
`~
`
`""C
`~ .....
`~ =
`.....
`>
`-....
`....
`~ .....
`0 =
`""C = O'
`-....
`.... 0
`.....
`=
`~ = :-
`N c c
`
`~-..J
`
`Ul
`
`computer-readable
`medium
`'c--
`
`'\
`
`13
`
`FIG. 1
`
`21
`
`22
`
`23
`
`24
`
`setup
`video
`surveillance
`system
`
`calibrate
`video
`surveillance
`system
`
`task
`video
`surveillance
`system
`
`operate
`video
`surveillance
`system
`
`FIG.2
`
`'Jl =-~
`~ .....
`""""
`0
`"""
`
`-..J
`
`d
`'Jl
`N c
`c
`
`Ul -c
`
`""""
`.i;;..
`O'I
`O'I c
`Ul
`>
`""""
`
`2/18
`
`DOJ EX. 1021
`
`
`
`obtain H extract H archive H extract H undertake I N c
`
`""C
`~ .....
`=
`.....
`> "Cl
`"Cl -....
`.... 0
`.....
`=
`""C = O' -....
`.... 0 =
`~ = :-
`
`~-..J
`
`c
`Ul
`
`'Jl =-~
`~ .....
`N
`0 .....,
`-..J
`
`32
`
`33
`
`34
`
`35
`
`identify H H H
`
`identify
`objects
`
`H
`
`spatial
`areas
`
`identify
`temporal
`attributes
`
`FIG.3
`
`identify
`responses
`
`identify
`interactions
`
`~
`
`(')
`~
`
`(')
`
`~ .....
`
`31
`
`41
`
`42
`
`43
`
`44
`
`45
`
`source
`video
`
`video
`primitives
`
`video
`primitives
`
`event
`occurrences
`
`response,
`as appropriate 1
`
`FIG.4
`
`91
`
`92
`
`93
`
`94
`
`video ~ access
`
`task
`
`surveillance
`system
`
`archived video
`primitives
`
`extract
`event
`occurrences
`
`undertake
`response,
`as appropriate
`
`FIG.9
`
`-
`
`d
`'Jl
`N c
`c
`Ul
`c
`'"""'
`.i;;..
`O'I
`O'I
`c
`Ul
`>
`'"""'
`
`3/18
`
`DOJ EX. 1021
`
`
`
`51
`
`53
`
`54
`
`,
`
`~1 det.ect objects
`via motion
`
`generate
`blobs
`
`track
`objects
`
`55
`
`determine if
`trajectory of
`foreground
`object
`is salient
`
`56
`
`57
`
`classify
`objects
`
`identify
`video
`primitives
`
`~ detect objects
`via change i------l
`
`52
`
`FIG.5
`
`(')
`
`~ .....
`
`(')
`
`~ .....
`
`""C
`~ .....
`~ = .....
`> "Cl
`"Cl -....
`.... 0 =
`""C = O' -....
`.... 0 =
`~ = :-
`N c c
`
`~-..J
`
`Ul
`
`61
`
`62
`
`63
`
`undertake
`response,
`as appropriate
`
`generate
`activity record
`
`generate
`output
`
`FIG.6
`
`'Jl =-~
`~ .....
`
`~
`0 .....,
`-..J
`
`d
`'Jl
`N c c
`~ c
`'"""'
`.i;;..
`O'I
`O'I c
`Ul
`>
`'"""'
`
`4/18
`
`DOJ EX. 1021
`
`
`
`71
`
`72
`
`74
`
`75
`
`76
`
`77
`
`obtain
`source video
`
`1
`
`detect objects 1
`1 via motion
`
`'
`
`•
`
`•i
`
`generate
`blobs
`
`track
`objects
`
`monitor
`typical object
`
`identify typical
`sizes of
`typical object
`
`detect objects 1-------.....
`via change
`73
`
`FIG. 7
`
`81
`
`82
`
`84
`
`85
`
`obtain
`source video
`
`1
`
`detect objects 1
`1 via motion
`
`'
`
`•
`
`•i
`
`generate
`blobs
`
`track
`objects
`
`detect objects ....__ ____ _..
`
`via change
`
`88
`
`83
`
`87
`
`86
`
`identify
`typical sizes of 1 •
`typical objects
`
`1
`
`identify
`trackable
`areas
`
`monitor
`typical object
`
`FIG.8
`
`(')
`
`~ .....
`
`(')
`
`~ .....
`
`""C
`~ .....
`~ = .....
`~ "Cl -....
`.... 0 =
`~
`O' -....
`.... 0 =
`~ = :-
`N 8
`
`~-..J
`
`Ul
`
`'Jl =(cid:173)~
`~ .....
`.i;;..
`0 .....,
`-..J
`
`d
`'Jl
`N c c
`~ c
`'"""'
`.i;;..
`O'I
`O'I c
`Ul
`>
`'"""'
`
`5/18
`
`DOJ EX. 1021
`
`
`
`Patent Application Publication Jul. 7, 2005 Sheet 5 of 7
`
`US 2005/0146605 Al
`
`c::
`.Q
`O'>
`Q)
`.....
`(/)
`:.c
`.......
`c::
`(/)
`..i.:::
`(.)
`~
`.......
`Q)
`.0
`.£51
`a5
`.....
`.......
`Q)
`O'>
`~
`c::
`co
`()
`
`(/)
`..i.:::
`u
`~
`.......
`"'C
`0
`0
`O'>
`.....
`s
`Q)
`E
`co
`C)
`.....
`~
`.!:!2
`c::
`.Q
`O'>
`~
`.!:!2
`..c::
`I-
`
`0
`
`CJ
`u:::
`
`Q) c::
`O>o
`~ en
`Q)
`.....
`~~
`
`a:J
`......
`0
`a::
`
`Q)
`o-
`:s: a.
`I- g
`a.
`
`c
`Q) 0
`c ~
`0 Q)
`a.
`
`..c
`.......
`"'O
`"§
`
`T"""
`
`.....
`Q
`
`LL
`
`6/18
`
`DOJ EX. 1021
`
`
`
`(')
`
`""C
`~
`~ = .....
`> "Cl
`"Cl -....
`~ .... 0 =
`§. -....
`~ .... 0 =
`
`""C
`
`(')
`
`FIG.12
`
`Water - Cold Spot:
`Av. Customers: 4/hour
`Av. Dwell Time: 3 Sec.
`
`~
`
`~
`~-..J
`N
`
`8 Ul
`'Jl =(cid:173)~
`
`~
`O'I
`0 .....,
`-..J
`
`Sodas - Hot Spot:
`Av. Customers: 15/hour
`Av. Dwell Time: 22 Sec.
`
`FIG. 13
`
`FIG.14
`
`d
`'Jl
`N c c
`Ul c
`'"""'
`.i;;..
`O'I
`O'I c
`Ul
`>
`'"""'
`
`7/18
`
`DOJ EX. 1021
`
`
`
`Patent Application Publication
`
`Jul. 7, 2005 Sheet 7 of 7
`
`US 2005/0146605 Al
`
`+-
`
`!.--
`
`in
`..,....
`(!) u::
`
`0 +-
`0 c
`Q) c
`·-
`!...,_ ·-
`(]) 0
`0
`0
`0..
`0..
`c
`0
`c
`....--
`-+-
`+-
`c
`·-
`0
`0
`U') 0 0
`co
`N
`("')
`~
`U') 0
`(") 0
`(")
`co 0
`N N
`(])
`0 N
`(") o_
`....--
`....--
`....--
`....--
`....--
`l.()
`.......
`.......
`-1- 0
`0 +-
`0
`c c
`c c
`0
`CD
`<D
`<D +- +-
`Q)
`c o_ o_ c 0.. Q_ u 0
`(.I) Q) z
`(.I) 0
`0
`•---.
`~
`~
`..0
`Q)
`Q)
`0
`CL
`CL
`•
`•
`•
`
`U')
`
`U')
`
`(.I)
`
`(.I)
`
`8/18
`
`DOJ EX. 1021
`
`
`
`US 2005/0146605 Al
`
`Jul. 7, 2005
`
`1
`
`VIDEO SURVEILLANCE SYSTEM EMPLOYING
`VIDEO PRIMITIVES
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`[0001] This application claims the priority of U.S. patent
`application Ser. No. 09/694,712 filed Oct. 24, 2000, which
`is incorporated herein by reference.
`
`BACKGROUND OF THE INVENTION
`
`FIELD OF THE INVENTION
`
`[0002] The invention relates to a system for automatic
`video surveillance employing video primitives.
`
`REFERENCES
`
`[0003] For the convenience of the reader, the references
`referred to herein are listed below. In the specification, the
`numerals within brackets refer to respective references. The
`listed references are incorporated herein by reference.
`
`[0004] The following references describe moving target
`detection:
`
`[0005]
`{ 1} A Lipton, H. Fujiyoshi and R. S. Patil, "Mov(cid:173)
`ing Target Detection and Classification from Real-Time
`Video,"Proceedings of IEEE WACV '98, Princeton, N.J.,
`1998, pp. 8-14.
`
`[0006] {2} W. E. L. Grimson, et al., "Using Adaptive
`Tracking to Classify and Monitor Activities in a Site",
`CVPR, pp. 22-29, June 1998.
`
`[0007]
`{ 3} A J. Lipton, H. Fujiyoshi, R. S. Patil, "Moving
`Target Classification and Tracking from Real-time Video,
`"IUW, pp. 129-136, 1998.
`
`[0008]
`{ 4} T. J. Olson and F. Z. Brill, "Moving Object
`Detection and Event Recognition Algorithm for Smart Cam(cid:173)
`eras,"IUW, pp. 159-175, May 1997.
`
`[0009] The following references describe detecting and
`tracking humans:
`
`[0010]
`{ 5} A J. Lipton, "Local Application of Optical
`Flow to Analyse Rigid Versus Non-Rigid Motion,"Interna(cid:173)
`tional Conference on Computer Vision, Corfu, Greece, Sep(cid:173)
`tember 1999.
`
`[0011] {6} F. Bartolini, V. Cappellini, and A Mecocci,
`"Counting people getting in and out of a bus by real-time
`image-sequence processing,"NC, 12(1):36-41, January
`1994.
`
`[0012] {7} M. Rossi and A. Bozzoli, "Tracking and count(cid:173)
`ing moving people,"ICIP94, pp. 212-216, 1994.
`
`[0013] {8} C.R. Wren,A.Azarbayejani, T. Darrell, and A.
`Pentland, "Pfinder: Real-time tracking of the human body,
`"Vismod, 1995.
`
`[0014] {9} L. Khoudour, L. Duvieubourg, J. P. Deparis,
`"Real-Time Pedestrian Counting by Active Linear Cameras,
`"JEI, 5( 4):452-459, October 1996.
`
`[0015] {10} S. loffe, D. A Forsyth, "Probabilistic Meth(cid:173)
`ods for Finding People,"IJCV, 43(1):45-68, June 2001.
`
`[0016] {11} M. Isard and J. MacCormick, "BraMBLe: A
`Bayesian Multiple-Blob Tracker,"ICCV, 2001.
`
`[0017] The following references describe blob analysis:
`
`[0018]
`{ 12} D. M. Gavrila, "The Visual Analysis of
`Human Movement: A Survey,"CVIU, 73(1):82-98, January
`1999.
`
`[0019]
`{ 13} Niels Haering and Niels da Vitoria Lobo,
`"Visual Event Detection,"Video Computing Series, Editor
`Mubarak Shah, 2001.
`
`[0020] The following references describe blob analysis for
`trucks, cars, and people:
`
`[0021]
`{ 14} Collins, Lipton, Kanade, Fujiyoshi, Duggins,
`Tsin, Tolliver, Enomoto, and Hasegawa, "A System for
`Video Surveillance and Monitoring: VSAM Final Report,"
`Technical Report CMU-RI-TR-00-12, Robotics Institute,
`Carnegie Mellon University, May 2000.
`
`[0022]
`{ 15} Lipton, Fujiyoshi, and Patil, "Moving Target
`Classification and Tracking from Real-time Video," 98
`Darpa IUW, Nov. 20-23, 1998.
`
`[0023] The following reference describes analyzing a
`single-person blob and its contours:
`
`[0024]
`{ 16} C. R. Wren, A Azarbayejani, T. Darrell, and
`A P. Pentland. "Pfinder: Real-Time Tracking of the Human
`Body,"PAMI, vol 19, pp. 780-784, 1997.
`
`[0025] The following reference describes internal motion
`of blobs, including any motion-based segmentation:
`
`[0026]
`{ 17} M. Alhmen and C. Dyer, "Long-Range Spa(cid:173)
`tiotemporal Motion Understanding Using Spatiotemporal
`Flow Curves,"Proc. IEEE CVPR, Lahaina, Maui, Hi., pp.
`303-309, 1991.
`
`[0027]
`{ 18} L. Wixson, "Detecting Salient Motion by
`Accumulating Directionally Consistent Flow", IEEE Trans.
`Pattern Anal. Mach. Intell., vol. 22, pp. 774-781, August,
`2000.
`
`BACKGROUND OF THE INVENTION
`
`[0028] Video surveillance of public spaces has become
`extremely widespread and accepted by the general public.
`Unfortunately, conventional video surveillance systems pro(cid:173)
`duce such prodigious volumes of data that an intractable
`problem results in the analysis of video surveillance data.
`
`[0029] A need exists to reduce the amount of video
`surveillance data so analysis of the video surveillance data
`can be conducted.
`
`[0030] A need exists to filter video surveillance data to
`identify desired portions of the video surveillance data.
`
`SUMMARY OF THE INVENTION
`
`[0031] An object of the invention is to reduce the amount
`of video surveillance data so analysis of the video surveil(cid:173)
`lance data can be conducted.
`
`[0032] An object of the invention is to filter video sur(cid:173)
`veillance data to identify desired portions of the video
`surveillance data.
`
`[0033] An object of the invention is to produce a real time
`alarm based on an automatic detection of an event from
`video surveillance data.
`
`9/18
`
`DOJ EX. 1021
`
`
`
`US 2005/0146605 Al
`
`Jul. 7, 2005
`
`2
`
`[0034] An object of the invention is to integrate data from
`surveillance sensors other than video for improved searching
`capabilities.
`[0035] An object of the invention is to integrate data from
`surveillance sensors other than video for improved event
`detection capabilities
`[0036] The invention includes an article of manufacture, a
`method, a system, and an apparatus for video surveillance.
`[0037] The article of manufacture of the
`invention
`includes a computer-readable medium comprising software
`for a video surveillance system, comprising code segments
`for operating the video surveillance system based on video
`primitives.
`[0038] The article of manufacture of the
`invention
`includes a computer-readable medium comprising software
`for a video surveillance system, comprising code segments
`for accessing archived video primitives, and code segments
`for extracting event occurrences from accessed archived
`video primitives.
`[0039] The system of the invention includes a computer
`system including a computer-readable medium having soft(cid:173)
`ware to operate a computer in accordance with the invention.
`[0040] The apparatus of the invention includes a computer
`including a computer-readable medium having software to
`operate the computer in accordance with the invention.
`[0041] The article of manufacture of the
`invention
`includes a computer-readable medium having software to
`operate a computer in accordance with the invention.
`[0042] Moreover, the above objects and advantages of the
`invention are illustrative, and not exhaustive, of those that
`can be achieved by the invention. Thus, these and other
`objects and advantages of the invention will be apparent
`from the description herein, both as embodied herein and as
`modified in view of any variations which will be apparent to
`those skilled in the art.
`
`Definitions
`
`[0043] A "video" refers to motion pictures represented in
`analog and/or digital form. Examples of video include:
`television, movies, image sequences from a video camera or
`other observer, and computer-generated image sequences.
`[0044] A "frame" refers to a particular image or other
`discrete unit within a video.
`[0045] An "object" refers to an item of interest in a video.
`Examples of an object include: a person, a vehicle, an
`animal, and a physical subject.
`[0046] An "activity" refers to one or more actions and/or
`one or more composites of actions of one or more objects.
`Examples of an activity include: entering; exiting; stopping;
`moving; raising; lowering; growing; and shrinking.
`[0047] A "location" refers to a space where an activity
`may occur. A location can be, for example, scene-based or
`image-based. Examples of a scene-based location include: a
`public space; a store; a retail space; an office; a warehouse;
`a hotel room; a hotel lobby; a lobby of a building; a casino;
`a bus station; a train station; an airport; a port; a bus; a train;
`an airplane; and a ship. Examples of an image-based loca(cid:173)
`tion include: a video image; a line in a video image; an area
`
`in a video image; a rectangular section of a video image; and
`a polygonal section of a video image.
`[0048] An "event" refers to one or more objects engaged
`in an activity. The event may be referenced with respect to
`a location and/or a time.
`[0049] A "computer" refers to any apparatus that is
`capable of accepting a structured input, processing the
`structured input according to prescribed rules, and produc(cid:173)
`ing results of the processing as output. Examples of a
`computer include: a computer; a general purpose computer;
`a supercomputer; a mainframe; a super mini-computer; a
`mini-computer; a workstation; a micro-computer; a server;
`an interactive television; a hybrid combination of a com(cid:173)
`puter and an interactive television; and application-specific
`hardware to emulate a computer and/or software. A com(cid:173)
`puter can have a single processor or multiple processors,
`which can operate in parallel and/or not in parallel. A
`computer also refers to two or more computers connected
`together via a network for transmitting or receiving infor(cid:173)
`mation between the computers. An example of such a
`computer includes a distributed computer system for pro(cid:173)
`cessing information via computers linked by a network.
`[0050] A "computer-readable medium" refers to any stor(cid:173)
`age device used for storing data accessible by a computer.
`Examples of a computer-readable medium include: a mag(cid:173)
`netic hard disk; a floppy disk; an optical disk, such as a
`CD-ROM and a DVD; a magnetic tape; a memory chip; and
`a carrier wave used to carry computer-readable electronic
`data, such as those used in transmitting and receiving e-mail
`or in accessing a network.
`[0051]
`"Software" refers to prescribed rules to operate a
`computer. Examples of software include: software; code
`segments;
`instructions; computer programs; and pro(cid:173)
`grammed logic.
`[0052] A "computer system" refers to a system having a
`computer, where the computer comprises a computer-read(cid:173)
`able medium embodying software to operate the computer.
`[0053] A "network" refers to a number of computers and
`associated devices that are connected by communication
`facilities. A network involves permanent connections such
`as cables or temporary connections such as those made
`through telephone or other communication links. Examples
`of a network include: an internet, such as the Internet; an
`intranet; a local area network (LAN); a wide area network
`(WAN); and a combination of networks, such as an internet
`and an intranet.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0054] Embodiments of the invention are explained in
`greater detail by way of the drawings, where the same
`reference numerals refer to the same features.
`[0055] FIG. 1 illustrates a plan view of the video surveil(cid:173)
`lance system of the invention.
`[0056] FIG. 2 illustrates a flow diagram for the video
`surveillance system of the invention.
`[0057] FIG. 3 illustrates a flow diagram for tasking the
`video surveillance system.
`[0058] FIG. 4 illustrates a flow diagram for operating the
`video surveillance system.
`
`10/18
`
`DOJ EX. 1021
`
`
`
`US 2005/0146605 Al
`
`Jul. 7, 2005
`
`3
`
`[0059] FIG. 5 illustrates a flow diagram for extracting
`video primitives for the video surveillance system.
`[0060] FIG. 6 illustrates a flow diagram for taking action
`with the video surveillance system.
`
`[0061] FIG. 7 illustrates a flow diagram for semi-auto(cid:173)
`matic calibration of the video surveillance system.
`
`[0062] FIG. 8 illustrates a flow diagram for automatic
`calibration of the video surveillance system.
`
`[0063] FIG. 9 illustrates an additional flow diagram for
`the video surveillance system of the invention.
`
`[0064] FIGS. 10-15 illustrate examples of the video sur(cid:173)
`veillance system of the invention applied to monitoring a
`grocery store.
`
`DETAILED DESCRIPTION OF IBE
`INVENTION
`
`[0065] The automatic video surveillance system of the
`invention is for monitoring a location for, for example,
`market research or security purposes. The system can be a
`dedicated video surveillance installation with purpose-built
`surveillance components, or the system can be a retrofit to
`existing video surveillance equipment that piggybacks off
`the surveillance video feeds. The system is capable of
`analyzing video data from live sources or from recorded
`media. The system can have a prescribed response to the
`analysis, such as record data, activate an alarm mechanism,
`or active another sensor system. The system is also capable
`of integrating with other surveillance system components.
`The system produces security or market research reports that
`can be tailored according to the needs of an operator and, as
`an option, can be presented through an interactive web(cid:173)
`based interface, or other reporting mechanism.
`
`[0066] An operator is provided with maximum flexibility
`in configuring the system by using event discriminators.
`Event discriminators are identified with one or more objects
`(whose descriptions are based on video primitives), along
`with one or more optional spatial attributes, and/or one or
`more optional temporal attributes. For example, an operator
`can define an event discriminator (called a "loitering" event
`in this example) as a "person" object in the "automatic teller
`machine" space for "longer than 15 minutes" and "between
`10:00 p.m. and 6:00 a.m."
`
`[0067] Although the video surveillance system of the
`invention draws on well-known computer vision techniques
`from the public domain, the inventive video surveillance
`system has several unique and novel features that are not
`currently available. For example, current video surveillance
`systems use large volumes of video imagery as the primary
`commodity of information interchange. The system of the
`invention uses video primitives as the primary commodity
`with representative video imagery being used as collateral
`evidence. The system of the invention can also be calibrated
`(manually, semi-automatically, or automatically) and there(cid:173)
`after automatically can infer video primitives from video
`imagery. The system can further analyze previously pro(cid:173)
`cessed video without needing to reprocess completely the
`video. By analyzing previously processed video, the system
`can perform inference analysis based on previously recorded
`video primitives, which greatly improves the analysis speed
`of the computer system.
`
`[0068] As another example, the system of the invention
`provides unique system tasking. Using equipment control
`directives, current video systems allow a user to position
`video sensors and, in some sophisticated conventional sys(cid:173)
`tems, to mask out regions of interest or disinterest. Equip(cid:173)
`ment control directives are instructions to control the posi(cid:173)
`tion, orientation, and focus of video cameras. Instead of
`equipment control directives, the system of the invention
`uses event discriminators based on video primitives as the
`primary tasking mechanism. With event discriminators and
`video primitives, an operator is provided with a much more
`intuitive approach over conventional systems for extracting
`useful information from the system. Rather than tasking a
`system with an equipment control directives, such as "cam(cid:173)
`era A pan 45 degrees to the left," the system of the invention
`can be tasked in a human-intuitive manner with one or more
`event discriminators based on video primitives, such as "a
`person enters restricted area A."
`
`[0069] Using the invention for market research, the fol(cid:173)
`lowing are examples of the type of video surveillance that
`can be performed with the invention: counting people in a
`store; counting people in a part of a store; counting people
`who stop in a particular place in a store; measuring how long
`people spend in a store; measuring how long people spend
`in a part of a store; and measuring the length of a line in a
`store.
`[0070] Using the invention for security, the following are
`examples of the type of video surveillance that can be
`performed with the invention: determining when anyone
`enters a restricted area and storing associated imagery;
`determining when a person enters an area at unusual times;
`determining when changes to shelf space and storage space
`occur that might be unauthorized; determining when pas(cid:173)
`sengers aboard an aircraft approach the cockpit; determining
`when people tailgate through a secure portal; determining if
`there is an unattended bag in an airport; and determining if
`there is a theft of an asset.
`[0071] FIG. 1 illustrates a plan view of the video surveil(cid:173)
`lance system of the invention. A computer system 11 com(cid:173)
`prises a computer 12 having a computer-readable medium
`13 embodying software to operate the computer 12 accord(cid:173)
`ing to the invention. The computer system 11 is coupled to
`one or more video sensors 14, one or more video recorders
`15, and one or more input/output (110) devices 16. The video
`sensors 14 can also be optionally coupled to the video
`recorders 15 for direct recording of video surveillance data.
`The computer system is optionally coupled to other sensors
`17.
`[0072] The video sensors 14 provide source video to the
`computer system 11. Each video sensor 14 can be coupled
`to the computer system 11 using, for example, a direct
`connection (e.g., a firewire digital camera interface) or a
`network. The video sensors 14 can exist prior to installation
`of the invention or can be installed as part of the invention.
`Examples of a video sensor 14 include: a video camera; a
`digital video camera; a color camera; a monochrome cam(cid:173)
`era; a camera; a camcorder, a PC camera; a webcam; an
`infra-red video camera; and a CCTV camera.
`
`[0073] The video recorders 15 receive video surveillance
`data from the computer system 11 for recording and/or
`provide source video to the computer system 11. Each video
`recorder 15 can be coupled to the computer system 11 using,
`
`11/18
`
`DOJ EX. 1021
`
`
`
`US 2005/0146605 Al
`
`Jul. 7, 2005
`
`4
`
`for example, a direct connection or a network. The video
`recorders 15 can exist prior to installation of the invention or
`can be installed as part of the invention. Examples of a video
`recorder 15 include: a video tape recorder; a digital video
`recorder; a video disk; a DVD; and a computer-readable
`medium.
`
`[0074] The 1/0 devices 16 provide input to and receive
`output from the computer system 11. The 1/0 devices 16 can
`be used to task the computer system 11 and produce reports
`from the computer system 11. Examples of 1/0 devices 16
`include: a keyboard; a mouse; a stylus; a monitor; a printer;
`another computer system; a network; and an alarm.
`
`[0075] The other sensors 17 provide additional input to the
`computer system 11. Each other sensor 17 can be coupled to
`the computer system 11 using, for example, a direct con(cid:173)
`nection or a network. The other sensors 17 can exit prior to
`installation of the invention or can be installed as part of the
`invention. Examples of another sensor 17 include: a motion
`sensor; an optical tripwire; a biometric sensor; and a card(cid:173)
`based or keypad-based authorization system. The outputs of
`the other sensors 17 can be recorded by the computer system
`11, recording devices, and/or recording systems.
`
`[0076] FIG. 2 illustrates a flow diagram for the video
`surveillance system of the invention. Various aspects of the
`invention are exemplified with reference to FIGS. 10-15,
`which illustrate examples of the video surveillance system
`of the invention applied to monitoring a grocery store.
`
`In block 21, the video surveillance system is set up
`[0077]
`as discussed for FIG. 1. Each video sensor 14 is orientated
`to a location for video surveillance. The computer system 11
`is connected to the video feeds from the video equipment 14
`and 15. The video surveillance system can be implemented
`using existing equipment or newly installed equipment for
`the location.
`
`In block 22, the video surveillance system is cali(cid:173)
`[0078]
`brated. Once the video surveillance system is in place from
`block 21, calibration occurs. The result of block 22 is the
`ability of the video surveillance system to determine an
`approximate absolute size and speed of a particular object
`(e.g., a person) at various places in the video image provided
`by the video sensor. The system can be calibrated using
`manual calibration, semi-automatic calibration, and auto(cid:173)
`matic calibration. Calibration is further described after the
`discussion of block 24.
`
`In block 23 of FIG. 2, the video surveillance
`[0079]
`system is tasked. Tasking occurs after calibration in block 22
`and is optional. Tasking the video surveillance system
`involves specifying one or more event discriminators. With(cid:173)
`out tasking, the video surveillance system operates by
`detecting and archiving video primitives and associated
`video imagery without taking any action, as in block 45 in
`FIG. 4.
`
`[0080] FIG. 3 illustrates a flow diagram for tasking the
`video surveillance system to determine event discriminators.
`An event discriminator refers to one or more objects option(cid:173)
`ally interacting with one or more spatial attributes and/or
`one or more temporal attributes. An event discriminator is
`described in terms of video primitives. A video primitive
`refers to an observable attribute of an object viewed in a
`video feed. Examples of video primitives include the fol(cid:173)
`lowing: a classification; a size; a shape; a color; a texture; a
`
`position; a velocity; a speed; an internal motion; a motion;
`a salient motion; a feature of a salient motion; a scene
`change; a feature of a scene change; and a pre-defined
`model.
`
`[0081] A classification refers to an identification of an
`object as belonging to a particular category or class.
`Examples of a classification include: a person; a dog; a
`vehicle; a police car; an individual person; and a specific
`type of object.
`
`[0082] A size refers to a dimensional attribute of an object.
`Examples of a size include: large; medium; small; fiat; taller
`than 6 feet; shorter than 1 foot; wider than 3 feet; thinner
`than 4 feet; about human size; bigger than a human; smaller
`than a human; about the size of a car; a rectangle in an image
`with approximate dimensions in pixels; and a number of
`image pixels.
`
`[0083] A color refers to a chromatic attribute of an object.
`Examples of a color include: white; black; grey; red; a range
`of HSY values; a range of YUV values; a range of RGB
`values; an average RGB value; an average YUV value; and
`a histogram of RGB values.
`
`[0084] A texture refers to a pattern attribute of an object.
`Examples of texture features include: self-similarity; spec(cid:173)
`tral power; linearity; and coarseness.
`
`[0085] An internal motion refers to a measure of the
`rigidity of an object. An example of a fairly rigid object is
`a car, which does not exhibit a great amount of internal
`motion. An example of a fairly non-rigid object is a person
`having swinging arms and legs, which exhibits a great
`amount of internal motion.
`
`[0086] A motion refers to any motion that can be auto(cid:173)
`matically detected. Examples of a motion include: appear(cid:173)
`ance of an object; disappearance of an object; a vertical
`movement of an object; a horizontal movement of an object;
`and a periodic movement of an object.
`
`[0087] A salient motion refers to any motion that can be
`automatically detected and can be tracked for some period of
`time. Such a moving object exhibits apparently purposeful
`motion. Examples of a salient motion include: moving from
`one place to another; and moving to interact with another
`object.
`
`[0088] A feature of a salient motion refers to a property of
`a salient motion. Examples of a feature of a salient motion
`include: a trajectory; a length of a trajectory in image space;
`an approximate length of a trajectory in a three-dimensional
`representation of the environment; a position of an object in
`image space as a function of time; an approximate position
`of an object in a three-dimensional representation of the
`environment as a function of time; a duration of a trajectory;
`a velocity (e.g., speed and direction) in image space; an
`approximate velocity (e.g., speed and direction) in a three(cid:173)
`dimensional representation of the environment; a duration of
`time at a velocity; a change of velocity in image space; an
`approximate change of velocity in a three-dimensional rep(cid:173)
`resentation of the environment; a duration of a change of
`velocity; cessation of motion; and a duration of cessation of
`motion. A velocity refers to the speed and direction of an
`object at a particular time. A trajectory refers a set of
`(position, velocity) pairs for an object for as long as the
`object can be tracked or for a time period.
`
`12/18
`
`DOJ EX. 1021
`
`
`
`US 2005/0146605 Al
`
`Jul. 7, 2005
`
`5
`
`[0089] A scene change refers to any region of a scene that
`can be detected as changing over a period of time. Examples
`of a scene change include: an stationary object leaving a
`scene; an object entering a scene and becoming stationary;
`an object changing position in a scene; and an object
`changing appearance (e.g. color, shape, or size).
`
`[0090] A feature of a scene change refers to a property of
`a scene change. Examples of a feature of a scene change
`include: a size of a scene change in image space; an
`approximate size of a scene change in a three-dimensional
`representation of the environment; a time at which a scene
`change occurred; a location of a scene change in image
`space; and an approximate location of a scene change in a
`three-dimensional representation of the environment.
`
`[0091] A pre-defined model refers to an a priori known
`model of an object. Examples of a pre-defined include: an
`adult; a child; a vehicle; and a semi-trailer.
`
`In block 31, one or more objects types of interests
`[0092]
`are identified in terms of video primitives or abstractions
`thereof. Examples of one or more objects include: an object;
`a person; a red object; two objects; two persons; and a
`vehicle.
`
`In block 32, one or more spatial areas of interest are
`[0093]
`identified. An are