`
`TO REQUEST FOR EX PA RTE REEXAMJ NATION OF
`U.S. PATENT NO. 7,868,912
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 1 of 18
`
`
`
`~foving Object Detection and Event .Recognition Algorithms for Smart Cameras
`
`Thoma~ J. Olson
`Frank Z. Brill
`Texas Instruments
`Re~earch & Development
`P.O. Box 655303, MS 8374, Dallas, TX 75265
`E-mail: olson@csc.ti.com, brill@ti.com
`http://www.ti.com/research/docs/iuba/index.html
`
`Abstract
`Smart vide-0 cameras analyze the video stream and
`translate it into a description of the scene in tenns
`of objects, object motions, and events. '111is paper
`describes. a set of algorithms for the core computa(cid:173)
`tions needed to build smart <.-.ameras. Together
`these algorithms make up the Antonomous Video
`Surveillance (AVS) system, a general-pu.rpose.
`fram.ework for moving object detection and event
`recognition. Mo\'ing objects are detected using
`change detection, and are tracked using first-order
`prediction and ne.arest neighbor matching. Events
`are recognized by applying predicates to the graph
`formed by linking corresponding obje-et~ in succes(cid:173)
`sive frames.The AVS algorithms have bt-cn used to
`create seve.ni'! novel v.ideo surveillance applica(cid:173)
`tions. "Dlese include a video surveillance shell that
`allows a human to monitor the outputs of multiple
`ca.meras, a system that takes a single high-quality
`snaps.hot of ev~ry person who enters its field of
`view, and a system that learns the structure of the
`monitored environment by watching humans move
`around in the scene.
`
`1 Introduction
`
`ages and video clips, but these- will be carefully
`selected to maximize their n~·eful infom1ation con(cid:173)
`tent. The symbolic information and images from
`smart cameras will be filtered by programs that ex(cid:173)
`tra.ct data relevant to particular tasks. This filtering
`process will enable a single human to monitor hun(cid:173)
`dreds or thm1saods of video streams.
`
`In pursuit of our research objectives [Flinchbaugh,
`1997}, we are developing the technology nee.ded to
`make :.mart cameras a reality. 1\vo fundamental ca(cid:173)
`pabilities are n<.>eded. The firs t is t11e ability to
`describe scenes in terms of object motions and in(cid:173)
`teractions. The second is the ability to recognize
`important events that occur in the scene, and to
`pick out those th,u are: relevant to the current task.
`These capabilities make it possible to develop a. va(cid:173)
`riety of novel and useful video surveiUam~e
`applications.
`
`1.1 Video Surveillance and M:onitoring
`Scenarios
`
`Our work is motivated by a several types of video
`surve.iJlancc and monitoring scenarios,
`
`Video cameras today produce images, which must
`he examined by humans in order to be uscfuL Fu(cid:173)
`tore 'smart' video cameras will produce infor(cid:173)
`mation, including descriptions of the environment
`they are monitoring and the events taking place in
`it. The information they pmducc may incJude im-
`
`1be re-11eatch describe<! in this report wns sponsored in part by
`the DARPA Image Understanding Program.
`
`Indoor Surveillance: Indoor urveiUance provides
`infom1ation about areas such as building lobbies.
`hallways, and office..~. Monitoring tasks in lobbie.s
`and ha!Iways include detection of people deposit(cid:173)
`ing things (e.g., unattended luggage in an airport
`lounge), removing things (e.g .• theft}, or loitering.
`Office monitoring tasks typically require informa(cid:173)
`tion about people's identities: in an office. for
`example, the office owner may do anything at any
`
`159
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 2 of 18
`
`
`
`time. but other people should not open desk draw(cid:173)
`ers or operate the computer unless the owner is
`present, Cleaning staff may come in at night to vac(cid:173)
`uum and empty trash cans, but should not handle
`objects on the desk.
`
`Outdoor Surveillance: Outdoor surveillance in(cid:173)
`cludes tasks such as monitoring a site perimeter for
`intrusion or threats from vehicles (e.g .• car bombs).
`In military applications, video surveiUance can
`function as a sentry or forward observer, e.g. by
`notifying commanders when enemy
`.soldiers
`emerge from a wooded area or cross a road.
`
`In order for smart cameras to be practical for real(cid:173)
`world tasks, the. algorithms they use must be ro(cid:173)
`bust Current commercial video surveillance
`systems have a high false alarm rate {Ringler and
`Hoover, 1995], which renders them useless for
`most applications. For this reason, our research
`stresses robustness and quantification of detection
`and false alarm rates. Smart camera algorithms
`must also run effectively on low-cost platforms, so
`that they can be implemented io small. low-power
`packages and can be used in large numbers. Study(cid:173)
`ing algorithms that can run in near real time makes
`it practical to conduct extensive evaluation and
`testing of systems. and may enable worthwhile
`near-tem1 applications .as well as contributing to
`long-term research goals.
`
`1.2 Approach
`
`The first step in processing a video stream for sur(cid:173)
`veillance purposes is to identify the important
`c,bjects in the scene. In this paper it is assumed that
`the important objects are those that move indepen(cid:173)
`dently. Camera parameters are assumed to be fixed.
`This allows the use of simple change detection to
`identify moving objects. Where use of moving
`cameras is necessary, stabilization hardware and
`stabilized mtwing object detection algorithms can
`be used (e.g. [Burt et al, 1989. Nelson, l991J. The
`use of criteria other than motion (e.g,, salience
`based on shape or color, or more general object
`recognition) is ~ompatible with our approach, but
`these criteria are not used
`in our current
`applications.
`
`Our event recognition algorithms arc based on
`graph matchil1g. ·Moving objects in the image are
`
`tracked over time. Obsc.rvations of an object in suc(cid:173)
`cessive video frames are Jinked to fonn a directed
`graph (the motinn graph), Events are defined in
`tenns of predicates on the motion graph. For in(cid:173)
`stance, the beginning of a chain of successive
`observations of an o~ject is defined to be an EN(cid:173)
`TER event. Event dete-ction is described in more
`detail below.
`
`Our approach to video surveillance stresses 2D.
`image-based algorithms and simple. low-level ob(cid:173)
`ject representations that can be extracted reliably
`from the video sequence. This emphasis yields a
`high level of robustness and low computational
`cost. Object recognition and other detailed analy(cid:173)
`ses are used only after the system ha.') detennined
`that the objects in question are interesting and mer(cid:173)
`it further investigation.
`
`1.3 Research Strategy
`
`The primary technical goal of this research is to de(cid:173)
`velop genernlypurpose algorithms for moving
`object dett~tion and event recognition. These algo(cid:173)
`ritl1ms
`comprise
`the Autonomous Video
`Surveillance (AVS) system, a modular framework
`for building video surveillance applications. AVS
`is designed to be updated to incorporate better core
`algorithms or to tune the processing to specific do(cid:173)
`mains as our rese.arch progresses.
`
`In order to evaluate the AVS core algorithms and
`event recognition and tracking frnmework. we use
`them to develop applications motivated by the sur(cid:173)
`veillance
`scenarios described
`above. The
`applications arc small-scale implementations of fu(cid:173)
`ture smart camera systems. They are designed for
`long-tenn operation, and are evaluated by allowing
`them to run for tong periods (hours or days} and
`analyzing their output.
`
`The remainder of this paper is organized as foJY
`tows. The next section discusses related work.
`Section 3 presents the core moving object detection
`and event recognition algorithms, and the mecha(cid:173)
`nism used to establish the 3D positions of objects.
`Section 4 presents applications that have been built
`using the AVS framework. The final section dis(cid:173)
`cusses the current state of the system and our
`futnre plans.
`
`160
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 3 of 18
`
`
`
`2 Related Work
`
`'
`
`y
`
`Onr overall approach to video surveillance has
`been influenced by intere,c;t in selective attention
`and task-oriented proce.ssing fSwain and Stricker,
`1991 Rimev and Brown, 1993, Camus et aL,
`1993]. The fundamental problem with current vid-
`eo surveillance techn-01ogy
`is
`that the useful
`infonnation density of the images delivered to a
`hum,m is very low; the vast majority of surveil(cid:173)
`lance. video frames contain no tisefu! infr1rmation
`at alt The fundamental role of the smart camera
`described above is to reduce the volume of data
`produced by the camera. and increase the value of
`that data. It does this by discarding irrelevant
`frames, and by expressing the information in the
`relevant frames primarily in symbolic form.
`
`2.1 l\tf.ol·ing Object Detection
`
`Most algorithms for moving object detection using
`fixed cameras work by comparing incoming video
`frames to a reference image, and attributing signifi(cid:173)
`cant differences e.ither to motion or to noise. The
`algorithms differ in the form of the comparison op(cid:173)
`erator they use, an<l in the way in which the
`reference image is maintained. Simple intensity
`differencing followed by thresholding is widely
`used [Jain et al., 1979, Yalamanchi!i et al., 1982,
`Kelly et al.. I 995, Bobick and Davis, J 996, Court(cid:173)
`ii
`ney: 1997} ~.ause
`is c-0mpUtationaHy
`inexpensive and works quite well in many indoor
`environments. Some algorithms provide a means of
`adapting the reforence image over time, in order to
`track slow changes in lighting conditions and/or
`changes in the environment {Karmann and von
`Brandt, 1990. Makarov, 1996aJ. Some also filter
`the image to reduce or remove low spatial frequen~
`cy content, which again makes the detect-Or less
`sensitive to lighting changes lMakarov et aL,
`1996b, Koller et al., 1994].
`
`Recent work [Pentland, 19%. Kahn et aL, l996j
`has extended the basic change detection paradigm
`by replacing the reference image with a statistical
`model of the background. The comparison operator
`becomes a statistical tesl that estimates the proba~
`bihty that the observed pixel value belongs to the
`background.
`
`Our baseline change detection algorithm use,c;
`thresholded absofute differencing, since this works
`well for our indo-0r surveillance scenarios. For ap(cid:173)
`plications where lighting change is a problem, we
`use the adaptive reference frame algorithm of Kar(cid:173)
`mann and von Brandt fl 990]. We are also
`experimenting with a probabilistic change detector
`similar to Pfinde,r {Pentland, l.996.
`
`Our \vork assumes fixed cameras. When the cam(cid:173)
`era is not fixed, simple change detection cannot be
`used because of background motion. One approach
`to this problem is to treat the s-cenc as a collection
`of independently moving objects, and to detect and
`ignore the visual motion due to camera motion
`!.e.g. Burt et aL, 19891 Other researchers have pro(cid:173)
`posed ways of detecting features of the optical flow
`that are inconsiste.nt with a hypothesis of self mo(cid:173)
`tion (Nelson, 1991].
`
`In many of our applications movjng object detec(cid:173)
`tion is a prelude to person detection. There. has
`been significant recent progress in the development
`of algorithms to locate and track humans, Pfinder
`(cited above) uses a coarse statistical model of hu(cid:173)
`man body geometry and motion to estimate the
`likelihood that a given pixel is part of a human.
`Seveml re-searchers have described methods of
`tracking human body and limb movements [Gavd(cid:173)
`la and Davis, 1996, Kakadiaris and Metaxas, 1996.l
`and locating faces in images [Sung and Poggio,
`1994, Rowley et al.. 1996]. Intille and Bobick
`1.1995] describe methods of tracking humans
`through episodes of mutual occlusion in a highly
`structured environment. We do not currently make
`m,e of the:se techniques in live experiments because
`of their computational cost. However, we expect
`that this type of analysis will eventually be an im(cid:173)
`portant part of sma:i1 came:ra processing.
`
`2.2 E\·ent Recognition
`
`Most work on event recognition has focussed on
`events that consist. of a welJ.defined sequence of
`primitive motions. This class of events can be con(cid:173)
`verted into spatiotemporal patterns and recognized
`using ~tatistical pattern matching techniques. A
`numher of researchers have demonstrated al.go(cid:173)
`rithms for recognizing gestures and sign language
`[e.g .• Starner and Pentland. .l 995). Bobick and
`Davis [ 1996] describe a method of recogniz.ing ste-
`
`161.
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 4 of 18
`
`
`
`*\ ~
`.... ~I~~r
`
`i~otyi,tc~l
`rn<.)tinr, pautm~ corre.sp1"Jndfog.
`tn
`ac·!i1)n$ )litt:h a~ siti'in.g ,kiwn, w11lking~ M waving.
`
`Our apJ}l't)tK:h w. e,,em rtC~)gnitfon i~ h<1sed on the
`v:ideJ) d,ttaha~e. 'ind:exing work ,,.)f C'.<mrtney n 9971,
`w.hk:h intn)dated Uw us.0 (if pr~dkates l~n lh~ JlK}·
`ritm graph tl'> rer.1res~M1t ¢Vt~nts, f\.-t~)thm graph~ are
`wtU $Hited tn t~pr~s,~.ndng abstrnd, g~netk events
`si1~h ~11 \i~ttt.iJitin.g ar\ ,':>bJ~cl' ix '<~1,i11ing tl:J te.$(,
`wfoch an.~ diftkuh h.s caimwt using th~ pitttem,
`~a~e<l approadws teforred t,, ~bl)Vt. On th{, l)tbtr
`h,:lmi, pattirn--h1~~d .1p_t1t1::>-iiches c.~1n represtnt· ~otn(cid:173)
`p.b: tn(1tkms .~uch ·as 'thrnwing an t1~ject' n:r
`··wuvi'ti;g\ whkh ·w<Jutd be. dinkult tQ express t1~ing
`m<)tkm :graphs. n it~ Hkdy thal f)()th pattcm-ba<;~J
`and ~b$tl1l~ z~vt~n1 re~:t,gnhimi tt.~hniques wHl be
`nceedt~i:I' w h:md.!e the {u.ll :mnge ,:if evt.~nt% fhaf ,ire (lf
`interest in surveHttmc1.:.1 ,ipp'ik.~th;in~,
`
`3 AVS l"r.acklng and l}n,•nt Rt~vgnition
`AtgurUbm~
`
`This se~tfon de~~nhe~ th~. q:ire. t{!(ehn9logie1> that
`ptQVide the vkk.•(1 t<ttln-e.Hla:n(,e l:\iid n:Klnit(:iring 1,~:1 ..
`pabiliti~"ts s:if th~ AV$ sy~{1:,'.m. 'llterc tH·e tlm.!{': k#.Y
`k<:ehnolt;g.i:c-!>: rnoving
`(~hjt.>.-et dett~«:1.ion, vfauid
`tnN.:kfog,, Mtd event re('1Jgnitkm. Thi~ nmving ($~k~.t
`dett~tikm rnntin~s (fot~rmim~ when <)n(i zw m(Jt~- <)b·-·
`j~d~ ~-tlt{tr ,l monit<)~d ~~ene, d~dd~ \\ihkh pi~l§!s
`in a gh·i:n videu ft~me ~urre,;pnnd w th1..~ muving
`objecur '\'et'$US whi¢h pb:d~ CQffei<pt)ntl to the hack-·
`grotmd, ,uid form ~l slmpfo r~prfscnlmiun d . lhe,
`,,bJect'.~ im~g¢ .in the vitfot) fra.1t1¢. Thi~ re:pwsenttt·
`tkrn is rel~ned to ~~ a moti<:m .t~.g.fo>). :md h eXlitts
`in ii single· vidt~{1 fr~1me, fl$ dl1.ting·1fr~htd frmn: the
`wodd ol-:}ects which ~~J~t in the W()dd i'llkl give ti~e
`to tfa~ mi-:it:i~,ii r¢git,n.1t
`
`Visual tn't<.:kinft c(nt~fats t)f deterrnining, CQrrt'!spon~
`de.,wes hetween
`ill~
`i'n<Jtkm ~.gi()fl~
`(}V~r a
`5~(~U(~nNt>fVst.k'I) fr~tn1C$, Md tnl\intaining :l ~.ingk
`r-erit-e,1ei!h1ti<.m, l}f trad, .. fot th~, wt)rld <;,'hkct whkh
`gmvc th,~ t,) the· ~cquttH.:t~ of motfon. i'ttgioHt ll'l tfm
`~equem:,e of framei-, Fi.miUy, evenl K~4)gnitkm :is a
`mean~ 11f ttnalyzii)Kthe coll~titm of tmch ll) nrd~r
`to id~Jttify (~\'~.flt~ nf .i'nt~re~f ln1c'Olvfog th~. w,}rld
`uhje,'.I~ r~iir.es~nwd l,ly the v,,t~ks..
`
`'Ille iw:wiog ol>Jt.-Ot .i.kh:!~tl<)n l~hn0l1Jgy wzt ~m.,
`l}foy is.a: 2D chmtge: Je,tection tedmique :similar tn
`th~1t describtd in lain ef rtl. p 9791 ~Hld Y~l;m1an-(cid:173)
`ch1H et ai f 'l:982J. Prior fO adivatinn of the
`1}J()Hilo1fog syst~m, ~H image d. the t;~wkgt\)i.mlr
`Le,. ~n image of the sce:ne w'hich contasm: tw niiJv y
`tng \,r <)tberw·ls~ hnen."tsUng i)bj~ct~$ i:$ c~pll!!'~tf tl1
`:,er~<e-as the r~rP·n~,r<-:~ imast.\ Whsm th~-~yskm ls in
`tiper~tkm. tiK~ abs·()J!lt~ <lifferefice ·Of tbc cvm~nt
`viduo fom)e .frmn the reft.,ren.:;e hmige is iXu:n})uted
`l\) }'H'OdtK!e a d€{{tmrm.~e J.mag~. 'fht dfffbt~m:¢ imv
`.uge t~ then tfo:eihoh:ted at: an appr1;,iprj~le v,1luc tu
`t,ht~in ll bfriary hwii~e tti w'hfol) the "<.lil" p~xds rep(cid:173)
`re1-em. bm:kgmu.nd ptxdg, .and the ''nn''· .pixel!>
`ropr\.>.S~:m~ "nK~vfag ~)l)_tl:-ct" pixz~is, The. fotir-·con,
`neckd i:..-:o.1np(meufa ~f moving 1:!hji~ct pixel~ in the
`thrt~Ut()ld~d jm~~-Itrt th~ n·t.-:1fo;i11 r.cgfom, f~>.t: flg--
`11re: l).
`
`Simpfo ~pplic~tfon <.'lf the nbjett. det(~t'timi proc~~
`dure oulfoie-d ·a.hove rel:tulrn hi a numoor ~Jf en·or~.
`hlrgdy due W the Hmit.a.h1}nt- -i:~f lhre~holding, ff i.h(t
`thre,,;hoid tised is too. fmv~ ~am.em nl)l.$e and ~h.ad-(cid:173)
`{):Wt wHI ptod1:it{~ ~t~tukm,\ ()bJ~~!lt: wl~"-l~a1. if the
`tlm~~ho.lcl h WI"> high, i.oll~!- pmtiuns M the ~lbjech·
`in the ~cent:. ·wlH 6H tt) be 5,;epa.rnted fmm th!! b.ick,-
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 5 of 18
`
`
`
`gfOMd, fe$llhing ill imsa:.~itp~ 10 ~vbich a s,nglt~
`'-Vorfd tibject ·givet rh;~,· tf1 ~everai m<)titm regi{ins
`wid1in ~1 single fram.e, Our gen~tra1 appn):t¢h i$ t<>
`.,i}low hr~:1ku~ h1:1t u~~ g.muping °tlt!-urisd~% t1;>
`rherge multiple cnJme,red C\~nq;Me1ns bto a sin.gle
`mtJfaw m!~i<.)n an<l maititain i <mt--t~H:.me ,om:.~-·
`tx.~tw~en -rnot.km re,gl()ni and W1)f.ld.
`Sf~)',i\de-m~~
`ohjecl!~ whhi-r-1 each frarne.
`
`One grt)upil'ig 1echt)iqut. \\'~·tmplqy is 20 rrti)tpb()..
`fogical difatitin 0f the motion ixt.gion~. Thi~ enahk~s
`the syl::t:t!m t,J merge ~~mne.cted c,impolie:m~ ~ -t)a"
`rated by a few pixels,. but ·using thi~ 1t~i;-hn:t(llfe t~'.i
`&pan hlrge- gapt results in ~ ~eve.re pcrfonn~nt.::e
`d~gs',~laH11i1. Mo:re~i\--e.r:, dibrion in t.he ima,tt.~ $pac:e
`nilly r~~iih in lrHX),r~~tly mugi11g disUnl: {ibj1:.:tt~
`whi~:h are . .n¢~1rby in thzt in:mge (.:J. few· r.ri~el~). but
`a~ iu t'wit .separ.Med hy i¥ large <.fo,t4tK':e. ·iJ1 the
`W<)fkl (;, fow fe~,t).
`
`tr 3D i1il\1tmatil)tt 1s :wailahle. the .,:fmrit!cted C{~m~
`pt)t1el1t gn)nping ,il'gorifhm ma.kcs 1t~i~ 12f Mi
`fStim11te -qf the s_i1.e {In w<.trfd co~;mfi.mttei) <.~f the
`()~jett:s i:n. the hn~~~-, :n~ bNmdirtg b{,x~,; t)f the
`.<.xn111\~tM t1Jmp:om~nts i\lrt-~ exp,iuded v~rt-~~aHy and
`hmit(mtally by a distimct me.-asum.1 tn l't!el (rather
`th11n pi~:ets) .• :md.~<.IT.ine(:te~l <..:mnpormnts wWt over-(cid:173)
`JaJ>phig bouiid~ng b<.">~~:~ :1rn merg~<l int(? a $higk
`rno.ticm r~gion, The technique fi;r e~timatiHg the
`site ()f the <Jb}~tl'$ fa Ute im~1ge ii de~cti~d in ~c~
`tkm JA below:
`
`Tt~~ fo.t)ctkm (!f th~ ls.VS tracking mt1tin~ fa u-1 (~s ...
`fabHsh. cz)rre~:pnndences ~tween
`il:H.~ motlnn
`reg-i~)n~ in tht~ ct.irrent fran:tt.-and t.h1)M~ in th~: pr~.vi~
`i.·ms fran:ie. \"./c u:'!e th~ {edmi<1m~ ·t)f Courtney
`[1~)97:1, \vhkb proceed!!> a~ fuHuv•·~- Fir%t a:~sume
`tn.,t \n-<: have <.:-mn_pmed 2n. ·1,c-etodty ¢~timtite~· for
`th~ nKttfon n·gfom~ fo· the pnwimt:S- frame, These ve·
`fodty e~tima!i}s,, tf}ge.tlw.r "'\''hh the.1oc.~11kms iif i:he
`c-entn)ids in the i,re.vh>us frame, tire used tl) pt{)Ject
`t_he· h.).(:·atkm& N' th1:~ ~~ntroh:J& (Jf th~>:: 1rn'.1tkm reg:i(m~
`·intf> the cut1~.!i11t fhune. Th~~i. a tmitui.i! 1illi-tf..'St·
`fa;
`criter.fot\
`used
`tl1ti.ghlmr
`c,~ta.hlish
`tt:t
`i;urr~~r,mn<ll!m::t~K-.
`
`Let P be th<~ ,;et rsf motion rngkm ttintr.rikl k1ca(cid:173)
`lion~ in the prtwfoiis .frnm~, with pi <.me. ~ucl1
`1o<.,afom, Let p'; be the pn·*~·ded Jocatin-n of p; in
`
`th<~ ~m~nt f reme. sH1d kt
`bt-: the· ·s~t. (tf lH such
`pr<:~j~recl l<.'l,:~1ti(m$. in the ~ummt frame, Let C be
`the. stt ~f iriotis:it, rngk~n <.:tutttikl k~~tkms iti th~(cid:173)
`currenr frame.. If Hm distance berw~ttl p' . .u1d
`. ..
`.
`.
`.
`.
`' 1:
`.
`c; e C is tht smllHt'.::\i foi an. demems- -t~f C, 1.uul
`this ·distaii<:~ is al$<.) the sman~t tlf the dii,t:aru.-.:e~·
`t~tW!'.!e:n· ct, and all e~ment~ M :P' (Le., tl - :w:i c,
`are rm.1tual nea.ts.!-St ndghb<m,1, then e~t:ib.iish ~ {'.t)r·
`1'¢.sp,~nd~1w9 hftw~-<:f!. pi. ·imd c i ~Y cret1.ting a
`oidirectiumil <'ifmn,g Hn.k tx~lWCt~n thtm, {J$~! tfm dlf,,
`fer.em:~ in time :1nd spact~ bl°.~1w-een P; ,md <~; w
`dete:nniiie- a vtk;titj e.stiitiiitt. f()f t\, expr<.~ssed ~n
`tti~~k p~r ;;ec~lmL H there is nn ~xisting track t()n-,
`taining: J\·· idd J\ t1) k Olherwfae:, e:nabhi;h a n~w
`trm~k, and ad<l b1Jth pi and'\ to h.
`
`I.
`
`'
`
`The strong link~' form the ru151j <11' fh~ tt,\cks \\!ith ~
`foglH:<.mild~inct or their GWT~Ctne~$, Video ubiect5>
`'Whkh <.i<.Y '{l~)t 1-Wtt· tiltltt1al nc!B'<.$! !k~ighb(;tS :!!~ the-·
`adj,:1cenl frame· nmy ··fa.il to .fonn C()rr.espN~dt·'m()t:.:~
`b~aui<e tM tmd.erlyi1\g W<.):rkl objet.:t i5' in\t~Jlved fo
`.an eve~t (t!,g., ente.r~ e~it,. dtpt~51t,. ~m,w~}, In t)r•
`de.r tn ~ssi§t. iit tht., idetidfk~tilm tl these ewnts.
`,,o,iet.t~ without .~tn:.)ng fo11<:s· kre givl'.!i1 mitdifcdk~i'l(cid:173)
`al ~vt•dk Utiks. n~ the thdr (mm-mutual) nearest
`nt~ig:hhi:m,, The: -,,.,,~t'k links repr~$ent _f)()tenfoll ~\ll·
`"bi.guity in the trndtirig_ ~tm.~e5>~:.Th~ motfon regions
`m a.H nf tl)<!-frames. togdhe.r \Vith their ~tn:mg t1nd
`weM links~ form -~ mmft)n gmph.
`
`Hgu.m 2 depict~ a ~ample mi:~Hon graph. ln th~ Hg-(cid:173)
`i~
`·on~-dim~msi<.)m}l.
`um, each. fr~u~(: b
`itnq
`:retlre:;ente.{I by a ve:rtitaf .line (r'tl -- Fl H,). Ci:rd:e\
`fepr-e~e:m <.)bjitt~ h1 tbt~ SC.i~nt\ 'th~-das-t ~mttws rep-·
`resent t,tmng. litlk5. am.! the gray tirr,,ws repre~enr
`w~'.-li.kl1nk$: Al'~ o.ttfei~t enters Ult Stl~llt~ fo frame Fl,
`~t'!ld then moves thJX)Hgh tht~ sctmc w,tH .trnl1i~ F4,
`wber~ it i:fu.pt1,,;it~-,l S~iXmd ~l~}ect. The fk~t objl~C!(cid:173)
`Ct)ntinut$ t() ·m1Wt! thm~gh t!m ~em:,. nncJ t:!:\il"~ ill
`fr.atr.l.e {~"6. T.hc dcp1Jsit~d ~~je-e.t remHltli> stadomtry,
`At frM1\e P~ -~n()ther. 1)!:tjt~t- ~n.krs tht--: .t.cl::~1et t~!'n-·
`t5Qradly (x.':~ludts the ~:atk~t1-1l.ry ()bj,~t ~~t frati1e
`f'W {or is oedudt><l by lt), imd then pmcted5 t<)
`ll);()Vt: pai>t the SMiom.u:y <Jbfed:. Thh. ~{~<.m<l mvv,(cid:173)
`ing oq_l~ct n:.:vi~t'.st~ dimcti<.)nt an.)1ind fr~mt~$ FL~
`anrl Ff 4, rerums w rer.imve the· statkmaiy- 1~,b.ktt in
`fmm(~fl6; and thi,~lly exit:;; ,n fotm<-,.fl1, An addi·
`·ti1:1-m1l (il~ject enkrs ii.1 fr;,1m:e. ·rs. amt exits ~n.·frame
`1'8 With(1\lt intr.m,ting with·at}Y <.)ther ()l~ic~t
`
`As ltidkated by the strif'X!d fiU pattern$ in Figure 2,
`the ct1rrect ~orrespondence~ for the lm·d<~ nre ~m·
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 6 of 18
`
`
`
`ENTER
`
`I
`
`F2
`
`F;l F4
`
`FO
`
`F1
`
`bigunus after object :intcr~~tt.ions such as
`the
`occlusion ih frame FlO. The AVS system resofves
`fht~ arnbigii.ity when~ poss1hJ~. by preforring to
`rrn~tch n1nvtng o\.)je:Ct~ with mQ\c'ing obje,:is, and
`stmkmru~''i" nhjects. with J>tation(iry o~je~~t.:. Th~ xfo(cid:173)
`t:in,!lli)n bet\W.•e:n 11.K;ving and ~tl'Ui()tult)' 1r11:eks i~
`~omput<.xl using thre.sholds 1.-\n the v~-t(X:tty es1.i~
`:maws., and hystere~is fbr .st:.i.biliiing 1:ransitkms
`hetwe(~.n mt>vh1g ,n<l stationa._iy,
`n .. %.iwing ,m ot·i:.:.!u~ion. {whith 1.nay la.st ihr 1everal
`frames) tfl.\~ fr;;im~-s in:1rnedfottily hdhre :md a.t.tt.'-r
`"the ot.~~!w;ion m-e compar~ re,g,, frames P9 mid
`Fl J .in Fl:gttrtt 2). Th~ AVS syste:n, exa:11:unt~~ e\~t.:h
`st.i.ti.otiary obj~,1 in the p:nH: .. <iedus.ion franw,. an<l
`se.udi.es for hs cotrespt>n<fon.1 in the poi;t,oc~.luskm
`frarr.ie (whi<.:h sb:.HJ!d b{~ e.~.-a~li)' \vber~ it was be(cid:173)
`for(i,. since th~~ obje,ct is ~t~it:iormry), ·n1i" pro~ed.tirf:
`rest1h'es a forge .port.i<.lit <.lf the, tracbns; ambigui.6e~A
`Gen.{~r~! re.~oltttfon of a:mbtgulli.e:s rn~uhing from
`multiple mo,dng tlb;jecb :in th.e, si;.e,1ie is a lopk for
`t'tlrth(~r n.~sl\latch. 'rhe AV$ :-yMr:Itl may ~n,~tit.
`from ind11si.on t)f a '\:fosed wottd tntckini'' faciHtv
`s.u1ch .as that described h.y
`lnti!k and Hohk~
`t41>'-'l .,
`1 1 .'>':;?,. a,
`.. ~ ,.n)_i ,
`l ! (<.t\'\'.<'
`
`'
`
`.
`
`' ~-
`
`y
`
`~i .. ne~ir· tn time ,ts poS!iiblf to the actual occurr~m .. ·e
`o:f the ztvent. The pn.~v·ious sy,,;tem whi,h ust~d rn.o(cid:173)
`tion graphs for event detedlon fCt:;unney, .. 1997}
`opernicd in a batch mode~ 1md .required mulhpl~
`passes o,,er the mmk10 graph;. pn~d11ding ordirie
`optrnti.t)r1. The: AVS systetl} d~tecis M¢nts in,, sin-(cid:173)
`gfo pa:~~ <.rv~r th(~ .u~ol:ion grnph • . as tht'- graph is
`crca(etl. Hnwe'i<'tX~ in orde!' to reduce ,~r_mrs due h'>
`rmisc the. AVS ~yste:m introd~i~es l* ~ligh1 ,t.~tay t)f
`n frame tfrnes (n~:3 itt the current .imptemenl,llt(m)
`beJzm:: .rep,:irting certain evimts. F<)r ~x;;impk. "in
`Figur~ 2, ~o enter ~v~nt occi:u-~ <m frn.m.e Fl. 'rhe
`,AVS t.ys.tem n~quires the track fo ht~ nminia.ined for
`n fran)\15 bdhre repmting the enter event. tr the
`lrnck nt.1t mttirna.ined for the required number ()f
`frame!l., it ts ignor~di at~fl .th~ enter t~w:mi i~ m~t re-·
`ported., t!,g" if n > 4, th~ r}hj~d iti Figure 2 whkh
`i::ntcn, in fnm1t~ F5 and c~ih, in fmm~ F8 will not
`ge11erme, .u-iy twei1t~.
`
`A .track that spHts i11Jo two tnidrn, ,m,e of \\'hicb l~
`movi11g, .:i.m:l tht~ other of which i:s st.itifmary. t~t)H~'·
`~ptmds to a DHPOS lT event. lf ,1 n!oving rrnok
`int,~rsc(:ts a sJattt)11<1ry track, tmd 1hcn OOtUiniWi to
`flllW{\ hut tht \lci(lOOary (fi\Ck OOdS ,tl the interse,<;v
`tion, this torre.sp(mds. t.-:i ~1 RE~·iOVE event. The
`remov~ event cJn be gtiti.mned a!I soot1 ;\i; fh{~ l\.."'(cid:173)
`n,cM:r dis&dudt-~s 1hc. loc~tl¢n nf the stllfh)JltJry
`Q~joot which Wl'i!:, ·mmm,ed. and the i:.yskm t-.an ,fo •
`term.ine that .the s.t.it1m1ary nl:iect is tKl h)J1g1;.:r at
`th a( loc:,ttorL
`
`Ce.rtain fraton:.~~ t>f trn ... ks iti).d pairs d ' nw;k s c-0~(cid:173)
`~t~n<l t{) events, For exampfo, the beginnin~ of a
`.track i:.:ar.respi:l!'lds tc:i ~m HN1.1:'.R ~v~nt, Md the end
`t~orre.Sp<.lnd~ t(> atl EXlT ewnL 1n ,ill onvline event
`detection s ystcn:1, .it h pn!forabk t~:> .detect· t ht! t:vent
`
`164
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 7 of 18
`
`
`
`'""'""(11"1
`,r.x··m·:,-
`t:..r
`
`~
`
`fa. a !":ti~ttm.er s.imi!.ar ti:> the {JC('hl\klil sfo.tat.ion de(cid:173)
`scribe<l iilx.w~ in section 3.2., 111e dep~1sit lW'tilt afao
`!{t\'~~s tiSt) t~) ambigu.tty n~ to whk:h t:ll\it!Ct is !'ht de-·
`vm,tt{lr, an<l whkh is. th~ depo-sitee. Fnr exam~lk, it
`may have been Hutt th(~-l)bject which e1H~1'(::d nt
`fi~tne F1 of Plgiire 2 -~MAf.~ed at frame F4 and de:(cid:173)
`pt)Shtd -~ mv,,frtg i:ll.1jr¢i; and it l.~ the dtpo:sltc~d
`{lbjec::t whkh then pri;.~t!etfod to exjt t:h~ seent! at
`F6. /\gait¥, th~ AVB. ~y~tem 1-elie~ (n1 ·a moving \'S.
`.~t~hnI1My disti.m~tion to re:mlve. tht! arnbiguity. and
`in:-.fats that th~ dept,sitee Nrn~-iin $~tkina.ry ~ftt~.r it(cid:173)
`depo$it event:. The AVS sp,tem n_!quire$ both lhe.
`,h.!1:io~ttQf and the detx:~sik"t': tmcks t\'.1 i~xtetJd frir n
`fnrmes pa~t. tht~ point at whk:h the. trn~.:ks ~epa.rate
`{e.g,, pm,t frnme P5 ii) Mgi1r.e 2), and thnt !ht· d<.>;···
`p~,!iit~~i of:,lje~i- remain statfonary: tHh~rwi~e nu
`de~)~tt (t\>\~nt hi gt"-:tK~r.tlti<L
`
`Al~i) defocted {but not-Hh..t~fraied iu Figtm~ 2), .h'e
`REST ~vents { when ii. m<.)Vii1g obJect ~<)tne~ t() ·a·
`S1<:)p). :md M.OVE ~weim, {when ~l RESting. t~bJict
`begin~ t<J mtw~ %\~.ih). Pi.naHy, rn-ic fmtber f ,Vt~nt
`tMt ij, dete~ted is th~ UGHTSOUT ~vent. whkh
`occHn. whenever a forl~e 1::hange pcturs ~Vr)r tht!- en-·
`tfre ht)age. The motion giaph n~ed m.,t be (Xmsulted
`tZ) rkl~:i--:t this evenL
`
`surfact!-S vi~ihfo in an 1m:ag~1 and th~ tqrreitfJ~)nding
`qu.!ldrllatentls f.ni a ni11p,. as shown tn Figt,ftt-3. A
`warp ti::~n!-lforrna,1i1>.n frnm im,lge t,~ m~p S::t}t)tth-(cid:173)
`nates
`is c,.~m.tri..ttted 11sing Hif quadr:ifauernl
`C()t)r.-i:.lirmte:h
`
`Om;:e the. transfrirmaH<:im, are,, estitblhhed, the SJ:\."
`tein ~an <.~s§i1ri.i1t ihe h)cmkm -i:W an t)bJtci (a.~ in
`.Plindlh:mgh tmd .Bammn f r994]J l)y ai<.~uming that
`all ubjetts rest t:itt a b()rtt,i>mal ~hrfo(;e, Wfam. m1
`tihject i!> <le.tected in t.he !.cene. the-midpdnt of the
`fowcH &id<.l-_,)f the b1)ondi1~g b1:~x: is used }lS thi;.!· im:(cid:173)
`Hge p1)int. h) prnJect int() ~he m:~p winduw using rhe
`quitddbtt:t,il. warp tt~o~forma.tion JW,)lhe.rg, 19'.~)l.
`
`The- AV.S (;Ore- alguri1hm~ ~hs,t~ribe-d in wction 3
`fowt~ bet-:n used" af-th~ b.!lsts f1;r ~even1l vfrkx~ sur(cid:173)
`vdll~m;.e applitiiiiun~. S~;x~ti<m .. 4 de~crihe~ Hn~e
`,tppHc~th'..ms that w~ tw:vt ·impkmt'JUtkt situatl\mM
`itwarenc~s1 best-view sdectfou for ~t:ti\>ity loggfog,
`-mKl envh,i:.mmi:nt foaming.
`
`fo orde.Y- to h~ite. t?bjeGt~ ~een in the image wiHi re--(cid:173)
`if i~ 11(.~°'X~:\.MHJ
`tt1 a m,i.p.
`tt.) est~lblhh ~1
`:KJ're{f
`mapping between fo.wge .. md ms.ti' tom:dimtte:t This
`mapping i~ t~stubHshed in the /;.VS ~ystem by httv(cid:173)
`iug- ii t1se:i: dni.w quadri-late:rafs <.m ihe h<n12-,mtaJ
`
`The goijl (,f the simatiomd. awaren.t:s~ appHs:'.~1titm i~
`H) iw<x!m~{"· a re~lH'itne map-~,sed display {)f th~ h) ..
`"in a
`-cations
`fJf pt~~)pk, objt1(:ts and events
`rn()Hik~n!d rn.gkm. ~mi to all(}\',:' a user· 1i:.; spt-ctfy
`;.d~mtt 1;s>1HtithM1s ihle.ractive!.y, Ahm11 c1)nditi<ml>
`may be ba$e<l <:10 the hx:afalns of i~oi,te ::i.mt ol},(cid:173)
`jects \tl the ~(:em.:, th!! t}l'X!-~ of ,lbjt>ct~ h~ the si:.:~m!,
`the evtnis in \>1hkh ihe people ~ind (}ljocl~ ~ire 'ltl··
`
`AVIGILON EX. 2005
`IPR2019-00314
`Page 8 of 18
`
`
`
`~I!:: 1: l!~:!:!!:l~:!'?~:!!~~:::!4~1#: .. :~l'.:::::::::\:'.:' _,-: .
`
`.
`
`.
`
`.·.
`
`..
`
`..,.,.· ..
`
`,:::::::::t•;::}} .. ··.,
`
`,,.,.,
`
`m~1;:~~~1:•~1:;rnf::::t:w
`· :::;;:;:r:::::::::::s:::::{::::.,.: ::rn::::=rn:::::.
`!!:~~lt• •::::i ::a :!Iiill\\lC•:_,ij;IIiilli :_:,,:,_ •. :,_i.p,,· :_;:_:_.,.:_:_ ::_·.:,.:_:'·1,.:,,:, .•. :,,: :.A: ::_::,.::,_•_:,:,:,,:;_:
`•i+•:::::::::,:, i iliiiliiili;iliililjl!; llJ! :\mt:::m@tntnt:rn::
`- ~'!''"'"~ 1 ~-
`
`,_~:.•-·.i.·.:r.:_·.:,'_1.'_._ ..•. ,.~,·-·::::···::::··~:.:_:_:_,~,:":F_:_:.•_·_;,:_:_•~_:_•.~.:_•_·.:,. _•·.1 .. : ::•.:_' . t:~. ,: :,:;:·· :·f f::,._i .• ,:,·_:,:1_1 .·.I.'_,::_·.}o:_::_t
`
`.•. i,_ .. i,_ .. i,:l,·.,,:_:.i.: .•. : .. i,_:.:,·_,_.:·.:···:_•.:_,_:_•::.:.:_·.:_ •. :_'._:_• .. :_,_ •. l_· .. :_,·· ••.•. •.::i.•
`_:;._r_·_; _:·:,:_._ •• :,.•_;:_::•.::•.·.:_•.:,_i_i,_f.•,:,_l:.:,.::r
`
`- · .
`
`.. :::.•._:,._,,'._•_.,._',:_._:,!_:.•_:.·_:,'._:,_:,_·_!,•_:._1.1_;
`
`-
`
`volve<l, and the tirnes lit which the. e·vems o,:ctu.
`F~r.tht~ru.roi:e, th~ user t'.:m specify the. acth)n to· take
`wh<~n u-n ~llum1 is triggered, e.g,, w. ge11ernte an m,t~
`,fa, Jdarm {lr wrile a hsg fik, Fer e-:•rnmp.k, tfa.~ user
`shotsld be 1tb-le to ::specify t.lrnt .m audio aLim1
`~hotdd he tr.i.ggcred if n 1}1.~rso.n dep~)sit$ :J hriefe.ase-(cid:173)
`,Jn a glveri Htbk bct\.\.'e~n 5:00pm and 7:fK) ~m ,;m. ~
`weektiig:ht
`
`The archltt-cture, {)f the AVS situaiit)nal. awareness
`i,ystcm is dt!:pided in Figure 4, The systt~m <..:orisisls
`of one or mt,re smart ca~ne:r:is (i:11mnur1k:ati11g with
`11. VjJeo Sun,eilhtm::<.i Shell {VSS). Ead1 ,:amtm;1 hiV:.
`aSSt':ICiated with it ~H1 in<k~pt<.ndt11t AVS ,,)R~ <.~r.1gine
`tk::it per.forms the. pnx.-:essirtg ckscrlbed in section 3.
`That is, the engine tfods-Md m.1cks movitig ohjed;<,.
`in the sce.11e, map~ their inmge- tO<.\ilions to W<)rld
`c<Jordhta.tes,. ,11)<.i W<\;ignit.cs ev~11ts h'tvi:.~king !ht"c
`ol>jtcts. Each <..:o~ engine emits .:1 stream 1.Jf hx:a(cid:173)
`tkin ii.nd ~v¢n~ r<.~pt1rts U) dk VSS. which fi!tenr the
`incOn'ling ~:ven~ stre~m$ fot uwr~-::sp(!{;:Hloo ~i!tttm
`t'.1Jnditk)rl&-and takes the apprtipdak cidion!"i,
`
`(},.,hje<:I f&.\~mfk-m
`Hgm\~ 4< The $itu3.!it)M! 1i\v.aretwss !iysh~m
`
`166
`
`Jo Mlkt to d.et<~rn1h1t~ the ideniltit~s nf Z)bjeds (t,g ,,
`h.riefoa..~e. riolem:K~k}. the sim:ulonat awatt~i~ss &:}'"£··
`lcrn cotnmi.mica.te-s
`\1/itb m1e or more ()lui~ct
`analy~is-mui-hd1:1s (QA.Ms). 'flte core e"'-~ines cap(cid:173)
`tu.re Silap~hois of it:rt.crt.~hng <)bje,:ts in the scenes,
`and forward the snapsh1">t<; to thi:i. OAM,. i1long with
`the IDs ,,f the tr~ch containing the d.tjecfs .. The
`OAM the•l pro~esscs the snap$hOt in ur<ler Ul de,ter(cid:173)
`m.ine the type of ol)jeet , The OAM p,xK~~sitig and.
`the /\VS core engine C{ll'iipurntitms M~ asynctm.'>-
`1tou~. so the rnre t~nginc may hiwe pmcessc::d
`severul more fnimes by Hine f!}e OAM .:_:ompkt~s.
`its analysis, Orn.ce tht~ analysis ·is c1)mpkte, the.
`OAM. ~'erKls the· ~s11hs (at1 ol*ct iype h,t~l) and
`the trnt.k m hack to thi~ z:,,re: engine. '.fbe t{)~ e11-
`gitle uses the ti.ad;, Il) IIJ ,)KSt)Ciate the .lahet with
`the , .on-:1:x:t object in the ~orrcnt fn~mt~ (as!>uming
`th~ <1bject h.ns remained in the :ia::,em.: anct t~~cn sue·
`,iess!\1Uy tr~cktd) .
`
`The VSS pnwitle,\ a map dispfay of the monitored
`il.t'¢a. with the locations ·,if the ()l~ject~ in the ~cene
`reported as icons-0:11 th(: map. Th~ VSS ~list) a.Hows
`the use.:r t{) spt~ify alarm ·regions ;:i.od conditions ,
`Alarul r.~gkms are si~-iiied hy Jrnwing tht)n1 on
`!he map usitlg a ffii)tis(! 5 :ind na.rning th(m1 a~ de-(cid:173)
`~in.!d, TIH.~ tt~-er cau then spt.~ify the ~onditfotis and
`a.::tj<inS: fr...lr alarms by cremh,g zme u