`Bassman et al.
`
`US006044166A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,044,166
`Mar. 28, 2000
`
`[54] PARALLEL-PIPELINED IMAGE
`PROCESSING SYSTEM
`
`[75] Inventors: Robert G Bassman, Austin, Tex;
`
`5,175,824 12/1992 Soderbery etal. ................... .. 395/325
`5,255,374 10/1993 Alderguia etal. ..
`395/325
`5,261,059 11/1993 Hedberg et al.
`395/312
`5,281,964
`1/1994 Iida et al. .............................. .. 340/936
`
`
`
`Bhavesh l3I Bhatt, Franklin Park; J. Call, Allentown, both of N.J.;
`
`573597674 10
`
`
`
`....................... .. amguc ‘6 ‘1' " ~
`
`
`
`_
`
`_
`
`,
`
`,
`
`/1994 van der Wal ..... ..
`
`
`
`
`
`M1°haelW-HanSen>NeW Hope’ Pa» Stephen C- HS“, East Wlndsora N1; Gooltzen S. van der Wal, Hopewell,
`
`
`
`
`
`
`
`5,586,227 12/1996 Kawana et al. 5,592,237 5,717,871 1/1997 Greenway et al. 2/1998 Hsieh et al. .......................... .. 395/312
`
`
`
`N.J.; Lambert E. Wixson, Rocky Hill,
`N.J.
`
`FOREIGN PATENT DOCUMENTS
`
`[73] Assignee: Sarno?' Corporation, Princeton, NJ.
`
`_
`
`_
`
`_
`
`[21] APP1~ N01 08/606,171
`[22] Filed:
`Feb 23’ 1996
`
`Related US. Application Data
`
`[63]
`
`Continuation-in-part of application No. 08/372,924, Jan. 17,
`1995, abandoned.
`[60] Provisional application No. 60/006,097, Oct. 31, 1995.
`[51]
`Int. c1.7 ..................................................... .. G06K 9/00
`_
`_
`_
`[52] US. Cl. ........................ .. 382/103, 382/325, 710/129,
`710/132
`[58] Field Of Search ................................... .. 382/103, 107,
`_
`_
`382/296, 307, 325, 395/800, 312, 309,
`364/516; 348/149> 716; 701/1; 710/126>
`129> 131> 132
`
`[56]
`
`_
`References Cited
`
`4,163,257
`4,187,519
`
`U.S. PATENT DOCUMENTS
`7/1979 White .................................... .. 358/133
`2/1980 Vitols et al. ..
`358/169
`olesen ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
`~ ~ ~ ~ ~~
`2/1984 Tanaka et a]_
`4,433,325
`__ 340/937
`4,490,851 12/1984 Gerhart et a1,
`382/43
`4,692,806
`9/1987 Anderson et al.
`358/209
`340/933
`4,839,648
`6/1989 Bejllchef et a1- ------- -
`478477772
`7/1989 MlchaloPoulos ct a1~
`364/436
`4,858,113
`8/1989 Sacbcarih ................. ..
`395/312
`
`WO9411852 5/1994 WIPO .......................... .. G086 1/015
`
`OTHER PUBLICATIONS
`
`English Language Abstract of WO 94/11852, Nov. 10, 1992.
`Traiteinent Du Signal, vol.
`No. 4, 1992 Maciej OrkisZ:
`“Localisation D’Ob] ets Mobiles Dans des Scenes Naturelles
`F?mees Par Une Camera Fixf?
`TWo—Dimensional Vehicle Tracking using Video—Image
`Processing, Sep. 2, 1992, YaIIlaIIlOtO, KllWahara and MiSra
`(Presented at the 3rd Vehicle Navigation & Information
`Systems Conference, 0510, Norway, Sap~ 2_4, 1992
`International Search Report from Corresponding PCT appli
`cation PCT/US96/00022.
`_
`_
`_
`_
`_
`Hierarchical Model—Based Motion Estimation, Mar. 23,
`1992’ James R- Bergen et a1" (apl’fearing in Proc' of Euro‘
`ean Conference on Coin uter V1s1on—92 .
`P
`P
`Primary Examiner—Jose L. Couso
`Assistant Examiner—MattheW C. Bella
`Attorney, Agent, or Firm—William J. Burke
`
`[57]
`
`ABSTRACT
`
`Apparatus for lllmlag‘? plrf’cfjsmg a Sequence of 1mag‘?S3°n£
`Fammg a Pa”? 6 ‘Plpe m‘? lmage PIOC‘T'SSO? Comp?“ 9
`image In'emorles, a pyral'n'ld prOCGSsing CII'CIIII, an arlthmetlc
`logic unit, a crossbar switch for video routing through the
`various components of the processor, signal processors to
`provide hardware programming through a global bus and
`also perform image processing operations. Images can be
`passed directly from the crossbar sWitch to internal static
`RAM of the Signal processors through a ?rspin, ?rst_out
`
`385/8231 364/516
`2823552 21122? giant-a 5:062:056 10/1991 LO et al. ......... ..
`
`
`5,161,107 11/1992 Mayeaux et al. ..................... .. 364/436
`
`a
`
`26 Claims, 5 Drawing Sheets
`
`6045 rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr "
`
`e021
`\3 e12~
`_‘—,
`
`614/
`
`1 K
`;
`606
`CROSSBARSWITCH
`
`‘7*;
`
`CONTROL
`
`7
`
`'
`
`636
`
`THREE 3U CARDS
`
`Page 1 of 13
`
`SAMSUNG EXHIBIT 1014
`Samsung v. Image Processing Techs.
`
`
`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 1 of5
`
`6,044,166
`
`100 v
`VIDEO
`CAMERA
`
`102 W
`\ TRAFFIC-MONITORING
`’
`IMAGE PROCESSOR
`
`100 w
`
`104 v
`
`102 ‘w
`
`VIDEO
`OAMERA
`
`\ VCR
`’
`
`\ TRAFFIC-MONITORING
`’
`IMAGE PROCESSOR
`
`FIG. 1a
`
`FIG. 1 B
`
`100
`
`30 FEET
`
`FIG. 2a
`
`I
`
`I
`
`I
`
`T
`
`l
`
`l
`
`50 FEET 100 FEET 150 FEET 200 FEET 250 FEET 300 FEET
`
`100
`
`62° 60 FEET
`
`FIG. 2b
`
`= 50 FEET
`
`A
`
`204
`640 PIXELS = 62 DEGREES /:
`l
`
`480 PIXELS
`
`FIG. 2C
`
`SAMSUNG EXHIBIT 1014
`Page 2 of 13
`
`
`
`US. Patent
`
`Mar. 28, 2000
`
`Sheet 2 0f 5
`
`6,044,166
`
`
`
`m0<_>=mozmmmmmm
`
`oz<zO_._.<>_m_m_o
`
`
`
`mz<m2Oz_._.<n_n_:
`
`mom
`
`vow
`
`zO_.r<N_.__m<._.m
`
`mz<m=>_
`
`o=>_<m>n_
`
`mz<m=>_
`
`mom
`
`Hcom
`
`Om_n__>
`
`._bn_z_
`
`O._.
`
`v.OE
`
`$05
`
`$3:mom
`
`m5
`
`m.GE
`
`
`
`was):mozmmmbm
`
`
`
`mzfi:02:82
`
`6032922334;
`
`05
`
`SAMSUNG EXHIBIT 1014
`
`PageBof‘lB
`
`SAMSUNG EXHIBIT 1014
`Page 3 of 13
`
`
`
`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 3 0f5
`
`6,044,166
`
`Fm;
`
`Po;
`
`mew
`
`wow I\
`
`
`
`Ar mow
`
`P wow
`
`5
`
`P cow
`
`w .GE
`
`SAMSUNG EXHIBIT 1014
`Page 4 of 13
`
`
`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 4 0f5
`
`6,044,166
`
`IMAGE ZONE PROJECTED
`ONTO I-D STRIPW
`
`[500
`
`1-D STRIP
`
`51 0
`
`504-2
`
`504-1 ——
`
`FIG. 5
`
`MEDIAL STRIP 518
`
`2 I 5 E N A L
`
`USEFI-DELINEATED LANE BOUNDARY
`
`FIG. 5a
`
`SAMSUNG EXHIBIT 1014
`Page 5 of 13
`
`
`
`US. Patent
`
`Mar. 28, 2000
`
`Sheet 5 0f 5
`
`6,044,166
`
`mmo
`
`mm
`
`_mm_
`n2mm_$359?
`EmEm
`$5808mommmgmm:0mmA|T”8:m5”f_“mom
`wwam0%u
`BEEF;_206528m1855
`
`
`
`”mmo0mm$982285ms;
`
`.....--.mfifiom.mw¢£----::-
`
`mGE
`
`SAMSUNG EXHIBIT 1014
`
`PageBof‘lB
`
`_"vow
`
`SAMSUNG EXHIBIT 1014
`Page 6 of 13
`
`
`
`
`
`
`6,044,166
`
`1
`PARALLEL-PIPELINED IMAGE
`PROCESSING SYSTEM
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application is a continuation-in-part application of
`US. patent application Ser. No. 08/372,924, ?led Jan. 17,
`1995, noW abandoned, and forms a 35 USC Section 111(a)
`counterpart application for provisional application Ser. No.
`60/006097, ?led Oct. 31, 1995.
`
`10
`
`BACKGROUND OF THE INVENTION
`
`15
`
`20
`
`25
`
`30
`
`1. Field of the Invention
`The invention relates to an image processing system and,
`more particularly, to a parallel-pipelined image processing
`system that digitally processes pixels of successive image
`frames derived from a video camera to identify changes in
`a scene represented by the successive image frames.
`2. Description of the Background Art
`Many computer vision systems, for automatic surveil
`lance and monitoring, seek to detect and segment transitory
`objects that appear temporarily in the system’s ?eld of vieW.
`Examples include traf?c monitoring applications that count
`vehicles and automatic surveillance systems for security.
`Various types of traf?c monitoring systems that utiliZe
`vision processing systems are knoWn in the prior art and
`examples thereof are respectively disclosed in US. Pat. Nos.
`4,433,325, 4,847,772, 5,161,107 and 5,313,295.
`Various image processing techniques that are useful in
`traf?c monitoring systems have been discussed in the com
`monly assigned provisional patent application Ser. No.
`60/006104 entitled “METHOD FOR CONSTRUCTING A
`REFERENCE IMAGE FROM AN IMAGE SEQUENCE”
`35
`?led Oct. 31, 1995 and incorporated herein by reference.
`Also incorporated herein by reference is the disclosure of
`provisional patent application Ser. No. 06/006098 entitled
`“IMAGE-BASED OBJECT DETECTION AND TRACK
`ING” ?led Oct. 31, 1995 and the disclosure of provisional
`patent application Ser. No. 60/006100 (attorney docket
`DSRC 11912P) entitled “METHOD AND APPARATUS
`FOR DETERMINING AMBIENT CONDITIONS FROM
`AN IMAGE SEQUENCE” ?led Oct. 31, 1995.
`Further, the present invention makes use of pyramid
`teachings disclosed in US. Pat. No. 4,692,806, Which issued
`to Anderson et al. on September 8, and image ?oW teachings
`disclosed in the article “Hierarchical Model-Based Motion
`Estimation” by Bergen et al., appearing in the Proceedings
`of the European Conference on Computer Vision, Springer
`Verlag, 1992. Both of these teachings are incorporated
`herein by reference.
`Although various image processing techniques are
`available, there is a need for a more robust image processing
`system for processing sequences of images to identify
`changes in the captured scene. A system that is computa
`tionally ef?cient and yet relatively inexpensive to implement
`Would ?nd use in applications such as traf?c monitoring and
`security systems.
`
`40
`
`45
`
`55
`
`60
`
`SUMMARY OF THE INVENTION
`
`The invention is an implementation of the system ?rst
`disclosed in US. patent application Ser. No. 08/372,924,
`?led Jan. 17, 1995, for processing a sequence of video
`images of a scene to determine the presence of a particular
`feature in the scene. The apparatus comprises a parallel
`
`65
`
`2
`pipelined image processor comprised of image memories, a
`pyramid processing circuit, an arithmetic logic unit, a cross
`bar sWitch for video routing through the various components
`of the processor, digital signal processors to provide hard
`Ware programming through a global bus and also perform
`image processing operations. Images can be passed directly
`from the crossbar sWitch to the internal static random access
`memory (SRAM) of the signal processors through a ?rst-in,
`?rst-out interface at full video rates.
`
`BRIEF DESCRIPTION OF THE DRAWING
`
`The teachings of the present invention can be readily
`understood by considering the folloWing detailed descrip
`tion in conjunction With the accompanying draWings, in
`Which:
`FIGS. 1a and 1b shoW alternative real time and non-real
`time Ways of coupling a video camera to a traf?c monitoring
`image processor;
`FIGS. 2a, 2b and 2c relate to the image ?eld of a video
`camera vieWing a multi-lane roadWay;
`FIG. 3 is a functional block diagram of the preprocessing
`portion of the digital image processor of the present inven
`tion;
`FIG. 4 is a functional block diagram of the detection and
`tracking portion of the digital image processor of the present
`invention;
`FIGS. 5 and 5a illustrate the manner in Which image
`pixels of a 2D delineated Zone of a roadWay lane are
`integrated into a 1D strip; and
`FIG. 6 is a block diagram of implementing hardWare of
`the image processing system of the present invention.
`To facilitate understanding, identical reference numerals
`have been used, Where possible, to designate identical
`elements that are common to the ?gures.
`
`DETAILED DESCRIPTION
`
`The present invention is an implementation of an image
`processing system embodied in a traf?c monitoring system.
`The traf?c monitoring system embodiment comprises at
`least one video camera for deriving successive image frames
`of road traf?c and a traf?c monitoring image processor for
`digitally processing the pixels of the successive image
`frames. As shoWn in FIG. 1a, the output of video camera 100
`may be directly applied as an input to traf?c monitoring
`image processor 102 for digitally processing the pixels of
`the successive image frames in real time. Alternatively, as
`shoWn in FIG. 1b, the output of video camera 100 may be
`?rst recorded by video cassette recorder (VCR) 104 and
`then, at a later time, the pixels of the successive image
`frames may be readout of the VCR and applied as an input
`to traf?c monitoring image processor 102 for digitally
`processing the pixels of the successive image frames.
`Video camera 100, Which may be charge-coupled device
`(CCD) camera, an infrared (IR) camera, millimeter-Wave
`sensor, or any other imaging sensor Which is mounted at a
`given height over a roadWay and Which has a given ?eld of
`vieW of a given length segment of the roadWay. As shoWn in
`FIGS. 2a and 2b, video camera 100, by Way of example,
`may be mounted 30 feet above the roadWay and have a 62°
`?eld of vieW suf?cient to vieW a 60 foot Width (5 lanes) of
`a length segment of the roadWay extending from 50 feet to
`300 feet With respect to the projection of the position of
`video camera 100 on the roadWay. FIG. 2c shoWs that video
`camera 100 derives a 640x480 pixel image of the portion of
`the roadWay Within its ?eld of vieW. For illustrative
`
`SAMSUNG EXHIBIT 1014
`Page 7 of 13
`
`
`
`6,044,166
`
`3
`purposes, vehicular traf?c normally present on the length
`segment of the roadway has been omitted from the FIG. 2c
`image.
`In a designed vehicular traf?c monitoring system, video
`camera 100 Was one of a group of four time-divided cameras
`each of Which operated at a frame rate of 7.5 frames per
`second.
`Aprincipal purpose of the present invention is to be able
`to provide a computationally-ef?cient digital traf?c
`monitoring image processor that is capable of more accu
`rately detecting, counting and tracking vehicular traf?c
`traveling over the vieWed given length segment of the
`roadWay than Was heretofore possible. For instance, con
`sider the folloWing four factors Which tend to result in
`detecting, and tracking errors or in decreasing computational
`ef?ciency:
`1. LoW Contrast:
`Avehicle must be detected based on its contrast relative
`to the background road surface. This contrast can be loW
`When the vehicle has a re?ected light intensity similar to that
`of the road. Detection errors are most likely under loW light
`conditions, and on gray, overcast days. The system may then
`miss some vehicles, or, if the threshold criteria for detection
`are loW, the system may mistake some background patterns,
`such as road markings, as vehicles.
`2. ShadoWs and Headlight Re?ections:
`At certain times of day vehicles Will cast shadoWs or
`cause headlight re?ections that may cross neighboring lanes.
`Such shadoWs or headlight re?ections Will often have
`greater contrast than the vehicles themselves. Prior art type
`traf?c monitoring systems may then interpret shadoWs as
`additional vehicles, resulting in an over count of traf?c ?oW.
`ShadoWs of large vehicles, such as trucks, may completely
`overlap smaller cars or motor cycles, and result in the
`overshadoWed vehicles not being counted. ShadoWs may
`also be cast by objects that are not Within the roadWay, such
`as trees, building, and clouds. And they can be cast by
`vehicles going the other direction on another roadWay.
`Again, such shadoWs may be mistaken as additional
`vehicles.
`3. Camera SWay:
`A camera that is mounted on a utility pole may move as
`the pole sWays in a Wind. A camera mounted on a highWay
`bridge may vibrate When trucks pass over the bridge. In
`either case camera motion results in image motion and that
`cause detection and tracking errors. For example, camera
`sWay becomes a problem if it causes the detection process to
`confuse one road lane With another, or if it causes a
`stationary vehicle to appear to move.
`4. Computational Efficiency:
`Since vehicle travel is con?ned to lanes and normal travel
`direction is one dimensional along the length of a lane, it is
`computationally inef?cient to employ tWo-dimensional
`image processing in detecting and tracking vehicular traf?c.
`The present invention is directed to a digital traf?c
`monitoring image processor that includes means for over
`coming one or more of these four problems.
`Referring to FIG. 3, there is shoWn a functional block
`diagram of a preferred embodiment of a preprocessor por
`tion digital traffic-monitoring image processor 102. ShoWn
`in FIG. 3 are analog-to-digital
`converter 300, pyramid
`means 302, stabiliZation means 304, reference image deri
`vation and updating means 306, frame store 308, reference
`image modifying means 310 and subtractor 312.
`The analog video signal input from camera 100 or VCR
`104, after being digitiZed by A/D 300, may be decomposed
`into a speci?ed number of Gaussian pyramid levels by
`
`4
`pyramid means 302 for reducing piXel density and image
`resolution. Pyramid means 302 is not essential, since the
`vehicular traf?c system could be operated at the resolution
`of the 640x480 piXel density of video camera 100. HoWever,
`because this resolution is higher than is needed doWnstream
`for the present vehicular traf?c system, the use of pyramid
`means 302 increases the system’s computational ef?ciency.
`Not all levels of the pyramid must be used in each compu
`tation. Further, not all levels of the pyramid need be stored
`betWeen computations, as higher levels can alWays be com
`puted from loWer ones. HoWever, for illustrative purposes it
`is assumed that all of the speci?ed number of Gaussian
`pyramid levels are available for each of the doWnstream
`computations discussed beloW.
`The ?rst of these doWnstream computations is performed
`by stabiliZation means 304. Stabilization means 304
`employs electronic image stabiliZation to compensate for the
`problem of camera sWay, in Which movement may be
`induced by Wind or a passing truck. Camera motion causes
`piXels in the image to move. Prior art vehicular traf?c
`systems that do not compensate for camera motion Will
`produce false positive detections if the camera moves so that
`the image of a surface marking or a car in an adjacent lane
`overlaps a detection Zone. StabiliZation means 304 compen
`sates for image translation from frame to frame that is due
`to camera rotation about an aXis perpendicular to the direc
`tion of gaZe. The compensation is achieved by shifting the
`current image an integer number of roWs and columns so
`that, despite camera sWay, it remains ?Xed in alignment to
`Within one piXel With a reference image derived by means
`306 and stored Within frame store 308. The required shift is
`determine by locating tWo knoWn landmark features in each
`frame. This is done via a matched ?lter.
`The problem of loW contrast is overcome by the coop
`erative operation of reference image derivation and updating
`means 306, frame store 308 and reference image modifying
`means 310. Means 306 generates an original reference
`image r0 simply by blurring the ?rst-occurring image frame
`i0 applied as an input thereto from means 304 With a large
`Gaussian ?lter (so that reference image r0 may comprise a
`higher pyramid level), and then reference image r0 is stored
`in frame store 308. FolloWing this, the image stored in frame
`store 308 is updated during a ?rst initialiZation phase by
`means 306. More speci?cally, means 306 performs a recur
`sive temporal ?ltering operation on each corresponding
`piXel of the ?rst feW image frames of successive stabiliZed
`image frames applied as an input thereto from means 304.
`With the additional constraint that if the difference betWeen
`the reference image and the current image is too large, the
`reference image is not updated at that piXel. Put
`mathematically,
`
`10
`
`15
`
`25
`
`35
`
`45
`
`rk (x, y) =
`
`(1)
`
`55
`
`r141”, y)
`
`Otherwise
`
`Where r, represents the reference image after frame t, and it
`represents the t’th frame of the input image frame sequence
`from means 304. The constant y determines the “respon
`siveness” of the construction process.
`The “responsiveness” setting of y must be suf?ciently
`sloW to keep transitory objects, such as moving vehicles or
`even vehicles that may be temporarily stopped by a traf?c
`jam, out of the reference image, so that, at the end of the ?rst
`feW input image frames to means 306 Which comprise the
`?rst initialiZation phase, the stored reference image in frame
`
`65
`
`SAMSUNG EXHIBIT 1014
`Page 8 of 13
`
`
`
`6,044,166
`
`5
`store 308 Will comprise only the stationary background
`objects being viewed by camera 100. Such a “responsive
`ness” setting of y is incapable of adjusting r, quickly enough
`to add illumination changes (such as those due to a passing
`cloud or the auto-iris on camera 100) to the reference image.
`This problem is solved at the end of the initialiZation phase
`by the cooperative updating operation of reference image
`modifying means 310 (Which comprises an illumination/
`AGC compensator) With that of means 306 and frame store
`308. Speci?cally, When the initialiZation phase is completed,
`it is replaced by a second normal operating phase Which
`operates in accordance With the folloWing equation 2 (rather
`than the above equation 1):
`
`(2)
`In this normal operating phase the reference image rt, stored
`in frame store 308 is then passed to reference image modi
`fying means 310 Which computes the illumination
`compensated reference image r‘, according to the folloWing
`equation:
`
`(3)
`Where k, and c,, are the estimated gain and offset betWeen the
`reference image r, and the current image it computed by
`means 310. Means 310 computes this gain and offset by
`plotting a cloud of points in a 2D space in Which the x-axis
`represents gray-level intensity in the reference image, and
`the y-axis represents gray-level intensity in the current
`image, and ?tting a line to this cloud. The cloud is the set of
`points (r,_1(x,y),it(x,y)) for all image positions x,y, at Which
`|t(x,y)—rt_1(x,y)|<D. This approach Will Work using any
`method for computing the gain and offset representing
`illumination change. For example, the gain might be esti
`mated by comparing the histograms of the current image and
`the reference image. Also, the speci?c update rules need not
`use an absolute threshold D as described above. Instead, the
`update could be Weighted by any function of |i,(x,y)—rt_1(x,
`y)|~
`The above approach alloWs fast illumination changes to
`be added to the reference image While preventing transitory
`objects from being added. It does so by giving the coopera
`tive means the ?exibility to decide Whether the neW refer
`ence image pixel values should be computed as a function of
`pixel values in the current image or Whether they should be
`computed simply by applying a gain and offset to the current
`reference image. By applying a gain and offset to the current
`reference image the illumination change can be simulated
`Without running the risk of alloWing transitory objects to
`appear in the reference image.
`The result is that the amplitude of the stationary back
`ground manifesting pixels of the illumination-compensated
`reference image (Which includes both stationary background
`manifesting pixels and moving object (i.e., vehicular traf?c))
`Will alWays be substantially equal to the amplitude of the
`stationary background manifesting pixels of the reference
`image (Which includes solely stationary background mani
`festing pixels) appearing at the output of reference image
`modifying means 310. Therefore, subtractor 312, Which
`computes the difference betWeen the amplitudes of corre
`sponding pixels applied as inputs thereto from means 310
`and 304, derives an output made up of signi?cantly-valued
`pixels that manifest solely moving object (i.e., vehicular
`traf?c) in each one of successive 2D image frames. The
`output of subtractor 312 is forWarded to the detection and
`tracking portion of traffic-monitoring image processor 102
`shoWn in FIG. 4.
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`6
`Referring to FIG. 4, there is shoWn 2D/ 1D converter 400,
`vehicle fragment detector 402, image-?oW estimator 404,
`single frame delay 406, pixel-amplitude squaring means
`408, vehicle hypothesis generator 410 and shadoW and
`re?ected headlight ?lter 412.
`2D/1D converter 400 operates to convert 2D image infor
`mation received from FIG. 3 that is applied as a ?rst input
`thereto into 1D image information in accordance With user
`control information applied as a second input thereto. In this
`regard, reference is made to FIGS. 5 and 5a. FIG. 5 shoWs
`an image frame 500 derived by camera 100 of straight,
`5-lane roadWay 502 With cars 504-1 and 504-2 traveling on
`the second lane 506 from the left. Cars 504-1 and 504-2 are
`shoWn situated Within an image Zone 508 delineated by the
`aforesaid user control information applied as a second input
`to converter 400. By integrating horiZontally the amplitudes
`of the pixels across image Zone and then subsampling the
`vertically oriented integrated pixel amplitudes along the
`center of Zone 508, 1D strip 510 is computed by converter
`400. The roadWay need not be straight. As shoWn in FIG. 6a,
`curved roadWay lane 512 includes Zone 514 de?ned by
`user-delineated lane boundaries 516 Which permits the com
`putation of medial strip 518 by converter 400. In both FIGS.
`5 and 5a, the user may employ lane-de?ning stripes that may
`be present in the image as landmarks for help in de?ning the
`user-delineated lane boundaries .
`More speci?cally, computation by converter 400 involves
`employing each of pixel positions (x,y) to de?ne integration
`WindoWs. For example, such a WindoW might be either (a)
`all image pixels on roW y that are Within the delineated lane
`bounds, (b) all image pixels on column x that are Within the
`delineated lane bounds, or (c) all image pixels on a line
`perpendicular to the tangent of the medial strip at position
`(x,y). Other types of integration WindoWs not described here
`may also be used.
`The 1D output from converter 400 is applied as an input
`to detector 402, estimator 404 and single frame delay 406,
`and through means 408 to ?lter 408. While the respective
`detection, tracking and ?ltering functions performed by
`these elements are independent of Whether they operate on
`1D or 2D signals, 1D operation is to be preferred because it
`signi?cantly reduces computational requirements.
`Therefore, the presence of converter 400, While desirable, is
`not essential to the performance of these detection, tracking
`and ?ltering functions. In the folloWing discussion, it is
`assumed that converter 400 is present
`Detector 402 preferably utiliZes a multi-level pyramid to
`provide a coarse-to-?ne operation to detect the presence and
`spatial location of vehicle fragments in the 1D strip of
`successive image frames received from FIG. 3. A fragment
`is de?ned as a group of signi?cantly-valued pixels at any
`pyramid level that are connected to one another. Detector
`402 is tuned to maximiZe the chances that each vehicle Will
`give rise to a single fragment. HoWever, in practice this is
`impossible to achieve; each vehicle gives rise to multiple
`fragments (such as separate fragments corresponding to the
`hood, roof and headlights of the same vehicle). Further,
`pixels of more than one vehicle may be connected into a
`single fragment.
`One technique for object detection at each strip pixel
`position is to compute a histogram of the image intensity
`values Within the integration WindoW centered at that pixel
`position. Based on attributes of this histogram (e.g., the
`number or percentage of pixels over some threshold value or
`values), classify that strip pixel as either “detection” or
`“background”. By performing this operation at each strip
`pixel, one can construct a one-dimensional array that
`
`SAMSUNG EXHIBIT 1014
`Page 9 of 13
`
`
`
`6,044,166
`
`7
`contains, for each pixel position, the “detection” or “back
`ground” label. By performing connected component analy
`sis Within this array, adjacent “detection” pixels can be
`grouped into “fragments”.
`Image-?oW estimator 404 in cooperation With delay 406,
`Which employs the teachings of the aforesaid Bergen et al.
`article, to permit objects to be tracked over time. Brie?y, in
`this case, this involves, at each pixel position, computing
`and storing the average value contained Within the integra
`tion WindoW. By performing this operation at each strip
`pixel, a one-dimensional array of average brightness values
`is constructed. Given tWo corresponding arrays for images
`taken at times t—1 and t, the one-dimensional image “?oW”
`that maps pixels in one array to the other is computed. This
`can be computed via one-dimensional least-squares mini
`miZation or one-dimensional patchWise correlation. This
`?oW information can be used to track objects betWeen each
`pair of successive image frames.
`The respective outputs of detector 402 and estimator 404
`are applied as inputs to vehicle hypothesis generator 410.
`Nearby fragments are grouped together as part of the same
`object (i.e., vehicle) if they move in similar Ways or are
`sufficiently close together. If the positions of multiple frag
`ments remain substantially ?xed With respect to one another
`in each of a train of successive frames, they are assumed to
`indicate only a single vehicle. HoWever, if the positions of
`the fragments change from frame to frame, they are assumed
`to indicate separate vehicles. Further, if a single fragment in
`one frame breaks up into multiple fragments or signi?cantly
`stretches out longitudinally in shape from one frame to
`another, they are also assumed to indicate separate vehicles.
`At night, the presence of a vehicle may be indicated only
`by its headlights. Headlights tend to produce headlight
`re?ections on the road. Lighting conditions on the road
`during both day and night tend to cause vehicle shadoWs on
`the road. Both such shadoWs and headlight re?ections on the
`road result in producing detected fragments that Will appear
`to generator 410 as additional vehicles, thereby creating
`false positive error in the output from generator 410.
`ShadoW and re?ected headlight ?lter 412, Which discrimi
`nates betWeen fragments produced by valid vehicles and
`those produced by shadoWs and re?ected headlights, elimi
`nates such false positive error.
`The output from pixel-amplitude squaring means 408
`manifests the relative energy in each pyramid-level pixel of
`the strip output of each of successive image frames from
`converter 400. Filter 412 discriminates betWeen fragments
`that produced by valid vehicles and those produced by
`shadoWs and re?ected headlights based on an analysis of the
`relative amplitudes of these energy-manifesting pixels from
`means 408. The fact that the variance in energy pixel
`amplitude (pixel brightness) of shadoW and re?ected head
`light fragments is signi?cantly less than that of valid vehicle
`fragments can be used as a discriminant.
`Another Way of ?ltering, not shoWn in FIG. 4, is to
`employ converter 400 for discriminating betWeen objects
`and shadoWs using the background-adjusted reference
`image. At each pixel position, the folloWing information is
`computed over the integration WindoW:
`(a) the number of pixels With brightness value greater than
`some threshold p, over all image pixels Within the
`integration WindoW;
`(b) the maximum absolute value, over all image pixels
`Within the integration WindoW; and
`(c) the number of adjacent pixels (x1,y1, and (x2,y2)
`Within the integration WindoW Whose absolute
`difference, |I(x1,y1)—I(x2,y2)|, exceeded a threshold
`value.
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`8
`Fragments that have been extracted as described previ
`ously can be classi?ed as object or shadoW based on these
`or other properties. For example, if the value of measure (a),
`summed over all strip pixels Within the fragment, exceeds
`some threshold, then the fragment cannot be a shadoW (since
`shadoWs Would never have positive brightness values in the
`images applied to converter 400 from FIG. 3. A similar
`summation using measure (c) provides another test measur
`ing the amount of texture Within the fragment, Which can
`also be thresholded to determine Whether a fragment is an
`object or a shadoW. While the input to ?lter 412 de?nes all
`hypothesiZed vehicle locations, the output therefrom de?nes
`only veri?ed vehicle locations. The output from ?lter 412 is
`forWarded to utiliZation means (not shoWn) Which may
`perform such functions as counting the number of vehicles
`and computing their velocity and length.
`Vehicle fragment detector 402, image-?oW estimator 404,
`and vehicle hypothesis generator 410 may use pre
`determined camera calibration information in their opera
`tion. Further, each of the various techniques of the present
`invention described above may also be employed to advan
`tage in other types of imaging systems from the vehicular
`traffic monitoring system disclosed herein.
`FIG. 6 depicts a block diagram of an image processing
`system that primarily ?nds use in the ?eld of machine or
`computer vision. As such, the system is knoWn as a vision
`processing system. This vision processing system (VPS)
`600, suitable among other purposes for surveillance and
`monitoring applications, is shoWn in block diagram form
`illustrating the vieWing system interconnectivity and data
`?oW. The system is a parallel-pipelined image processing
`system comprised of image memories (frame stores) 612
`and 614, a pyramid processing integrated circuit 608 for
`image convolution, an arithmetic logic unit (ALU) 616, and
`a crossbar sWitch 606 for video routing through the various
`components. TWo digital signal processors (DSP) 626 are
`present in the system as Well, Which provide hardWare
`programming through a global bus 628 and also perform
`image processing operations. Images can be passed directly
`from the crossbar sWitch 606 to the internal static RAM
`(SRAM) 624 of the DSPs through a ?