throbber
Proposal of a mobile camera system for moving object detection
`
`Y. Berviller, E. Tisserand, C. Bataille, H. Guermoud, S. Weber
`
`Laboratoire d'lnstrumentation Electronique de Nancy
`
`B.P. 239-54506 Vandoeuvre Cedex- France
`
`abstract
`
`We describe a system for detecting moving elements from a moving camera using a simplified background constraint
`technique. The hardware implementation of the method is done in an intelligent camera form. In order to keep the design
`small and able to operate at video frame rate the algorithm is for a large part implemented in F.P.G.A's. The camera move in a
`locally planar environment and is tilted enough to exclude the horizon line from the image. In this paper we assume the
`camera moves only in translation and the objects to detect are moving in the same direction as the camera does (and obviously
`at higher speed). If the instantaneous speed of the camera is known and is constant between two successively processed
`images, it is easy to predict (for a big part) the next image to process by using the reverse perspective transform. In order to
`transform such a prediction in a simple translation of the image and in order to obtain a uniform spatial resolution in the
`observed scene, we re-sample the lines and apply to each re-sampled line a specific horizontal scale factor. With each
`transformed image we «compute >> a predicted image. The predicted and true images are subtracted and objects moving in the
`scene correspond to a noticeable difference area. The direction and speed of the displacement are estimated with one(cid:173)
`dimensional correlation functions between vertical windows in subtracted images. A study of the limitations and experimental
`results obtained by simulation with real images are presented.
`
`motion detection
`
`collision avoidance
`
`real-time processing
`
`FPGA
`
`intelligent camera
`
`1. Introduction
`
`The detection of moving elements is an important task in autonomous vehicle systems and since a few years in the driving aid
`feature that future vehicles will possess. Among the various concepts, the detection of proximity (all around the vehicle but
`forward) aims to avoid rear-lateral collisions. Microwave2
`4 or ultra-sound5
`3
`6
`radar based systems have already been



`proposed. Currently, thanks to the continuous cost diminution of C.C.D. cameras as well as their associated processing
`electronics, this domain is increasingly explored for on board applications. The biggest interest of this technique is its ability
`to recognize, even roughly, the shape of the detected object. This feature is not yet available, for a reasonable cost, with the
`other mentioned techniques 1
`• If we look for motion detection in the image processing area, three types of method are mainly
`used: those optical flow based, those based on the depth from focus or depth from defocus and those background constraint7
`based. The first are computing expensive8 even in their simplified form that use only one dimensional correlation9
`. One
`variant is the active vision 12 principle, that do not require the optical flow computing, but need a dynamically pan and tilt
`adjustable camera because of its underlying target tracking algorithm. The second are inadequate for dynamic distances
`measurements. They take at least ten images10 when depth from focus is used, while the depth from defocus, that need only
`two to three images, are less accurate and more sensitive to noise because they rely on second order differentiating
`operators 11
`• Furthermore, the last one need defocused images and small field depth camera, that make the image difficult to
`use for human observer.
`
`Thus we choose the third type of technique. The basic idea is that vehicles move generally in a planar environment13
`• With
`this assumption it becomes relatively easy, if the speed and direction of the camera platform is known, to estimate the next
`image in the sequence from current one. The moving object, relative to the scene, do not respect this prediction constraint, and
`are consequently detectable. In the following sections we describe the optical configuration that we use and some of its
`limitations, then we expose the global principle we retained. After this, we show the hardware implantation in an intelligent
`camera system, that is mainly constituted ofF.P.G.A. circuits. This solution contribute to build an economical and small sized
`device, that can be used, for example, in vehicle active mirror applications. We finish with the presentation of some
`preliminary experimental results.
`
`174
`
`SPIE Vol. 2950 • 0-8194-2354-8/96/$6.00
`
`Mercedes-Benz USA, LLC, Petitioner- Ex. 1011
`Mercedes-Benz USA, LLC v. American Vehicular Sciences LLC
`IPR2014-00644
`
`1
`
`

`
`2.1. Optical and geometric configuration
`
`2. Preliminary
`
`Since the objective is to detect the presence of an object in an area near from the camera, we find that it were judicious to tilt
`the camera in order to restrict the observed field as shown in figure 1. Thus the supervised area take almost the entire image
`and the spatial resolution is enhanced. This can also reduce the problem of over-illumination. We assume that the moving
`elements to detect are approaching by following the trajectory of the camera. This trajectory is considered to be collinear with
`the optical axis.
`
`· . . . )
`
`0
`
`·ct . 0
`
`0
`
`h
`
`~floor ~
`
`0~
`
`Z.min
`
`...
`
`Zsmax
`
`crj::
`
`2.2. Perspective equations
`
`Figure 1: Optical configuration
`
`We use the following indices to refer to the coordinates in the diverse coordinate systems:
`
`i: coordinates in the image coordinate system, c: coordinates in the camera coordinate system ( Oc, X c, ~, Zc), s:
`coordinates in the scene coordinate system ( 0 s , X s , I: , Z J
`
`if kx and ky are the scale factors along the axis Xc and Yc respectively (in pixels per meter).
`
`Thus by using the pinhole camera model we have:
`
`175
`
`2
`
`

`
`We would tilt the camera in such a manner that the extreme values of Yi correspond to finite values ofZs. This is given by
`the following equations:
`
`n
`. = h · tan(- - n. - a)
`Z
`snun
`2
`n
`= h·tan(--n.+a)
`2
`
`Z
`smax
`
`'I'
`
`'I'
`
`(2)
`
`The relation between the coordinates in the scene and in the camera coordinate system are given by using the homogeneous
`coordinates by:
`
`0
`cos(¢)
`-sin(¢)
`0
`
`0
`sin(¢)
`cos(¢)
`0
`
`(3)
`
`thus by substituting in (1) we obtain:
`
`(ys- h)· cos(¢)+ Zs ·sin(¢)
`Y· = f ·
`Y (-ys+h)·sin(lf>)+zs·cos(¢)
`
`'
`
`(4)
`
`and in the particular case of the points belonging to the floor:
`
`Zs -h ·tan(¢) -1
`Yi =....:..:. ____ _
`z
`!y
`~+tan(¢)
`h
`
`(5)
`
`xi
`- -
`fx
`
`xs
`h ·sin(¢)+ Zs ·cos(¢)
`
`(5')
`
`2.3. Effect of the « thickness » of the objects
`
`If Lk is the kth line ofthe image (the first being at the top).
`
`Equation (5) show that if an object is approaching along Zs ( Zs decreasing), Yi decrease and consequently Lk crease. The
`motion detection in the scene becomes a motion detection along Lk . Furthermore if the parameters a, ¢, h, N are known
`it is possible to find Zs by use of relation (6). Normally, this relation is only true for points belonging to the floor plane. But,
`if the front face (assumed perpendicular to the floor plane) of the element to detect posses points belonging to the floor,
`equation (4) show that these points have the smallest value of Yi and then have the biggest value of Lk . Thus these points
`will be detected first and satisfy to the planar hypothesis.
`
`176
`
`3
`
`

`
`2.4. Camera resolution
`
`Let N min be the minimum number of lines needed in order to keep the resolution in the scene lower or equal to &
`along Zs . Then N min is given by:
`
`2
`2-(m-n+p)·(m·p-1)
`+ 1 = 1 +
`N · =
`n·(1+p2)
`mm 1-y(zmax-&)
`y(zmax)
`
`(6)
`
`with:
`
`m = Zsmax ; m = & ; p =tan(¢)= 1 + m · tan(a)
`h
`h
`m-tan(a)
`
`3. Principle
`
`3.1. General overview
`
`Figure 2 show the global processing flow for the algorithm that we have developed.
`
`Image at
`timet
`
`Prediction of image at
`time t + 1 from image at ~ Subtraction
`timet
`
`Image at
`time t+l
`
`f--D Linearization
`
`Image at
`time t+2
`
`3.2. Linearization
`
`Prediction of image at
`time t+2 from image at 1----c Subtraction
`timet+ I
`
`_j
`
`Figure 2: data flow of the proposed algorithm
`
`l
`
`Vertical bands ~ Cross
`extraction
`~ correlation
`
`I
`
`~
`
`Analysis
`
`1..[:::
`4
`
`We wish resample the image in order to obtain an homogeneous spatial resolution in the scene from its image.
`
`Thus we can give the vertical linearization algorithm:
`
`for Zs creasing from Zsmin to Zsmax by step of & :
`
`177
`
`4
`
`

`
`. J
`calculate yz = y ·
`
`h
`tan(cj>)--
`Zs
`h
`1 +- · tan(cj>)
`Zs
`
`N l
`I y.(z )
`s + 1) · -J N: number of line in the image, E(x): integer part of x.
`calculate k = E (
`'
`L Y; (zsmax)
`2
`
`copy the line Lk in the resulting image.
`
`On the other hand, as shown by equation (5'), the lines are spatially more compressed as they correspond to farther area in the
`scene. Thus to linearize along x without reducing the supervised zone it is necessary to compress the lines corresponding to
`near distances. We keep the same scale as this given by the point at Zsmax (first line of the image).
`
`Thus we can give the expression of a linearized line:
`
`xi J = !i__. h · sin(cj>) + Zs · cos(cj>)
`-f
`fx h·sin(~~>)+Zsmax ·cos(~~>)
`x
`lin
`'I'
`'I'
`
`(
`
`(7)
`
`And we have the following algorithm:
`
`For each vertically linearized line:
`
`h · sin(cj>) + Zsmax · cos(cj>)
`calculate a = __ __:_ _ ____::;,::::::..:....__ _ ___;__
`h · sin(cj>) + Zs · cos(cj>)
`
`X max (a -1)
`calculate x 0 lin = - 2
`-
`
`· ----;;:--
`
`X max: number of points per line
`
`Xi lin= Xotin
`
`J X max l
`for n creasing from 0 to ll---;;-J
`
`if E[n ·a+ E(a)] <X max
`1 r
`1
`then l(xiun) = -;l (a- E(a)) · I(E(a · n) + E(a)) + f:'ol(E(a · n) + k) J
`
`E~)
`
`With l(x): gray level of the pixel of abscise X in the current line.
`
`and E( a) is the integer part of a
`
`178
`
`5
`
`

`
`An image that is processed by this algorithm is equivalent to an image obtained if the optical axis of the camera is
`perpendicular to the floor (if all the objects are planar).
`
`3.3. Prediction
`
`In the linearized image, the translation of the camera imply the translation of the image. Of course, there is a loss which
`correspond to the new part in the image and that will be greater as the speed of the camera platform increase.
`
`Let V be the speed of the camera and assume it is constant during the prediction time !lt , the corresponding
`movement V · !lt , imply a translation of E[ V :t] lines and a loss from the same number of lines. The most critical
`
`hypothesis is the planarity of the totality of the scene.
`
`3.4. Prediction gap
`
`A simple subtraction between the linearized image and its predicted part can make it possible to detect a moving object. But
`some conditions must be observed:
`* The illumination of the scene may not vary significantly during the prediction time, in order to eliminate the immobile part
`of the scene. Even if there are no illumination variation, there is a variation of intensities along the Z axis (used in shape
`from shading). But this would be neglected because, with the optical configuration that we impose, the dynamic on Z is
`low.
`
`* The gray level of the detected object must be significantly different from that of the floor.
`* The displacement of the object to detect must be two to three times & during llt, in order to discriminate it from the
`residual prediction error.
`
`Note that it may be possible, if there are small repetitive pattern, that false alarm occur by stroboscopic effect.
`
`3.5. Cross-correlation function
`
`We use a function that associate an intensity average profile to a vertical window. This approach has yet been used with non
`moving cameras and without distances linearization1
`. For the moment, we use three windows horizontally centered in the
`image.
`
`The detection of an object moving at the same speed as the camera is obtained by the analysis of both the mean value of the
`profile function and the cross-correlation function. If we use only the cross-correlation function we could make false detection
`because when no moving object are present spurious peaks can appear.
`
`We use a cross-correlation function based on Pearson's correlation coefficient. Thus the effects of intensity variation and non
`locally stationary « signal » are reduced. Furthermore, to prevent from errors due to border effects we assume that the profiles
`does not vary outside the observed interval.
`
`4. Hardware implantation
`
`4.1. Algorithm adaptation
`
`In order to match the requirements of such a system (small size, cost, power consumption), the following adaptations must be
`made:
`
`179
`
`6
`
`

`
`* The arithmetic computations, with the exception of the linearization parameters, will be made with integer, this will reduce
`the silicon area without decreasing the processing speed.
`
`* The averaged profiles will be taken from windows with a width that is an integer power of two.
`* Only the information in the averaged profiles are memorized.
`* The system can operate with cameras that have a shutter operating at image rate without the need of a non-interleaving
`device.
`
`We have noted the good immunity of the cross-correlation function with respect to quantification noise by making simulation
`with real images. Thus we use a 6 bits quantified video signal.
`
`4.2. General architecture
`
`Figure 3 show the functional decomposition of the proposed method, which is the following:
`* Computation of the linearization and prediction parameters
`* Linearization and vertical averaged profiles extraction.
`* Prediction-subtraction and cross-correlation of the differences of extracted profiles.
`
`The data exchange between these three phase are made by the memory.
`
`The device output consist of the periodic readout of the cross-correlated functions.
`
`Digital video signal
`
`Sync hronization
`
`Linearization and
`averaged vertical bands 1--
`extraction
`
`r
`
`speed --t>
`
`tilt angle --t>
`
`height --t>
`
`Linearization and
`prediction parameters
`computing
`
`"
`
`Memory
`
`Prediction, Subtraction
`and Cross-Correlation
`
`4.3. Processing blocs implantation
`
`Figure 3: general overview of the architecture
`
`The computation of the linearization and prediction parameters, which does not require a huge amount of computing power is
`made by a micro controller. The computing requirements is the following:
`
`The linearization, profiles extraction and the prediction-subtraction, cross-correlation operations are implanted in F.P.G.A's,
`which are well suited for such rapid processing. The organization of this implantation is shown in figure 4.
`
`180
`
`7
`
`

`
`Linearization and averaged vertical bands extraction
`
`synchronization
`signals
`
`r-
`
`I Video signal J----Lc Accumulator
`r
`l
`
`Line counter ~
`Selection of line and
`pixels to preserve
`
`Column
`counter
`
`r------t>
`
`f
`
`Memory access
`management
`
`%
`
`Prediction. Subtraction and Cross-Correlation
`
`<J-l>
`
`Memory access
`management
`
`1
`
`Cross-correlation
`
`:}------{;: Synchronous read-out
`
`"
`
`v
`
`Predictive subtraction
`
`1
`
`Figure 4: Organization of the two processing blocs implanted in F.P.G.A.
`
`We have not yet totally optimized all the critical paths but the size of the two F.P.G.A. function described above are
`respectively 1930 and 2560 gates equivalent.
`
`The RAM size needed is about 4 kword of 12 bits.
`
`there are F.P.G.A's that allow RAM implantation up to 20 kbits, but then there is not much place for logic and furthermore we
`use RAM for inter-processing communication. We use an external static RAM with an access time of 20 ns, that is not critical.
`
`The complete system is build up with a classical C.C.D. camera, a digitizing and synchronization interface, two small
`densities F.P.G.A's, one micro controller and one static RAM
`
`5.1. Experimental conditions
`
`5. Preliminary results
`
`We give the results that were obtained by software simulation of the architecture, that is, with the same adaptations of the
`algorithm. As these simulation are made off-line, we take the images sequences with an 8mm camera recorder and processed
`the images one by one with a personal computer. Figure 5 shows the effects of distances linearization and one can see the
`errors implied by non planar objects.
`
`181
`
`8
`
`

`
`5.2. Real scene
`
`We made this sequence with two cars. The car who precedes the other has the camera in the middle of the rear end. Figure 6
`show cross-correlation functions obtained with and without the presence of the car in the field of view.
`
`Figure 5: Effect of the linearization with a real scene.
`The black object correspond to the shadow and the
`front spoiler of the car
`
`cross-correlation value
`
`,.~------,-------~--~--~------~------~
`
`0.7
`
`0.6
`
`o.s
`
`0.3-
`
`0.2-
`
`.a-
`
`II
`
`-'·
`
`• 0 .. + . a
`
`111
`
`e
`
`• 0 _
`
`•>('"
`
`..
`
`-2
`
`number of lines shifted in the cross-correlation function
`Figure 6: Examples of cross-correlation function obtained
`in a real traffic scene. The value of & was 50 em and the
`speed of the first vehicle was approximately 30 km/h. The
`real speed of the approaching vehicle was unknown. The
`curve with the box points is taken at a moment where no
`car appears in the field of view, in contrast to the other
`curve.
`
`182
`
`9
`
`

`
`5.3. Calibrated scene
`
`Although the primary goal of the system is not the measurement of relative speed, we made several test to evaluate the
`possibility in this domain. We simulated such a scene by taking the image one by one and moving both the camera and the
`object to detect at each image by a measured distance.
`
`In the example from which the results are shown on the left side of figure 7, we moved the camera by 5 em and the object by
`10 em at each image. The curve with the box points is the average of ten consecutive cross-correlation function and his
`maximum value is located between 2 and 3 shifted lines. This correspond to 5 em, since we linearized the images with a &
`of 2 em, and is in concordance with the real relative motion. The graph on the right side of figure 7 shows the results if we use
`a profile that contain an identical object but that moves not in the scene. In these case the maximum value lies on a negative
`lines shift, because the object is moving away 5 em per image with respect to the camera. As for the graph on the right side,
`the curve with the box points is the average with the same characteristics. But the result are a little less good, since the
`maximum value does not lie exactly in the middle of -2 and -3.
`
`cross-correlation value
`
`cross-correlation value
`
`.
`
`'•
`
`i
`
`'.·
`
`~
`
`\:
`\! \
`\
`·'·
`!\'\
`.
`
`' \
`
`J )_·.)
`'.~· \ ·~J.
`ir \ ~· .. ·.
`ki
`\ ·~.
`lf1 ...
`\1\\·. ·.
`'.
`I :' '1 •
`/l r
`'l. I
`l
`I
`;.
`I
`I
`
`I
`·!
`
`If!, I
`
`l \'j
`
`1
`
`·,·,,,~~/.
`
`number of lines shifted in the cross-correlation function
`
`number of lines shifted in the cross-correlation function
`
`figure 7: Some cross-correlation function obtained with a scene where the camera moved 5 em and one object 10
`em at each image. The linearization was made with a & of 2 em. In both graphs the curve with the box points is
`the mean of ten consecutive functions. The graph on the left side correspond to profiles containing the moving
`object. The graph on the right side show the case of a profile containing an immobile object.
`
`183
`
`10
`
`

`
`6. Conclusion and future works
`
`We propose a method for detecting approaching elements in a moving area with a C.C.D. camera. This method is relatively
`simple and thus allow the integration in the form of an intelligent camera. The entire device can be integrated in a standard
`industrial camera box.
`
`The preliminary results make us think that an intelligent camera system can advantageously replace concurrent technologies
`(ultra sounds, microwave, etc.). Indeed, even if the accuracy, in terms of distance measurements, is not yet comparable with
`that of these methods, the false detection risk are quite low and can still be reduced by further processing. In addition, with the
`results that we expect to obtain soon, we will verify the robustness and certainly enhance the accuracy by averaging the cross(cid:173)
`correlation functions over time. We also expect to integrate the entire processing stage in one F.P.G.A. of medium density, by
`using the dynamic reconfiguration capability of these circuits.
`
`7. References
`
`1. G. Najm « Comparison of alternative crash avoidance sensor technologies » Proceedings of the SPIE vol.2344 Intelligent
`Vehicle Highway System 1994.
`
`2. Brooke« Radar Beams», Automotive Industries December 1992.
`
`3. Scott« Radar Mirror Has OE-Potential »Automotive Engineer August-September 1993.
`
`4. E. Jackson, J. A. Himelick and C. D. Wright« Commercial Vehicle Applications of Object Detection Systems» Int.
`Congress on Trans. Elect., Dearborn, October 1992.
`
`5. Polaroid Corp. «Ultrasonic Ranging System. For Accurate Distance Measurement and Object Detection» Cambridge.
`MA.
`
`6. T. Clemence and G. W. Hurlbut «The Application Of Acoustic Ranging to the Automatic Control of a Ground Vehicle »
`IEEE Trans. on Vehicle Tech. vol.VT-32 n°3 August 1983.
`7. Facon « EDOSI: un systeme d' etude du deplacement d' objets a partir de sequences d'images » Ph. D thesis Universite de
`Technologie de Compiegne December 1987.
`
`8. Ens and Z. N. Li « Real Time Motion Stereo » Proceedings of the IEEE Computer Society Conference on Computer
`Vision and Pattern Recognition, New York 15-18 June 1993.
`
`9. Acona and T. Poggio «Optical Flow from 1D Correlation: Application to a simple Time to Crash Detector» Proceedings
`of the IEEE fourth Int. Conf. on Computer Vision, Berlin 11-14 May 1993.
`
`10. Xiong and S. A. Shafer« Depth from Focusing and Defocusing» Proceedings of the IEEE Computer Society Conference
`on Computer Vision and Pattern Recognition, New York 15-18 June 1993.
`
`11. Surya and M. Subbarao « Depth from Defocus by Changing Camera Aperture: A Spatial Domain Approach » Proceedings
`of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York 15-18 June 1993.
`
`12. Aloimonos, I. Weiss and A. Bandyopadhyay «Active Vision» Int. J. of Computer Vision vol.11988.
`
`13. Elnagar and A. Basu «Motion detection using background constraint» Pattern Recognition vol.28 n°10 1995.
`
`14.L. Duckworth, M. L. Frey, C. E. Remer, S. Ritter and G. Vidaver «Comparative study of non-intrusive traffic monitoring
`sensors» Proceedings of the SPIE vol.2344 Intelligent Vehicle Highway System 1994.
`
`184
`
`11

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket