`
`JOSEPH F. TRAUB, Editor
`Columbia University
`
`BARBARA J. GROSZ, Associate Editor
`Harvard University.
`
`BUTLER W. LAMPSON, Associate Editor
`Digital Equipment Corporation
`
`NILS J. NILSSON, Associate Editor
`Stanford University
`
`ANNUAL REVIEWS INC 4139 EL CAMINO WAY P.O. BOX 10139 PALO ALTO, CALIFORNIA 94303-0897
`
`IPR2013-00419 - Ex. 1009
`Toyota Motor Corp., Petitioner
`
`1
`
`
`
`ANNUAL REVIEWS INC.
`Palo Alto, California, USA
`
`COPYRIGHT © 1987 BY ANNUAL REVIEWS INC., PALO ALTO, CALIFORNIA, USA. ALL
`RIGHTS RESERVED. The appearance of the code at the bottom of the first page of an
`article in this serial indicates the copyright owner's consent that copies. of the
`article may be made for personal or internal use, or for the personal or internal use
`of specific clients. This consent is given on the condition, however, that the copier
`pay the stated per-copy fee of $2.00 per article through the Copyright Clearance
`Center, Inc. (21 Congress Street, Salem, 1\;IA 01970) for copying beyond that
`permitted by Sections 107 or 108 of the US Copyright Law. The per-copy fee of
`$2.00 per article also applies to the copying, under the stated conditions, of articles
`published in any Annual Review serial before January 1, 1978. Individual readers,
`and nonprofit libraries acting for them, are permitted to make a single copy of an
`article without charge for use in research or teaching. This conserit does not extend
`to other kinds of copying, such as copying for general distribution, for advertising
`or promotional purposes, for creating new collective works, or for resale. For such
`uses, written permission is required. Write to Permissions Dept., Annual Reviews
`Inc., 4139 El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0897 USA.
`
`International Standard Serial Number: 8756-7016
`International Standard Book Number: 0-8243-3202-4
`
`Annual Review and publication titles are registered trademarks of Annual Reviews
`Inc.
`
`Annual Reviews Inc. and the Editors of its publications assume no responsibility
`for the statements expressed by the contributors to this Review.
`
`TYPESET BY AUP TYPESETTERS (GLASGOW) LTD., SCOTLAND
`PRINTED AND BOUND IN THE UNITED STATES OF AMERICA
`
`2
`
`
`
`Annual Review of Computer Science
`Volume 2, 1987
`
`CONTENTS
`
`ARTIFICIAL INTELLIGENCE
`Common Lisp, Scott E. Fahlman
`Using Reasoning About Knowledge to Analyze Distributed
`Systems, Joseph Y. Halpern
`The Emerging Paradigm of Computational Vision, Steven
`W. Zucker
`Nonmonotonic Reasoning, Raymond Reiter
`Logic, Problem Solving, and Deduction, Drew V. McDermott
`Planning, Michael P. Georgeff
`Language Generation and Explanation, Kathleen R. McKeown
`and William R. Swartout
`Search Techniques, Judea Pearl and Richard E. Korf
`Vision and Navigation forth~ Carnegie-Mellon Navlab,
`Charles Thorpe, Martial Hebert, Takeo Kanade and
`Steven Shafer
`
`HARDWARE
`Techniques and Architectures for Fault-Tolerant Computing,
`Roy A. Maxion, Daniel P. Siewiorek and Steven A.
`Elkind
`
`SOFTWARE
`Knowledge-Based Software Tools, David R. Barstow
`Network Protocols and Tools to Help Produce Them,
`Harry Rudin
`
`THEORY
`Computer Algebra Algorithms, Erich Kaltofen
`Linear Programming (1986), Nimrod Megiddo
`Algorithmic Geometry of Numbers, Ravi Kannan
`Research on Automatic Verification of Finite-State
`Concurrent Systems, E. M. Clarke and 0. Grumberg
`
`1
`
`37
`
`69
`147
`187
`359
`
`401
`451
`
`521
`
`469
`
`21
`
`291
`
`91
`119
`231
`
`269
`
`v
`
`3
`
`
`
`vi
`
`CONTENTS (continued)
`
`APPLICATIONS
`Computer Applications in Education: A Historical Overview,
`D. Midian Kurland and Laura C. Kurland
`
`INDEXES
`Subject Index
`Cumulative Index of Contributing Authors, Volumes 1-2
`Cumulative Index of Chapter Titles, Volumes 1-2
`
`317
`
`557
`564
`565
`
`4
`
`
`
`Ann. Rev. Comput. Sci. 1987. 2: 521-56
`Copyright© 1987 by Annual Reviews Inc. All rights reserved
`
`VISION AND NAVIGATION
`FOR THE CARNEGIE-MELLON
`NAVLAB
`
`Charles Thorpe, Martial Hebert, Takeo Kanade,
`Steven Shafer, and the members of the
`Strategic Computing Vision Lab
`
`Department of Computer Science, Carnegie-Mellon University,
`Pittsburgh, Pennsylvania 15213
`
`1.
`
`INTRODUCTION
`
`Robotics is one place where Artificial Intelligence meets the real world. AI
`deals with symbols, rules, and abstractions, reasoning about concepts and
`relationships. The real world, in contrast, is tangible, full of exceptions to
`the rules, and often stubbornly difficult to reduce to logical expressions.
`Robots must span that gap. They live in the real world and must sense,
`move, and manipulate real objects. Yet to be intelligent, they must also
`reason symbolically. The gap is especially pronounced in the case of
`outdoor mobile robots. The outdoors is constantly changing, due to wind
`in trees, changing sun positions, even due to a robot's own tracks from
`previous runs. And mobility means that a robot is always encountering
`new and unexpected events, so static models or preloaded maps are inad(cid:173)
`equate to represent the robot's world.
`The tools a robot uses to bridge the chasm between the external world
`and its internal representation include sensors, image understanding to
`interpret sensed data, geometrical reasoning, and concepts of time and of
`motion over time. We are studying those issues by building a mobile robot,
`the Carnegie-Mellon Navlab, and giving it methods of understanding the
`world. The Navlab has perception routines for understanding color video
`images and for interpreting range data. CODGER, our "whiteboard,"
`proposes a new paradigm for building intelligent robot systems. The
`CODGER tools, developed for the Navlab and its smaller cousin the
`
`8756-7016/87/1115-0521$02.00
`
`521
`
`5
`
`
`
`522
`
`THORPE ET AL
`
`Terregator, handle much of the modeling of time and geometry, and
`provide for synchronization of multiple processes. Our architecture coor(cid:173)
`dinates control and information flow between the high-level symbolic
`processes running on general-purpose computers, and the lower-level con(cid:173)
`trol running on dedicated real-time hardware. The system built from these
`tools is now capable of driving the Navlab along narrow asphalt paths
`near campus while avoiding trees and pausing for joggers that get in its
`way.
`This report describes the Navlab (Singh 1986) and the software we have
`built over the past year: color vision, for finding and following roads
`(Thorpe 1986); 3-D perception, for obstacle avoidance (Hebert 1986); and
`the CODGER whiteboard (Shafer 1986).
`
`2. NAVLAB: NAVIGATION LABORATORY
`
`Navlab is a self-contained laboratory for navigational vision system
`research (see Figures 1 and 2). The motivation for building the Navlab
`came from our earlier experience with the Terregator, a six-wheeled vehicle
`teleoperated from a host computer through a radio link. The Terregator
`had been a reliable workhorse for small-scale experiments, such as the
`Campus Sidewalk navigation system (Goto 1986), but we outgrew its
`capabilities. As we began to experiment with sensor fusion, the Terregator
`
`Figure I The Navlab.
`
`6
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`523
`
`Figure 2 Navlab interior.
`
`ran out of space and power for multiple sensors. When we wanted to
`expand our test areas, communications to a remote computer in the lab
`become more difficult. And as the experiments became more sophisticated,
`we found it more productive for the experimenters to test or debug new
`programs near or in the vehicle, instead in a remotely located laboratory.
`
`7
`
`
`
`524
`
`THORPE ET AL
`
`All these factors culminated in the design and construction of the Navlab
`(Singh 1986).
`Navlab is based on a commercial van chassis, with hydraulic drive and
`electric steering. Computers can steer and drive the van by electric and
`hydraulic servos, or a human driver can take control to drive to a test site
`or to override the computer. The Navlab has room for researchers and
`computers on board, and has enough power and space for all our existing
`and planned sensors. This gets the researchers close to the experiments,
`and eliminates the need for video and digital communications with remote
`computers.
`Features of the Navlab include:
`
`• Onboard computers: We have five computer racks, one for low-level
`controllers and power smoothing, one for video distribution, VCRs,
`communications and miscellaneous equipment, two racks for genenil(cid:173)
`purpose processors (currently Sun workstations), and one for a Warp
`processor.
`• Onboard researchers: There is always a safety driver in the driver's seat.
`There is room for four researchers in the back, with a terminal or·
`workstation for each. An overhead shelf holds video monitors and
`additional terminals. The researchers can monitor both their programs
`and the vehicle's motion.
`• Onboard power: The Navlab carries two 5500-W generators, plus power
`conditioning and battery backup for critical components.
`• Onboard sensors: Above the cab is a pan mount carrying our laser
`scanner and a mounting rail for a color TV camera. There will eventually
`be a separate pan/tilt mount for stereo cameras.
`• Evolving controller: The first computer controller for the Navlab is
`adequate for our current needs. It steers the van along circular arcs, and
`has commands to set speed and acceleration and to ask for the current
`dead-reckoned position estimate. The controller will evolve to do
`smoother motion control and to interface with an inertial guidance
`system (possibly even with GPS satellite navigation). It will also eventu(cid:173)
`ally watch vital signs such as computer temperature and vehicle
`hydraulic pressure.
`
`3. COLOR VISION
`
`The first application of a mobile robot is to find and follow roads using a
`video camera. Two classes of algorithms can be used for this problem:
`road edge extraction, and color classification. Algorithms of the first class
`attempt to extract the edges of the road as seen by a camera, backprojecting
`
`8
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`525
`
`them on the ground plane in order to compute the appropriate steering
`commands for the vehicle (Davis et al 1986, Waxman et al 1985). Algo(cid:173)
`rithms of the second class classify the pixels of a color image as being on(cid:173)
`road or off-road based on the color characteristics of the road (Turk et al
`1987).
`The Navlab uses color vision, specially multi-class adaptive color classi(cid:173)
`fication, to find and follow roads. Image points are classified into "road"
`or "non-road" principally on the basis of their color. Since the road is not
`a uniform color, color classification must have more than one road model,
`or class, and more than one non-road class. Because conditions change
`from time to time and from place to place over the test course, the color
`models must be adaptable. Once the image is classified, the road is ident(cid:173)
`ified by means of an area-based voting technique that finds the most likely
`location for the road in the image.
`
`3.1 Vision Principles for the Real World
`We based the development of our vision system on the following principles:
`
`Assume variation and change On sunny days there are shadowed areas,
`sunlit areas, and patches with dappled sunlight. On rainy days, there are
`dry patches and wet patches. Some days there are wet, dry, sunny, and
`shadowed areas all in the same image. The road has clean spots and other
`places covered with leaves or with drips of our own hydraulic fluid. And
`as the sun goes behind a cloud or as the vehicle turns, lighting conditions
`change. We therefore need more than one road and non-road color model
`at any one time; those color models must adapt to changing conditions;
`and we need to process images frequently so that the change from one
`image to the next will be moderate.
`
`Use few geometric parameters A complete description of the road's shape
`in an image can be complex. The road can bend gently or turn abruptly,
`can vary in width, and can go up- or downhill. However, the more par(cid:173)
`ameters there are, the greater the chance of error in finding those
`parameters. Small misclassifications in an image could give rise to fairly
`large errors in perceived road geometry. Furthermore, if all the road
`parameters can vary, there are ambiguous interpretations: Does the road
`actually rise, or does it instead get wider as it goes? We describe the road
`with only two free parameters: Its orientation and its distance from the
`vehicle. Road width is fixed, we assume a flat world, and we decree that
`the road is straight. While none of these assumptions is true over a long
`stretch of the road, they are nearly true within any one image; and the
`errors in road position that originate in our oversimplifications are bal(cid:173)
`anced by the smaller chance of bad interpretations. If our system classifies
`
`9
`
`
`
`526 .
`
`THORPE ET AL
`
`a few pixels incorrectly as road, the worst it will do is to find a slightly
`incorrect road. A method that tries to fit more parameters, on the other
`hand, may interpret parts of the road perfectly, but could find an abrupt
`turn or sudden slope near any bad pixels.
`
`Work in the image The road can be found either by projecting the road
`shape into the image and searching in image coordinates, or by back(cid:173)
`projecting the image onto the ground and searching in world coordinates.
`The problem with the latter approach comes in projecting the image onto
`an evenly spaced grid in the world. The points on the world grid close to
`the vehicle correspond to a big area in the lower part of the image; points
`farther away may correspond to one or a few pixels near the top. Unless
`one uses a complex weighting scheme, some image pixels (those at the top
`that project to distant world points) will have more weight than other
`(lower) points. A few noisy pixels can then have a big or a small effect,
`depending on where in the image they lie. On the other hand, working
`directly in the image makes it much easier to weight all pixels evenly. We
`can directly search for the road shape that has the most road pixels and
`the fewest non-road pixels. Moreover, projecting a road shape is much
`more efficient than back-projecting all the image pixels.
`
`Calibrate directly A complete description of a camera must include its
`position and orientation in space, its focal length and aspect ratio, lens
`effects such as fisheye distortion, and nonlinearities in the optics or sensor.
`The general calibration problem of trying to measure each of these vari(cid:173)
`ables is difficult. It is much easier, and more accurate, to calibrate the
`whole system than to tease apart the individual parameters. The easiest
`method is to take a picture of a known object and build a lookup table
`that relates each world point to an image pixel and vice versa. Projecting
`road predictions into the image and back-projecting detected road shapes
`onto the world are done by means of table lookup (or table lookup for
`close-by values, with simple interpolations). Such a table is straightforward
`to build and provides good accuracy, and there are no instabilities in the
`calculations.
`
`Use outside constraints Even without a map of our test course or an
`expensive inertial navigation system, we know (based on the previous
`image and on vehicle motion) approximately where the road should be.
`Our "whiteboard," described in Section 5, can predict where the road
`should appear if the road were straight and vehicle navigation were perfect.
`Adding a suitable margin for curved roads and sloppy navigation still
`gives useful limits on where in the image to look for the road.
`
`10
`
`
`
`NA VLAB VISION AND NAVIGATION
`
`527
`
`Test with real data We ran our VCR nearly every time we took the
`vehicle out, to collect images under as many conditions as possible. We
`recorded sunny days, cloudy days, rainy days, leaves on trees, leaves
`turning color, leaves falling, early morning, noon, after dusk, even a partial
`solar eclipse. Strategies that worked well on one set of images did not
`always work on the others. We selected the toughest images, ran our best
`algorithms and printed the classification results, changed parameters or
`algorithms, reran the data set, and compared results. This gave us the best
`chance of being methodical and of not introducing new bugs as we went.
`When the image processing worked to our satisfaction, we ran simulations
`in the lab that included the whiteboard, range processing, path planning,
`and a vehicle simulator, with the vision component processing stored
`images and interacting with the rest of the system. When the simulations
`worked in the lab, we moved them to the vehicle. Only after the simulations
`worked on the vehicle's computers, and we were sure that all necessary
`software was on the van, did we go into the field for real tests. Even then
`not everything worked, but there were many fewer bugs than there would
`have been without the simulations and tests.
`
`3.2 Road Following Algorithm
`We followed the same principles in building and tuning adaptive color
`classification for following roads. Figure 3 shows a relatively simple scene
`to help explain our algorithm. As shown in Figure 4, the algorithm involves
`three stages: 1. Classify each pixel. 2. Use the results of classification to
`vote for the best-fit road position. 3. Collect new color statistics based on
`the detected road and non-road regions. Pixel classification is done by
`standard pattern classification. Each class is represented by the means,
`variances, and covariances of red, green, and blue values, and by its a
`priori likelihood based on expected fraction of pixels in that class. For
`each pixel, calculating the class to which it most likely belongs involves
`finding how far the pixel's values lie from the mean of each class, where
`distance is measured in standard deviations of that class. Figures 5 and 6
`show how each pixel is classified and how well it matches.
`Once each point has been classified, we must find the most likely location
`of the road. We assume the road is locally :flat, straight, and has parallel
`sides. The road geometry can then be described by two parameters as
`shown in Figure 7: 1. The intercept, which is the image column of the
`road's "vanishing point." This is where the road centerline intercepts the
`horizon (or more precisely the vanishing line of the locally :flat plane of
`the road; since the camera is fixed to the vehicle, this vanishing line is
`constant independent of the vehicle's pitch, roll, and yaw). The intercept
`gives the road's direction relative to the vehicle. 2. The orientation of the
`
`11
`
`
`
`528
`
`THORPE ET AL
`
`road in the image, which tells how far the vehicle is to the right or left of
`the centerline.
`We set up a two-dimensional parameter space, with intercept as one
`dimension and orientation as the other. Each point classified as road votes
`for all road intercept/orientation combinations to which it could belong,
`while non-road points cast negative votes, as shown in Figures 8 and 9. The
`orientation/intercept pair that receives the most votes is the one that
`contains the most road points, and it is reported as the road. For the case
`of Figure 3, the votes in orientation/intercept space look like Figure 10.
`Figure 11 shows the detected position and orientation of the road. It is
`noteworthy that since this method does not rely on the exact local geometry
`of the road, it is very robust. The road may actually curve or not have
`parallel edges, or the segmentation may not be completely correct. But
`since this method does not rely on exact geometry, the answer it produces
`is adequate to generate an appropriate steering command.
`Once the road has been found in an image, the color statistics of the
`
`Figure 3 Original image.
`
`12
`
`
`
`NA VLAB VISION AND NAVIGATION
`
`529
`
`road and offroad models are modified for each class by resampling the
`detected regions (Figure 12) and updating the color models. The upated
`color statistics will gradually change as the vehicle moves into a different
`road color, as lighting conditions change, or as the colors of the sur(cid:173)
`rounding grass, dirt, and trees vary. As long as the processing time per
`image is low enough to provide a large overlap between images, the
`statistics adapt as the vehicle moves. The road is picked out by hand in
`the first image. Thereafter, the process is automatic, using the segmentation
`from each image to calculate color statistics for the next.
`There are several variations on this basic theme. One variation is to
`smooth the images first. This throws out outliers and tightens the road
`and non-road clusters. Another is to have more than one class for road
`and for non-road, for instance, one for wet road and one for dry, or one
`for shadows and one for sun. Other variations change the voting for best
`
`Image •
`
`Road/Non-road
`Classification
`
`Road Model
`-Width
`Position
`Orientation
`Surface Appearance
`(RGBT)
`
`Non-road Model
`
`Appearances
`(RGBT)
`
`Self Clustering
`&
`Update
`
`Vehicle
`Motion
`
`Figure 4 Color vision for road following, including color classification, Hough transform
`for road region detection, and updating multiple road and nonroad models.
`
`13
`
`
`
`Figure 5 Segmented image. Color and texture cues are used to label points below the
`horizon into two road and two offroad classes.
`
`Figure 6 Road probability image. The pixels that best match typical road colors are
`displayed brightest.
`
`14
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`531
`
`P: Road direction rclati ve to vehicle
`e: Vehicle position relative to road center
`
`Vanishing Line
`Knowledge of Ground Plane
`
`Find a good combination of (P, 8)
`Figure 7 Hough transform that considers the geometry of road position and orientation.
`Geometry of locally flat, straight, and parallel road regions can be described by only P and
`e. Point A classified as road could be a part of the road with the shown combination of
`(P, 8), and thus casts a positive vote for it. Point B classified as off-road, however, will cast
`a negative vote for that (P, ()) combination.
`
`road. Besides adding votes for road pixels, we subtract votes for non-road
`points. Votes are weighted according to how well each point matches road
`or non-road classes. Finally, an image contains clues other than color(cid:173)
`for instance, visual texture. Roads tend to be smooth, with less high(cid:173)
`frequency variation than grass or leaves, as shown in Figure 13. We
`calculate a normalized texture measure and use that in addition to color
`in the road classification.
`
`15
`
`
`
`532
`
`THORPE ET AL
`
`Figure 8 A point classified as road could be part of roads with different orientations and
`vanishing points.
`
`D
`
`D
`
`D
`D
`D
`horizon point
`~
`= 1:':1
`~~------------~~----------------~--~
`D
`
`D D
`
`Figure 9 The point from Figure 8 would vote for these orientation/intercept values in the
`parameter space.
`
`16
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`533
`
`Figure 10 Votes for best road orientation and intercept, and point with most votes (dark
`square), for the road in Figure 3.
`
`Figure 11 Detected road, from the point with the most votes shown in Figure 10.
`
`17
`
`
`
`534
`
`THORPE ET AL
`
`Figure 12 Updating road and nonroad model colors, leaving safety zone around the
`detected road region.
`
`3.3
`Implementation, Details, and Results
`The implementation of road following runs in a loop of six steps: image
`reduction, color classification, texture classification, combining color and
`texture results, voting for road position, and color update. These steps are
`shown in Figure 14 and are explained in detail below.
`
`IMAGE REDUCTION We create a pyramid of reduced-resolution R, G, and
`B images. Each smaller image is produced by simple 2 x 2 averaging of
`the next larger image. Other reduction methods, such as median filtering,
`are more expensive and produce no noticeable improvement in the system.
`We start with 480 x 512 pixel images and typically use the images reduced
`to 30 x 32 for color classification. We use less-reduced versions of the
`images for texture classification. Image reduction is used mainly to improve
`speed, but as a side effect the resulting smoothing reduces the effect of
`scene anomalies such as cracks in the pavement.
`
`COLOR CLASSIFICATION Each pixel (in the 30 X 32 reduced image) is
`labeled as belonging to one of the road or non-road classes by standard
`
`18
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`535
`
`Figure 13 Zoomed picture of road-nonroad boundary. The road (at left) is much less
`textured than the grass (at right).
`
`maximum-likelihood classification. We usually have two road and two
`non-road classes. Each class is represented by the mean R, G, and B values
`of its pixels, by a 3 x 3 covariance matrix, and by the fraction of pixels
`expected a priori to be in that class. The classification procedure calculates
`the probability that a pixel belongs to each of the classes, assigns the label
`of the most probable class, and records the maximum road and non-road
`probabilities for each pixel.
`
`TEXTURE CALCULATION This is composed of six substeps:
`
`• Calculate texture at high resolution by running a Roberts operator over
`the 240 x 256 image.
`• Calculate a low-resolution texture by applying a Roberts operator to
`the 60 x 64 image.
`• Normalize the texture by dividing the high-resolution texture by a com(cid:173)
`bination of the average pixel value for that area (to handle shadow
`
`19
`
`
`
`536
`
`THORPE ET AL
`
`interiors) and the low-resolution texture (to remove the effect of shadow
`boundaries). The average pixel value is the value from the corresponding
`pixel in the 120 x 128 reduced image.
`
`normalized gradient
`
`high-freq. gradient
`a x low-freq. gradient+ p x mean pixel value
`Typical values for the coefficients are a = 0.2 and P = 0.8.
`• Threshold. Produce a binary image of "microedges" by thresholding
`the normalized gradient. A fairly low threshold, such as 1, is usually
`adequate.
`• Count Edges. Count the number of edges in each pixel block. This gives
`a 30 x 32 pixel texture magnitude image. Figure 15 shows the texture
`image derived from Figure 3. Each texture pixel has a value between 0
`and 256, which is the number of pixels in the corresponding area of the
`full-resolution image that are microedges.
`• Classify Texture. Classify each pixel in the 30 x 32 image as road or
`non-road on the basis of texture, and calculate a confidence for each
`
`BLUE
`
`GREEN
`
`RED
`
`REDUCE
`
`~~ 'rEXTURE NORMALIZATION \1
`
`TEXTURE
`IMAGE
`
`COLOR CLASSIFICATION
`
`I GRASS 1
`I GRASS 0
`I ROAD 1
`
`-
`
`SAMPLE COLORS
`
`ROAD 0
`
`I--
`
`f--
`
`HIGHEST VOTE
`
`HOUGH
`SPACE
`
`NEGATIVE VOTE
`
`POSITIVE VOTE
`
`TEXTURE CLASSIFICATION~
`\1
`
`c
`LASSIFICATION
`
`COMBINATION T GRASS
`~ LY
`
`ROAD
`
`Figure 14 Processing cycle for color vision.
`
`20
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`537
`
`Figure 15 Low-resolution texture image, showing textures from Figure 3. The brighter
`blocks are image areas with more visual texture.
`
`label. We found experimentally that a fixed mean and standard devi(cid:173)
`ation for road and non-road textures were better than adaptive texture
`parameters. Our best results were with road mean and standard devi(cid:173)
`ation of 0 and 25, and non-road values of 175 and 100. Effectively, any
`pixel block of the image with more than 35 microedges above threshold
`is considered textured, and is therefore classified as non-road.
`
`COMBINATION OF COLOR AND TEXTURE RESULTS Color is somewhat more
`reliable than texture, so the color probabilities are weighted somewhat
`more than the probabilities calculated by texture. The result of this step is
`a final classification into road or non-road, and a "confidence" calculated
`by
`
`Max( road confidence, non-road confidence)- Min( road confidence,
`non-road confidence).
`
`VOTE FOR BEST ROAD POSITION This step uses a 2-D parameter space
`similar to a Hough transform. Parameter 1 is the column of the road's
`vanishing point, quantized into 32 buckets because the image on which
`
`21
`
`
`
`538
`
`THORPE ET AL
`
`the classification and voting are based has 32 columns. Parameter 2 is the
`road's angle from vertical in the image, ranging from -1 to 1 radian in
`0.1 radian steps. A given road point votes for all possible roads that
`would contain that point. The locus of possible roads whose centerlines go
`through that point is an arctangent curve in the parameter space. Because
`the road has a finite width, the arctan curve has to be widened by the width
`of the road at that pixel's image row. Road width for a given row is not a
`constant over all possible road angles but is nearly constant enough that
`it doesn't justify the expense of the exact calculation. Each pixel's vote is
`weighted by its calculated confidence. Pixels classified as non-road cast
`negative votes (with their weights reduced by a factor of 0.2) while road
`pixels add votes. In pseudo C code, the voting for a pixel at (row, col) is
`
`for (theta= -1; theta<== 1; theta+= 0.1) {
`center= col+arctan (theta);
`for ( c = center-width/2; c <== center+width/2; c + +) {
`parameter _space [theta] [c] + =confidence;
`}
`
`}
`
`At the end of voting, one road intercept/angle pair will have the most
`votes. That intercept and angle describe the best road shape in the scene.
`
`COLOR UPDATE The parameters of the road and non-road classes need to
`be recalculated to reflect changing colors. We divide the image into four
`regions plus a "safely zone": left offroad, right offroad, upper road, and
`lower road. We leave a 64-pixel wide "safety zone" along the road bound(cid:173)
`ary, which allows for small errors in locating the road, or for limited road
`curvature. For each of the four regions, we calculate the means of red,
`green, and blue. We use the calculated parameters to form four classes,
`and reclassify the image using a limited classification scheme. The limited
`reclassification allows road pixels to be classified as either of the two road
`classes, but not as non-road, and allows non-road pixels to be reclassified
`only as one of the non-road classes. The reclassified pixels are used as
`masks to recalculate class statistics. The loop of classify-pixels/recalculate(cid:173)
`statistics is repeated, typically 3 times or until no pixels switch classes. The
`final reclassified pixels are used to calculate the means, variances, and
`covariances of R, G, and B for each of the classes, to be used to classify
`the next image. Limited reclassification is based on distance from a pixel's
`values to the mean values of a class, rather than the full maximum like(cid:173)
`lihood scheme used in classifying a new image. This tends to give classes
`based on tight clusters of pixel values, rather than lumping all pixels into
`classes with such wide variance that any pixel value is considered likely.
`
`22
`
`
`
`NAVLAB VISION AND NAVIGATION
`
`539
`
`CALIBRATION There is no need for complete geometric calibration. The
`vision algorithms calculate the road's shape (road width and location of
`the horizon) from the first training image. We also take two calibration
`pictures, with a meter stick placed perpendicular to the vehicle, 8 and 12m
`in front. Then, during the run, given the centerline of a detected road
`in image coordinates, it is easy to get the x position of the road at 8 and
`12m and then to calculate the vehicle's position on the road.
`
`PERFORMANCE This algorithm is reliable. Running on the Navlab, with
`predictions of where the road should appear, our failure rate is close to 0.
`The occasional remaining problems come from one of three causes:
`
`• The road is covered with leaves or snow, so one road color class and
`one non-road color class are indistinguishable.
`• Drastic changes in illumination occur between pictures (e.g. the sun
`suddenly emerges from behind a cloud) so all the colors change dra(cid:173)
`matically from one image to the next.
`• Sunlight is so bright and shadows are so dark in the same scene that we
`hit the hardware limits of the camera. It is possible to have pixels so
`bright that all color is washed out, and others in the same image so dark
`that all color is lost in the noise.
`
`Not every image is classified perfectly, but almost all are good enough for
`navigation. We sometimes find the road rotated in the image from its
`correct location, so we report an intercept off to one side and an angle off
`to the other side. But since the path planner looks ahead about the same .
`distance as the center of the image, the steering target is still in approxi(cid:173)
`mately the cnrrect location, and the vehicle stays on the road. This algo(cid:173)
`rithm runs in about 10 sec p