throbber
/
`t ~ ,.
`
`AD-A249 972
`
`1111!11111!1 !!II! IIIII IIIII !1111111111111 !Ill
`
`Neural Network Perception for Mobile Robot
`Guidance
`
`Dean A. Pomerleau
`February 16, 1992
`CMU-CS-92-115
`
`School of Computer Science
`Carnegie Mellon University
`Pittsburgh, PA 152i3
`
`DTI SBLECTK
`
`.. I
`
`MAY12
`
`- ··- -~ ....... ~ .. ~ ... , ........... ·-- -~-J
`
`.
`
`Submitted in partial fulfillment of the requirements
`for the degree of Doctor of Philosophy_.
`·• .. .{:.'!!:?!:J:::,., "\
`-·~·-'a·~"~-"· .. •' ·-·· .
`CLEARED;. :· · , . '- :-- ~. _
`• ''P ')-,
`... ['
`.
`l' ~ 1" PL!Bt !('ATIOfJ
`~ ....
`~.. ... .o.....:..
`~,..
`
`APR 2 8 1992
`
`Q
`t.
`
`© 1992 by Dean A. Pomerleau
`
`Support for this work has come from DARPA, under contracts DACA76-85-C-0019,
`DACA76-85-C-0003, DACA76-85-C-0002, DACA76-89-C-0014 and DAAE07-90-C-R059.
`These contracts were monitored by the Topographic Engineering Center and by TACOM. This re(cid:173)
`search was also funded in part by grants from the Fujitsu Corporation and the Shimizu Corporation.
`
`92-12205
`lllllllllllllllllllllllllllllllllllllllllllll
`
`\_ !
`
`.-.1
`
`IPR2013-00424 - Ex. 1005
`Toyota Motor Corp., Petitioner
`1
`
`

`

`Neural Network Perception for Mobile Robot
`Guidance
`
`Dean A. Pomerleau
`February 16, 1992
`CMU-CS-92-115
`
`School of Computer Science
`Carnegie Mellon University
`Pittsburgh, PA 15213
`
`Submitted in partial fulfillment of the requirements
`for the degree of Doctor of Philosophy.
`
`@ 1992 by Dean A. Pomerleau
`
`Suppon for this work has come from DARPA, under contracts DACA76-35-C-0019,
`DACA76-85-C-0003, DACA76-85-C-0002, DACA76-89-C-0014 and DAAE07-90-C-R059.
`These contracts were monitored by the Topographic Engineering Center and by TACOM. This re(cid:173)
`search was also funded in part by grants from the Fujitsu Corporation and the Shimizu Corporation.
`
`2
`
`

`

`• egte
`on
`
`School of Computer Science
`
`DOCTORAL THESIS
`in the field of
`Computer Science
`
`Neural Network Perception for Mobile Robot Guidance
`
`DEAN POMERLEAU
`
`Submitted in Partial Fulfillment of the Requirements
`for the Degree of Doctor of Philosophy
`
`ACCEP'IED:
`
`f2·I?J4==
`
`I
`
`APPROVED:
`
`DEAN
`
`r
`
`•
`
`DATE
`
`DATE
`
`PROVOST
`
`DATE
`
`3
`
`

`

`Dedicated to Terry, Glen and Phyllis
`
`toeesslou lol'
`ITIS GRAll
`DTIC TAB
`Unannollltced
`Justlt'1cat1on.
`
`/
`£!r'
`0
`0
`
`By
`~!~r~~u~!!>!¥ ___
`AY61lAb111ty Codes
`
`J)lst ·rv;;!~t~/or -
`~.-1
`
`4
`
`

`

`Abstract
`
`Vision based mobile robot guidance has proven difficult for classical machine
`vision methods because of the diversity and real time constraints inherent in the
`task. This thesis describes a connectionist system called ALVINN (Autonomous
`Land Vehicle In a Neural Network) that overcomes these difficulties. ALVINN
`learns to guide mobile robots using the back-propagation training algorithm. Be(cid:173)
`cause of its ability to learn from example, ALVINN can adapt to new situations
`and therefore cope with the diversity of the autonomous navigation task.
`
`But real world problems like vision based mobile robot guidance presents a
`different set of challenges for the connectionist paradigm. Among them are:
`
`• How to develop a general representation from a limited amount of real
`training data,
`
`• How to understand the internal representations developed by artificial neural
`networks,
`
`• How to estimate the reliability of individual networks,
`
`• How to combine multiple networks trained for different situations into a
`single system,
`
`• How to combine connectionist perception with symbolic reasoning.
`
`This thesis presents novel solutions to each of these problems. Using these
`techniques, the ALVINN system can learn to control an autonomous van in under 5
`minutes by watching a person drive. Once trained, individual ALVINN networks
`can drive in a variety of circumstances, including single-lane paved and unpaved
`roads, and multi-lane lined and unlined roads, at speeds of up to 55 miles per hour.
`The techniques also are shown to generalize to the task of controlling the prec; Je
`foot placement of a walking robot.
`
`5
`
`

`

`Acknowledgements
`I wish to thank my advisor, Dr. David Touretzky for his support and technical
`advice during my graduate studies. Dave has not only provided invaluable feed(cid:173)
`back, he has also given me the latitude I've needed to develop as a researcher. I am
`also grateful to Dr. Charles Thorpe for the opportunities, resources and expertise
`he has provided me. Without Chuck's support, this research would not have been
`possible. I also wish to thank the other members of my committee, Dr. Takeo
`Kanade and Dr. Terrence Sejnowski, for their insightful analysis and valuable
`comments concerning my work.
`
`I owe much to all the members of the ALV/UGV project. Their technical
`support and companionship throughout the development of the ALVINN system
`has made my work both possible and enjoyable. I would like to specifically
`thank Jay Gowdy and Omead Amidi, whose support software underlies much of
`the ALVINN system. James Frazier also deserves thanks for his patience during
`many hours of test runs on the Navlab.
`
`Interaction with members of the Boltzmann group at Carnegie Mellon has
`also been indispensable. From them I have not only learned about all aspects of
`connectionism, but also how to communicate my thoughts and ideas. In panicular,
`the insights and feedback provided by discussions with John Hampshire form the
`basis for much of this thesis. I am grateful to Dave Plaut, whose helpful suggestions
`in the early stages of this work put me on the right track.
`
`Other people who have contributed to the success of this thesis are the mem hers
`of the SM2 group. In particular, Ben Brown and Hiroshi Ueno have given me the
`opportunity, incentive and support I have needed to explore an alternative domain
`for connectionist mobile robot guidance.
`
`I am also in debt to my office mates, Spiro Michaylov and Nevin Heintze.
`They have helped me throughout our time as graduate students, with everything
`from explaining ~TJ:Xpeculiarities to feeding my fish. I would like to thank my
`parents, Glen and Phyllis, for encouraging my pursuit of higher education, and
`for all the sacrifices they've made to provide me with a world of opportunities.
`Finally, I am especially grateful to my fiancee, Terry Jessie, for her constant love,
`support and patience during the difficult months spent preparing this dissertation.
`Her presence in my life has helped me keep everything in perspective.
`
`Dean A. Pomerleau
`
`February 16, 1992
`
`-~----- - - - - - -
`
`6
`
`

`

`Contents
`
`1
`
`Introduction
`1.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . .
`1.2 Robot Testbed Description . . . . . . . . . . . . .
`13 Dissenation Overview . . . . . . . . . . . . . . . . . . . . . .
`
`2 Network Architecture
`2.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . .
`2.2
`Input Representations
`. . . . . . . . . . . . . . . . . . . . . .
`2.2.1 Preprocessing Practice . . . . . . . . . . . .
`2.2.2
`Justification of Preprocessing
`. . . . . . . .
`2.3 Output Representation . . . . . . . . . . . . . . . .
`2.3.1
`1-of-N Output Representation
`. . . . . . . . . . .
`2.3.2 Single Graded Unit Output Representation . . . . .
`2.3.3 Gaussian Output Representation . . . . . . . . . . . . .
`2.3.4 Comparing Output Representations
`Internal Network Structures
`. . . . . . . . . . . . . . . . . . .
`
`2.4
`
`3 Training Networks "On· The-Fly"
`. . . . . . . . .
`3.1 Training with Simulated Data
`3.2 Training "on-the-fly" with Real Data . . . . . .
`3.2.1 Potential Problems . . . . . . . . . . .
`. . . . . .
`3.2.2 Solution - Transform the Sensor Image
`3.2.3 Transforming the Steering Direction . . . . . . . .
`3.2.4 Adding Diversity Through Buffering
`. . .
`3.3 Performance Improvement Using Transformations .
`3.4 Discussion . . . . . . . . . . . . . . . . . . . . .
`
`1
`2
`4
`5
`
`10
`11
`12
`13
`18
`20
`22
`24
`28
`33
`35
`
`37
`38
`41
`41
`42
`48
`52
`54
`56
`
`v
`
`7
`
`

`

`4 Training Networks With Structured Noise
`4.1 Transitory Feature Problem
`. . . .
`4.2 Training with Gaussian Noise
`. . . . .
`4.3 Characteristics of Structured Noise . . .
`4.4 Training with Structured Noise . . . . . . . .
`Improvement from Structured Noise Training
`4.5
`4.6 Discussion . . . . . . . . . . . . . . . . . .
`
`5 Driving Results and Performance
`5.1 Situations Encountered . . . . . . . . . .
`5.1.1 Single Lane Paved Road Driving.
`5.1.2 Single Lane Din Road Driving. .
`5.1.3 Two-Lane Neighborhood Street Driving .
`5.1.4 Railroad Track Following
`. .
`5.1.5 Driving in Reverse . . . . . .
`. . . . .
`5.1.6 Multi-lane Highway Driving .
`. . . . . . . . . . .
`5.2 Driving with Alternative Sensors , . .
`5.2.1 Night Driving Using Laser Reflectance Images
`5.2.2 Training with a Laser Range Sensor . . . . . .
`5.2.3 Contour Following Using Laser Range Images
`5.2.4 Obstacle Avoidance Using Laser Range Images
`5.3 Quantitative Perfonnance Analysis . . .
`. . . . .
`5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . .
`
`6 Analysis of Network Representations
`6.1 Weight Diagram Interpretation . . . . . . . .
`6.2 Sensitivity Analysis
`. . . . . . . . . . . . .
`6.2.1 Single Unit Sensitivity Analysis . . .
`6.2.2 Whole Network Sensitivity Analysis . . . . . . . . .
`6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`7 Rule-Based Multi-network Arbitration
`7.1 Symbolic Knowledge and Reasoning.
`7.2 Rule-based Driving Module Integration
`7.3 Analysis and Discussion . . . . . . . .
`
`58
`58
`64
`68
`69
`75
`77
`
`80
`80
`81
`83
`84
`84
`85
`85
`86
`88
`88
`90
`90
`91
`94
`
`96
`97
`101
`103
`108
`118
`
`120
`121
`125
`128
`
`8
`
`

`

`8 Output Appearance Reliability Estimation
`8.1 Review of Previous Arbitration Techniques
`8.2 OARE Details
`. . . . . . . . . . . .
`8.3 Results Using OARE . . . . . . . . .
`8.3.1 When and Why OARE Works
`. . . . . . .
`8.4 Shortcomings of OARE
`
`9
`
`Input Reconstruction Reliability Estimation
`9.1 The IRRE Idea . . . . . . . .
`9.2 Network Inversion . . . . . .
`9.3 Backdriving the Hidden Units
`9.4 Autoencoding the Input ..
`9.5 Discussion . . . . . . . .
`
`10 Other Applications • The SM2
`10.1 The Task . . . . . . . . .
`10.2 Network Architecture
`. .
`10.3 Network Training and Performance
`10.4 Discussion . . . . . . . . . . . . . .
`
`11 Other Vision-based Robot Guidance Methods
`11.1 Non-learning Autonomous Driving Systems
`11.1.1 Examples
`. . . . . . . . . . . .
`11.1.2 Comparison with ALVINN ...
`11.2 Other Connectionist Navigation Systems .
`11.3 Other Potential Connectionist Methods .
`11.4 Other Machine Learning Techniques .
`11.5 Discussion . . . . . . . . . . . . . . .
`
`12 Conclusion
`12.1 Contributions .
`12.2 Future Work
`.
`
`131
`132
`135
`137
`142
`145
`
`147
`147
`148
`152
`159
`163
`
`168
`168
`171
`171
`176
`
`178
`179
`179
`181
`182
`184
`186
`190
`
`192
`192
`196
`
`9
`
`

`

`Chapter 1
`
`Introduction
`
`A truly autonomous robot must sense its environment and react appropriately.
`Previous mobile robot perception systems have relied on hand-coded algorithms
`for processing sensor information. In this dissertation I develop techniques which
`enable artificial neural networks (ANNs) to learn the visual processing required
`for mobile robot guidance. The power and flexibility of these techniques are
`demonstrated in two domains, wheeled vehicle navigation, and legged robot foot
`positioning.
`The central claims of this dissertation are:
`
`• By appropriately constraining the problem, the network architecture and the
`training algorithm, ANNs can quickly learn to perform many of the complex
`perception tasks required for mobile robot navigation.
`
`• A neural network-based mobile robot perception system is able to robustly
`handle a wider variety of situations than hand-programmed systems because
`of the ability of ANNs to adapt to new sensors and situations.
`
`• Artificial neural networks are not just black boxes. Their internal represen(cid:173)
`tations can be analyzed and understood.
`
`• The reliability of ANNs can be estimated with a relatively high precision.
`These reliability estimates can be employed to arbitrate between multiple
`expert networks, and hence facilitate the modular construction of connec(cid:173)
`tionist systems.
`
`1
`
`10
`
`

`

`2
`
`CHMTERJ. INTRODUCTION
`
`Sensors
`
`Pm:eptual Perceptual Motor
`Low-Level Control Actuators
`Processing Comm.OO Controller
`lnpul
`Signal
`t
`I
`
`Control
`Feedback
`
`Figure 1.1: Block diagram of sensor based mobile robot guidance
`
`l i I
`
`• By combining neural network-based perception with symbolic reasoning,
`an autonomous navigation system can achieve accurate low-level control
`and exhibit intelligent high-level behavior.
`
`1.1 Problem Description
`
`To function effectively, an autonomous mobile robot must first be capable of pur(cid:173)
`poseful movement in its environment. This dissenation focuses on the problem of
`how to employ neural network based perception to guide the movement of such
`a robot. To navigate effectively in a complex environment, a mobile robot must
`be equipped with sensors for gathering information about its surroundings. In this
`work, I have chosen imaging sensors as the primary source of information be(cid:173)
`cause of their ability to quickly provide dense representations of the environment.
`The imaging sensors actually employed include color and bla~k-and-white video
`cameras, a scanning laser rangefinder and a scanning laser reflectance sensor1•
`The imaging sensors provide input to the component of an autonomous mobile
`robot which will be the primary focus of this dissenation, the perceptual process(cid:173)
`ing module (See Figure 1.1 ). The job of the perceptual processing module is to
`transform the information about the environment provided by one or more imag(cid:173)
`ing sensors into an appropriate high level motor command. The motor command
`appropriate for a given situation depends both on the current state of the world as
`reponed by the sensors, and the perception module's knowledge of appropriate re(cid:173)
`sponses for particular situations. The motor responses produced by the perception
`module take the form of elementary movement directives, such as "drive the robot
`
`1 Work is also underway in using the same techniques to interpret the output from a sonar array
`and an infrared camera.
`
`11
`
`

`

`1.1. PROBLEM DESCRIPTION
`
`3
`
`along an arc with a 30m radius" or "move the robot's foot 2.5cm to the right".
`The elementary movement directives are carried out by a controller which
`manipulates the robot's actuators. The determination of the correct motor torques
`required to smoothly and accurately perfonn the elementary movement directives
`is not addressed in this dissertation. While it is possible to use connectionist tech(cid:173)
`niques for low level control [Jordan & Jacobs, 1990, Katayama & Kawato, 19911.
`this aspect of the problem is implemented using classical PID control in each ot
`the systems described in this work.
`The two mobile robot domains used to develop and demonstrate the techniques
`of this thesis are autonomous outdoor driving and precise foot positioning for a
`robot designed to walk on the exterior of a space station. Because it embodies
`most of the difficulties inherent in any mobile robot task, autonomous outdoor
`driving is the primary focus of this work.
`In autonomous outdoor driving, the goal is to safely guide the robot through
`the environment. Most commonly, the environment will consist of a network of
`roads with varying characteristics and markings. In this situation, the goal is to
`keep the robot on the road, and in the correct lane when appropriate. There is
`frequently the added constraint that a particular route should be followed through
`the environment, requiring the system to make decisions such as which way to
`turn at intersections. An additional desired behavior for the autonomous robot is
`to avoid obstacles, such as other cars, when they appear in its path.
`The difficulty of outdoor autonomous driving stems from four factors. They
`
`are
`
`• Task variations due to changing road type
`
`• Appearance variations due to lighting and weather conditions
`
`• Real time processing constraints
`
`• High level reasoning requirements
`
`A general autonomous driving system must be capable of navigating in a
`wide variety of situations. Consider some of the many driving scenarios people
`encounter every day: There are multi-lane roads with a variety of lane markers.
`There are two-lane roads without lane markers. There are situations, such as
`city or parking lot driving, where the primary guidance comes not from the r"ad
`delineations, but from the need to avoid other cars and pedestrians.
`
`12
`
`

`

`4
`
`CHMTERJ. INTRODUCTION
`
`The second factor making autonomous driving difficult is the variation in
`appearance that results from environmental factors. Lighting changes, and deep
`shadows make it difficult for perception systems to consistently pick out important
`features during daytime driving. The low light conditions encountered at night
`make it almost impossible for a video-based system to drive reliably. In addition,
`missing or obscured lane markers make driving difficult for an autonomous system
`even under favora~le lighting conditions.
`Given enough time, a sophisticated image processing system might be able to
`overcome these difficulties. However the third challenging aspect of autonomous
`driving is that there is a limited amount of time available for processing sensor
`information. To blend in with traffic, an autonomous system must drive at a
`relatively high speed. To drive quickly, the system must react quickly. For
`example, at 50 miles per hour a vehicle is traveling nearly 75 feet per second. A
`lot can happen in 75 feet, including straying a significant distance from the road,
`if the system isn't reacting quickly or accurately enough.
`Finally, an autonomous driving system not only must perform sophisticated
`perceptual processing, it also must make high level symbolic decisions such as
`which way to turn at intersections. This dissertation shows that the first three
`factors making mobile robot guidance difficult can be overcome using artificial
`neural networks for perception, and the fourth can be handled by combining
`artificial neural networks with symbolic processing.
`
`1.2 Robot Testbed Description
`
`The primary testbed for demonstrating the applicability of the ideas developed
`in this dissertation to autonomous outdoor driving is the CMU Navlab, shown
`in Figure 1.2. The Navlab is a modified Chevy van equipped with forward and
`backward facing color cameras and a scanning laser rangefinder for sensing the
`environment. These are the primary sensory inputs the system receives. The
`Navlab also contains an inertial navigation system (INS) which can maintain the
`vehicle's position relative to its starting location. The Navlab is outfitted with
`three Sun Sparcstations, which are used for perceptual processing and other high
`level computation. The Navlab also has a 68020-based processor for controlling
`the steering wheel and accelerator and for monitoring the vehicle's status.
`The Navlab can be controlled by computer or driven by a person just like a
`normal car. This human controllability is useful for getting the Navlab to a test site,
`
`13
`
`

`

`1.3. DISSERTATION OVERVIEW
`
`5
`
`Figure 1.2: The CMU Navlab Autonomous Navigation Testbed
`
`and as will be seen in Chapter 3, for teaching an artificial neural network to drive by
`example. I have used two other robots to demonstrate the power of connectionist
`robot guidance, a ruggedized version of the Navlab called Navlab II, and a walking
`robot called the Self Mobile Space Manipulator (SM2). These additional testbeds
`will be described in more detail in Chapters 5 and 9, respectively.
`
`1.3 Dissertation Overview
`
`The goal of this thesis is to develop techniques that enable artificial neural networks
`to guide mobile robots using visual input. In chapter 2, I present the simple neural
`network architecture that serves as the basis for the connectionist mobile robot
`guidance system I develop called ALVINN (Autonomous Land Vehicle In a Neural
`Network). The architecture consists of a single hidden layer, feedforward network
`(see Figure 1.3). The input layer is a two dimensional retina which receives input
`from an imaging sensor such as a video camera or scanning laser rangefinder. The
`output layer is a vector of units representing different steering responses, ranging
`from a sharp left to a sharp right turn. The network receives as input an image of
`the road ahead, and produces as output the steering command that will keep the
`vehicle on the road.
`
`14
`
`

`

`6
`
`CHMTERJ. INTRODUCTION
`
`Figure 1.3: ALVINN driving network architecture
`
`Although some aspects of ALVINN's architecture and 1/0 representation are
`unique, the network structure is not the primary reason for ALVINN's success.
`Instead, much of its success can be attributed to the training methods presented
`in Chapters 3 and 4. Using the training "on-the-fly" techniques described in
`Chapter 3, individual three-layered ALVINN networks can quickly learn to drive
`by watching a person steer. These methods allow ALVINN to learn about new
`situations first hand, as a person drives the vehicle. The ability to augment the
`limited amount of live training data available from the sensors with artificial images
`depicting rare situations is shown to be crucial for reliable network performance
`in both Chapters 3 and 4.
`Using the architecture described in Chapter 2, and the training techniques from
`Chapters 3 and 4, ALVINN is able to drive in a wide variety of situations, described
`in Chapter 5. As a preview, some of ALVINN's capabilities include driving on
`single-lane paved and unpaved roads, and multi-lane lined and unlined roads, at
`speeds of up to 55 miles per hour.
`But developing networks that can drive is not enough. It is also important to
`undP ·stand how the networks perform their functions. In order to quantitatively
`understand the internal representation developed by individual driving network, I
`
`15
`
`

`

`13. DISSERTATION OVERVIEW
`
`7
`
`develop a technique called sensitivity analysis in Chapter 6. Sensitivity analysis
`is a graphical technique which provides insight into the processing performed by
`a network's individual hidden units, and into the cooperation between multiple
`hidden units to carry out a task. The analysis techniques in Chapter 6 illustrate
`that ALVINN's internal representation varies widely depending on the situations
`for which it is trained. In shon, ALVINN develops filters for detecting image
`features that correlate with the correct steering direction.
`A typical filter developed by ALVINN is shown in Figure 1.4. It depicts
`the connections projecting to and from a single hidden unit in a network trained
`on video images of a single-lane, fixed width road. This hidden unit receives
`excitatory connections (shown as white spots) from a road shaped region on the
`left of the input retina (see the schematic). It makes excitatory connections to the
`output units representing a sharp left tum. This hidden unit is stimulated when a
`road appears on the left, and suggests a left tum in order to steer the vehicle back
`to the road center. Road-shaped region detectors such as this are the most common
`type of feature filters developed by networks trained on single-lane unlined roads.
`In contrast, when trained for highway driving ALVINN develops feature detectors
`that determine the position of the lane markers painted on the road.
`This situation specificity allows individual networks to learn quickly and drive
`reliably in limited domains. However it also severely limits the generality of
`individual driving networks. Chapters 7, 8 and 9 focus on techniques for combining
`multiple simple driving networks into a single system capable of driving in a wide
`variety of situations. Chapter 7 describes rule-based techniques for integrating
`multiple networks and a symbolic mapping system. The idea is to use a map of
`the environment to determine which situation-specific network is appropriate for
`the current circumstances. The symbolic mapping module is also able to provide
`ALVINN with something the networks lack, namely the ability to make high level
`decision such as which way to tum at intersections.
`However rule-based arbitration is shown to have significant shoncomings.
`Foremost among them is that it requires detailed symbolic knowledge of the
`environment, which is often difficult to obtain. In Chapters 8 and 9, I develop
`connectionist multi-network arbitration techniques to complement the rule-based
`methods of Chapter 7. These techniques allow individual networks to estimate
`their own reliability in the current situation. These reliability estimates can be
`used to weight the responses from multiple networks and to determine when a new
`network needs to be trained.
`Chapter 10 illustrates the flexibility of connectionist mobile robot guidance
`
`16
`
`

`

`8
`
`CHM7ERJ. INTRODUCTION
`
`Weight to Output Units
`••• • ••••
`• ••
`Weight from Input Retina
`
`0Road
`[illJ Non-Road
`
`Figure 1.4: Diagram of weights projecting to and from a typical hidden unit in a
`network trained on roads with a fixed width. This hidden unit acts as a filter for a
`road on the left side of the visual field as illustrated in the schematic.
`
`17
`
`

`

`1.3. DJSSERTATJON OVERVIEW
`
`9
`
`by demonstrating its use in a very different domain, the control of a two-legged
`walking robot designed to inspect the space station exterior. The crucial task in this
`domain is to precisely position the foot of the robot in order to anchor it without
`damaging either the space station or the robot. The same methods developed to
`steer an autonomous vehicle are employed to safely guide the foot placement of
`this walking robot.
`In Chapter 11, the neural network approach to autonomous robot guidance
`is compared with other techniques, including hand-programmed algorithms and
`other machine learning methods. Because of its ability to adapt to new situations,
`ALVINN is shown to be more flexible than previous hand-programmed systems for
`mobile robot guidance. The connectionist approach employed in ALVINN is also
`demonstrated to have distinct advantages over other machine learning techniques
`such as nearest neighbor matching, decision trees and genetic algorithms.
`Finally, Chapter 12 summarizes the results and discusses the contributions of
`this dissertation. It concludes by presenting areas for future work.
`
`18
`
`

`

`Chapter 2
`
`Network Architecture
`
`The first steps in applying artificial neural networks to a problem involve choosing
`a training algorithm and a network architecture. The two decisions are intimately
`related, since cenain training algorithms require, or are best suited to, specific net(cid:173)
`work architectures. For this work, I chosen a multi-layered perceptron (MLP) and
`the back-propagation training algorithm [Rumelhart, Hinton & Williams, 1986]
`for the following reasons:
`
`• The task requires supervised learning from examples (i.e. given sensor
`input, the network should respond with a specific motor response). This
`rules out unsupervised/competitive learning algorithms like Kohonen 's self(cid:173)
`organizing feature maps [Kohonen, 1990] which learn to classify inputs on
`the basis of statistically significant features, but not to produce particular
`desired responses.
`
`• The system should learn relatively quickly, since one of the goals is to rapidly
`adap: to new driving situations. This rules out cenain supervised train(cid:173)
`ing algorithms/architectures such as Boltzmann Machines [Hopfield, 1982],
`which are notoriously slow at learning.
`
`• The task of determining the correct motor response from sensor input was
`not expected to require substantial, run-time knowledge about recent inputs.
`Thus, it was decided that the extensions of the back-propagation algorithm
`to fully recurrent networks [Pineda, 1987, Pearlmutter, 1988] was not nec(cid:173)
`essary.
`
`10
`
`19
`
`

`

`2.1. ARCHITECJ'URE OVERVIEW
`
`11
`
`The decision to use artificial neural networks in the first place, and to use back(cid:173)
`propagation over other closely related neural network training algorithms like
`quickprop [Fahlman, 1988] and radial basis functions [Poggio & Girosi, 1990]
`can be better understood after presentation of the architecture and training scheme
`actually employed in this work, and hence will be discussed in Chapter 11.
`
`Once the decision is made to use a feedforward multi-layered perceptron
`as the underlying network architecture, the question then becomes "what form
`should the MLP take?" This question can be divided into three components:
`the input representation, the output representation, and the network's internal
`structure. I will discuss each of the three components separately, theoretically
`and/or empirically justifying the choices made.
`
`2.1 Architecture Overview
`
`The architecture of the perception networks chosen for mobile robot guidance
`consists of a multi-layer perceptron with a single hidden layer (See Figure 2.1).
`The ·nput layer of the network consists of a 30x32 unit "retina" which receives
`images from a sensor. Each of the 960 units in the input retina is fully connected
`to the hidden layer of 4 units, which in turn is fully connected to the output layer.
`The output layer represents the motor response the network deems appropriate for
`the current situation. In the case of a network trained for autonomous driving, the
`output layer consists of 30 units and is a linear representation of the direction the
`vehicle should steer in the current situation. The middle output unit represents
`the "travel straight ahead" condition, while units to the left and right of center
`represent successively sharper left and right turns.
`
`To control a mobile robot using this architecture, input from one of the robot's
`sensors is reduced to a low-resolution 30x32 pixel image and projected onto
`the input retina. After completing a forward pass through the network, a motor
`response is derived from the output activation levels and performed by the low level
`controller. In the next sections, I will expand this high level description, giving
`more details of how the processing actually proceeds and why this architecture
`was chosen.
`
`20
`
`

`

`12
`
`CHAPTER 2. NEIWORK ARCHITECTURE
`
`lOO.t,.t
`Unlu
`
`Figure 2.1: ALVINN driving network architecture
`
`2.2
`
`Input Representations
`
`Perhaps the most imponant factor determining the performance of a particular
`neural network architecture on a task is the input representation. In determining
`the input representation, there are two schools of thought. The first believes that the
`best input representation is one which has been extensively preprocessed to make
`"important" features prominent and therefore easy for the network to incorporate
`in its processing. The second school of thought contends that it is best to give
`the network the "raw" input and let it learn from experience what features are
`important.
`Four factors to consider when deciding the extent of preprocessing to perform
`on the input are the existence of a preprocessing algorithm, its necessity, its com(cid:173)
`plexity and its generality. If straightforward algorithms are known to perform
`crucial initial processing steps applicable in a wide variety of situations, then it
`is advisable to use them. Such is the case in speech recognition where the raw
`speech input, as represented as amplitude of sound waves over time, is converted
`into coefficients representing amplitudes at various frequencies over time using
`a fast Fourier transform preprocessing step [Waibel et al., 1987]. This FFf pre-
`
`21
`
`

`

`22. INPUT REPRESENTATIONS
`
`13
`
`processing is known to be a useful and widely applicable first step in automatic
`processing of speech data [O'Shaughnessy, 1987]. Furthermore, algorithms to
`compute a signal's Fourier transform, while complex, are well understood and
`have efficient implementations on a variety of computer architectures. Finally,
`not only is the information contained in the Fourier transform proven useful in
`previous speech recognition systems, the Fourier transform also has the property
`that little information in the original signal is lost in the transformation. This
`insures that imponant input features are not lost as a result of the FFr.
`On the other hand, if appropriate preprocessing algorithms are not known
`for the task, or if the known algorithms are too complex to be practical, the
`solution is to give the n

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket