`
`"(cid:173)(T)
`00
`t-
`N
`<t
`I
`C
`<C
`
`ALVINN:
`AN AUTONOMOUS LAND VEHICLE
`IN A NEURAL NETWORK
`
`Technical Report AIP-77
`
`Dean A. Pomerleau
`
`Department of Psychology
`Carnegie Mellon University
`Pittsburgh, PA 15213-3890
`
`rrhe Artificial Intelligence
`and Psychology Project
`
`Departments of
`Computer Science and Psychology
`Carnegie Mellon University
`
`~.-.--- . " -.-- --'
`
`Learning Research and Development Center
`University of Pittsburgh
`
`----
`
`Approved for public release; distribution unlimited.
`
`12 039
`
`IPR2013-00419 - Ex. 1008
`Toyota Motor Corp., Petitioner
`1
`
`
`
`ALVINN:
`AN AUTONOMOUS LAND VEHICLE
`IN A NEURAL NETWORK
`
`Technical Report AIP-77
`Dean A. Pomerleau
`Department of Psychology
`Carnegie Mellon University
`Pittsburgh, PA 15213-3890
`
`January 1989
`
`OTIC.
`SELECTED
`MARB,l990
`
`.
`
`,
`This research was supported by the Computer Sciences Division, Office of Naval Research, l'nder contract number NOOO14-86-
`K-0678, N00014-87-K-~38S, and N00014-87-K-O~, by National Science Foundation Grant EET-8716324, by the Defense
`Advanced Research Projects Agency (DOD) monitored by the Space and Naval Warfare Systems Command under Contract
`N00039-87-C.~2S1, and by the Strategic Computing Initiative of DARPA, through ARPA Order 5351.iand monitored by the U.S.
`Army Engineer Topographic Laboratories under contract DACA76-85P-0003 titled "Road Following".
`
`Reproduction In whole or in part Is permitted for any purpose of the United States Govemment. Approved for plb/lc release;
`distribution unlimited.
`
`
`
`2
`
`
`
`Unclassified
`stC:U",T
`
`la. q~ SECURITY CWSIFICATION
`Unclassified
`2 •. SECURITY C~SIFtCA ~ AuTHORITY
`
`REPORT DOCUMENTAnON PAGE
`1 b RESnlCTlVE MARKINGS
`
`2b. OECLASSIF1CA TION J DOWNGRADING SCHEOUL£
`
`~. DERFORMING ORGANIZATION q~T NUMIER(S)
`AlP - 77
`
`Ie, NAME OF PERFORMING ORGANIZA TlON
`
`Carnegie-Mellon University
`
`6c. AODRESS (City, su" MttIIIPCoMJ
`
`Department 0 Psychology
`Pittsburgh, Pennsylvania 15213
`
`6b. OFFICE S'fMIOL
`(If .pplic.bIe)
`
`) OISTRIIUTION I AVAII,.IILITY OF REPORT
`Approved for public release;
`Distribution unlimited
`
`S. MONITORING ORGANIZATION REPORT NUMIER(S)
`
`,. NAME OF MONITORING ORGANIZATION
`Computer Sciences Division
`Office of Naval Research
`
`70 ADDRESS (City, sr.,., .ttd ll~ CoM'
`
`800 N. Quincy Street
`Arlington, Virginia 22217-5000
`
`Ia. NAME OF FUNDING I SPONSORING
`ORGANI~TION
`S .. e as Monitoring Organizatiol
`Ie. ADDRESS (City. St.,.. MttI Zl~ CoM)
`
`11. TITLE (Inclcx» s.cumy Onllfi(.tion)
`
`lb. OFFICE SYMIOL
`(If .pplic.bIe)
`
`9 PROCUREMENT INSTRUMENT IDENTIFICATION NUMIU
`
`NOO014-86-K-0678
`10 SOURCI OF FUNDING NUMII~ D40
`TASK
`PROJECT
`PROG~
`ELEMENT NO
`NO.
`NO
`N/A
`N/A
`N/A
`
`117-4-86
`WORK UNIT
`~CESSION NO
`N/A
`
`ALVINN: An Autonomous Land Vehicle in a Neural Network
`12 PERSONAL AuTHOR(S)
`
`Pomerleau~ Dean A.
`
`,f4 DATE OF REPORT
`
`i'f •• " Monm. 0.1) rs. PAGE COUNT
`
`1'3b. TlMa COVERED
`13 •. TYPE OF REPORT
`FROM 6SeptJ. 5TO 91 SeEtl
`Technical
`13
`Januarv 1989
`16 SUPPLEMENTARY NOTATION Paper was presented at the IEEE Conference on Neural Information
`Processing Systems--Natural and Sy~thetic, Denver, Novembpr, 1988
`~""r t>, /
`~UBJECT TERMS (Contrnu. on ,.V.". " n.(.sury .ttd identify by bloc/( nu,"~r'
`
`17
`
`FIELD
`
`COSA TI CODES
`SUB-GROUP
`GROUP
`
`autonomous navigatibn....
`neural networks.-
`
`road following.'
`machine vision, (:Sr
`
`19. ABSTRACT (Contlnu. on r.v.fI. " n.c.sury .nd l!Unr,fy by block nu,"~r'
`
`SEE REVERSE SIDE
`
`a
`
`I · c .....
`I
`
`.~
`
`,-
`
`20. DISTRI'UTION I AVAILAIIUTY OF AISTRACT
`aD SAME AS RPT
`Cl UNCLASSIFIEOIUNLIMITED
`22. N,.Mf! O~ "IS.., .SIIU INOIVIOU~
`22b TELEPHONE (Ind" Are. CoeM) 122(. OFFICE SYMIOL
`Dr. Alan L. Mayrovitz
`(202)
`696-4302
`NOOO14
`DD'OAM 147J,"MAR
`11 APR edItIon (f'!.y be uMd until •• "ault.d.
`
`o OTIC USERS
`
`2' ABSTRACT SECURITY CI.ASSI"C,A. TION
`
`All at".' edItIon, .,. obsolet •.
`
`SECURITY CLASSIFICATION OF THIS PAGE
`
`Unclassified
`
`,
`
`
`
`3
`
`
`
`ABSTRACT
`
`~
`ALVINN (Autonomous Land Vehicle In a Neural Network) is a 3-layer
`back-propa~ation network desi~ned for the task of road following.
`Currently ALVINN takes images from a camera and a laser range finder
`as input and produces as output the direction the vehicle should
`travel in order to follow the road. Training has been conducted
`using simulated road images. Successful tests on the Carnegie
`Mellon autonomous navigation test vehicle indicate that the network
`can effectively follow real roads under certain field conditions.
`The representation developed to perform the task differs dramatically
`when the network is trained under various conditions, suggesting the
`possibility of a novel adaptive autonomous navigation system capable
`of tailoring its processing to the conditions at hand.
`I -~'J;
`
`.
`
`j
`
`
`
`4
`
`
`
`INTRODUCTION
`
`Autonomous navigation has been a difficult problem for traditional VISIon and robOUc
`techruques. pnrnarily because of the noise and vanability associated WIth real world
`scenes. Autonomous navigation systems based on traditional tmage processmg and pat(cid:173)
`tern recognition techniques often perform well under cenam conditions but have problems
`Wlth others. Part of the difficulty stems from the fact that the processing performed by
`these systems remams fixed across various driving silUabons.
`
`Artificial neural networks have displayed promising performance and flexibility Ul other
`domains characterir.ed by high degrees of noise and variability. such as handwntten
`character recognition (Jackel et aI., 1988] [pawlicki et aL, 1988] and speech recognition
`[Waibel et al., 1988]. ALVINN (Autonomous Land Vehicle In a Neural Network) is a
`COMectionist approach to tile navigational faSt of rold following. Specifically, ALVINN
`is an 3rtificial neural netwOlk designed to control the NAVLAB, tile Carnegie Mellon
`autonomous navigatioo test vehicle.
`
`NETWORK ARCHITECTURE
`
`ALVINN's current ardlitectw'e consists of a single bidden layer back-propagation network
`(See Figure 1). The input layer is divided into three sets of units: two "retinas" and a
`single intensity feedback uniL The two retinas correspond to the two forms of sensory
`input available on the NAVLAB vehicle; video and range information. The first retina.
`consisting of 3002 units. receives video camera input from a road scene. The activation
`level of each unit in this Ietina is proportional to the intensity in the blue color band of
`the corresponding pIlCh of tile image. The blue band of the color image is used because
`it provides the highest contrast between the rold and the non-road. The second retina.
`consisting of 8x32 units. receives input from a laser range finder. The activation level of
`eacb unit in this retina is proportional to tile proximity of the corresponding atea in the
`image. The rold intensity feedback unit indicates wbether tbe road is lighter or darker
`than the non-rold in the previous image. Each of these 1217 input units is fully connected
`to the hidden layer of 29 units. which is in tum fully conneaed to the oUlput layer.
`
`The output layer consists of 46 units. divided into two groups. The tint set of 4S units
`is a linear repreaealllioo of the tum curvarure aloog wbicb the vehicle should travel in
`order to bead towlldl 1be road center. The middle unit represents the "travel straight
`abeId" CClIlditiaa wbiIe units to the left and right of the cenler represent successively
`sbIIper left mel Jipr IUJDS. Tbe netwOlt is trained with a desired output vector of all
`zerGI euept for a '1Wl" of activation cenlered on the unit representing the correct tum
`c:al't1lUie. wbich is 1be curvabJre which would bring the vebicle to the rold cenler 7
`meterI ahead of ill cmrem position. More specifically. the desired activation levels for
`the aiDe units cenllmld IrOUDd tbe correct tum curvabJre unit are 0.10. 0.32. 0.61. 0.89.
`1.00.0.89.0.61. 0.321Dd 0.10. During testing. the tum curv1IUIe diewed by the netwoIk
`is tabu to be me CUJVIIUI'e represented by the output UDiI with the hi~st activation
`leveL
`
`The final output IlDit is • rold intensity feedback unit wbicb indicateS wbedler the road
`
`on
`Ua:I
`.~
`.leed
`'nt 11\n
`
`if
`0
`0
`
`.It ~ OIl:=' __ -!
`"~I1 ••. t1bHi ty Cod.1I
`-_.-- Avail and/or
`Diet
`Special
`
`
`
`5
`
`
`
`i:\LVINN
`Architecture
`
`Road Intensity
`Feedback Unit
`
`45 Direction
`Output Units
`
`8x32 Range Finder
`Input Retina
`
`3Ox32 Video
`Input Retina
`
`Fisure 1: ALVlNN Arcbitectwe
`
`is lighter or daJter IbaII the nOD-rold in the current irnase. During testinJ, the activation
`of the output rold intensity feedback unit is recirc:ulaIed to the input layer in the style
`of JordaD [Jon1aD, 1988] to aid the netwOlt's proc:essin& by providing rudimentary in(cid:173)
`fODDItiOO concemiog the relative intensities of the rold and tile Don-rold in tile previous
`irnqe.
`
`TRAINING
`
`TraioiDl on actual fOld irna&a is loptically difficult. because in order to develop a
`paeral represeDWiClll, me oetwom mUll be pzaeoted with a large number of training
`....,.111 depic:tilll 1'OIda under a wide variety of conditions. Collection of such a
`
`cia _ would be di1Ilcult, aDd c:baqes in pUllllecets such U camera orientation would
`reqaiIe coDectinllD entiJely new set of rold im .... To avoid tbese ditliaJlties we have
`deftloped a simuJarecl road paemor wbicb aea.&eI road imaaes to be used •
`traininl
`exempWI for the netwom. Tbe simullled road aeaenror uses neuly 200 parameters
`in order to genellle a vmecy of rea1isQc road imajJel. Some of the most importlDt
`pll'llDeten are listecS in fiawe 2.
`Figure 3 depictS the video imases of one real rold and one artificial rold geoented
`wim a sinlle set of values for me parameters from Figure 2. Although not shown in
`Figure 3, the rold generator also createS conesponctin, simulated range finder imaaes.
`
`2
`
`
`
`6
`
`
`
`• size of video camera retina
`• 3D position of video camera
`• direction of video camera
`• field of view of video camera
`• Size of range finder retina
`• 3D position of range finder camera
`• direction of range finder
`• field of view of range finder
`• position of vehicle relative to road center
`• fOld direction
`• tOed curvature
`• rate of road cwvature change
`• fOld curve length
`• fOld width
`• rate of road width change
`• road intensity
`• left non-road intensity
`• ript non-road intensity
`• fOld intensity variability
`• nOll-road intensity variability
`• rate of road intensity change
`• rate of llOIl-road intensity cbange
`• position of image saturation spots
`• size of image saturation spots
`• sbIpe of image saturation spots
`• position of obstacles
`• size of obstacles
`• sbIpe of obstacles
`• intaJlity of obstacles
`• sb.Idow size
`• sbIdow direction
`• IIbIdow intensity
`
`Figure 2: Plrlmeters for simulated road generator
`
`3
`
`
`
`7
`
`
`
`Real Road Image
`
`Simulated ROid Image
`
`Figure 3: Real and simulated road images
`
`At the relatively low resolution being used it is difficult to distinguish between real and
`simulated roads.
`
`Netwodc training is performed using artificial rold "snapshots" from the simulated rold
`generator and the Warp back-propagation simulator described in [pomerleau et al., 1988].
`Training involves first crelling a set of 1200 different rold snapshocs by randomly vuying
`the parameters used by the simulated road generllor. Back-propllation is men performed
`using Ibis set of exemplars until only asymptotic perfonnanc:e improvements appear ~ely.
`During the early stages of training. the input road intensity unit is given a random
`acavatioo level This is done to prevent the netwolk from meJely learning to copy the
`activation level of the input road intensity unit to the ouqJUt road intensity unit. since their
`activation levels should almost always be identical because the relative intensity of the
`road aDd me non-road does not often dlanse between two successive images. Once the
`netwodt blS developed a representation thll uses image characteristics to detennine the
`aca.vllioD level for the ouqJUt road intensity unit. the necwom is given as input wbether
`me ... would have been daIker or lipter than the non-rold in the previOUS imlle. Using
`tbiI extra information concerning the relative brighmess of the road and the non-rold..
`die netwOlk is beaer able to determine the correct direction for the vehicle to travel.
`
`PERFORMANCE
`
`Three methods are used to evaluate ALVlNN's performance. The filst teSt uses novel
`artificial road unlles. After 40 epochs of traUUnl on the 1200 simulated road snapshOtS,
`the networt correctly dictates a nun curvalUle within two units of the correct answer
`appromnacely ~ of the time on novel artificIAl road snipshocs. The second. more
`informative test IDvolves "drtvinS" down a simullled stretch of rold.. Spec:ifically, the
`
`4
`
`
`
`8
`
`
`
`Figwe 4: NAVLAB. the CMU autoncmous navigation test vehicle.
`
`anificial road generator has an interactive mode in whic:b the ro..d image scrolls in
`response to an externally spec:i.fied speed and direction of travel After the training
`described above. AL VINN caD drive the artificial road generator It I c:onstaDl speed on
`trips of several miles without straying from the simulaled road. The primary testing of
`ALVINN's perfonnanc:e is conducted on the NAVLAB (See Figwe 4). The NAVLAB
`is I modified OIevy van equipped widl 3 Sua c:ompuren. a warp, a video camera. and
`a luer range finder. which serves as I testbed for tbe CMU autonomous laDd vehicle
`project ['Iborpe et al .• 1987]. Performanc:e of the oecwolt to dale is c:ompanble to thll
`adUeved. by the best tl'lditional visioo-bued autonomous navigation algorithm It CMU
`under 1be limited conditions rested.. Specifically, the netWOIk can accurately drive the
`NAVLAB III speed of 1/2 meter per second along 1400 meter pam througb I wooded
`area of the CMU campus under sunny faD conditions. Under similar conditions on the
`same C:0UJ!e. the ALV group II CMU his recently achieved similar driving ac:curac:y at
`a speed of ODe meter per second by implementing their imlge processing autonomous
`naviplioa allorida OIl tbe Warp c:omputer. In contrast. the AI. VINN nelWOtX is currectly
`. rimn". using oaIy ID OIl-boud Sun computer. and dramatic: speedups are expected
`...... are performed using the Warp.
`
`NETWORK REPRESENTATION
`
`1be representation developed by the nelWolt to perform the road foUowing task depends
`dnmaIic:ally on the charaaeristics of the training set. When U'lined on examples of rolds
`widl I 6xed wtddl. the netWOIk developl I representltian in which hidden units act as
`filters for rolds II different positions. FilUJ1'S 5. 6 and 7 are diagrams of the weigblS
`projec:tins to and from single hidden unils in such , netWOIk.
`
`s
`
`
`
`9
`
`
`
`Weigr.t to Output
`Feedback Unit
`
`Weight from Input
`Feedback Unit
`-. w
`
`Weights to Direction Output Units
`---
`
`Weights from Video Camera Retina
`
`Weight from
`Bias Unit
`
`•
`
`o Road
`II Non-road
`
`Weights from
`Range Finder Retina
`
`- - =-= :::::%::'~==== = -=.===
`-:. _ =- --=-=== ===== -=- == =.::. == =
`
`= =:::.
`-=--==::z::::::= =====
`
`-=-= ::::--===;;:::=.=:. ======== =========-=-===-=..==
`
`Figure S: Diagram of weights projecting to and from a typical bidden unit in a network
`trained on roads Wtth a fixed width. This hidden Wlit &as as a tilter for a single road on
`the left side of the visual field as illUSU'lled by the schematic.
`
`6
`
`
`
`10
`
`
`
`Weight to Output
`Feedback Unit
`
`-:.. --
`
`-
`
`.:::;:::
`
`~eights to Direction Output Units
`- - -
`--::=::=:"
`"'-:'~:i_:r
`-
`J:::i~ -=:~;:~---
`
`Weights from Video Camera Retina
`
`Weight from Input
`Feedback Unit
`o
`
`Weight from
`
`Biu Unit •
`
`Left Road Right Road
`
`Weights from
`Range Finder Retina
`""",.....,.....,.....,.....,.-,--=--==
`
`FiIUft 6: Diap-am o( weights projecting to and from • typical bidden unit in • oetwolk
`traiDed OIl roads 9t'1tb a futed 9t'1dtb. This hidden unit aas u a filter (or two fOlds. one
`slipdy \eft and one sbptly nillt of center. as illustrlled by the schematic:.
`
`7
`
`
`
`11
`
`
`
`Weight tc Output
`Feedback Unit
`~ :..:
`
`'''''1&
`
`III
`
`Weight from Input
`Feedback Unit
`
`Weight from
`
`Bias Unit •
`
`Weights tc Direction Output Units
`.:XIIX::;::;aa
`11 III XI.
`
`III
`
`Weights from Video Camera .... .,.. ... _--.
`
`Weights from
`Range Finder Retina
`
`==~=~~==~==~===~~~~~~==~=~==
`===~=;=========~-========-===-
`
`.
`
`i.. f
`
`Fi~ 7: Diagram of weights projecting ." and from a typical bidden unit in a network
`trained on roads WIth a fixed wtdth. This hidden unit actS a a filter for three roads, as
`illusttared by tht trtmodal exciwory COMections to the direction output units.
`
`8
`
`I
`
`
`
`12
`
`
`
`As indicated by the weIghts to and from the feedback unit in Figure 5. this hidden Wllt
`expects the road to be lighter than the non-road In the preVIous unage and suppons the
`road being lighter than the non-road in the current image. More specifically. the weights
`from the video camera retina suppon the interpretation that this hidden unit is a filter for
`a single light road on left SIde of the visual field (See the small schematic to the left of
`the weights from the VIdeO renna in Figure 5). This interpretation is also supported by
`the weIghts from the range finder renna. This hidden unit is excited if there is high range
`activity (i.e. obstacles) on the right and inhibited if there IS high range activity on the left
`of the visual field where this hidden unit expeas the road to be. Finally, the single road
`filter interpretation is reftect.ed in the weights from this hidden unit to the directioo output
`units. Specifically, dUs hidden unit makes exciWoJ)' connections to the output unilS on
`the far left. dictating a sIwp left tum to bring the vehicle back to the rold center.
`
`Figwe 6 iDu.stta!es the weigllts to and from a JUdden unit with a more complex represen(cid:173)
`tatioo. This hidden unit acts IS a filter for two rolds, one slightly \eft and one slightly
`right of center. The weigllts from the video camera retina along with the explanatory
`scbematic in Figure 6 show the positions and orientations of the two roads. This hid(cid:173)
`deD unit makes bimodal excitatory connections to the directioo output units, diewing a
`slight left or slight right tum. Finally, Figme 7 illustrates a still more complex hidden
`unit representation. Althougll it is difficult to deteanine the natura. of the lepresentatioo
`from !be video camera weights. it is clear from the weights to the directioo output unilS
`that this hidden unit is a filter for duee differeDt roads. each dictating a different travel
`direction. Hidden units which act IS filters for one to three roads lie the representation
`struaures most ccmmooly developed when the DetWcxt is trained 00 rolds with a fixed
`width.
`
`The oetwOlk develops a very different representation wben trained em snapshots with
`widely varying road widdls. A typical bidden unit from Ibis type of representatioo is
`~ in fi~ 8. ODe important future to notice from the feedback weights is that
`this unit is filtering for a road wbich is daJker than tbe DOll-road. More importantly. it
`is evideDt from the video CIDlera retina weights that this hidden unit is a filter solely
`for the \eft edse of the road (See scbemllic to the left of the weipts from the video
`retma in Fipre 8). TbiJ bidden UDit suppons a Jatber wide range of travel directions.
`This is to be expected. since the comet travel d:i.rectiem for a rold with an edge at a
`pIfticalIr 10Clli0a varies substantially depeodiog on the rold's width. This hidden unit
`woaId aJOPei- widl hidden units tbIl detect the right road edge to detennine the couect
`bawl diJec:tiem in any puticular situation.
`
`DISCUSSION AND EXTENSIONS
`
`The diIdDct Jepiaeotllioas developed for di1I'ereDt cin:nJl\'lIDce5 illustrate a key advan(cid:173)
`tale provided by neunI oetwOlb for IDtoDOIDOUI navipDOIl. NlDlely. in this pmdigm
`the data. Dot the propmuner. de1etmiDes 1M saliem maae features crucial to ICcun&e
`roM naviptioo. From a practical standpoint. tbis dill respoasiveness his dramatically
`sped ALVINN's development. Once a ralisDc arti1lcia11'01d geoerllor wu developed.
`blCk-propaptiOD producted in half an hour a relMively succeaful road fonowin, system.
`
`9
`
`
`
`13
`
`
`
`Weight to Output
`Feedback Unit
`
`•
`
`.:::::::::!::.!.
`
`. - -
`- -
`- - - -
`
`.1
`
`Weight from Input
`Feedback Unit
`
`Weight from
`
`•
`Bias Unit •
`II Road
`o Non-road
`
`Weights from
`Range Finder Retina
`
`Figure 8: Diagram of weights projecting to and from • typical hidden unit in a network
`trained on roads with different widths.
`
`10
`
`
`
`14
`
`
`
`It took many months of algorithm development and parameter tuning by the vision and
`autonomous navigation groups at eMU to reach a sunilar level of performance using
`traditional image processing and pattern recognition techniques.
`
`More speculatively. the flexibility of neural network representations provides the pos(cid:173)
`sibility of a very different type of autonomous navigation system in which the salient
`sensory feanues are determined for specific driving conditions. By interacnvely traming
`the network on real road images taken as a human drives the NAVLAB. we hope to
`develop a system that adapts its processing to accommodale current ciJcumstances. This
`is in contrast with other autonomous navigation systems at CMU [Thorpe et al.. 1987)
`and elsewhere [Dunlay &: Seida. 1988] [Dickmanns &: Zapp. 1986] [Kuan et al.. 1988).
`Each of these implementations bas relied on a fixed. highly suuctured and therefcm rela(cid:173)
`tively infiexible algorithm for finding and following the road. reganlless of me conditions
`at hand.
`
`There are difficulties involved with training "oo-me-fiy" with real imases. If the necwotk
`is not presented with sufficient vuiability in its trlining exemplars to cover the conditions
`it is likely to encounter wilen it takes over driving from me humID operator. it will not
`develop a sufliciendy robust representation and will perform poorly. In additioo. the
`netwodt must not solely be shown examples of accurate driving, but also bow to recover
`(i.e. recum to the road ceotel') ooce a mistake bas been made. Pa."":'.al initial training on
`a variety of simulated road images should help eliminate these difficulties and facilitate
`better performance.
`
`Another important advantage gained through the use of neural netwotks for autonomous
`navigation is the ease with which they assimilare data from independent seosors. The
`cmreot AL VINN implemenWioo processes clara from two sources, the video camera and
`me laser range finder. Durins tJUliDg. the DetwotX discovers how iDfonnation from
`eacb source relates to tile task.. IDC1 weigbU each acc:orctingly. As aD example. range
`data is in scme scme less important for the task of rold following than is the video
`data. 1be ranp data eontlins iDfonnatiOll cooceming the position of obstacles in the
`scene, but notbing explicit about tile loc:aliOD of the roc. As a result, the range data
`is given less sipli1ic:ance in the represeotatiOD, IS is illustrated by the relatively small
`mapitDde weights from tile rage finder retina in the weight diqrams. Figures S, 6 and
`8 illutrIre tbal tbe rlDp finder counections do correlate with the connections from the
`video c:.nera, IDd do coatribur.e to cboosing tbe conect traVel directioo. Specifically, in
`.... 1bIee flpIes. obItacles 10Clted outside the area in wbicb the bidden unit e;-pects
`tbe RIId to be 10Clled iDaea the bidden unit's ac:tivatioo level wbiIe obstacles located
`wi1biD die apeaed !'Old bouDdaries inhibit tbe bidden unit. However the contributions
`!laD Cbe range tiDpr cODDeCtions area't necessary for reasonable performance. When
`ALVINN ..... 1eSIed with normal video input but ID obsUIcle-free rlDF finder imaae IS
`coaa.t input. there ..... DO nOCicelble depadaDOD in driviD, perfonnlDce. Obviously
`UDder off-road driviD, coaditiOlll obade avoidIDce would become mucb more important
`IDd beac:e one would npec:t the rlDF fiDder Jetina to play a much mon: sipli1iCIIJt role
`in die necwOlk's repseleDWiOlL We lie cumIldy waddllg OIl ID off-roid versiOD of
`ALVINN to test tbiI bypocbelis.
`
`11
`
`
`
`15
`
`
`
`Other current directions for this project include conducting more extensive tests of the
`network's performance under a vanety of weather and lighting conditions. These will
`be CruCial for making legitimate performance comparisons between ALVlNN and other
`autonomous navigation techniques. We are also wortmg to mcrease driving speed by
`Implementing the network simulation on the on-board Warp computer.
`
`Additional extensions involve exploring different network architectures for the road fol(cid:173)
`lowing task. These include 1) giving the network additional feedback: informaoon by us(cid:173)
`ing Elman's [Elman. 1988] technique of recirculating hidden activation levels. 2) adding
`a second hidden layer to facilitate better internal representations. and 3) adding local
`connectivity to give the network a priori knowledge of the two dimensional nature of the
`input
`
`In the area of planning. interesting extensions include stopping for. or planning a path
`around. obstacles. One area of planning that clearly needs work is dealing sensibly with
`road forks and intersections. Cwrenlly upon reaching a fork. the network may output two
`widely discrepant travel directions. one for each choice. 'The result is often an oscillation
`in the dictated travel direction and hence inaccwate road following. Beyond dealing with
`individual intersections. we would eventually like to integrate a map into the system to
`enable global point-ta-point path planning.
`
`CONCLUSION
`
`More extensive testing must be performed before definitive conclusions can be drawn con(cid:173)
`cerning the performance of ALVINN versus other road followers. We are optimistic con(cid:173)
`cerning the eventual contributions neural networks will make to the area of autonomous
`navigation. But perbaps just as interesting are the possibilities of contributions in the
`other direction. We hope thll exploring autonomous navigation. and in panicular some of
`the extensions outlined in this paper. will have a significant impaa on the field of neural
`networks. We certainly believe it is important to begin researching and evaluating neural
`networks in real world situations. and we think autonomous navigation is an interesting
`application for sucb an approacb.
`
`'Ibis wOlt would not have been possible without the input and support provided by Dave
`TomeuJty. 10seph Tebelskis. George Gusciora and the CMU Warp group. and panicularly
`CbarIes Thorpe. 1ill Crisman. Martial Hebert, David Simon. and rest of the CMU AL V
`group.
`
`[Dickmlnns cl Zapp. 1986] Dickmanns. E.D .•
`lapp. A. (1986) A curvature-based
`scbeme for improvin,l'OId vehicle IUidance by computer vision. "Mobilt Robots".
`SPIE-Proc. Vol. 727. Cambridae. MA.
`
`[Elman. 1988J Elman.1.L, (1988) Finding structure in time. Tedlnical report 8801. Cen(cid:173)
`ter for Reseucb in Lanpage. University of Califomia. San Die,o.
`
`12
`
`
`
`16
`
`
`
`[Dunlay & Seida, 1988] Dunlay, R.T .• Seida. S. (1988) Parallel off-road perception pro(cid:173)
`cessing on the ALV. Proc. SPIE Mobile Robot Conference. Cambridge MA.
`
`[Jackel et al .. 1988] Jackel. L.D., Grat, H.P., Hubbard. W., Denker. J.S .• Henderson, D.,
`Guyon. I. (1988) An application of neural net chips: Handwritten digit recognition.
`Proceedings of IEEE IfllenllJrionaJ Conference on Neural NeLWor/cs, San Diego, CA.
`
`[Jordan, 1988] Jordan, M.I. (1988) Supervised learning and systems with excess degrees
`of freedom. COINS Tech. Report 88-27. Computer and Information Science. Uni(cid:173)
`versity of Massachusetts. Amherst MA.
`
`[Kuan et al., 1988] Kuan, D., Phipps, G. and Hsueh, A.-C. Autonomous Robotic Vehicle
`Road Following. IEEE Trans. on Pallern Analysis and MachiM Intelligence. Vol.
`10. SepL 1988.
`
`(Pawlicki et al., 1988] Pawlicki, T.F., Lee, D.S., Hull, JJ., Srihari, S.N. (1988) Neural
`network models and their application to handwritten digit recognition. Proceedings
`o/IEEE In~r1tlJlionaJ Conference on Neural NeLWorlcs, San Diego, CA.
`
`[pomerleau. 1989] Pomerleau, D.A. (1989) ALVINN: An Autonomous Land Vehicle In a
`Neural Network. To appear in Advances in Neural InfonnJJtiofl Processing Systems.
`Vol 1, D.S. Touretzky (ed.), Morgan Kaufmann.
`
`[pomerleau et aI., 1988] Pomerleau, D.A., Gusciora, G.L., Touretzky, D.S., and Keng,
`H.T. (1988) Neural network simulation at Warp speed: How we got 17 million
`connections per second. Proceedings o/IEEE llllerfllJliofllJl Conference on Neural
`Networlcs, San Diego, CA.
`
`[Thorpe et aI., 1987] Thorpe, C., Herbert. M., Kanade, T., Shafer S. and the members of
`the Stralegic Computing VlSion Lab (1987) Vision and navigation for the carnegie
`MeUon NAVLAB. AMIIIJl Review 0/ Computer Science Vol. ll. Ed. Joseph Traub,
`Annual Reviews Inc., Palo Alto, CA.
`
`[Waibel et aI., 1988] Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K. (1988)
`Phoneme recognition: Neural Networks vs. Hidden Markov Models. Proceedings
`from 1111. Con/. on Acoustics. Speech and Signal Processiflg, New York. New York.
`
`13
`
`
`
`17