`
`IEEE TRANSACTIONS C/N SYSTEMS. MAN, AND CYBERNETICS, VOL . 20. NO. 2, MARCH/APRIL 1990
`
`
`
`
`
`
`A Fault-Tolerant Architecture for an
`
`
`
`Automatic Vision-Guided Vehicle
`
`MANSUR R. KABUKA, MEMBER, IEEE, SURJADI HARJADI STUDENT MEMBER, IEEE,
`AND AKMAL YOUNIS
`
`
`
`
`
`
`
`
`
`Abstract —A fault-tolerant architecture for an automatic navigation
`
`
`
`
`
`
`
`
`
`
`system is presented. The system employs a mixed type of architecture in
`
`
`
`
`
`
`
`
`
`which the speed advantages of both pipelined and parallel architectures
`
`
`
`
`
`
`
`
`
`are exploited to achieve real-time navigation. The fault-tolerant archi-
`
`
`
`
`
`
`
`
`
`
`tecture is presented using two reconfiguration strategies. To evaluate the
`
`
`
`
`
`
`
`
`
`proposed architecture, its reliability, availability, and safety are investi-
`
`
`
`
`
`
`
`
`
`
`gated using Markov models. In addition, the feasibility of implementing
`
`
`
`
`
`the proposed architecture is studied.
`
`
`
`
`
`
`
`
`
`
`ing any unknown environment has yet to be achieved,
`
`
`
`
`
`
`
`
`although great advancements have been made. On the
`
`
`
`
`
`
`
`
`contrary, if the environment is controllable a likely situ-
`
`
`
`
`
`
`
`
`ation in indoor industrial settings—the situation is simpli-
`
`
`
`
`
`
`
`fied and practical solutions could be implemented.
`
`
`
`
`
`
`
`
`In a controlled environment, the AGV usually navi-
`
`
`
`
`
`
`
`gates by correlating the information previously stored
`
`
`
`
`
`
`
`
`about the environment with the information it gathers
`
`
`
`
`
`
`
`
`
`
`
`along its way. One way of providing this information is in
`
`
`
`
`
`
`
`
`
`
`the use of marks and patterns that can be discretely
`
`
`
`
`
`
`
`
`placed within the environment. The AGV determines its
`
`
`
`
`
`
`
`
`position relative to these marks and subsequently locates
`
`
`
`
`
`
`
`
`
`itself in the environment. Some of the marks already
`
`
`
`
`
`
`
`developed include laser-detected corner cubes [19] and
`
`
`
`
`retroreflective "spot marks" [22].
`
`
`
`
`
`
`
`
`Artificial intelligence also has been used for purposes
`
`
`
`
`
`
`
`
`
`
`of path planning and navigation [23], [24]. If the environ-
`
`
`
`
`
`
`
`
`ment is assumed to be known completely, algorithmic
`
`
`
`
`
`
`
`
`
`
`methods can be employed to plan the path as a one-time
`
`
`
`
`
`
`
`
`
`off-line operation [25]—[27]. Although this can prove to be
`
`
`
`
`
`
`
`
`
`fast for path-planning execution, it does not account for
`
`
`
`
`
`
`
`
`
`
`the possibility of the path being blocked due to temporary
`
`
`
`
`
`
`
`
`
`
`reasons. In such a situation, the AGV's should have the
`
`
`
`
`
`
`
`ability to revise, during navigation, their previously
`
`
`
`
`
`
`
`
`
`planned paths so as to obtain optimal or suboptimal
`
`
`solutions [28]-[30].
`
`
`
`
`
`
`
`
`
`To increase the efficiency and minimize the chance of
`
`
`
`
`
`
`
`
`
`
`accidents, an AGV system must be designed with a maxi-
`
`
`
`
`
`
`
`
`mum regard towards reliability, availability, and safety. In
`
`
`
`
`
`
`
`
`this paper, a fault-tolerant architecture is proposed for
`
`
`
`
`
`
`
`
`
`
`the implementation of the AGV presented in [29] and [30]
`
`
`
`
`
`
`
`
`
`to attain these properties. No matter how well designed
`
`
`
`
`
`
`
`
`
`
`
`the system is, its circuit components may fail at a crucial
`
`
`
`
`
`
`
`
`
`
`
`moment. This can prove to be hazardous not only to the
`
`
`
`
`
`
`
`
`
`
`AGV but also to its environment. In a less serious condi-
`
`
`
`
`
`
`
`
`
`
`
`tion, the AGV can lose its way and wander until rescued,
`
`
`
`
`
`
`
`
`
`with the consequence of wasting both time and efficiency.
`
`
`
`
`
`
`
`
`Moreover, the interruption of material flow caused by
`
`
`
`
`
`
`
`
`malfunction can seriously disrupt the whole system's op-
`
`
`
`
`
`
`
`
`eration. Under more serious circumstances, the AGV can
`
`
`
`
`
`
`
`collide with another AGV, causing massive destruction
`
`
`
`
`
`
`
`
`
`
`
`that could have been avoided with the use of a fault-
`
`
`
`
`
`
`
`
`
`tolerant AGV. On the other hand, a fault-tolerant AGV
`
`
`
`
`
`
`
`
`
`will continue its work uninterrupted and thus inflict a
`
`
`
`
`
`positive influence on overall productivity.
`
`
`1.
`
`
`
`
`THE study of automatic guided vehicles (AGV) has
`
`INTRODUCTION
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`received a great deal of attention in the past decade
`
`
`
`
`
`
`
`
`
`due in part to the increasingly complex modern trans-
`
`
`
`
`
`
`
`
`
`
`portation problem and to the vast area of related applica-
`
`
`
`
`
`
`
`
`
`tions that involve monotonous and tedious tasks or haz-
`
`
`ardous environments.
`
`
`
`
`
`
`
`
`
`
`
`In fact, AGV's can be considered one of the key factors
`
`
`
`
`
`
`
`
`in flexible manufacturing systems. These systems are used
`
`
`
`
`
`
`
`
`
`to maximize the throughput by attempting to equalize the
`
`
`
`
`
`
`
`
`workload among their various components. This can be
`
`
`
`
`
`
`
`
`achieved by premeditated task planning and devising of
`
`
`
`
`
`
`
`efficient navigation methods for the autonomous trans-
`
`
`
`
`
`
`
`porters. These methods should be flexible, inexpensive,
`
`
`
`
`
`
`
`
`and easily modifiable. Initial efforts in obtaining these
`
`
`
`
`
`
`
`
`
`methods for robot navigation included the use of buried
`
`
`
`
`
`
`
`
`
`
`wires [1] and painted lines [2], [3]. The buried-wire method
`
`
`
`
`
`
`
`
`
`is still the most popular among the Japanese manufactur-
`
`
`
`
`
`ers of mobile robots [4].
`
`
`
`
`
`
`
`Research has concentrated lately on navigation with
`
`
`
`
`
`
`
`
`
`
`little or no a priori knowledge of the surrounding environ-
`
`
`
`
`
`
`
`
`ment. Since the knowledge about the environment is
`
`
`
`
`
`
`
`
`
`minimal, the system must depend on its sensing mecha-
`
`
`
`
`
`
`
`
`
`nisms for navigation. Most of the current research has
`
`
`
`
`
`
`
`
`relied upon visual sensors [5]—[13], ultrasonic rangefinders
`
`
`
`
`
`
`
`
`
`[14], tactile sensors [15]—[18], and laser rangefinders
`
`
`
`
`
`
`
`
`
`[19]—[21]. These systems are usually designed with the aim
`
`
`
`
`
`
`
`of being completely autonomous. However, most mobile
`
`
`
`
`
`
`
`
`
`
`robots are still inept in this regard, the limitations defined
`
`
`
`
`
`
`
`finally by their sensing capabilities. Realistic implementa-
`
`
`
`
`
`
`
`
`tion of completely autonomous robots capable of navigat-
`
`
`
`
`
`
`
`
`
`Manuscript received February 24, 1988; revised August 8, 1989.
`
`
`
`
`
`
`
`
`
`
`The authors are with the Department of Electrical and Computer
`
`
`
`
`
`
`
`
`
`
`
`Engineering, University of Miami, P. O. Box 248294, Coral Gables, FL
`
`
`33124.
`
`
`
`
`IEEE Log Number 8932019.
`
`
`
`
`
`0018-9472/90/0300-0380$01.00 ©1990 IEEE
`
`AHM, Exh. 1007, p. 1
`
`
`
`KABLIKA et at.: A FAULT-TOLERANT ARCHITECTURE FOR AN AUFOSIA1 I(' VISION-GUIDED VEIIR LF
`
`381
`
`
`
`
`
`
`
`
`
`
`The navigation system, presented in [29] and [30], is
`
`
`
`
`
`
`
`
`briefly described in Section II. The fault-tolerant architec-
`
`
`
`
`
`
`
`
`
`
`
`ture of the system is presented in Section III. An evalua-
`
`
`
`
`
`
`
`
`
`tion of the proposed architecture is investigated in Sec-
`
`
`
`
`
`
`
`
`
`tion IV, in which reliability, availability, and safety are
`
`
`
`
`
`
`
`
`
`discussed using Markov models. In Section V, a feasibility
`
`
`
`
`
`
`
`
`
`study of the proposed architecture is conducted to illus-
`
`
`
`
`
`
`
`
`
`
`trate that it could be implemented to perform the re-
`
`
`
`
`
`
`
`
`
`quired real-time navigation of the AGV. Also, this feasi-
`
`
`
`
`
`
`
`
`
`
`bility study could be considered, in a wider context, useful
`
`
`
`
`
`
`
`
`for implementing any application in which the transfor-
`
`
`
`
`
`
`
`
`
`mation of a path network map into an interconnected
`
`
`
`graph is needed.
`
`
`b)
`
`
`a)
`
`
`
`
`
`II. AGV SYSTEM DESCRIPTION
`
`
`
`
`
`
`
`
`The AGV goes through three different phases to
`
`
`
`achieve real-time navigation.
`
`
`
`
`
`
`
`
`Path network learning: During this phase a scaled
`
`
`
`
`
`
`
`
`
`
`map of the path network is presented to the system
`
`
`
`
`
`
`
`via a camera. The system automatically extracts
`
`
`
`
`
`
`
`
`information, such as location and number of inci-
`
`
`
`
`
`
`
`
`
`dent edges of a node, node adjacency, edge path
`
`
`
`
`
`
`
`
`direction trace, and edge length, using image pro-
`
`
`
`
`
`
`
`cessing techniques. Information, such as width and
`
`
`
`
`
`
`
`
`
`
`height of an edge, whether an edge is directed or
`
`
`
`
`
`
`
`
`
`
`not, and the maximum load an edge can support, is
`
`
`
`
`
`
`
`
`
`input to the system interactively and stored in the
`
`
`
`
`
`
`
`
`path network database. The learning process is not
`
`
`
`
`
`
`
`
`
`entirely carried out in an interactive mode due to
`
`
`
`
`
`
`
`
`
`
`the fact that it would have been time consuming and
`
`
`
`
`prone to human errors.
`
`
`
`
`
`
`
`
`Automatic path scheduling: In this phase, the infor-
`
`
`
`
`
`
`
`
`
`mation in the path network database is used to
`
`
`
`
`
`
`
`
`
`
`generate a path that could be traced by the AGV to
`
`
`
`
`
`
`
`
`navigate from the source node to the destination
`
`
`
`
`
`
`
`
`note. The requested navigation could be specified to
`
`
`
`
`
`
`
`
`the system in one of three possible alternatives:
`
`
`
`
`1) initial, destination nodes;
`
`
`
`
`
`
`
`
`
`2) initial, destination, and part of the path cross-
`
`
`ing nodes;
`
`
`
`
`
`
`
`
`3) initial, destination, and all the path crossing
`
`nodes.
`
`
`
`
`
`
`c) Real-time navigation: During real-time navigation,
`
`
`
`
`
`
`
`the AGV extracts information about the environ-
`
`
`
`
`
`
`
`
`ment and confirms it with the previously planned
`
`
`
`
`
`
`
`
`path. For the purpose of guidance, a finite-state
`
`
`
`
`
`
`
`
`
`machine is designed to control the navigation of the
`
`
`
`
`
`
`
`
`
`AGV. The AGV identifies nodes by decoding a bar
`
`
`
`
`
`
`
`
`
`code attached to each node and detects its current
`
`
`
`
`
`
`
`
`
`
`position by monitoring a painted line on the floor. If
`
`
`
`
`
`
`
`
`
`
`an unexpected node is reached, or a path is found
`
`
`
`
`
`
`
`
`blocked by an obstacle, a double heuristic search
`
`
`
`
`
`
`
`
`
`
`technique is used to generate a new path, to guide
`
`
`
`
`
`
`
`the system back to the required destination.
`
`
`
`
`
`
`
`
`
`For the AGV to perform the previously mentioned
`
`
`
`
`
`
`
`
`
`tasks, it utilizes a multiple instruction multiple data stream
`
`
`
`
`
`
`
`
`(MIMD) architecture in which three parallel paths are
`
`identified.
`
`
`
`
`
`
`
`
`I) Node/edge pipelined unit (NPU): This path consists
`
`
`
`
`
`
`
`of four image-processing pipelined modules used for
`
`
`
`
`
`segmentation, smoothing, thinning, and node/edge
`
`
`
`
`
`
`
`
`extraction. The NPU first transforms the input im-
`
`
`
`
`
`
`
`
`
`
`age of the path network into a graph of intercon-
`
`
`
`
`
`
`
`
`
`nected segments whose widths are one pixel in each
`
`
`
`
`
`
`
`
`direction. This graph is processed to extract infor-
`
`
`
`
`
`
`
`mation about network nodes and their interconnect-
`
`
`
`
`
`
`
`
`ing edges. The extracted information is stored in
`
`
`
`
`
`
`
`
`order to be used for automatic path planning.
`
`
`
`
`
`
`
`
`
`2) Bar-code pipelined unit (BPU): When the AGV navi-
`
`
`
`
`
`
`
`
`
`
`
`gates along a path, it decodes the bar codes of the
`
`
`
`
`
`
`
`
`nodes it encounters using the BPU. The decoded
`
`
`
`
`
`
`
`information is compared with the stored informa-
`
`
`
`
`
`
`
`
`
`
`
`tion about the path in order for the AGV to deter-
`
`
`
`
`
`
`
`
`mine if it is on the right path.
`
`
`
`
`
`
`
`
`
`
`3) Obstacle avoidance unit (OAU): This unit is used for
`
`
`
`
`
`
`detecting unexpected obstacles using an ultrasonic
`
`
`
`
`
`
`
`
`rangefinder. For the occasion that an obstacle is
`
`
`
`
`
`
`
`
`detected, by observing both a discontinuity in the
`
`
`
`
`
`
`
`
`
`painted line and a shorter range than the expected
`
`
`
`
`
`
`
`
`
`distance to the next node, the authors developed an
`
`
`
`
`
`
`
`
`algorithm in which the AGV attempts to maneuver
`
`
`
`
`
`
`
`
`
`
`around it, if possible, or else backtracks to the last
`
`
`
`
`
`
`
`
`
`reached node and a new path is replanned. Ultra-
`
`
`
`
`
`
`
`
`sonic obstacle avoidance has been chosen because it
`
`
`
`
`
`
`
`
`is inexpensive, easy to implement, and gives the
`
`
`
`
`
`
`
`required range information. The range data ob-
`
`
`
`
`
`
`
`tained is considered of acceptable resolution be-
`
`
`
`
`
`
`
`
`
`cause it is used for detection, not for recognition.
`
`
`
`
`
`III. AGV FAULT-TOLERANT ARCHITECTURE
`
`
`
`
`
`
`
`
`
`
`Since the AGV consists of three parallel paths, each of
`
`
`
`
`
`
`
`
`
`
`
`which uses a pipelined architecture as shown in Fig. 1, it
`
`
`
`
`
`
`
`
`is difficult to apply fault-tolerance design techniques to
`
`
`
`
`
`
`
`
`
`
`
`the system as a whole. Hence a modular approach is used
`
`
`
`
`
`
`
`
`
`
`in which each of the three parallel paths is designed
`
`independently.
`
`
`
`
`
`
`
`
`
`
`In this paper, the issue of fault tolerance is addressed
`
`
`
`
`
`
`
`
`
`
`for the NPU, due to its inherent complexity as compared
`
`
`
`
`
`
`
`
`
`
`
`to the BPU and OAU, but the concept could similarly be
`
`
`
`
`
`
`
`
`applied to these units. Two architectures arc presented
`
`
`
`
`
`
`
`
`
`for achieving fault tolerance in the NPU. Although both
`
`
`
`
`
`
`
`
`
`architectures are able to recover from two faults only,
`
`
`
`
`
`
`
`
`
`their concept of operation could easily be extended to
`
`
`
`
`
`
`include any number of faults in.
`
`
`
`
`
`
`
`
`
`
`In the first architecture, shown in Fig. 2, a direct
`
`
`
`
`
`
`
`
`replacement strategy is employed to reconfigure the sys-
`
`
`
`
`
`
`
`
`
`tem after fault detection and location. In this strategy,
`
`
`
`
`
`
`
`
`
`
`one of the spares is downloaded with the state and
`
`
`
`
`
`
`
`
`
`
`program of the faulty processor in order to replace it
`
`
`
`
`
`
`
`
`when normal operations are resumed after recovery. While
`
`AHM, Exh. 1007, p. 2
`
`
`
`382
`
`IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, vol.. 20, NO. 2, MARCI I /APRIL 1990
`
`
`
`Ultraeonla
`FlongeFinder
`
`
`
`
`Picture
`Input
`Unit
`
`Segnen tat ion
`Module
`
`Smoothing
`Module
`
`Thinning
`Module
`
`
`
`Obetocle
`
`Pooicio ce
`
`Module
`
`
`
`Node/Edge
`tl
`E
`t
`Module
`
`Bar -Code
`2 t ct
`A Decoder
`
`Coneole
`
`Hord
`Di.k
`
`
`
`Mos ter
`
`Processor
`
`
`
`Interproceesor
`
`
`Interrupt
`
`Network
`
`
`
`
`
`
`
`
`Arbiter
`
`
`
`
`Common
`Plenary
`odule I
`
`
`RGV
`
`Steering
`
`Controller
`
`
`Arbiter
`
`
`
`
`Common
`
`Memory
`
`Module II
`
`
`
`Electric
`
`Compiles
`
`
`Moto,
`
`Sensors
`
`
`Motor,
`
`Fig. 1. AGV system architecture.
`
`Fl
`
`BI
`
`82
`
`Flu
`
`PIPI
`
`PIP2
`
`SPF12
`
`SPN I
`
`IM
`
`SPPIk
`
`SPP02
`
`SPP01
`
`bk
`
`PIPk
`
`SPP0k
`
`Switch In
`
`Switch_O t
`
`ES1
`
`IDI
`
`SVC
`
`102
`
`SPI2
`
`SPI I
`
`Sof twore
`
`Sul tc her
`
`Selection
`Generator
`
`SP I
`
`SP2
`
`852
`
`SP:r
`
`SP0I
`
`M
`
`I B2 I
`
`Bk
`
`
`
`
`
`
`Fig. 2. Direct replacement architecture.
`
`
`
`
`
`
`
`
`
`
`
`in the second architecture, shown in Fig. 3, a rippling
`
`
`
`
`
`
`
`
`
`replacement strategy is employed in which a faulty pro-
`
`
`
`
`
`
`
`
`
`
`
`cessor is replaced by its successor in the pipeline, and that
`
`
`
`
`
`
`
`
`
`
`
`successor is in turn replaced by its successor, and so on,
`
`
`
`
`
`
`
`
`
`
`
`until the last processor in the pipeline is replaced by one
`
`
`
`
`
`
`
`
`of the spare processors. Hence the reconfiguration pro-
`
`
`
`
`
`
`
`
`
`cess results in a shift-of-functions operation that starts at
`
`
`
`
`
`
`
`
`
`
`
`
`the faulty processor and ends up at one of the spare
`
`processors.
`
`
`
`
`
`
`
`
`
`In both architectures, each module of the NPU is
`
`
`
`
`
`
`
`
`
`assumed to consist of a number of similar processors
`
`
`
`
`
`
`
`
`
`
`connected in a pipeline, with the total number of proces-
`
`
`
`
`
`
`
`
`
`
`sors in all four modules equal to k. Both architectures
`
`
`
`
`
`
`
`
`
`employ the concept of standby sparing by having two
`
`AHM, Exh. 1007, p. 3
`
`
`
`
`KABUKA el
`
`
`
`
`
`
`
`
`
`
`
`
`A FAULT-TOLERANT ARCIIITE(TURE FOR AN AUTOMATIC VISION-GUIDED VEIIIUI F
`
`
`353
`
`31
`
`32
`
`-
`
`PIU
`
`PIP'
`
`12
`
`f3k
`
`P1Pk
`
`S
`
`[51
`
`ES2
`
`5P2
`
`IM
`
`
`ESI
`
`
`ID2
`
`
`SVC
`
`
`
`Sof luore
`
`
`
`Switcher
`
`Bypass
`Generator
`
`
`
`
`er1O , y
`
`Ell
`
`' B21
`
`6k
`
`
`
`
`
`
`Fig. 3. Rippling replacement architecture.
`
`
`
`
`
`
`
`
`
`
`spare processors ready to be substituted for faulty proces-
`
`
`
`
`
`
`
`
`
`
`sors, once one or two faults have been detected and
`
`
`
`
`
`
`
`
`located. They differ mainly in the reconfiguration strat-
`
`
`
`
`
`
`
`
`
`
`
`
`egy, by which a faulty processor is replaced by one of the
`
`
`spare processors.
`
`
`
`
`
`
`
`Moreover, to achieve their respective strategies, they
`
`
`
`
`
`
`use a combined hardware/software switching mechanism,
`
`
`
`
`
`
`
`
`
`
`which is initiated when a fault is detected. The hardware
`
`
`
`
`
`
`
`portion is responsible for establishing proper physical
`
`
`
`
`
`
`
`
`links between the spare processors and the nonfaulty
`
`
`
`
`
`
`
`
`processors, while the software portion is responsible for
`
`
`
`
`
`
`
`
`downloading the states and programs needed for each
`
`
`
`
`
`
`
`
`
`architecture so as to get the spare processor functioning
`
`
`
`in the pipeline.
`
`
`
`
`
`A. Direct Replacement Architecture
`
`
`
`
`
`
`
`The direct replacement architecture proposed for the
`
`
`
`
`
`
`
`
`
`
`
`NPU is shown in Fig. 2, where the following basic units
`
`
`are identified.
`
`
`
`
`
`
`
`
`
`a) State verification controller (SVC): This unit is re-
`
`
`
`
`
`
`
`sponsible for scanning and testing the pipelined
`
`
`
`
`
`
`
`
`processors to globally detect any faults. On the
`
`
`
`
`
`
`
`
`other hand, faults are detected locally within each
`
`
`
`
`
`
`
`processor running its own self-diagnostics; when a
`
`
`
`
`
`
`
`
`
`fault is detected, it alters an internal register con-
`
`
`
`
`
`
`
`
`
`
`taining its order (state) in the pipeline. When one or
`
`
`
`
`
`
`
`
`
`
`two faults are detected by the SVC, it generates the
`
`
`
`
`
`
`
`
`
`codes of the faulty processors on its output lines
`
`
`
`
`
`
`(ID,, ID2) and activates the corresponding enable
`
`
`
`
`
`
`
`
`
`
`lines (ES„ ES2). These signals are used in turn by
`
`
`
`
`
`
`
`
`
`
`the other units in the system to initiate the switching
`
`
`
`
`
`
`
`
`
`
`mechanism. Each of the ID codes has s bits, where
`
`
`
`
`
`
`
`
`
`s is given by flog 2 k1, so that each pipelined proces-
`
`
`
`
`
`
`
`sor is given a unique identifying code.
`
`121
`ES_ 1
`
`
`SSPI_I
`
`SSP! 2
`
`
`
`0 EN
`
`
`2
`
`
`
`V
`
`SSP
`
`
`
`
`
`102
`ES_2
`
`
`SSP' 1
`
`SSPI_2
`
`
`
`
`0 EN
`
`
`2
`
`
`
`
`
`
`Select
`
`
`
`X
`
`
`
`
`
`511 _2
`
`SSP!
`
`
`
`
`
`
`
`
`
`
`Fig. 4 Switch in basic design.
`
`
`
`
`
`
`
`
`
`b) Switch,,,: This logic control unit establishes proper
`
`
`
`
`
`
`
`
`links from the outputs of the pipelined nonfaulty
`
`
`
`
`
`
`
`
`processors preceding the faulty ones, to the inputs
`
`
`
`
`
`
`
`
`
`of the spare processors (SPIT, SPI2 ). The basic logic
`
`
`
`
`
`
`
`
`
`design of this unit is shown in Fig. 4.
`
`
`
`
`
`
`
`
`c) Switch„„,: This logic control unit establishes proper
`
`
`
`
`
`
`
`
`links from the outputs of the spare processors (SPO,,
`
`
`
`
`
`
`
`
`SPO2 ) back to inputs of the nonfaulty pipelined
`
`
`
`
`
`
`
`processors succeeding the faulty ones. The basic
`
`
`
`
`
`
`
`
`
`
`
`
`logic design of this unit is shown in Fig. 5. In this
`
`
`
`
`
`
`
`design, when one of the demultiplexer (DEMUX's)
`
`
`
`
`
`
`
`
`
`
`
`is disabled, all its outputs are equal to 0. Also, the
`
`
`
`
`
`
`
`
`
`shown oa gate symbols represent banks of cnt gates.
`
`
`
`AHM, Exh. 1007, p. 4
`
`
`
`
`384
`
`
`
`
`
`
`
`IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, voi.. 20, NO. 2. MAR('H/APRII, 1990
`
`
`
`
`
`
`
`
`
`
`
`101
`ES
`
`
`EN
`
`
`Se le
`c t
`
`
`
`
`V
`
`0
`
`
`
`Input
`
`SP0
`
`
`
`
`
`102
`ES
`
`EN
`
`Select
`
`V
`
`0
`I put
`
`SP0_2
`
`
`
`SPPD
`
`
`I
`
`
`
`
` SPPO_2
`
`__D—
`
`5PPO_k
`
`
`
`
`
`
`Fig. 5. Switch„„, basic design.
`
`E5_1
`
`I0_"
`
`=5_2
`
`10_2
`
`Enable
`
`DECODER
`
`I rpu t
`
`Ennble
`
`DECODER
`
`Input
`
`BI
`
`57
`
`D I
`
`
`HE
`
`
`
`
`
`
`
`
`Fig. 6. Selection bypass) generator basic design.
`
`
`
`
`
`
`
`
`
`d) Selection generator: This combinational logic unit is
`
`
`
`
`
`
`
`
`used to generate selection signals (R, where i =
`
`
`
`
`
`
`
`
`
`
`
`0,1, . . .,k), which are used in turn by the steering
`
`
`
`
`
`
`
`
`
`logic at the input of each pipelined processor to
`
`
`
`
`
`
`
`
`reroute the tokens intended for the faulty processor
`
`
`
`
`
`
`
`
`
`to the spare processor. The steering logic is simply a
`
`
`
`
`
`
`
`
`
`
`MUX that is set to appropriately control the flow of
`
`
`
`
`
`
`
`information through the pipeline before and after
`
`
`
`
`
`
`
`
`the reconfiguration process. The basic design of this
`
`
`
`
`
`
`unit is shown in Fig. 6.
`
`
`
`
`
`
`
`
`e) Spare processors: These processors are identical to
`
`
`
`
`
`
`
`
`the processors used in the pipeline. Whenever one
`
`
`
`
`
`
`
`
`
`
`or two faults arc detected, they can be used to
`
`
`
`
`replace the faulty processors.
`
`
`
`
`
`
`
`
`
`
`f) Pipeline input unit (PIU): This unit controls the in-
`
`
`
`
`
`
`
`
`
`
`puts to the pipeline, depending on the state of the
`
`
`g)
`
`
`
`
`
`
`
`
`
`system. In normal operation, it supplies image infor-
`
`
`
`
`
`
`
`mation to the pipelined processors, while during
`
`
`
`
`
`
`
`
`
`recovery from one or two faults, it prohibits the
`
`
`
`
`
`
`
`
`
`input from the image memory (IM) to the system.
`
`
`
`
`
`
`
`
`Software switcher: This unit is responsible for down-
`
`
`
`
`
`
`
`
`loading spare processors with the states and pro-
`
`
`
`
`
`
`
`
`
`
`grams of faulty processors in order to take over their
`
`
`
`
`
`
`
`
`
`role in the pipeline after recovery. It mainly consists
`
`
`
`
`
`
`
`of a general-purpose processor and an attached
`
`
`
`
`
`
`
`
`memory. The processor is interrupted by the recov-
`
`
`
`
`
`
`
`
`
`
`ery signal (R) issued by SVC once recovery from a
`
`
`
`
`
`
`
`
`fault is required. The interrupt service routine (ISR)
`
`
`
`
`
`
`performs the necessary downloading. The attached
`
`
`
`
`
`
`
`
`
`memory stores a duplicate of the programs of all
`
`
`
`
`
`
`pipeline processors. Also, it contains a look-up table
`
`
`
`
`
`
`
`
`
`for storing the starting and ending addresses of each
`
`AHM, Exh. 1007, p. 5
`
`
`
`
`
`
`
`
`
`
`
`
`KABUKA et at.: A FAULT-TOLERANT ARCHITECTURE FOR AN AUTOMATIC VISION-CL I OE!) NTH I('LE
`
`
`
`
`
`385
`
`
`
`
`
`
`
`
`
`
`
`program. The algorithm of ISR is detailed as fol-
`
`lows.
`
`
`
`
`
`
`1) read ES,, ES2 , ID,, ID2
`
`
`
`
`
`
`2) FOR i =1 TO no_of_spares Do
`
`
`
`
`
`
`IF (used_spares = i — 1) AND (ES, = 1) THEN
`
`BEGIN
`
`
`
`
`
`
`read lookup [/.0„1], lookup ]/D„2.]
`
`
`
`
`
`
`read memory segment starting at
`lookup
`
`
`
`
`
`
`
`[ID,, 1] and ending at lookup [ID„2]
`
`
`
`
`
`write memory segment into SP,
`
`
`
`
`
`
`
`
`
`write state of faulty processor (ID,) into SP,
`
`
`
`used_spares = used_spares + 1
`
`END
`
`
`
`
`
`
`
`In the preceding algorithm, the variable "used_
`
`
`
`
`
`
`
`
`
`spares" is initialized with 0 upon system power up.
`
`
`
`
`
`
`
`Moreover, the algorithm could be easily modified
`
`
`
`
`
`
`
`
`
`for recovery from m number of faults by initializing
`
`
`
`
`
`
`
`
`
`
`"no_of spares with the m (in our system, m = 2).
`
`
`
`
`
`
`
`The system employs software redundancy, by stor-
`
`
`
`
`
`
`
`
`ing a duplicate of all pipelined processors programs
`
`
`
`
`
`
`
`
`in the attached memory, to achieve the required
`
`
`
`
`
`
`
`downloading during recovery. Also, the recovery pe-
`
`
`
`
`
`
`
`
`
`
`riod is the time required to read the programs of
`
`
`
`
`
`
`
`
`faulty processors and then write them, along with
`
`
`
`
`
`
`
`
`
`the states of the faulty to the spare processors.
`
`
`
`
`
`
`
`
`
`
`h) Image memory (IM): This unit consists of two mem-
`
`
`
`
`
`
`
`
`
`
`ory buffers, each of which is capable of storing the
`
`
`
`
`
`
`
`
`whole image. This buffering scheme is employed to
`
`
`
`
`
`
`maintain concurrency of operations between the
`
`
`
`
`
`
`
`
`
`camera, which is scanning the scene and storing its
`
`
`
`
`
`
`
`
`
`
`image into one buffer, and the PIU, which is reading
`
`
`
`
`
`
`
`
`the previously scanned frame from the other buffer.
`
`
`
`
`
`
`
`
`The camera and PIU operations are switched be-
`
`
`
`
`
`tween the buffers every frame.
`
`
`
`
`B. Rippling Replacement Architecture
`
`
`
`
`
`
`
`The rippling replacement architecture proposed for the
`
`
`
`
`
`
`
`
`
`
`
`NPU is shown in Fig. 3, where the following basic units
`
`
`are identified.
`
`
`
`
`
`
`
`
`
`a) State verification controller (SVC): This unit has the
`
`
`
`
`
`
`
`
`same function and design of the corresponding unit
`
`
`
`in Fig. 2.
`
`
`
`
`
`
`
`
`b) Bypass generator: This combinational logic unit is
`
`
`
`
`
`
`
`responsible for the generation of bypass control
`
`
`
`
`
`
`
`signals (B,, where i = 0,1,. . .,k) used by the steering
`
`
`
`
`
`
`
`
`
`logic at the output of each pipelined processor to
`
`
`
`
`
`
`
`
`
`bypass that processor when it is detected as faulty.
`
`
`
`
`
`
`
`
`
`
`
`The basic design of this unit is mainly the same as
`
`
`
`
`
`that given in Fig. 5.
`
`
`
`
`
`
`
`
`
`Pipelined input unit (PIU): This unit has the same
`
`
`
`
`
`
`
`
`
`dual function of the corresponding unit in Fig. 2,
`
`
`
`
`
`
`
`
`
`with the exception that during recovery from one or
`
`
`
`
`
`
`
`
`
`
`two faults, it not only prohibits the input from IM
`
`
`
`
`
`
`
`
`but also passes downloading tokens supplied by the
`
`
`
`
`
`
`
`
`software switcher to the pipeline to provide proper
`
`recovery.
`
`
`c)
`
`
`
`
`
`
`
`
`
`
`
`d) Software switcher: The function of this unit is to
`
`
`
`
`
`
`
`download each of the processors succeeding the
`
`
`
`
`
`
`
`
`
`faulty processor in the pipeline with the state and
`
`
`
`
`
`
`
`
`program of its preceding processor. This is accom-
`
`
`
`
`
`
`
`
`
`plished by sending command tokens to the latter in
`
`
`
`
`
`
`
`
`
`
`order to pass its state to its successor. This mecha-
`
`
`
`
`
`
`
`
`
`nism is employed to maintain a high degree of
`
`
`
`
`
`
`
`
`
`parallelism during recovery, and it is applied to all
`
`
`
`
`
`
`
`processors succeeding the faulty, except the one
`
`
`
`
`
`
`
`immediately succeeding it, which must be down-
`
`
`
`
`
`
`
`
`
`loaded with the entire state and program of the
`
`
`faulty processor.
`
`
`
`
`
`
`
`
`
`This unit is similar in configuration to the correspond-
`
`
`
`
`
`
`
`
`
`
`
`
`
`ing unit in Fig. 2, although its ISR is different due to the
`
`
`
`
`
`
`difference between the replacement strategies employed.
`
`
`
`
`
`
`
`
`
`Also, its attached memory contains, in addition to the
`
`
`
`
`
`
`
`
`
`programs and look-up table, a download table that con-
`
`
`
`
`
`
`
`
`
`tains the necessary command tokens to initiate the down-
`
`
`
`
`
`
`
`
`
`loading of each processor's state and program to its
`
`
`
`
`
`
`
`
`
`
`successor. The algorithm of the ISR is detailed as follows.
`
`
`
`
`
`read ES,, ES2, ID,, ID2
`
`
`
`
`
`
`FOR i =1 TO no_of_spares DO
`
`
`
`
`
`
`
`IF (used_spares = i — 1) AND (ES ' = 1) THEN
`
`
`BEGIN
`
`
`
`
`
`FOR j = k DOWN TO ID, +1 DO
`
`BEGIN
`
`
`
`read download [IDj]
`
`
`
`
`
`send download Lip/1 to PIU
`
`END
`
`
`
`
`
`
`read lookup [ID,, 1], lookup [ID,,2]
`
`
`
`
`
`
`
`read memory segment starting at lookup [ ID,,1]
`
`
`
`
`
`and ending at lookup [ID,2]
`
`
`
`
`
`
`
`write memory segment
`into processor number
`ID, +1
`
`
`
`
`
`
`
`
`
`write state of faulty processor (/D,) into processor
`
`
`number ID, + 1
`
`
`
`used_spares = used_spares + 1
`
`END
`
`
`
`
`
`
`
`In the preceding algorithm, the software redundancy
`
`
`
`
`
`
`
`
`
`employed is more than that of the direct replacement
`
`
`
`
`
`
`
`strategy architecture due to the downloading routines
`
`
`
`
`
`
`
`
`
`added to the program of each pipelined processor, en-
`
`
`
`
`
`
`
`
`
`abling it to download its state to its successor.
`
`
`
`
`
`
`
`
`
`
`The recovery period is the time required not only to
`
`
`
`
`
`
`
`
`
`read the programs of the faulty processors and write
`
`
`
`
`
`
`
`
`
`them to their successors, but also the time required
`
`
`
`
`
`
`
`
`
`to send the download table entries. Hence the re-
`
`
`
`
`
`
`covery period for rippling replacement architecture
`
`
`
`
`
`
`
`
`
`is longer than that of the direct replacement archi-
`
`
`
`
`
`
`
`
`tecture. This conclusion is in compliance with the
`
`
`
`
`
`
`
`fact that the rippling replacement architecture is
`
`
`
`
`
`
`
`
`using less hardware in the switching mechanism, and
`
`
`
`
`
`
`
`
`
`therefore it must lose some of the speed advantage
`
`
`
`of hardware switching.
`
`
`
`
`
`
`
`
`
`
`e) Image memory (IM): This unit has the same configu-
`
`
`
`
`
`
`
`
`ration as the corresponding unit in Fig. 2.
`
`AHM, Exh. 1007, p. 6
`
`
`
`
`386
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IEEE TRANSACTIONS ON SYSTEMS. MAN, AND CYBERNFTI('S, Vol- 20, NO. 2, mAatil/APRIL 1990
`
`
`
`
`
`C. Switching Mechanism
`
`
`
`
`
`
`
`
`In both architectures, the switching mechanism is de-
`
`
`
`
`
`
`
`
`
`fined as the combined set of coordinated actions per-
`
`
`
`
`
`
`
`
`
`
`
`formed by the different units of the system to achieve the
`
`
`
`
`
`
`
`
`
`
`required switching at the beginning and end of the recov-
`
`
`
`
`
`
`
`
`
`ery period. This mechanism is illustrated using the follow-
`
`
`
`ing algorithmic approach.
`
`
`
`
`
`
`
`1) WHILE (No fault is detected) oo
`
`BEGIN
`
`
`
`
`
`
`
`a) SVC scans and tests processors' states
`
`
`
`
`
`
`
`
`
`b) PIU reads image tokens from IM and feeds
`
`
`
`
`them to the pipeline
`
`
`END
`
`2) SVC
`
`
`
`
`
`
`
`a) generates ID code(s) of failing processor(s)
`
`
`
`
`
`
`
`b) activates ES,, or ES2, or both
`
`
`
`
`
`c) activates recovery signal (R)
`
`
`
`3) PAR BEGIN:
`
`
`
`
`
`
`
`a) PIU prohibits input to the pipeline
`
`
`
`
`
`
`
`
`
`b) proper B,'s are generated by the selection (or
`
`
`bypass) generator
`
`
`
`
`
`
`
`
`c) software switcher reads code(s) of the faulty
`
`processor(s)
`
`
`PAR END
`
`
`
`
`
`
`4) CASE replacement strategy employed is
`
`
`direct replacement:
`
`BEGIN
`
`
`software switcher
`
`
`
`
`
`
`a) reads program of faulty processor(s)
`
`
`
`
`
`
`
`
`
`b) downloads the state and program of faulty to
`
`
`the spare(s)
`
`
`
`
`END
`
`
`rippling replacement:
`
`BEGIN
`
`
`software switcher
`
`
`
`
`
`
`
`
`a) issues command tokens to initiate shift of func-
`
`
`tions operation
`
`
`
`
`
`
`b) reads program of faulty processor(s)
`
`
`
`
`
`
`
`
`
`c) downloads the state and program of the faulty
`
`
`
`to its successor(s)
`
`
`END
`
`
`END CASE
`
`
`
`
`
`
`SVC deactivates R, i.e., GOTO 1.
`
`
`
`
`
`
`
`In this mechanism, the combined hardware/software ap-
`
`
`
`
`
`
`
`
`
`
`proach of switching is revealed by the interaction of the
`
`
`
`
`
`
`
`
`hardware components, i.e., SVC, PIU, selection (or by-
`
`
`
`
`
`
`
`
`
`pass) generator, switch U11, switch,„, and the steering logic,
`
`
`
`
`
`
`
`
`with the software components, i.e., the software switcher
`
`
`
`
`
`
`
`
`
`and its attached memory. This combination makes use of
`
`
`
`
`
`
`
`
`the advantages of both hardware switching, i.e., speed
`
`
`
`
`
`
`
`and software switching, i.e., noncomplex switching cir-
`
`cuitry.
`
`
`
`IV. EVALUATION OF FAULT-TOLERANT
`
`ARCHITECTURES
`
`
`
`
`
`
`
`The direct and rippling replacement architectures that
`
`
`
`
`
`
`
`
`
`were discussed earlier achieve the same degree of fault
`
`
`
`
`
`
`
`
`tolerance as opposed to the nonredundant system, which
`
`
`
`(a)
`
`
`
`OF
`
`x
`
`y
`
`
`(b)
`
`
`
`
`
`Fig. 7. Markov modes.
`
`
`
`
`
`
`
`
`
`
`
`
`
`has no fault tolerance at all, i.e., it fails whenever a fault
`
`
`
`
`
`
`
`
`condition exists. This fault tolerance advantage is achieved
`
`
`
`
`
`
`
`
`at the expense of additional hardware and software
`
`
`
`
`
`
`
`needed in both redundant architectures. The hardware
`
`
`
`
`
`
`
`redundancy ratio in the rippling replacement architecture
`
`
`
`
`
`
`
`
`
`is less than that of the direct replacement architecture.
`
`
`
`
`
`
`
`
`
`Conversely, the software redundancy ratio is higher in the
`
`
`
`rippling replacement architecture.
`
`
`
`
`
`
`
`
`
`
`As far as the delay of the redundant architectures is
`
`
`
`
`
`
`concerned, the direct replacement architecture introduces
`
`
`
`
`
`
`
`
`
`a delay, as compared to the nonredundant system, given
`
`by
`
`
`[4+3r]d
`
`
`where
`
`
`
`
`
`
`
`
`
`r 1, if at least one spare is used