`
`Fault Tolerant Distributed Architectures for in-Vehicular Networks
`Author(s): Syed Misbahuddin and Nizar Al-Holou
`Source: SAE Transactions, Vol. 110, Section 7: JOURNAL OF PASSENGER CARS:
`ELECTRONIC AND ELECTRICAL SYSTEMS (2001), pp. 277-281
`Published by: SAE International
`Stable URL: https://www.jstor.org/stable/44718335
`Accessed: 04-11-2021 13:20 UTC
`
`JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
`range of content in a trusted digital archive. We use information technology and tools to increase productivity and
`facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
`
`Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
`https://about.jstor.org/terms
`
`SAE International is collaborating with JSTOR to digitize, preserve and extend access to SAE
`Transactions
`
`D
`
`JSTOR
`
`AHM, Exh. 1005, p. 1
`
`
`
`2001-01 -0673
`
`Fault Tolerant Distributed Architectures
`for in-Vehicular Networks
`
`Syed Misbahuddin
`King Fahd University of Petroleum and Minerals Saudi Arabia
`
`Nizar AI-Holou
`University of Detroit Mercy
`
`Copyright© 2001 Society of Automotive Engineers, Inc.
`
`ABSTRACT
`
`The increasing trend of automotive electronics mandates
`the introduction of multiple processors in automotive
`electronics. The automotive electronic systems have to
`operate
`in harsh environments having a high
`temperature
`range,
`high
`humidity,
`unpredictable
`vibrations and
`rapid voltage variation.
`In such
`environment, the automotive electronic systems become
`vulnerable
`to
`intermittent and
`transient
`failures.
`Depending upon the importance of the tasks performed
`by the processor, a processor's failure inside automotive
`electronic system may lead to serious consequences.
`Fault tolerant computing techniques are used to keep the
`computer systems running in spite of one or more
`processors' failures. The concept of fault tolerant is well
`known in many applications such as airplanes, industry,
`and military. However, the question of fault tolerant
`in automotive
`design has drawn
`little attention
`electronics.
`In
`this paper, various
`fault
`tolerant
`architectures
`for automotive applications have been
`proposed. In these schemes, fault tolerant is achieved by
`assigning the task of a failing processor to another
`processor in the system. In this way, the automotive
`electronic system may continue to function with multiple
`processors' failure.
`
`I.INTRODUCTION
`
`Electronics has been introduced into the automobiles to
`provide efficient
`implementation of all automotive
`functions.
`In the initial electronics implementation, a
`single processor called electronic control unit provides
`In order to improve the
`complete vehicular control.
`existing features and as well as
`to
`introduce new
`features, more processors are being introduced in the
`automotive
`electronic
`systems. When multiple
`processors are used in an automobile system, their
`failures may lead to the unavailability of the feature
`associated with the failing processor. Depending upon
`
`the importance of the failing processor, the automotive
`system may come to a complete halt. In the current state
`of the art, there is no fault tolerant architecture available
`in automobiles
`[18]. Because of cost and space
`restrictions associated with automotive electronics,
`special fault tolerant methods should be developed.
`In
`this paper, various fault tolerant distributed processing
`architectures for automotive applications have been
`proposed.
`In
`these proposed schemes, software(cid:173)
`hardware based approaches have been proposed to
`address the fault tolerant issues in automobiles.
`In
`section II, single bus based architecture is out-lined along
`with
`fault
`tolerant algorithm
`for detecting
`faults
`in
`processors. Section Ill extends the idea of single bus
`based scheme to hierarchical distributed architecture. In
`section IV a multi-network scheme is presented. Finally,
`conclusion is discussed in section V.
`
`II. SINGLE BUS BASED FAULT TOLERANT
`DISTRIBUTED PROCESSING ARCHITECTURE
`
`Single vehicle wide networks provide many advantages
`such as economical multiplexing, flexibility in adding or
`removing control nodes and single communication
`protocols [1 ]. In a single bus-multiplexing network, all
`intelligent nodes are connected to
`the system bus
`through interfaces.
`In this section, a single bus based
`fault
`tolerant distributed processing architecture
`is
`proposed. This architecture allows a vehicular system to
`function in spite of multiple processor's failures. The fault
`tolerant is achieved by assigning the tasks of the failed
`processor to a functioning processor, which will continue
`its original tasks in addition to assigned tasks of the failed
`processor
`
`ARCHITECTURAL FEATURES OF THE SINGLE BUS
`BASED ARCHITECTURE
`
`The proposed system consists of n processing nodes (P1,
`••• P" , ) m sensor groups (SG,, SGrSGm) and k
`P2,
`
`277
`
`AHM, Exh. 1005, p. 2
`
`
`
`actuator groups (ACG 1, ACG2 ••• ACGk ) as shown in
`Figure 1. The sensor and actuator groups consist of
`smart sensor and actuators [2]. All processors are
`connected to the bus via network interface logic, which is
`subjected to the random errors [3].
`The processing
`power of a processor becomes unavailable when
`processor and/or interface logic becomes faulty. This
`scheme uses a central controller unit (CCU). The central
`control monitors the performance of all processors
`connected to the automotive multiplexing bus. When
`detecting of a processing node's failure,
`the CCU
`executes a fault tolerant algorithm that assign the tasks
`
`Central
`controller
`mt(CCU)
`
`Code Memory
`
`µC
`
`µC
`
`MUX
`
`MUX
`
`of the failed processor to the another processor.
`
`Figure 1: Block diagram of single bus based fault tolerant distributed
`processing architecture
`
`is
`this proposed architecture, each processor
`In
`interfaced with two port memory modules. One port of
`each memory module is connected to its own processor
`and second port is to the main bus. Dual port memory
`allows any processor in the system to access any other
`processor's memory via the multiplexing bus without
`involving the processor. This feature of dual memory is
`conducive in implementing the proposed fault tolerant
`scheme discussed later.
`In this system, the code
`memory module shown in Figure 1 holds the segment of
`significant program codes for all processors.
`
`FAULT DIAGNOSTIC ALGORITHM FOR THE
`PROPOSED SINGLE BUS BASED ARCHITECTURE
`
`The proposed single bus based scheme will allow an
`automotive system to function in case of one or more
`processors' failures. In order to implement this scheme,
`the central controller unit (CCU) performs a supervisory
`action. During its supervisory action, the CCU identifies
`the
`faulty processor in
`the system and
`takes
`the
`appropriate actions to keep the system running. The
`CCU uses a variable called processor index (PINDX),
`which points to an ith processor at a given instant of time.
`The CCU sends a diagnostic message periodically to all
`processors in the system indicated by PINDX.
`If a
`processor is not faulty, it will respond to the CCU's
`
`diagnostic message by sending an acknowledgment
`message to the CCU.
`If the CCU does not receive this
`acknowledgment message within a predefined interval of
`time then, it will mark that processor as faulty processor
`and will assign the tasks of the faulty processor to
`another processor performing frivolous tasks in
`the
`system. The CCU can transfer the critical program code
`of the faulty processor to the assigned processor's
`memory by accessing the faulty processor's memory via
`multiplexing bus. Alternatively, the assigned processor
`may directly execute the critical code of the faulty
`processor by reading it from the code memory of the
`faulty processor. For the first option,
`the assigned
`processor's memory will be partitioned into two parts in
`such a way that half of it will hold the critical program
`code of the faulty processor and the other half will hold
`the program code of the assigned processor. The
`assigned processor will continue performing its original
`tasks in addition to this new assignment on a time
`sharing basis. The assigned processor can access the
`sensors and actuators related to the faulty processor via
`the serial bus. The fault diagnostic algorithm executed by
`the central controller is summarized in the flow chart
`shown in Figure 2.
`
`Begin
`
`Yes
`
`No •
`
`Send message to a
`processor pointed by PINDX
`
`ACK
`
`v,s--+ PINDX=PINDX+1
`
`No
`
`Mark processor as
`faulty
`
`Send message to a
`processor PJ to perform,____ ___ ___.,
`the tasks of faulty
`processor
`
`Figure 2 Fault diagnostic algorithm performed by the central controller
`unit
`
`that each
`it is assumed
`the proposed scheme,
`In
`the
`task
`processor has
`the potential of executing
`performed by other processor. Therefore, no redundant
`processors are needed to implement the fault tolerant
`
`278
`
`AHM, Exh. 1005, p. 3
`
`
`
`scheme. On the other hand, normal working processors
`are used to execute the faulty processor's tasks. This
`approach is cost effective in the sense that the fault
`tolerant capability is achieved by software and a limited
`hardware. The failure of the CCU will be catastrophic for
`the operation of the whole system because in this
`situation failure in any processor will not be detected. In
`order to avoid this problem, a single line called the
`central controller's active line (CCA) can be used as
`proposed in [4]. As long as the CCU is not faulty, an
`active high signal will be available on
`the CCA.
`Whenever this active high signal becomes
`low, a
`watchdog timer will become active. If the high logic level
`does not appear on the CCA within a defined time
`period, the watchdog timer will interrupt any of the
`processors in the system to takeover the responsibilities
`of the CCU. The assigned processor will continue its
`original assignment on a time-sharing basis
`
`this section. The
`in
`been proposed and discussed
`proposed system consists of a global bus, a central
`controller unit (CCU), a code memory module and N
`functional subgroups ( G,-GN) as shown in Figure 3. Each
`subgroup also contains one special purpose processor
`called coordinator processor ( GP). The purpose of the
`coordinator processor is to provide communication facility
`between processors located in different subgroups. Also,
`the coordinator processor in each group can provide the
`performance history of all processors
`inside
`the
`
`Central
`Controller
`
`Global Bus
`
`Ill. DEVELOPMENT OF A HIERARCHICAL
`DISTRIBUTED PROCESSING SYSTEM (HOPS)
`
`corresponding subgroup.
`
`In single bus based fault tolerant distributed system
`discussed in section II, all processors in the system are
`connected to one global bus. In this situation, the global
`bus becomes congested when the data traffic increases.
`Global bus then becomes the bottleneck tor the whole
`system and its failure will bring the whole system to a
`halt. To overcome this problem, a hierarchical distributed
`processing system is proposed in this section in which
`the concepts of global and local buses have been used.
`This scheme
`is based upon
`the classification of
`automotive system
`into
`functional subsystems. The
`automotive electronic systems can be divided
`into
`functional subsystems according
`to
`their physical
`locations and functions as follows [5]:
`
`Figure 3: Architecture of proposed hierarchical distributed processing
`architecture
`
`A subgroup G consists of a smart sensor group, a smart
`actuator group, a number of homogenous processing
`elements and a local bus as shown in Figure 4. All
`processors within a subgroup communicate with each
`other through the local bus. The processors in different
`subgroups can also communicate with each other
`~~~Ill~
`TT T T Localbu,
`
`2.
`
`1. Vehicle Drive Control Group (VDCG): This group
`may include the engine control, transmission control,
`cruise control, suspension control, steering control,
`throttle control, traction control, tour wheel steering
`control and knock control.
`Intelligent and Security Group (ISG): This group may
`include
`the air bag control, automatic collision
`avoidance and notifier system, ABS, engine
`immobilizer control and lojack system.
`Intelligent Transportation System Group (ITSG): This
`group
`provides
`support
`for
`the
`intelligent
`transportation System (ITS). The ITSG may include
`the navigation computer and ITS support control [6].
`4. Body Control Group (BCG): this group may include
`the instrument cluster control, trip computer, climate
`controller, tachometer and fuel gauge control.
`
`3.
`
`THE ARCHITECTURAL FEATURES OF HOPS
`
`Based on the classification of the automotive electronic
`system, a hierarchical distributed processing system has
`
`279
`
`111
`
`SAM,
`
`through the global bus.
`
`SSM = Smart sensor module
`SAM = Smart actuator module
`P = Processor
`Figure 4: Architecture of a typical subgroup in the proposed
`hierarchical distributed architecture
`
`FAULT DIAGNOSTIC ALGORITHM IN THE HOPS
`
`In automobiles, if a processor in a significant subsystem
`fails then the whole automotive system may fail. In order
`to avoid complete failure of the system, there should be a
`fault diagnostic and
`fault tolerant algorithm tor the
`system. In this section, a fault diagnostic algorithm is
`presented for the proposed HOPS. The central controller
`unit (CCU) implements this algorithm shown in Figure 5.
`In order to implement the fault diagnostic algorithm, the
`CCU sends diagnostic messages periodically to all
`
`AHM, Exh. 1005, p. 4
`
`
`
`IV. DEVELOPMENT OF A MULTIPLE NETWORK
`DISTRIBUTED PROCESSING SYSTEM (MNDPS)
`FOR AUTOMOTIVE APPLICATIONS
`
`The global bus in the hierarchical distributed system
`discussed in section Ill may pose the same limitation as
`indicated in single bus based architecture. That is, a
`to system's
`lead
`the global bus may
`in
`failure
`malfunctioning. To avoid this situation, a multiple network
`distributed processing system (MNDPS) is proposed in
`this section. Different groups of an automobile system
`may need different bus speeds. In order to accommodate
`in
`introduced
`this need, multiple buses can be
`automobiles. These buses are connected with each other
`through bridges. Figure 6 shows the proposed multiple
`network scheme for automotive applications. Each sub
`network consists of necessary processors, smart sensors
`and smart actuators. A coordinator processor (CP) is
`included in each sub network. Processors within a sub
`network communicate with each through local bus. For
`the communication among various sub-networks, the
`processors can use coordinator processors and bridges.
`Because of the multiple bus characteristics, individual
`sub networks can contain individual protocols. In this
`scheme, no global bus has been used. This feature
`eliminates the bottleneck of global bus failure in the
`proposed HOPS. In the proposed MNDPS a central
`controller unit (CCU) is attached to one of the sub
`networks. This sub network is called as supervisory sub
`network (SSN). The SSN contains a code memory which
`holds the critical program codes of all processors. The
`CCU sends diagnostic messages to all processors in the
`whole system. If a processor is found malfunctioning in a
`sub network, the CCU can assign the task of the faulty
`processor to any other processor in the same sub
`network. The assigned processor continues its original
`assignment on a time sharing basis. The critical program
`to the
`transferred
`is
`faulty processor
`code of the
`The fault diagnostic algorithm
`assigned processor.
`proposed for HOPS can be applied for MNDPS without
`major changes.
`
`processors in each subgroup. The CCU points out a
`subgroup by using a variable called GINDX. The variable
`PINDX points to a processor in a group pointed by
`GINDX. At the beginning of the algorithm, the variable
`first group in the
`G/NDX is initialized to point to the
`system. The algorithm tests whether GINDX is greater
`than NG, total number of groups in the system. If G/NDX
`is found greater or equal to NG, the GINDX is initialized
`to point to the first subgroup. On the other hand, if
`GINDX is not equal to NG, then the variable PINDX is
`initialized to "O." In this case, PINDX points out to the
`very first processor in the subgroup pointed by GINDX.
`A diagnostic message is sent to the processor pointed
`out by PINDX in a subgroup indicated by GINDX. Before
`the CCU checks
`the diagnostic message,
`sending
`whether PINDX has become greater than NP, which is
`the number of processor in a group pointed by G/NDX. If
`so, the CCU will reset PINDX to "O" and the pointer
`GINDX is incremented and control is transferred to
`another group. If CCU has not sent messages to all
`processor in a subgroup, then it will send the diagnostic
`message to the processor pointed by PINDX in the
`subgroup. CCU will anticipate an acknowledgment
`message from the processor within a specified interval of
`If an acknowledgment is not received from a
`time.
`the CCU
`interval of time,
`processor within certain
`In case of a
`assumes that the processor is faulty.
`processor's failure, the CCU assigns the tasks of the
`faulty processors to another processor in the same
`subgroup. The assigned processor continues its original
`assignment on a time sharing basis.
`
`GINDX=GINDX+1
`
`Initialize PINDX in the group
`pointed by GINDX
`
`PINDX in the group
`, - - - - - - - - , pointed by GINDX
`is incremented
`
`Send message to
`processor pointed by
`PINDX in group
`pointed by GINOX
`
`Transfer code to
`assigned processor
`
`Yo
`
`Figure 5: Fault diagnostic algorithm performed by the control unit in
`HOPS
`
`280
`
`AHM, Exh. 1005, p. 5
`
`
`
`Sub n,two,k I
`
`5. Shuji Mizutani, "Car Electronics", Nippondenso Co.
`Ltd, 1992.
`
`6. Ronald K. Jurgen, " The electronic Motorist, " IEEE
`Spectrum, March 1995, pp. 37-48.
`
`~ -----,-I_,__....,... l_y_P,..__Superviso,:yrub network
`
`~ J_, J-,
`E'._j EJ ~
`
`SSM= Srrartsensorm:dule
`SAM=Snwtac:tuaiorrrodule
`CP= Cooidiria1or processor
`B= Bridge
`
`Sub network..,
`
`Sub nrtwoik n
`
`Figure 6: Multiple Networks distributed processing system for
`automotive applications
`
`V. CONCLUSION
`
`The issue of fault tolerant automotive multiplexing design
`has not received enough attention. Due to the cost
`sensitivity
`issues attached
`to
`the auto
`industry,
`economical fault tolerant design approaches are required
`for automotive multiplexing architectures.
`In this paper,
`attempts have been made to provide some solutions to
`the
`issue of
`fault
`tolerance
`in
`the perspective of
`automotive multiplexing. Various fault tolerant schemes
`have been proposed in which the feature of fault tolerant
`is achieved by software and minimum hardware to meet
`the cost constraints of automotive industry.
`
`VI. REFERENCES
`1. Buce D. Emacus, " Aspects and issues of Multiple
`Networks," SAE paper 950293, 1995 pp. 79-94.
`
`2. Manual Alba, " A system Approach to Smart Sensors
`and Smart Actuator Design," SAE paper 880554,
`1988, pp. 61-73.
`
`3.
`
`John R. Wagner, " Failure mode testing tool for
`Automotive electronic controllers," IEEE Transaction
`on Vehicular Technology, Vol. 43, No. 1, Feb. 1994,
`pp. 156-163.
`
`for Body
`"Smart Actuator
`4. Cornelius Peter,
`Equipment," SAE paper 910883, 1991, pp. 237-244.
`
`281
`
`AHM, Exh. 1005, p. 6
`
`