throbber
Design for Fault-Tolerance in System ES/9000 Model 900
`
`Lisa Spainhower, Jack Isenberg, Ram Chillarege, Joseph Berding
`International Business Machines Corporation
`South Road, Poughkeepsie, NY 12602
`
`Abstract
`
`The ES/9000 Model 900 is IBM’s high-end fault-
`tolerant commercial processor. Although, high-end
`commercial processors were traditionally designed to
`be very reliable, this is the first one that implements
`a fault-tolerant machine. The design exploits circuit
`level concurrent-error detection, fault-identification and
`reconfiguration with system level techniques when mul-
`tiple functional resources are available. It provides true
`graceful degradation during Central Processor or Chan-
`nel re-configuration and repair. This paper:
`
`¯ Discusses the design point for this processor and
`the trade-offs involved.
`
`Shows the error detection and on-line repair pro-
`cess of a Central Processor with the work recovered
`on an alternate Central Processor, transparent to
`the application.
`
`Describes Dynamic Path Selection and the hot-
`pluggable channels.
`
`Illustrates the fault-tolerance techniques used in
`the Level I Cache and the Central Store.
`
`1
`
`Introduction
`
`This paper presents the design for fault-tolerance in
`the BS/9000 Model 900 high-end commercial proces-
`sor. The Model 900 is a 6-way tightly coupled multi-
`processor with a two-level cache, expanded storage and
`fiber optic channels. Compared to its predecessor, the
`3090 model 600], it provides about twice the process-
`ing power and the following reductions in components:
`a 30% reduction in the number of chips, 40% reduction
`in the number of Thermal Conduction Modules, (TCMs
`are the second level multichip packaging vehicle) and a
`
`IBM may have patents or paneling patent applications cov-
`ering subject matter described herein. Licenses under IBM’s
`Utility patents axe available on reasonable and nond~iscrimina.
`tory terms and conditions. Lnquiries relative to licensing should
`be directed, in writing, to: ~’BM Corporation, Director of Con-
`tracts and Licensing, Armonk, NY 10504. Trademaxks: ES/9000,
`ES/3090, BSGON, SYSTEM/390, System/370, 3090, MVS/XA,
`MVS/ESA,S/390.
`
`0"/31-3071/92 $03.00 © 1992 IEEE
`
`38
`
`40% reduction in the number of signal cables. The ma-
`chine also makes significant advancement in processor
`design, implementing out of sequence execution, mul-
`tiple execution elements, etc. This paper focuses only
`on the design of the fault-tolerant aspects of the ma-
`chine whereas details of the processor organization can
`be found in [Liptay92].
`The design of a fault-tolerant machine in this mar-
`ket segment involves a complex set of trade-offs. These
`trade-offs involve, cost, performance, packaging, main-
`tenance strategy, operating system support and cus-
`tomer requirements. Thus, the choices of techniques to
`achieve fault-tolerance have to be weighed both from
`a top down and a bottom up view without losing per-
`spective of the final application program or customer
`view. Since the design team does not start with a clean
`slate there is a significant evolutionary process from the
`earlier generation of machines. The earlier generation
`of machines has been proven to be very reliable, using
`extensive error checking and recovery techniques in the
`processor. Furthermore, the packaging, screening and
`burn-in of components provide additional leverage of
`very large mean-time-between failures. To turn a high-
`end machine of this class into a fault-tolerant machine
`requires a very careful selection of the design point.
`Clearly, the challenge in this market segment is in de-
`signing a fault-tolerant machine that continues to be
`performance driven and cost competitive.
`This paper first discusses the Design Point. The
`standard circuit design process in the high end en-
`forces the principles of concurrent error detection and
`fault isolation. To take this design point and make it
`fault-tolerant one needs to enhance it to provide to-
`tal concurrent error detection and recovery with on-
`line non-disruptive reconfiguration [Pradhan86]. This
`is designed by integrating the circuit level capabilities
`of error detection with system level redundancy to pro-
`vide complete recovery and reconfiguration. The result-
`ing fault-tolerant system provides graceful degradation
`of processing power until repair, with complete trans-
`parency to the existing application base.
`Following the discussion on the design point, there
`are three sections discussing the details of the major
`subsystems, the Central Processors (CP), the Channels
`and the Storage Subsystem. In each section we have
`illustrated some of the key ideas used in this design
`and how they have been implemented. Clearly, no sin-
`gle paper can exhaustively discuss all the elements that
`
`IBM-Oracle 1012
`Page 1 of 10
`
`

`

`go into building the fault-tolerant system. The power
`subsystem and operating system layer are beyond the
`scope of this paper. This paper should interest system
`architects, integrators, and researchers in fault-tolerant
`computing.
`
`placed without affecting the rest of the system.
`In the following sub-sections we describe the error
`handling and maintenance capability and then describe
`the design goals for each of the Subsystems.
`
`2 The Design Point
`
`A critical part of any such design is a clear recogni-
`tion of customer requirements. In the high-end there
`has always been a very strong emphasis on data in-
`tegrity. Data integrity implies that all computation is
`checked and in the event of an unrecoverable error, the
`program aborted or the machine stopped. This is the
`reason why concurrent error detection is so advanced in
`most previous IBM high-end processors. Data integrity
`however, requires only that an error not be allowed to
`propagate - an unrecoverable error could result in the
`job or program being terminated. This may result in a
`system outage if the job being terminated is a critical
`software subsystem that failed software level recovery.
`In the case of a permanent hardware failure which re-
`quires repair, the processor would be check stopped. In
`recent years, the requirements have gone beyond data
`integrity alone. The goal is that no computation be
`lost, thereby never requiring program termination; that
`repair be on-line and non-disruptive, providing contin-
`uous availability.
`In most commercial environments a slight loss of per-
`formance is acceptable during the process of repair.
`This is particularly true considering that a) the pro-
`cessors are expensive and an expensive idle processor is
`usually unacceptable; b) although systems are operated
`at almost full utili~.ation (around 90%) they are rarely
`fully utili~,ed. A small shortfall in processing capability
`can be handled by workload management algorithms
`in the dispatcher or by shedding some less important
`workloads. With the repair becoming non-disruptive,
`true graceful degradation becomes the goal in these en-
`vironments.
`The design of a high-end processor is primarily driven
`by performance and cost trade offs. For performance
`reasons, the organization of the CentrM Processor Com-
`plex (CPC) has several identical elements, in paral-
`lel, such as processors, channels, levels of storage, etc.
`Thus there is a fundamental capability for high avail
`ability from the redundant resources that are deployed
`for performance. At the same time, each of these ele-
`ments is designed at the circuit level to have concurrent
`fault detection, recovery by retry, and fault identifica-
`tion and isolation. The strategy for fault-tolerance is
`to enhance the existing Error Detection and Fault Iso-
`lation (EDFI) techniques and couple them with system
`level techniques exploiting the identicM elements in the
`CPC. The resulting design can identify errors and
`cover transient and intermittent failures by retry. For
`permanent faults requiring repair, the CPC will isolate
`the failing element, reconfignre the system around it,
`recover by rolling back to an error-free consistent state
`and resume execution. The failed hardware can be re-
`
`Error Handling and Maintenance
`
`In order to ensure data integrity and provide on-line
`diagnosis, the circuits in the high-end employ concur-
`rent error detection and fault isolation. To achieve the
`fault-tolerance and continuous-availability goals, it was
`critical that this capability be enhanced. Thus the de-
`sign goal was to increase the level of error detection
`from what was typically in the 90% range to total con-
`current error detection. The design point for error de-
`tection is to ensure that an error is first identified within
`the same machine cycle. The reporting and recording
`may take additionM cycles. Every latch is protected by
`a code and there are no naked latches. All dataflows,
`arrays, and control busses have ECC, or checking, and
`state machines have parity predict, etc. Expanding to
`this level of coverage added less than 3% to the num-
`ber of logic chips. One reason for the small increase in
`overhead is that many of the hard 4o check logic chips,
`like sequence logic, are very pin limited and thus the
`logic could be duplicated on the chip for checking with-
`out costing more chips. Another reason was that in
`a heavily pipelined machine at the high-end, a lot of
`the circuits are in parallel where the checking levels are
`already very high.
`There are a several failure modes that this Design
`Point addresses however, for the purpose of understand-
`ing the error detection and recovery strategy we discuss
`a few of the dominant circuit fault models. Broadly,
`errors in the logic can occur due to solid hard circuit
`faults, intermittent faults, and transients faults. The
`solid hard circuit faults, commonly called permanent
`faults, occur when a digital circuit no longer yields a
`correct output given a specific set of inputs. Any time
`the specific input is repeated, the incorrect output is
`produced. An intermittent fault is an occurrence where
`a specific event produces an incorrect result, but the
`same inputs at a different point in time may produce
`the correct result. This can occur as a result of a design
`error, marginal circuits or loading conditions. Tran-
`sients occur when there are environmental conditions,
`noise transients, or cosmic particles cause an incorrect
`result, but the circuit itself functions correctly.
`To provide online error correction and repair not only
`should the error be identified but also resolved whether
`it is a transient or permanent. For the case of transient,
`the recovery should be guided to the source of the data
`that had an error and for a permanent, the exact iden-
`tity of the replaceable part. As an example, the design
`ground rules require that error checkers be placed at the
`driver for any signals that leave the replaceable part,
`and immediately after the receiver of any signals enter-
`ing a replaceable part. This allows most failures to be
`identified to the correct part without further analysis.
`The on-line EDFI would determine the Field Replace-
`able Unit (FRU) [Bossen82] to be replaced. It is neces-
`
`39
`
`IBM-Oracle 1012
`Page 2 of 10
`
`

`

`sary to be able to determine whether an error is due to
`a permanent physical failure requiring repair, or only
`a transient that can be handled without physical part
`replacement. In an ES/9000 system it is not always
`easy to distinguish which type of fault has occurred,
`but the design handles all of them. The entire logic
`structure called for the capability to back out and retry
`all internal operations with appropriate thresholds for
`determining success. Since the retry of an operation can
`involve different paths through the logic than the orig-
`inal error, the system of thresholds is very fine grained
`and rather complex. For example, if a fault occurs on
`a directory entry in a cache, the retry of the opera-
`tion may use a different cache set and, therefore, the
`threshold must be on the use of the directory address
`and set, not on the actual operation being performed.
`This has led to a very extensive threshold system imple-
`mentation. A detailed description of the error detection
`techniques can be found in [Bossen92]. Note that failure
`due to a hard circuit fault might be recoverable through
`instruction retry since the machine is run unoverlapped
`during the retry, thereby using different circuit combi-
`nations. Intermittents on the other hand could cause
`errors several times and go away later. To deal with
`either type of fault, multiple retries with appropriate
`thresholds are used to determine whether a unit needs
`to be taken out for repair or whether the system can
`continue with the unit containing the failure.
`If a failure is determined to be a physical failure
`requiring repair, the design requires the capability to
`fence off all external interfaces to the element being re-
`paired so that reconfiguration and maintenance can be
`performed while the remainder of the system continues
`in operation. Thus all interfaces have the capability to
`"fence" or de-gate the drivers and/or receivers of the in-
`terfaces. When a failure is determined to require repair,
`a service request is initiated. The Processor Controller
`Element is provided with a Remote Support Facility
`(RSF) which can report failure information automati-
`cally. An auto-diai for service from the RSF will result
`in the correct routing of the information to the cus-
`tomer service engineer, parts stocking location, and, if
`further analysis of the failure information is required,
`to the remote support center.
`
`Design Goals for Subsystems
`
`The design choices for fault-tolerance pay careful atten-
`tion to both the physical layout of the machine and the
`logical boundaries. Figure 1 shows the physical struc-
`ture of the Model 900. Each central processor (CP) is
`contained on 1 board in 4 TCMs which includes two
`128K byte store through Level 1 Caches (L1). Each
`CP contains multiple execution elements (each for dif-
`ferent types of instructions) and implements out of se-
`quence instruction execution. There are two System
`Control Element (SCE) boards each with 6 TCMs and
`each board has 8 card positions to accept up to 512
`MBytes of central storage. The SCE includes several
`elements, a store-in Level 2 cache (2MBytes per SCE),
`the directories and store buffers. There are two Inter-
`
`Figure I: ES9000/900 System Structure
`
`communications Connection Elements (ICE) per Sys-
`tem - each is packaged on a board with 5 TCMs. This
`element contains an I/O processor, channel control and
`switching function and the Expanded Storage controls.
`The Expanded Storage is packaged on cards and up
`to 8 GBytes is available. The system can be config-
`ured with up to a total of 256 channels, 96 of which
`could be Parallel channels and the remainder Enter-
`prise Systems Connection Architecture (ESCON) fiber
`optic channels. To avoid confusion between an indi-
`vidual CP and the package including the TCM boards,
`storage and channels, the package is called the Central
`Processor Complex (CPC) or simply the processor.
`The CPC hardware can be divided into 3 logical el-
`ements. Figure 2 shows the Central Processors (CPs),
`Storage Subsystem, and the Channel Subsystem (CSS).
`A different approach was taken in each logical section
`depending on element redundancy and the concurrent
`error detection and reconfiguration capable in the el-
`ement. We examine the goals of each of these logical
`elements here and describe the details of the design in
`the following sections.
`
`The Model 900 is a 6-way tightly coupled multi-
`processor which is an inherently redundant design
`point. The implementation takes care that a pro-
`gram dispatched on any of the CPs and will execute
`architecturally correctly. If a CP takes a catas-
`trophic failure, the work can be moved to another
`CP and the failing CP fenced out of the proces-
`sor resource pool and repaired. Degrading from
`6 CPs to 5 CPs is less than i/6th (due to lower
`MP degradation), which is quite tolerable for most
`commereial environments. This graceful degrada-
`tion together with non-disruptive repair provides
`a very cost-effective solution in a high-end CPC.
`Additional steps taken in internal design of the
`CP have the additional benefit of greatly reduc-
`ing the likelihood of a CP taking a catastrophic
`failure. The operating system’s primary function
`
`IBM-Oracle 1012
`Page 3 of 10
`
`

`

`CPs
`
`CP TCMs, SCE TCMs
`
`Storage
`
`SCE TCMs, ICE TCMs
`
`Subsystems
`
`C~ntr~l & ExpaP~ed
`
`Storage Cards
`
`Support
`Subsystems
`
`Power,
`Cooling,
`Processo~-
`con~oller
`
`~CE TCM
`Channel Cards
`
`s
`
`Figure 2:ES9000/900 Logical Structure
`
`is to schedule work to the CPs using algorithms to
`best manage the workload. Achieving good perfor-
`mance for the important jobs even in a degraded
`configuration is a natural fallout of the operating
`system scheduler.
`
`¯ The Channel Subsystem (CSS) is the means
`through which the processor attaches to DASD and
`tape storage devices, display controllers, switches,
`and communication networks. It is made up of
`TOM data path and control logic and up to 256
`card-on-board channels. The CSS has consider-
`able inherent redundancy. Redundant paths are
`used in such a way that if one is busy, an alternate
`is used, thus increasing performance rather than
`waiting for the busy one to be freed. Performance
`and availability requirements determine the num-
`ber of redundant paths to any particular I/O de-
`vice. These are generally sufficient to tolerate the
`loss of one path with no noticeable performance
`degradation. If a channel should fail, the failing
`channel can be varied out of the active configu-
`ration and concurrently repaired. Channel pack-
`aging of one parallel channel per card or two ES-
`CON channels per card facilitates the concurrent
`repair. ESCON [Elliot92/ architecture further fa-
`cilitates the ease with which redundant paths can
`be configured so that an alternate path can be used
`to address an I/O device. The Channel Subsystem
`TCM provides configuration boundaries for fault
`tolerance in the data funnel from the individual
`channel to Central Storage.
`
`¯ The Storage Subsystem is responsible for the han-
`dling of system data. It is made up of Gentral
`Storage, Expanded Storage, and the L2 Cache.
`This data is usually accessed synchronously with
`the CP’s operation. The storage subsystem con-
`trois manage the data through the use of Least
`Recently Used(LRU) algorithms. The storage sys-
`
`tern is not duplicated since this has adverse per-
`formance effects and the cost implications are pro-
`hibitive. However, several techniques are applica-
`ble to make storage fault-tolerant: ECC, scrub-
`bing, array reconfiguration and sparing. Array re-
`configuration is the ability to delete a section of
`storage and move the data for the target address
`to another area in the storage array. For the cen-
`tral and expanded storage array, spare memory ar-
`ray chips are added for automatic replacement of
`a failing memory chip. Fault avoidance has also
`been used most extensively in this dement. An en-
`tire level of packaging was removed, compared to
`the Model 6003, by integrating the Central Storage
`cards into the System Gontrol Element.
`
`Support subsystems which are not critical to the
`performance or cost/performance characteristics
`of the Model 900 widely use explicit redundancy.
`Each major physical element (e.g. SCE, CP, ICE)
`has its own power boundary. Within that bound-
`ary there are power supplies for each required volt-
`age level. For each voltage level within the power
`boundary there is one more supply than needed
`to meet the power requirements. All N+I sup-
`plies are normally active. When one fails, the re-
`maining N automatically re-adjust their output to
`meet the needs of the load. Likewise, subsequent
`to a repair, all N+I automatically re-adjust their
`output to meet the needs which were being pro-
`vided by N supplies. Details of this can be found
`in [Covi92]. The air cooling subsystem fans and
`blowers are similarly designed. Pumps are redun-
`dant in the water cooling subsystem and are peri-
`odically switched to detect any latent faults in the
`inactive pump. The Processor Controller Element
`(PCE), which provides system operations, service
`interfaces as well as participating in recovery and
`reconfiguration, is duplexed. One side is active and
`the backup continuously runs diagnostic programs
`and communicates with the active via "I am well"
`signals. These explicitly redundant subsystems -
`power supplies, fans, blowers, pumps, PCE - are
`all concurrently maintainable.
`
`The following sections describe the design of three
`subsystems that were described in this section, the CPs,
`Channel Subsystem, and Storage Subsystem.
`
`3 The Central Processor
`
`All System 390 Central Processor Complexes with more
`than one Central Processor have the ability to dynam-
`ically vary any CP on-line and off-line, and, as long as
`one of the N CPs is on-line, continue instruction ex-
`ecution. With the exclusion of speciai operations for
`which hardware may not be provided on all CPs (i.e.
`Vector Facility and Integrated Cryptographic Facility),
`each CP in a multiprocessing environment is capable of
`executing any task dispatched by the operating system
`and it is unpredictable where any particular task will be
`
`41
`
`IBM-Oracle 1012
`Page 4 of 10
`
`

`

`Deckle ~ Felch
`
`T
`
`I
`
`E
`
`RECOVERY
`
`Figure 3: Instruction Retry and Recovery
`
`executed. A major goal of the Model 900 design was to
`exploit this capability and continue uninterrupted op-
`eration in the event that one CP becomes unavailable.
`Accomplishing this goal required the following designs:
`
`1. The processing state in the failed CP must be rolled
`back to a consistent, error free state.
`
`2. The state of the storage system as seen by all other
`CPs must be architecturally accurate and error
`free.
`
`3. The state of the hardware capability of the failed
`CP (transient failure vs permanent failure) must
`be determined.
`
`4. The failed CP must be removed from the configu-
`ration before it propagates any corrupted data.
`
`5. The work that was running on the failed CP must
`be transferred to an operational processor.
`
`6. The failed CP must be repaired and returned to
`service without a system interruption.
`
`The implementation of this is best illustrated by an
`example involving instruction retry and recovery. Fig-
`ure 3 illustrates the pipeline and four instructions that
`are issued. Note that this machine implements out-of-
`sequence execution and this example also illustrates the
`leveraging of a performance technique to boost fault-
`tolerance at the circuit level. Out-of-sequence execu-
`tion allows for multiple instructions to be issued with-
`out requiring that they finish in the same order that
`they are issued so long as they complete in the same
`order. There is a fine distinction between finish and
`complete - finish is when the instruction finishes execu-
`tion, and complete is when the results of the instruction
`are put away into storage. Essentially, once an instruc-
`tion completes it cannot be backed out since its results
`are posted in storage and architecturally the instruction
`has executed. Prior to completing, an instruction can
`
`42
`
`be backed out provided all other pending stores from
`instructions issued after it are also cancelled and the
`storage is left in a consistent state.
`Assume a failure occurs at time T2 as shown in Fig-
`ure 3. The clocks are immediately frozen at time T2.
`Note that the last completed instruction was instruc-
`tion number 1 which completed at time T1. Although
`instructions 3 and 4 had both finished (they had had
`their results calculated and put into temporary facilities
`awaiting completion), they can not be flagged as com-
`pleted since instruction number 2 has not completed.
`Since instruction execution results are maintained in
`temporary facilities until the instruction is determined
`to be complete, it is possible to back out of all inter-
`mediate pipeline operations (including the finished in-
`structions) by flushing the pipeline and temporary fa-
`cilities. This leaves the processing state, of the program
`in execution, at an error free state - that is, at a point
`in time corresponding to the completion of instruction
`number 1. Since there is assumed to be a failure in
`the hardware of the CP, a physically separate Processor
`Controller, examines the pipeline contents, architected
`facilities, and temporary facilities, and puts them back
`into a consistent state. The Processor Controller ob-
`tains the CP state by scanning out the internal facili-
`ties using totally separate scan clocks so that the logic
`clocks can remain stopped.
`Next, the storage system must be insured to be con-
`sistent. The Model 900 particularly adapts to this be-
`cause of its storage hierarchy design. One of the advan-
`tages of the store-through L1 Data Cache is providing
`isolation of internal Central Processor faults. Because
`the shared L2 Cache contains any critical system data
`resulting from GP instruction execution, the L1 Cache
`copy is not critical and can be discarded upon detection
`of an error.
`Determining whether the failure detected was a tran-
`sient error or a true permanent circuit failure is done
`by retry. The Model 900 GP register management [Lip-
`tay90], the main function of which is to control out-of-
`sequence instruction execution, provides an audit trail
`of changes effected by incomplete instructions which is
`used by instruction retry for checkpoint restart. In or-
`der to do retry, the Processor Controller sets the inter-
`hal state of the logic in the CP to the state it would be
`in at the completion of instruction number 1. The logic
`clocks are then stepped to see if the following instruc-
`tions execute successfully with no errors. If they do, the
`error was a transient, and normal operation follows. If
`retry is unsuccessful after a thresholded number of re-
`tries, the failure is determined to be permanent and the
`work must be moved. At this point the failed processor
`is put into the CP check-stopped state.
`GP Checkstop [ESA/390] indicates severe CP dam-
`age for which processing on that GP must be termi-
`nated. In a uniprocessor, CP Checkstop is the equiva-
`lent of System Checkstop and all instruction processing
`ceases. In the past, when there was more than one
`CP and Processor Availability Facility (PAF) was not
`available, the operating system would initiate Alternate
`CP Recovery (ACR). ACR logically removes the failing
`
`IBM-Oracle 1012
`Page 5 of 10
`
`

`

`ALL CP$ ONLINE
`
`Non-disruptive Repair
`
`VARY
`
`GP1 f ~1 CHECK STOP
`
`CP1 C~ CP3 CP4 CP5 CP6
`
`CP1 C~ CP3 CP4 CP5 C~
`
`GP POWER ~UNDARY ON -VARY CP1 OFFLINE
`
`Figure 4: Concurrent CP Repair after Processor Avail-
`ability Facility
`
`CP from the system and attempts to transfer work that
`was in progress on the failing CP to an operative CP
`[MVS/XA]. The Model 900 introduces Processor Avail-
`ability Facility whereby the failing CP can be physically
`isolated for repair while the work in progress is com-
`pletely recovered and run on the N-1 CPs.
`
`One of the Model 900 objectives was to provide concur-
`rent maintenance for redundant subsystems, whether
`inherent or explicit. Figure 4 shows the procedure for
`concurrent repair of a CP. The failed GP needs to be
`both logically and physically isolated from the opera-
`tional components of the Processor Complex. Physical
`isolation is provided by a power boundary. Each CP
`has its own independent power supplies which are not
`shared with any other hardware and which may be pow-
`ered up and down independently of the power status of
`other power boundaries. Logical isolation is provided
`by fencing, an interna! mechanism by which any physi-
`cally connected on-line logical unit disables the receipt
`of signals from the fenced CP. Fencing is invoked au-
`tomatically by the hardware. Also automatically in-
`voked through the PCE is an auto-call for repair. The
`auto-call will result in the customer service engineer ar-
`ranging a time convenient to the customer for CP con-
`current maintenance. At that time, the customer ser-
`vice engineer performs on-line maintenance procedures,
`beginning with powering down the CP boundary, fol-
`lowed by replacing the failed part, powering the bound-
`ary back on, and, with operations assistance, varying
`the CP back into the configuration. Again, throughout
`this repair process, N-1 CPs continue uninterrupted in-
`struction processing and upon its completion all CPs
`are online, executing tasks.
`
`Processor Availability Facility
`
`4 The Channel Subsystem
`
`Processor Availability Facility (PAF), designed for the
`Model 900 presents the check-stopped condition along
`with the details of the error status, and the complete
`status of the process in execution to the operating sys-
`tem. Upon examination of the error status, the operat-
`ing system, stores the task in the normal dispatch queue
`to allow resumption of execution on another processor
`according to normal dispatch priorities [Daly91].
`In the past when retry was not successful, ACR would
`ABnormally END (ABEND) the task and would initi-
`ate software recovery. Depending on the robustness of
`the recovery and the criticality of the task that was
`ABENDed, the system may or may not recover. In
`general, control program tasks, where the recovery is
`quite robust, recovery has been successful If the task
`was an application, termination generaily would not
`cause the system to fail. However, for many subsys-
`tem and other critical component tasks, it was quite
`common for the system to fail. With the successful
`PAF operation, the task is not ABENDed and no other
`software recovery mechanisms need be invoked. PAF
`provides online reconfiguration in addition to the con-
`current error detection and fault-identification, making
`this a fault-tolerant processor. During and subsequent
`to the PAF operation, N-1 CPs continue to execute in-
`structions normally, no tasks are lost, and no interven-
`tion is required. The normal priority of dispatching in-
`sures optimum usage of the system while it is running
`with one less CP.
`
`The Model 900 Channel Subsystem is comprised of
`TGM control, storage path logic and Card-on-Board
`channels. A maximum of 256 channels can be attached.
`All 256 may be ESCON or up to 96 can be parallel. ES-
`CON, a new I/O architecture, is standard on the Model
`900. This provides a high-speed, long distance fiber op-
`tic serial channel. A direct channel to I/O connection
`is point-to-p~int. ESCON also provides for channels to
`attach to a star-distributed dynamic switch, the ES-
`CON Director. The Director allows dynamic switching
`between any combination of up to 60 channels and con-
`trol units. This enhanced connectivity facilitates pro-
`viding redundant channels to I/O device paths.
`There are two ESCON channels per card or one paral-
`lel channel per card. Depending on the mix of ESCON
`and parallel channels, a Model 900 may have a max-
`imum of 128 to 176 cards in the Channel Subsystem.
`Asymmetry is permitted, but if symmetrical, 64 to 88
`cards are in one power boundary and the other 64 to
`88 in a separate power boundary.
`
`Dynamic Path Selection
`
`For performance reasons 370/XA architecture was de-
`veloped to relieve the operating system from the task
`of managing the I/O activity. Channe! redundancy has
`long been a requirement of high-end processor com-
`plexes. In order to achieve throughput, more than one
`
`43
`
`IBM-Oracle 1012
`Page 6 of 10
`
`

`

`~ Path
`
`Dynamic
`
`Selection
`
`3990
`Storage Controller
`
`Channel
`Subsystem
`
`Selection of
`Reconnect Path
`
`Figure 5: Dynamic Path Selection
`
`channel must be capable of accessing the same I/O de-
`vice. This is so I/O operations are not delayed when the
`channel is busy with a different I/O operation. When
`a busy condition is encountered, the Channel Subsys-
`tem can chose another route to the device. The status
`of physical connections available, the choice of physi-
`cal connection to be used, and the routing of opera-
`tions are all controlled by the Channel Subsystem. Fig-
`ure 5 shows that an I/O request that may be initiated
`along one physical path can complete along a different
`path determined by the Channel Subsystem or Storage
`Controller. It executes the architecture which defines
`a connection between an I/O device and the operat-
`ing system independent o

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket