throbber
POWER4 Systems: Design for
`Reliability
`
`Douglas Bossen, Joel Tendler, Kevin
`Reick
`
`IBM Server Group, Austin, TX
`
`AMD EX1024
`U.S. Patent No. 6,895,519
`
`0001
`
`

`

`POWER4 Microprocessor
`POWER4 Microprocessor
`(cid:122) 2-way SMP system on a chip
`(cid:190) > 1 GHz processor frequency
`
`(cid:122) Storage Hierarchy
`(cid:190) L1: 32 KB Data, 64 KB
`Instruction per processor
`(cid:190) L2: ~1.5 MB per chip
`(cid:190) L3: 32 MB per chip
`
`(cid:122) Chip interconnect:
`(cid:190) New Distributed Switch design
`(cid:190) Buses operate at _ processor
`speed
`
`(cid:122) Technology:
`(cid:190) 0.18 λm lithography
`¬ Copper, SOI
`(cid:190) 174 million transistors
`
`POWER4
`
`L3
`
`Mem
`
`GX Bus
`
`>1GHz
`Core
`
`>1GHz
`Core
`
`L3 Dir
`
`Shared L2
`
`Chip - Chip Commo
`
`0002
`
`

`

`System Building Block
`
`To other modules
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`C C
`
`L2
`
`IOCC
`
`C C
`
`L2
`
`IOCC
`
`C C
`
`L2
`
`IOCC
`
`C C
`
`L2
`
`IOCC
`
`L3 Ctl/Dir
`
`L3 Ctl/Dir
`
`L3 Ctl/Dir
`
`L3 Ctl/Dir
`
`L3
`
`L3
`
`L3
`
`L3
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`From other modules
`
`(cid:122) 4-chip, 8-way
`SMP module
`
`(cid:122) New Distributed
`Switch design
`
`(cid:122) Hybrid switch and
`bus configuration
`
`(cid:122) Enables
`aggressive
`cache-to-cache
`transfers
`
`(cid:122) Buses extended
`in multi-module
`configurations
`
`POWER4
`
`0003
`
`

`

`32-way System Logical Structure
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`GX Bus
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`L3
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`Memory
`
`POWER4
`
`0004
`
`

`

`32-way Components
`(cid:122) Hardware
`(cid:190) 4 multi-chip modules with
`16 POWER4 chips
`(cid:190) 16 L3 dual chip modules
`with 32 16MB EDRAM
`chips
`(cid:190) Up to 16 memory
`controllers
`(cid:190) Thousands of DRAM chips
`to support up to 256 GB
`RAM
`(cid:190) GX to Remote I/O Bridge
`Chips
`(cid:190) PCI Host Bridge Chips
`(cid:190) PCI-PCI Bridge Chips
`(cid:190) Redundant Power
`Supplies and Air Moving
`Facilities
`I/O devices
`
`(cid:122) Code
`(cid:190) Service Processor Code
`(cid:190) System Firmware Code
`(cid:190) Operating System Code
`
`System comprised of
`hardware & software
`components
`interacting and
`affecting system
`RAS
`
`POWER4
`
`0005
`
`

`

`Chip Design Driven by System
`Requirements
`Fault
`Fault
`Avoidance
`Avoidance
`
`(cid:122)Component minimization
`(cid:122)Intrinsic SER Mitigation
`
`Recovery
`Recovery
`
`Diagnosis and
`Diagnosis and
`Reconfiguration
`Reconfiguration
`
`Repair
`Repair
`Policy
`Policy
`
`(cid:122)Local chip level: ECC, refresh
`(cid:122)Chip interactions: bus/command retry
`(cid:122)System error handling: UE, PCI bus
`retry
`(cid:122)Run time / IPL diagnostics
`(cid:122)Local chip level: spare bits, lines
`elements
`(cid:122)System level: CPU, cache, memory
`sparing + degraded modes of
`(cid:122)Minimize system down time
`operation
`(cid:122)Concurrent and deferred maintenance
`
`POWER4
`
`0006
`
`

`

`Fault Avoidance: SER Mitigation With ECC
`
`Chip
`Example
`
`Failure Category
`
`Intrinsic
`
`Intermittent
`Marginal Pattern
`& Stress Sensitive
`
`Aggregate Chip
`SER
`
`Failure Rate
`(Relative)
`
`1
`
`2-3
`
`Source
`
`FITS
`
`MTBF
`
`Vendor DB
`
`10-100
`
`> 1140 yrs
`
`Empirical
`Field Avg
`
`20-300
`
`> 380 yrs
`
`150-2000
`
`Vendor Appl
`Notes
`
`1500-
`140000
`
`0.8-76 yrs
`
`POWER4 Chip Set Design Rule:
`
`Implement ECC or equivalent recovery on all arrays sufficient
`to keep residual SER at or below each chip’s IFR
`
`POWER4
`
`0007
`
`

`

`ECC or hardware
`ECC or hardware
`refresh:
`refresh:
`Memory, Caches,
`Memory, Caches,
`ERAT, TLB
`ERAT, TLB
`
`Recovery: ECC, Retry, UE Handling
`(cid:122) Hamming SEC-DED
`augmented for special
`uncorrectable error (UE)
`handling
`(cid:122) Mainstore ECC designed for
`chip kill + redundant bit steering
`(cid:122) Masks intermittent errors
`(cid:122) Supports UE handling
`(cid:122) Increases unmasked MTBF from
`4 months to > 20 years
`
`Retry:
`Retry:
`Module, GX &
`Module, GX &
`PCI buses, and
`PCI buses, and
`I/O link
`I/O link
`
`UE handling:
`UE handling:
`ECC protected
`ECC protected
`arrays, buses
`arrays, buses
`using retry
`using retry
`
`(cid:122) Marking, moving UE data
`(cid:122) AIX to support process
`terminate and software partition
`reboot
`
`POWER4
`
`0008
`
`

`

`Recovery: Main Store ECC and
`Extensions
`
`Memory scrubbing corrects soft single
`bit errors in background while memory
`is idle preventing multiple bit errors
`
`XXXX
`
`XXXX
`
`• • •
`
`Spare
`memory
`chip
`
`Bit scattering allows normal single bit
`ECC error processing to function even
`with a chip kill failure by scattering
`memory chip bits across separate
`ECC words
`
`Bit steering dynamically reassigns
`memory I/O if error threshold is
`reached on same bit
`
`POWER4
`
`0009
`
`

`

`Recovery: Additional Logic to Avoid
`Checkstops
`L1
`Data
`Cache
`
`Main
`Memory
`
`ECC
`
`L2
`Cache
`
`L3
`Cache
`
`ECC
`
`ECC
`
`ECC
`
`DL1 FIR
`
`L2 FIR
`
`L3 FIR
`
`MS FIR
`
`Status: OK
`
`Status: UE
`
`Status: SUE
`
`Status: SUE
`
`Service
`Process
`or
`
`FRU
`Callout
`
`Synch
`Machine
`Check
`
`POWER4
`
`0010
`
`

`

`Recovery: More Logic to Avoid
`Checkstops
`AIX Dev
`Driver
`
`System
`Firmware
`
`GX Bus
`
`C C
`
`L2
`
`IOCC
`
`Remote I/O
`(RIO)
`Bridge Chip
`
`L3 Ctl/Dir
`
`POWER4
`
`L3
`
`Memory
`
`PCI
`PCI
`Host
`Host
`Bridge
`Bridge
`(PHB)
`(PHB)
`Chip
`Chip
`
`PCI
`
`Remote
`I/O Bus
`
`On PCI Error Detect:
`
`(cid:122) Slot freeze on error
`(cid:122) Return all 1’s to
`Device Driver
`(cid:122) Driver calls
`firmware, reset slot
`(cid:122) Driver recovery,
`retry operation
`(cid:122) Threshold, fault
`isolation logged
`
`PCI-PCI
`Bridge Chip
`
`PCI XXXX
`
`PCI Adapter
`
`PCI-PCI
`Bridge Chip
`
`PCI
`
`POWER4
`
`0011
`
`

`

`Diagnosis & Reconfiguration: IPL RAS Design
`
`POWER4
`Chip
`
`(cid:190) Built-in Self Test (BIST)
`(cid:190) Chips / single core can be deconfigured
`(cid:190) Spare bits switched in for single cell failures in L1,
`L2 caches, and in L2, L3 directories
`
`L3
`
`(cid:190) Line delete capability maps out bad bits
`(cid:190) L3 cache bypassed for more serious failures
`
`Main
`Memory
`
`(cid:190) ECC and redundant bit steering
`(cid:190) Entire memory card deconfigured for serious failure
`
`I/O
`
`(cid:190) Redundant Remote I/O link takes over in case of
`failure
`(cid:190) I/O drawer deconfigured during boot if failure
`detected to allow IPL to proceed
`(cid:190) PCI adapter marked unavailable if error detected
`and can be replaced via Hot Swap later
`
`POWER4
`
`0012
`
`

`

`Diagnosis & Reconfiguration: Run Time RAS
`Design
`(cid:190) Internal checkers, parity, ECC, UE & Special UE
`handling
`POWER4
`Chip
`(cid:190) FIR error capture, Who’s on First logic
`(cid:190) Spare bits / quiesce mode
`(cid:190) Internal checkers, parity, ECC, UE & Special UE
`handling
`(cid:190) FIR error capture, Who’s on First logic
`(cid:190) Cache line delete
`(cid:190) Bypass on next boot following UE
`(cid:190) ECC, address checks, UE & Special UE handling
`(cid:190) FIR error capture, Who’s on First logic
`(cid:190) Memory scrubbing, chip kill, bit steering
`(cid:190) Card deconfigure on next boot following UE
`
`Main
`Memory
`
`L3
`
`I/O
`
`(cid:190) GX bus error checkers, UE & Special UE handling
`(cid:190) FIR error capture, Who’s on First logic
`(cid:190) Remote I/O link hardware failover
`(cid:190) PCI device detect, recover, isolate, deconfigure
`POWER4
`
`0013
`
`

`

`Diagnosis & Reconfiguration: Fault Isolation
`Registers
`
`(cid:122) FIR design guidelines:
`(cid:190) Capture error at source
`(cid:190) Identify component showing error symptoms
`
`Compone
`nt
`
`Bus
`
`Compone
`nt
`
`FIR
`
`(cid:122) (cid:122) (cid:122)
`
`(cid:190) Leads to:
` FIR bits can be OR’d but carefully
`
`POWER4
`
`0014
`
`

`

`Diagnosis & Reconfiguration: Who’s on First Logic
`
`(cid:122) Requirement:
`(cid:190) Separate cause from effect to isolate faulty component
`from components propagating fault
`
`(cid:122) Solution:
`(cid:190) Each FIR starts a timer when it detects an error condition
`(cid:190) FIRs freeze timer when checkstop finally occurs
`(cid:190) FIR measuring longest elapsed time identifies causing
`component
`
`(cid:122) Used for:
`(cid:190) FRU callout
`(cid:190) Reconfigure system
`
`POWER4
`
`0015
`
`

`

`Diagnosis & Reconfiguration: Role of Service
`Processor
`(cid:122) Controls First Failure Data Capture
`(cid:190) FIR and Who’s on First logic enable capability
`
`(cid:122) Responsible for reconfiguring system
`(cid:190) Use spares for dynamic reconfiguration where possible
`(cid:190) Deconfigure failing part and bypass otherwise
`(cid:190) FRU callout
`
`(cid:122) Allows for component de-allocation before hard
`failures occur in conjunction with Operating System
`(cid:190) Processor
`(cid:190) Chip
`(cid:190) L3 line deletes
`(cid:190) L3 and memory (on next IPL)
`(cid:190) PCI adaptor and I/O devices
`
`POWER4
`
`0016
`
`

`

`Repair Policy: Maximize System Availability
`
`(cid:190) Internal array spare bits eliminate single bit error causes
`allowing continued operation without repair
`(cid:190) Core / chip deconfiguration support deferred repair for
`other errors
`(cid:190) Cache line delete eliminates single bit error causes
`allowing continued operation without repair
`(cid:190) L3 bypass supports deferred repair for other problems
`(cid:190) Bit steering eliminates single bit error causes allowing
`continued operation without repair
`(cid:190) Chip kill allows non-degraded operation and deferred
`repair for other DRAM failures
`(cid:190) Card deconfiguration allows deferred repair for card logic
`failures
`(cid:190) Remote I/O link hardware failover allows continued
`operation
`(cid:190) PCI device hot plug for concurrent repair
`(cid:190) Redundancy and hot plug to allow for concurrent repair
`
`POWER4
`Chip
`
`L3
`
`Main
`Memory
`
`I/O
`
`Power and
`Cooling
`
`POWER4
`
`0017
`
`

`

`POWER4 RAS Design
`
`(cid:122) Focus on maximizing system operation
`(cid:190) Avoid faults through masking and recovery
`(cid:190) Dynamically bypass faulty componentry
`(cid:190) Allow for concurrent repair where possible, deferred repair
`otherwise
`
`(cid:122) Requires special handling to uniquely identify failure
`source
`
`(cid:122) Total system design encompassing all hardware
`components, system firmware and operating system
`code
`
`(cid:122) Mainframe RAS attributes in a UNIX server
`
`POWER4
`
`0018
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket