throbber
United States Patent [19]
`Kizuka
`
`111111
`
`111111111111111111111111111111111111111111111111111111111111
`lTS0057969 3 7 A
`Patent Number:
`Date of Patent:
`
`5,796,937
`Aug. 18, 1998
`
`[11]
`
`[45]
`
`[54] METHOD OF AND APPARATUS FOR
`DEALING WITH PROCESSOR
`ABNORMALITY IN MULTIPROCESSOR
`SYSTEM
`
`[75]
`
`Inventor: Yoshitaka Kizuka. Kawasaki. Japan
`
`[73] Assignee: Fujitsu Limited. Kawasaki. Japan
`
`[21] Appl. No.: 53(1,739
`
`[22] Filed:
`
`Sep. 29, 1995
`
`[30]
`
`Foreign Application Priority Data
`
`Sep. 29, 1994
`
`[JP)
`
`Japan .................................... 6-235422
`
`Int. Cl.6
`...................................................... G06F 11/00
`[51]
`[52] U.S. Cl ................................. 395/182.11: 395/182.09:
`3951182.05; 364/268: 364/268.3
`[58] Field of Search .......................... 395/182.11. 182.09.
`395/182.01. 181. 182.05
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`3,787,816
`3,812.468
`3,937,936
`4,415,973
`4,503.534
`4,654,846
`4,807,228
`
`1/1974 Hauck ................................ 395/182.01
`5/1974 Wollum .......................... 395/182.09 X
`2/1976 Saporito ............................. 395/182.09
`1111983 Evans ............................. 395/182.11 X
`3/1985 Budde ............................ 395/182.11 X
`3/1987 Goodwin ............................ 395/182.11
`2/1989 Dahbura ............................. 395/182.11
`
`9/1989 Chao ................................... 395/181 X
`4,866.712
`6/1990 Elrod .................................. 395/182 01
`4,933.838
`3/1991 Ely ................................. 395/18209 X
`5,003.464
`5/1993 Glider ..................................... 395/181
`5,214.778
`Primar}' Examiner-Robert W. BeausolieL Jr.
`Assistant Examiner-Dieu-Minh Le
`Attome}; Agem, or Firm-Staas & Halsey
`ABSTRACT
`[57]
`
`A multiprocessor system has processors for processing dis(cid:173)
`tributed works. a monitoring facility for detecting an abnor(cid:173)
`mality in any one of the processors. an administration
`facility for providing information about the abnormal pro(cid:173)
`cessor and information about a redundant processor. and a
`work allocation facility for seeking the distributed works of
`the abnormal processor from a work table according to these
`pieces of information and allocating the sought works to
`given ones of the processors. The system includes an abnor(cid:173)
`mality measures table that selectively describes measures to
`be taken for each of the distributed works against an
`abnormality. The work allocation facility determines. for
`each of the distributed works of the abnormal processor. a
`measure to be taken according to the abnormality measures
`table and allocates the distributed works of the abnormal
`processor to given ones of the processors. If the abnormality
`is recursive. allocating any work for which a specific mea(cid:173)
`sure such as rerun or continuation is to be taken is sus(cid:173)
`pended. If the redundant processor is being initialized.
`allocating works to the redundant processor is delayed.
`
`4 Claims, 18 Drawing Sheets
`
`I - - - - - - - - - - - - - - - - - - - -
`1 CURRENT PROCESSOR (S)
`
`-- - - - -~-- ------,
`I
`I
`I
`I
`I
`
`\
`MONITORING FACILITY
`·TO DETECT ABNORMAL PROCESSOR
`OF P1 TO P3
`
`8
`
`\
`
`CORRESPONDENCE TABLE
`·TO DESCRIBE TYPES OF
`PROCESSORS
`
`ABNORMALITY CLASSIFICATION
`TABLE
`·TO DESCRIBE STATES OF
`PROCESSORS AND POSSIBILITY
`OF RECURSIVE ABNORMALITY
`sr
`
`ABNORMALITY
`
`~
`ADMINISTRATION FACILITY
`·TO IDENTIFY ABNORMAL AND
`REDUNDANT PROCESSORS
`·SEE IF REDUNDANT PROCESSOR
`IS BEING INITIALIZED
`·AND SEE IF ABNORMALITY IS
`RECURSIVE
`
`jE--
`
`t--
`
`INFORMATION ABOUT
`ABNORMAL AND
`REDUNDANT PROCESSORS
`
`I
`I
`_L_
`- - - - - - .. ·-··-
`
`-
`
`- - - - · · - - - - ' - - - -
`
`I
`I
`I
`I
`I
`I
`_L _
`
`-
`
`P2
`
`AHM, Exh. 1006, p. 1
`
`

`

`......:1
`~ w
`0'\
`~
`""-~
`Ol
`
`'II
`
`'II
`
`~ .....
`rJ':J ::r a .....
`
`QO
`
`QO
`~
`~ .....
`
`> = ~ .....
`
`~ = ~
`~ ~
`•
`00.
`
`e •
`
`CURRENT I
`
`PROCESSOR
`
`{
`P1
`
`1--
`
`L.....J
`8
`
`--
`
`-
`
`PROCESSOR
`CURRENT
`
`~
`P3
`
`-
`
`PROCESSOR
`CURRENT
`
`!--
`
`~
`P2
`
`:
`I
`I
`I
`I
`I
`I
`I
`r--
`I
`I
`
`'
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-------
`
`REDUNDANT PROCESSORS
`ABNORMAL AND
`INFORMATION ABOUT
`
`RECURSIVE
`·AND SEE IF ABNORMALITY IS
`IS BEING INITIALIZED
`·SEE IF REDUNDANT PROCESSOR
`REDUNDANT PROCESSORS
`·TO IDENTIFY ABNORMAL AND
`ADMINISTRATION FACILITY
`
`ABNORMALITY
`
`~
`2
`
`5
`
`OF RECURSIVE ABNORMALITY
`PROCESSORS AND POSSIBILITY
`·TO DESCRIBE STATES OF
`TABLE
`ABNORMALITY CLASSIFICATION
`
`PROCESSORS
`·TO DESCRIBE TYPES OF
`CORRESPONDENCE TABLE
`
`\
`L..
`
`~CURRENT PROCESSOR (S)
`-----------------------------------------,
`
`1
`
`OF Pl TO P3
`·TO DETECT ABNORMAL PROCESSOR
`MONITORING FACILITY
`
`~
`1
`
`Fig.1A
`
`AHM, Exh. 1006, p. 2
`
`

`

`.....:J
`\0 w
`="
`._.
`\0
`.....:J
`._.
`til
`
`~ -00
`r:n =(cid:173)~ a
`
`N
`
`rte -S'J -~
`
`> c
`
`"'""" ~ a
`
`~
`~
`•
`\J).
`0 •
`
`9
`
`----------~----------------------J
`
`I
`
`L----------
`
`I
`
`SHARED MEMORY
`
`I NONVOLATILE
`
`I
`I
`I
`
`r--PROCESSOR
`REDUNDANT
`
`--
`
`-
`
`I
`I
`I
`
`f
`P4
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`T
`E I
`
`I
`
`ALLOCATION SH
`
`~~
`
`f
`
`IF ABNORMALITY IS RECURSIVE
`WORKS TO BE RERUN OR CONT! 'lUED
`·AND SUSPEND ALLOCATION OF
`OF REDUNDANT PROCESSOR
`PROCESSORS AFTER INITIALIZATION
`PROCESSOR TO SUBSTITUTE
`·TO ALLOCATE WORKS OF ABNORMAL
`WORK ALLOCATION FACILITY
`
`-
`
`-
`
`-
`
`Fig.1B
`
`WHEN ABNORMALITY OCCURS
`TO BE TAKEN FOR EACH WORK
`·TO DESCRIBE MEASURES
`ABNORMALITY MEASURES TABLE
`
`?
`OF PROCESSORS
`·TO DESCRIBE WORKS
`WORK TABLE
`
`16
`
`-
`
`-
`
`---
`
`AHM, Exh. 1006, p. 3
`
`

`

`~ ......
`\0
`="'
`....
`\0
`.... ......
`VI
`
`P4: INITIALIZED REDUNDANT PROCESSOR
`
`P1~P3: CURRENT PROCESSOR
`
`00
`~
`~
`
`00. :r
`
`~ -I.;J
`
`00
`~
`
`~
`
`~
`
`~
`
`> = ~
`
`~ ... ~ = ...
`
`e •
`
`•
`00.
`
`APPLICATION C
`COMMUNICATION
`
`CONTROL c
`COMMUNICATION
`CONTROL b
`COMMUNICATION
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`
`P4
`
`PROCESSOR
`REDUNDANT)
`
`(
`
`APPLICATION C
`COMMUNICATION
`APPLICATION B
`COMMUNICATION
`
`CONTROL c
`COMMUNICATION
`
`DISTRIBUTED OS
`
`PROCESSOR
`CURRENT
`Fig.2A
`
`P3
`
`PROCESSOR
`CURRENT
`
`P2
`
`APPLICATION C
`COMMUNICATION
`
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`
`PROCESSOR
`CURRENT
`
`P1
`
`AHM, Exh. 1006, p. 4
`
`

`

`.....:1
`~ \C w
`0'\
`\C
`~ -....!
`f.Jl
`
`~ -00
`~ -~
`
`00 :r
`
`~
`
`> = ~ -S'J -IC
`
`00
`IC
`
`: RERUN
`
`( i.i)
`
`:CONTINUATION
`
`( .i. )
`
`B:DRAWBACK
`
`{7 RESTORATON OF P2
`
`SERVER
`PRINTING
`
`APPLICATION C
`COMMUNICATION!
`
`APPLICATION C
`COMMUNICATION
`APPLICATION B
`COMMUNICATION
`
`CONTROL c
`COMMUNICATION
`
`---
`
`(ii)
`
`SERVER
`PRINTING
`
`APPLICATION C
`COMMUNICATION
`
`~
`
`~ =
`00 • ;p
`
`f""'t'o.
`
`f""'t'o.
`
`DISTRIBUTED OS ~:HALT
`I
`
`P4
`
`I
`PROCESSOR-PROCESSOR
`REDUNDANT
`
`I
`CURRENT
`
`P3
`
`PROCESSOR
`CURRENT
`Fig. 2 8
`
`CONTROL b
`COMMUNICATION
`
`( i)
`
`X
`
`P2
`
`( i)
`
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`
`P1
`
`PROCESSOR
`CURRENT
`
`AHM, Exh. 1006, p. 5
`
`

`

`~ "''
`\C
`="'
`,.
`,. "'' \C
`
`01
`
`~ -00
`::g -U1
`
`r.l1 ::r
`
`> = ~ -~ -~
`
`00
`
`~ • IJJ.
`
`(tl a
`~ .......
`~
`•
`
`~ CRUSH OF P4
`
`: RERUN
`
`(li)
`
`:CONTINUATION
`
`(il
`E::J: DRAWBACK
`~:HALT
`
`APPLICATION C
`COMMUNICATION
`APPLICATION B
`COMMUNICATION
`
`CONTROL c
`COMMUNI CAT I ON I
`I
`I
`
`PROCESSOR
`CURRENT
`
`PL.
`
`DISTRIBUTED OS
`
`P3
`
`PROCESSOR
`CURRENT
`
`Fig. 2C
`
`P2
`X
`
`SERVER
`PRINTING
`
`APPLICATION C
`COMMUNICATION
`
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`
`PROCESSOR
`CURRENT
`
`P1
`
`AHM, Exh. 1006, p. 6
`
`

`

`......:1
`VJ
`\C
`="
`....
`\C
`......:1
`....
`Ol
`
`~ -00
`rJ1 =-~ -0',
`> = ~ -7J -~
`
`00
`
`~ = f"'f".
`~ = f"'f".
`
`•
`Cl'l
`~ •
`
`{!r RESTORATON OF 4
`
`{1 RESTORATON OF 2
`
`APPLICATION Ci
`COMMUNICATION,
`
`APPLICATION C
`COMMUNICATION
`APPLICATION B
`COMMUNICATION
`
`:RERUN
`
`(ill
`
`:CONTINUATION
`
`(i)
`
`CONTROL b
`COMMUNICATION
`
`(.ii)
`
`\
`
`CONTROL c
`COMMUNICATION
`
`B:DRAWBACK
`
`~:HALT
`
`DISTRIBUTED OS
`
`~ ~
`
`(l)
`I
`
`DISTRIBUTED OS
`
`X
`
`PL.
`
`P3
`
`PROCESSOR
`CURRENT
`Fig. 2 D
`
`P2
`X
`
`SERVER
`PRINTING
`
`APPLICATION C
`COMMUNICATION
`
`~I-'"
`
`(i~l
`
`CONTROL b
`COMMUNICATION
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`
`P1
`
`PROCESSOR
`CURRENT
`
`AHM, Exh. 1006, p. 7
`
`

`

`.....:1
`\C w
`="
`-..
`\C
`.....:1
`.,.
`til
`
`~ -QO
`
`......
`~ a
`'71 :r
`
`= ~ -~ -~
`
`;....
`
`~ = .......
`~ .......
`
`•
`00
`~ •
`
`APPLICATION C
`COMMUNICATION
`
`-
`
`---
`
`( lill :RESUMPTION
`
`(il' :CONTINUATION
`
`(RECOVERY)
`
`Pl..
`
`CONTROL c
`COMMUNICATION
`CONTROL b
`COMMUNICATION
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`I
`X -(REDUNDANT)
`PROCESSOR
`
`DISTRIBUTED OS
`
`P3
`
`PROCESSOR
`CURRENT
`Fig.2E
`
`[
`
`( g~ STR I BUTEO )
`X CURRENT
`-PROCESSOR
`
`P2
`
`CONTROL b
`COMMUNICATION
`
`[,
`
`CONTROL b
`COMMUNICATION
`CONTROL a
`COMMUNICATION
`DISTRIBUTED OS
`
`P1
`
`PROCESSOR
`CURRENT
`
`APPLICATION C
`COMMUNICATION
`APPLICATION B
`uDICOMMUNICATION
`APPLICATION A
`COMMUNICATION
`CONTROL c
`COMMUNICATION
`
`-----·-··-
`
`1-.----------~-
`
`APPL I CAT I ON B f-
`COMMUNICATION
`
`L...--__ ~---~ --
`
`SERVER
`PRINTING
`
`APPLICATION C
`COMMUNICATION
`
`AHM, Exh. 1006, p. 8
`
`

`

`......:1
`~
`\C
`~ -..
`\C
`......:1
`-..
`Ol
`
`~
`.....
`~
`
`rJ;J =- ::g -~
`
`> = ~
`
`~
`.....
`~~
`.....
`
`~
`
`~ a
`~ = .....
`•
`'Jl
`~ •
`
`PM :#:002
`
`25
`
`FACILITY
`MONITORING
`
`-
`
`26
`
`-
`
`-
`
`-
`
`-
`
`-
`
`ALLOCATION STATE
`
`-
`
`CRUSH
`
`1
`
`L--------------_.J
`I w I TH UN I T 13
`I
`1 14 AND 15 AND COMMUNICATE 1
`:·TO PM ADMINISTER UNITS
`I ADMINISTRATION UNIT--; INSTALLATION
`, __ \..._ -----
`16\
`14
`
`I TABLE OF FIG. 5 I
`~
`1 CLASS IF I CAT I ON :
`: TABLE OF FIG. 4 :
`I ABNORMALITY
`I
`I CORRESPONDENCE
`:
`: ·TO ADMIN I STER 1
`I ·TO ADMINISTER
`I
`:
`1 DEC I S I ON UNIT
`: DEFINITION UNIT :
`: CLASS I F I CAT I ON 1
`I
`I
`;-----------l i A-BNORMALITY--~
`ADMINISTRATION FACILITY
`
`I
`
`----r-·-;--J ~]r--___ j
`
`15
`
`24
`
`11
`
`12
`
`CURRENT PROCESSOR MODULE (S)
`
`Fig. 3A
`
`AHM, Exh. 1006, p. 9
`
`

`

`-....)
`~
`\C
`
`\C "' ._.
`
`-....)
`._.
`til
`
`"" ~ -00
`'J1 =-a
`
`~ -7J -"" "" 00
`
`> c:
`
`~ =
`......
`~ ......
`
`•
`00
`~ •
`
`28
`
`SHARED MEMORY
`NONVOLATILE
`
`I
`
`' I
`' I
`
`I
`2~
`
`I MEMORY
`I CPU 1-
`PM #003
`
`-
`
`--
`
`-
`
`LLATION
`
`-E-c
`INSTA
`
`I
`
`11
`J
`
`FACILITY
`MONITORING
`
`~DRAWBACK-RESUMPTION ~--1_
`~L W~ ~H_D~~~L-~N~T ___ c<_I-~1--l-19
`,---------------,
`L-~----------::J ,8 -..l-18
`HALT UNIT
`
`f--1 RERUN UNIT
`
`, -~20
`
`._ ____________ :J
`
`Lc~~! ~N~~T ~ o~ _u~ I_T _ _p:J ~'r,~ 22
`,------------..,
`.._-----------_J
`~ 21
`L-~-----------~ ,8
`Ol'--~-
`,------------.,
`L_ ____________ ::_j
`L-..-----------:.1 ,(3
`'UNIT
`ex'
`r------------.,
`
`L------------------_j
`
`h,~23
`
`I
`
`WORK ALLOCATION FACILITY
`~ ~E~S_U~E_S _ T~~L~ _0~ £ ~G~7 ____ pj .8:
`I ·TO ADMINISTER ABNORMALIT
`: -~
`; MEASURES DETERMINING UNIT
`
`'-----------__ J
`
`r--_.----------------1._
`
`r------------~
`
`.._ ____________ :_)
`cxr -1-
`L--~--------------~
`-,-----------------J .d
`:
`
`-----
`
`Fig. 38
`
`-
`
`-
`
`REDUNDANT PROCESSORS
`ABNORMAL AND
`INFORMATION ABOUT
`
`-
`
`~---
`
`~~
`I
`: ~~-17
`
`I
`
`r--~-~--------------,
`
`~TO OTHER PMS
`IALLOCAfE WORKS OF ABNORMAL PM
`:·AND PROVIDE
`INSTRUCTIONS TO
`1 ·ADMINISTER WORK TABLE OF FIG.61
`I ·TO COMMUNICATE WITH UNIT 12
`;WORK ALLOCATION CONTROL UNIT
`
`:
`I
`I
`~-~
`l
`13
`
`AHM, Exh. 1006, p. 10
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 10 of 18
`
`5,796,937
`
`Fig.4
`
`r----- 31
`
`# 002
`
`NAME OF PM MOUNTING NUMBER OF PM
`pmOa
`# 001
`pmOb
`pmOc
`
`=If 003
`
`* 004
`
`IN THIS FIGURE:
`# 001 ~ :tf 003: CURRENT PM
`# 004: REDUNDANT PM
`
`Fig.5
`
`NAME OF PM
`
`STATE OF PM
`
`pmOa
`pmOb
`
`pmOc
`
`RESTORING
`OPERATING
`CHANGING
`INITIALIZING
`( INTO HOT
`STANDBY STATE
`
`)
`
`r---
`
`32
`
`POSSIBILITY OF
`RECURSIVE ABNORMALITY
`YES
`NO
`YES
`
`NO
`
`AHM, Exh. 1006, p. 11
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 11 of 18
`
`5,796,937
`
`Fig.6
`
`WORK
`
`DESTINATION PM
`
`33
`!
`
`pmOb
`
`pmOc
`
`REDUNDANT
`
`REDUNDANT
`
`REDUNDANT
`
`REDUNDANT
`
`pmOc
`
`pmOc
`
`REDUNDANT
`
`DISTRIBUTED OS
`
`COMMUNICATION
`CONTROL a
`COMMUNICATION
`CONTROL b
`COMMUNICATION
`CONTROL c
`COMMUNICATION
`APPLICATION A
`COMMUNICATION
`APPLICATION B·
`COMMUNICATION
`APPLICATION C
`PRINTING
`SERVER
`INTERACTIVE
`SERVICE
`
`pmOa
`
`pmOa
`
`pmOb
`
`pmOc
`
`p~b
`
`pmOb
`
`pmOa
`
`pmOb
`
`pmOb
`
`AHM, Exh. 1006, p. 12
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 12 of 18
`
`5,796,937
`
`Fig. 7
`
`WORK 10
`
`WORK
`
`34
`!
`MEASURES AGAINST
`PM ABNORMALITY
`SYSTEM HALT
`
`CONTINUATION
`DRAWBACK AND
`RESUMPTION.
`CONTINUATION AND
`RECOVERY
`RERUN
`
`RERUN AND
`RECOVERY
`
`HALT
`DRAWBACK AND
`RE SUMPT I ON.
`CONTINUATION
`
`BASE OF OS
`DISTRIBUTED OS
`(SYSTEM SERVICE)
`COMMUNICATION
`CONTROL
`
`HOST LINKAGE
`SERVICE
`(X)
`HOST LINKAGE
`SERVICE
`(Y)
`INTERACTIVE
`SERVICE
`
`COMMUNICATION
`APPLICATION
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`PRINTING SERVER
`
`RERUN
`
`AHM, Exh. 1006, p. 13
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 13 of 18
`
`5,796,937
`
`Fig. SA
`
`(MOUNTING NUMBER)
`CRUSH
`
`(MOUNTING NUMBER)
`INSTALLATION
`
`IS SYSTEM PROCESS >-'N_O~-----,
`CONTINUABLE?
`YES
`
`522
`
`RECURSIVE
`ABNORMALITY?
`YES
`
`NO
`
`NO
`
`AHM, Exh. 1006, p. 14
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 14 of 18
`
`5,796,937
`
`Fig. 88
`
`DELETE MOUNTING NUMBER
`OF ABNORMAL PM FROM
`CORRESPONDENCE TABLE 31
`THROUGH ADMINISTRATION
`FACILITY 12
`
`HALT OR DRAW BACK WORKS
`OF ABNORMAL PM AND
`WITHDRAW TAKEOVER
`INFORMATION THROUGH
`WORK ALLOCATION
`FACILnY 13
`
`527
`
`;
`
`I
`
`CHANGE MOUNTING NUMBER
`OF ABNORMAL PM TO THAT
`OF REDUNDANT PM
`IN
`CORRESPONDENCE TABLE 31
`THROUGH ADMINISTRATION
`FACILITY 12
`
`~28
`
`CONTINUE.RERUN.OR DRAW
`BACK AND WITHDRAW WORKS
`OF ABNORMAL PM THROUGH
`WORK ALLOCATION
`FACILITY 13
`
`I
`
`END
`
`AHM, Exh. 1006, p. 15
`
`

`

`~ ......
`\C
`-..
`0'\
`\C
`'I
`-..
`Ul
`
`*6
`
`*5
`
`*L..
`
`~ -QO
`(",) --VI
`~ -S'J -~
`
`> c
`
`(",)
`::'
`00
`
`~ = ......
`~ ......
`~
`•
`
`~ • \Jl
`
`FACILITY 12
`THROUGH ADMINISTRATION
`CORRESPONDENCE TABLE 31
`OF
`DESCRIBE MOUNTING NUMBER
`
`INSTALLED PM
`
`IN
`
`S37
`
`FACILITY 12
`THROUGH ADMINISTRATION
`CORRESPONDENCE TABLE 31
`INSTALLED PM
`IN
`AND NEW NAME OF
`DESCRIBE MOUNTING NUMBER
`
`S3L..
`
`YES
`
`NO
`
`533
`
`NO
`
`Fig. 9A
`
`FACILITY 12
`THROUGH ADMINISTRATION
`CORRESPONDENCE TABLE 31
`OF
`DESCRIBE MOUNTING NUMBER
`
`INSTALLED PM
`
`IN
`
`OBJECT OF WORK TO BE
`IN DRAWBACK STATE OR
`INSTALLED
`IS PM TO BE
`
`YES
`RECOVERED?
`
`S31
`
`AHM, Exh. 1006, p. 16
`
`

`

`......:1
`~
`\C
`="'
`.,..
`\C
`......:1
`.,..
`Ul
`
`END
`
`""" -QO
`a -""' 0
`> = ~ -~ -~
`
`ga
`
`QO
`
`~ =
`......
`~ = ~
`
`•
`00
`•
`Lj
`
`INSTALLED PM
`/
`536
`
`EXECUTE WORKS
`LET
`
`YES
`
`*6
`
`NO
`
`DESCRIBE WORKS OF
`DOES WORK TABLE 33
`;
`535
`
`INSTALLED PM?
`
`*5
`
`Fig. 98
`
`FACILITY 13
`WORK ALLOCATION
`RECOVER WORKS THROUGH
`WORKS DRAWN BACK,OR
`INSTALLED PM AND RESUME
`RELEASE BLOCKING OF
`
`_!__
`532
`
`*4
`
`AHM, Exh. 1006, p. 17
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 17 of 18
`
`5,796,937
`
`Fig.10A
`
`HALTING SYSTEM
`
`~ jP2l jP"3l P1~P3:CURRENT
`~ L_ j L_ j
`PROCESSOR
`
`Fig.10B
`
`DRAWING BACK AND RESUMING ABNORMAL PROCESSOR Pl
`(NO REDUNDANT PROCESSOR)
`
`~LJLJ
`..!). LJLJLJ
`LJLJLJ
`
`RESUMING
`WORKS
`
`DRAWING BACK
`
`{7
`
`AHM, Exh. 1006, p. 18
`
`

`

`U.S. Patent
`
`Aug. 18, 1998
`
`Sheet 18 of 18
`
`5,796,937
`
`Fig.10C
`
`RERUNNING OR CONTINUING WORKS BY SUBSTITUTE PROCESSORS
`P2 AND P3
`(NO REDUNDANT PROCESSOR)
`
`..!}
`
`~c:JLJ
`P2 II P3
`II RERUNNING OR CONTINUING
`~ ,,
`L___j 4 -: -----1~1---1-:---..f-1~ WORKS OF PROCESSC'. P 1
`IP11,1 p2 II P)
`{7
`I' RERUNNING OR CONTINUING
`L___j '-+_---f-._.... _ _ _ _ ....,__ WORKS OF PROCESSOR Pl
`RESTORING
`
`Fig.10D
`
`RERUNNING OR CONTINUING WORKS BY REDUNDANT
`PROCESSOR P4
`
`~c:JLJLJ
`~c:JLJLJ RERUNNING OR CONTINUING
`
`{7
`
`WORKS OF PROCESSOR P1
`
`AHM, Exh. 1006, p. 19
`
`

`

`5.796.937
`
`1
`METHOD OF AND APPARATUS FOR
`DEALING WITH PROCESSOR
`ABNORMALITY IN MULTIPROCESSOR
`SYSTEM
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates to a method of and an
`apparatus for dealing with a processor abnormality in a
`multiprocessor system. and particularly. to a multiprocessor
`system having processors for processing distributed works.
`When a monitoring facility detects an abnormality in any
`one of the processors. an administration facility provides
`information about the detected abnormal processor as well 15
`as information about a redundant processor. to a work
`allocation facility which seeks the distributed works of the
`abnormal processor from a work table according to these
`pieces of information and allocates the sought works to
`given ones of the processors.
`The present invention allocates the distributed works of
`the abnormal processor to the other processors in a way to
`improve the fault tolerance of the multiprocessor system and
`secure a 24-hour operation of the system.
`2. Description of the Related Art
`A multiprocessor system according to a prior art loosely
`couples processors each having a CPU and a memory
`through a high-speed bus and distributes works including an
`operating system (OS). applications. and communication
`control to the processors.
`To improve the fault tolerance of the system. it is impor(cid:173)
`tant to provide improved measures to deal with a processor
`abnormality. The prior art is incapable of optionally setting
`measures to deal with an abnormality depending on the
`processing conditions of the system and the requirements of 35
`a user. For example. the prior art is incapable of localizing
`the influence of an abnormality in one processor. to protect
`other processors.
`If the cause of a processor abnormality is a software 40
`failure such as an error in take-over information about a
`work processed by the abnormal processor. the abnormality
`will necessarily occur in a substitute processor that reruns or
`continues the work of the abnormal processor. This will
`involve another substitute process. which will be again
`abnormal, to thereby expand the processor abnormality.
`Consequently, the fault tolerance of the system will dete(cid:173)
`riorate.
`Works shared by an abnormal processor must be allocated
`to a redundant processor after the redundant processor is 50
`initialized. or the works will be incorrectly taken over by the
`redundant processor and the redundant processor will inef(cid:173)
`fectively serve as a substitute processor.
`
`55
`
`SUMMARY OF THE INVENTION
`An object of the present invention as to deal with a
`processor abnormality in various ways. suppress an expan(cid:173)
`sion of the processor abnormality. and effectively use a
`redundant processor.
`In order to attain the above object. the present invention 60
`provides a multiprocessor system having processors for
`processing distributed works. a monitoring facility for
`detecting an abnormality in any one of the processors (Pl to
`P4). an administration facility for providing information
`about the detected abnormal processor as well as informa- 65
`tion about a redundant processor. and a work allocation
`facility for seeking the distributed works of the abnormal
`
`2
`processor from a work table according to these pieces of
`information and allocating the sought works to given ones of
`the processors. The system includes an abnormality mea(cid:173)
`sures table that selectively describes measures to be taken
`5 for each of the distributed works against an abnormality. The
`work allocation facility determines. for each of the distrib(cid:173)
`uted works of the abnormal processor. a measure to be taken
`according to the abnormality measures table and allocates
`the distributed works of the abnormal processor to given
`10 ones of the processors. If the abnormality is recursive.
`allocation of any work for which a specific measure such as
`rerun or continuation is to be taken is suspended. If the
`redundant processor is being initialized. allocating works to
`the redundant processor is delayed.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The above object and features of the present invention
`will be more apparent from the following description of the
`preferred embodiments with reference to the accompanying
`20 drawings. wherein:
`FIGS. IA and lB show a basic structure of a multipro(cid:173)
`cessor system according to the present invention;
`FIGS. 2A 2B. 2C. 2D. and 2E explain measures to deal
`with an abnormality occurring in one of processors of the
`25 multiprocessor system:
`FIGS. 3A and 3B explain the operation of the multipro(cid:173)
`cessor system according to the present invention;
`FIG. 4 shows a processor module correspondence table
`30 according to the present invention:
`FIG. S shows an abnormality classification table accord(cid:173)
`ing to the present invention;
`FIG. 6 shows a work table (corresponding to FIGS. 2A to
`2D) according to the present invention;
`FIG. 7 shows an abnormality measures table according to
`the present invention;
`FIGS. SA and SB explain procedures (part 1) to deal with
`a crash or an installation of i processor module. according to
`the present invention;
`FIGS. 9A and 9B explain procedures (part 2) to deal with
`a crash or an installation of a processor module. according
`to the present invention; and
`FIGS. lOA to lOD show examples of conventional mea-
`45 sures to deal with a processor abnormality.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`Before describing the embodiments of the present
`invention. the related art and the disadvantages therein will
`be described with reference to the related figures.
`FIGS. lOA to lOD show measures to deal with a hardware
`or software abnormality in any one of processors Pl to P4.
`Among these processors. the processors PI to P3 are current
`processors. and the processor P4 is a redundant processor. It
`is supposed that an abnormality occurs in the current pro(cid:173)
`cessor Pl.
`Measures to deal with the abnormality include:
`halting all of the current processors;
`drawing back works processed by the abnormal processor
`Pl and resuming the works from the beginning after the
`abnormal processor PI is restored to a normal state;
`letting substitute processors rerun the works of the abnor(cid:173)
`mal processor Pl from the beginning; and
`letting the substitute processors continue the works of the
`abnormal processor Pl from the time when the abnor(cid:173)
`mality occurred.
`
`AHM, Exh. 1006, p. 20
`
`

`

`5.796.937
`
`3
`In this !specification. the "substitute processor" may be
`any one of the redundant and current normal processors.
`The multiprocessor system selects one or a plurality of
`these measures. The selection is solely carried out by an
`operating system (OS). and therefore. there is no room for a
`user or an application for freely selecting one or a plurality
`of them.
`In the above circumstances. these measures are taken
`without actively determining whether the abnormality is a
`hardware abnormality or a software abnormality.
`The software abnormality is caused by an error in a work
`program per se or in take-over information that is produced
`and used during the execution of a work program in each
`processor.
`To solve these problems. the present invention adopts an
`abnormality measures table that selectively describes mea(cid:173)
`sures to be taken for each of works shared by processors of
`a multiprocessor system against an abnormality. If an abnor(cid:173)
`mality repeatedly occurs (recursive error) during restoration
`of the processor that has caused the abnormality or during a
`change-over to a substitute processor. the system determines 20
`that the abnormality is due to a software error and suspends
`allocation of the related work by rerun or continuation to the
`substitute processor. If a redundant processor is being
`initialized. the system delays the allocation of works to the
`redundant processor. In this way. the system provides vari- 25
`ous measures to deal with a processor abnormality. sup(cid:173)
`presses an expansion of the processor abnormality. and
`effectively uses a redundant processor.
`FIGS. 1A and 1B explain the principle of a multiprocessor
`system according to the present invention.
`A processor monitoring facility 1 monitors the operating
`states of current processors P1 to P3 and detects an abnor(cid:173)
`mality in any one of them.
`An administration facility 2 notifies a work allocation
`facility 3 of information about an abnormal processor and a
`redundant processor and administers information about a
`possibility of recursive abnormality as well as information
`about whether or not a redundant processor is being initial(cid:173)
`ized. The recursive abnormality usually occurs when one
`processor takes over a work from another.
`The work allocation facility 3 allocates works of the
`abnormal processor sought from a work table 6 to given
`processors according to measures described in an abnormal-
`ity measures table 7. If the administration facility 2 notifies
`the work allocation facility 3 that there is a possibility of
`recursive abnormality. the facility 3 suppresses allocating
`works to be rerun or continued to the other processors. If the
`facility 2 notilies the facility 3 that the redundant processor
`is being initialized. the facility 3 delays allocating works to
`the redundant processor.
`A correspondence table 4 (refer to FIG. 4) describes the
`classification of the current and redundant processors. i.e ..
`the names and mounting numbers thereof.
`An abnormality classification table 5 (refer to FIG. 5)
`describes the state such as restoring. operating. or initializ(cid:173)
`ing state of each processor as well as a possibility of
`recursive abnormality of each processor.
`The work table 6 (refer to FIG. 6) describes works and
`processors that share the works.
`The abnormality measures table 7 (FIG. 7) describes. for
`each work. measures such as halt. drawback. rerun. and
`continuation to deal with an abnormality.
`The multiprocessor system also has a high-speed bus 8
`and a nonvolatile shared memory 9 for storing take-over
`information to be transferred from an abnormal processor to
`substitute processors. The processors of the system are the
`current processors P1 to P3 and the redundant processor P4.
`
`4
`For the sake of explanation. the tables 4 to 7 are separated
`from each other. The arrangements and storage of informa(cid:173)
`tion contained in these tables are optional.
`The monitoring facility 1 notifies the administration facil(cid:173)
`ity 2 of an abnormality occurring in any one of the current
`processors P1 to P3.
`The administration facility 2 refers to the correspondence
`table 4 and abnormality classification table 5. to identify the
`abnormal processor and a redundant processor and deter-
`10 mine whether or not the abnormality is recursive and
`whether or not the redundant processor is being initialized.
`These pieces of information are sent to the work allocation
`facility 3.
`The work allocation facility 3 refers to the work table 6.
`to identify works shared by he abnormal processor. Accord-
`15 ing to measures sought from the abnormality measures table
`7. the facility 3 allocates the works of the abnormal proces(cid:173)
`sor to the redundant processor P4. etc. Thereafter. the facility
`3 notifies the administration facility 2 of the allocation states
`of the works.
`Not only the redundant processor P4 but also any one of
`the current normal processors may serve as a substitute
`processor. Accordingly. the current normal processors
`execute newly allocated works in addition to works origi(cid:173)
`nally shared thereto.
`The administration facility 2 updates the correspondence
`table 4 according to information about the abnormal and
`redundant processors. The facility 2 also updates the abnor(cid:173)
`mality classilication table 5 according to information from
`the work allocation facility 3.
`The contents of the work table 6 and abnormality mea(cid:173)
`sures table 7 may be set or updated by a user or application
`according to the capacity and utilization mode of the mul(cid:173)
`tiprocessor system.
`As mentioned above. the method of dealing with an
`35 abnormal processor according to the present invention is
`applied to a multiprocessor system having processors for
`processing distributed works. the monitoring facility 1 for
`detecting an abnormality in any one of the processors. the
`administration facility 2 for providing information about the
`40 detected abnormal processor and information about a redun(cid:173)
`dant processor. and the work allocation facility 3 for seeking
`the distributed works of the abnormal processor from the
`work table 6 according to these pieces of information and
`allocating the sought works to given ones of the processors.
`45 The method of the present invention uses the abnormality
`measures table 7 that selectively describes measures to be
`taken for each of the distributed works against an abnor(cid:173)
`mality thereof. The method lets the work allocation facility
`3 determine. for each of the distributed works of the abnor-
`50 mal processor. a measure to be taken according to the
`abnormality measures table 7 and allocate the works to
`given ones of the processors accordingly.
`The apparatus according to the present invention for
`dealing with a processor abnormality in a multiprocessor
`55 system having processors to process distributed works has
`the monitoring facility 1 for detecting an abnormality in any
`one of the processors. the administration facility 2 for
`providing information about the abnormal processor and
`information about a redundant processor. the abnormality
`60 measures table 7 that selectively describes measures to be
`taken for each of the distributed works against an
`abnormality. and the work allocation facility 3 for seeking
`the distributed works of the abnormal processor from the
`work table 6 according to these pieces of information and
`65 allocating the sought works to given ones of the processors
`according to the measures described in the abnormality
`measures table 7.
`
`30
`
`AHM, Exh. 1006, p. 21
`
`

`

`5.796.937
`
`5
`6
`With the abnormality measures table 7 that describes, for
`At this time, the work of the abnormal processor P2 is
`each work. measures such as halt, drawback, rerun, and
`blocked against processing requests from the other proces(cid:173)
`continuation to deal with an abnormality and with the
`sors.
`abnormality classification table 5 that describes a possibility
`A multiprocessor system according to an embodiment of
`of recursive abnormality for each processor and whether or 5
`the present invention will be explained with reference to
`not a redundant processor is being initialized. the system is
`F1GS. 3A to 8B. For the sake of explanation. this system has
`capable of selecting the destination of each of the works of
`processor modules (PMs) each including a CPU and a
`memory.
`the abnormal processor. suppressing an expansion of the
`abnormality. and efficiently using the redundant processor.
`F1GS. 3A and 3B are general views showing the multi(cid:173)
`F1GS. 2A. 2B, 2C 2D. and 2E explain measures to deal 10
`processor system. The system has a monitoring facility IL
`with an abnormality occurred in a processor for processing
`an administration facility 12. a work allocation facility 13. a
`distributed works in a multiprocessor system. The multipro(cid:173)
`definition unit 14, an abnormality classification decision unit
`cessor system includes current processors PI to P3 and a
`15, an administration unit 16, a work allocation control unit
`17, a halt unit 18, a withdrawal unit I9. a drawback-
`redundant processor P4 that has been initialized to a hot
`standby state. It is supposed that an abnormality has 15
`resumption unit 20, a rerun unit 2I. a continuation unit 22.
`occurred in the current processor P2.
`a measures determining unit 23. a high-speed bus 24. the
`processor modules (PMs) 25 to 27. and a nonvolatile shared
`Works of the current processor P2 now becoming an
`memory28.
`abnormal processor are allocated to the other processors as
`follows:
`F1GS. 4 to 7 show variety of tables. that is. a correspon(cid:173)
`dence table 3I, an abnormality classification table 32, a
`a distributed operating system (OS) is continued by the 20
`work table 33. and an abnormality measures table 34.
`current processors PI and P3 and redundant processor
`respectively. Note that each processor module (PM) is
`P4;
`identified by a name used by software but not by a mounting
`conununication control b is continued by the redundant
`number. The correspondence between the name and the
`processor P4;
`a communication application A is continued by the current 25 mounting number is described in the table 3I.
`The monitoring facility 11. administration facility 12. and
`processor P3;
`work allocation facility 13 are realized by one or a plurality
`a communication application B is drawn back;
`of PMs, which receive a signal indicating an installation or
`a printing server is rerun by the current processor PI; and
`a crush from any one of

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket