`[191
`Oct. 25, 1994
`[45] Date of Patent:
`Marron
`
`[11] Patent Number:
`
`5,359,730
`
`USOOS359730A
`
`[54] METHOD OF OPERATING A DATA
`PROCESSING SYSTEM HAVING A
`DYNAMIC SOFTWARE UPDATE FACILITY
`
`[75]
`
`Inventor:
`
`[73] Assignee:
`
`Assaf Marron, Poughkeepsie, NY.
`International Business Machines
`Corporation, Armonk, NY.
`
`[21] Appl. No.: 985,762
`
`[22] Filed:
`
`Dec. 4, 1992
`
`
`[51]
`Int. C1.5 ................................... G06F 9/06
`[52] US. Cl. ............................. 395/650; 364/DIG. 1;
`364/280; 364/286
`[58] Field of Search .................. 364/DIG. 1 MS File;
`395/650, 700
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`4,604,690 8/1986 Crabtree et a1.
`............. 364/DIG. 1
`4,607,332
`8/1986 Goldberg ...........
`364/DIG. 2
`
`....... 364/DIG. 1
`4,809,170 2/1989 Leblang et a1.
`
`..
`....... 364/DIG. 1
`4,862,349 8/1989 Foreman et al.
`................... 364/DIG. 1
`4,953,079
`8/1990 Ward et a1.
`4,954,941
`9/1990 Redman .................
`364/DIG. 1
`
`.......
`4,980,822 12/1990 Brantley et al.
`364/DIG. 1
`........... 364/DIG. 1
`5,210,854 5/1993 Beaverton et a1.
`OTHER PUBLICATIONS
`
`G. Etzkorn, “Change Programming In Distributed Sys-
`tem”, Intnl. Workshop on Configurable and Distributed
`Systems, pp. 140—151, London UK, Mar. 25—27, 1992.
`O. Frieder et a1., “Dynamic Program Modification In
`Telecommunication Systems”, Proceedings of
`the
`
`IEEE Seventh Conf. on Software Engrg. for Telecom-
`munication Switching Systems, pp. 168, 172, 1989.
`
`Primary Examiner—Thomas M. Heckler
`Attorney, Agent, or Firm—William B. Porter; Douglas
`R. McKechnie
`
`[57]
`
`ABSTRACT
`
`A dynamic software update facility (DSUF) is installed
`in a data processing system for the purpose of non-dis-
`ruptively replacing old operating system programs or
`modules with new updated versions thereof while pro—
`viding continuous availability and operation of the sys-
`tem. The new versions are loaded into the system along
`with change instructions providing information con-
`trolling the update. Task or process control blocks con-
`tain markers indicating the corresponding tasks are safe
`or unsafe to run the new programs. The markers are set
`initially to unsafe. A change descriptor table is stored
`and contains control
`information derived from the
`change instructions. When the DSUF is activated, an
`interrupt handler is installed and traps are stored in the
`old programs at entry points and safety points therein.
`Entry point traps are tripped when a task or process
`enters the old program and interrupts are generated that
`are handled by the interrupt handler to route tasks
`which are unsafe to the 01d program and tasks which
`are safe to a new program. When all tasks are safe, the
`new programs replace the old programs. When safety
`point traps are tripped, a task or process may change its
`state from unsafe to safe when predetermined condi-
`tions are met.
`
`19 Claims, 4 Drawing Sheets
`
`
`
`l/ O DEVICES
`AND
`
`PROGRAM LIBRARIES
`
`Page 1 of 12
`
`GOOGLE EXHIBIT 1017
`
`Page 1 of 12
`
`GOOGLE EXHIBIT 1017
`
`
`
`US. Patent
`
`Oct. 25, 1994
`
`Sheet 1 of 4
`
`5,359,730
`
`FIG. 1
`
`14
`
`1o
`
`CPU1
`
`12
`
`\ PROCESS 1
`
`
`73
`
`PROGRAM
`CHECK
`J
`
`
`CPU2
`
`PROCESSZ
`
`15
`
`
`
`MEMORY
`
`16
`
`PCFLIH
`
`
`
`I / O SUBSYSTEM
`
`I / O DEVICES
`AND
`
`PROGRAM LIBRARIES
`
`PROGRAM
`
`PROGRAM
`
`A!
`
`
`
`A' CHANGE
`
`INST.
`
`B!
`
`B' CHANGE
`
`29
`
`INST.
`
`27
`
`Page 2 of 12
`
`Page 2 of 12
`
`
`
`US. Patent
`
`Oct. 25, 1994
`
`Sheet 2 of 4
`
`5,359,730
`
` 34
`
`& CHANGE-INSTRUCTIONS
`
`INSTALL
`IN LIBRARY
`
`INSTALL NEW PROGRAMS
`
`43
`
`PREPARE
`
`FIG. 2A
`
`
`
`ACTIVATE
`
`36
`
`
`
`38
`
`4o
`
`
`
`
`
`
`
`
`
`
`MARK ALL TASKS AND
`PROCESSES 'UNSAFE'
`
`LOAD NEW PROGRAMS
`IN MEMORY
`
`ANALYZE CHANGES
`
`CREATE CHDESC
`
`INSTALL PCFLIH
`
`INTERCEPT
`
`
`
`47
`
`48
`
`50
`
`52
`
`INSTALL TRAPS AT ALL
`ENTRY POINTS IN OLD
`PROGRAMS & IN ALL
`SAFETY POINTS
`
`
`
`I
`
`54
`
`SAVE TIME STAMPS OF
`WHEN TRAPS STORED
`
`
`
`
`
`EXECUTE PCFLIH L--
`
`
`DSUF
`TRAP
`?
`
`60
`
`YES
`
`EXECUTE TRAP POINT
`
`ROUTINE (FIG. 3)
`
`A——————A
`
`Page 3 of 12
`
`Page 3 of 12
`
`
`
`US. Patent
`
`Oct. 25, 1994
`
`Sheet 3 of 4
`
`5,359,730
`
`COMMIT
`
`SWITCH OVER TO
`NEW PROGRAMS
`
`FIG. 28
`
`ALL
`
`NO
`
`
`PROCESSES
`
`SAFE ?
`
`
`
`
`76
`
`SAFETYPOINT
`TRAP
`
`DOES PROCESS
`
`
`N0
`MEET CONDITIONS
`
`
`
`To BECOME
`
`YES
`
`82
`
`RETURN To
`PROCESS
`
`77
`
`83
`
`PART OF
`NO
`
`
`COORDINATED
`GROUP ?
`
`YES
`
`85
`
`. 73
`
`MARK PROCESS
`"SAFE'
`
`87
`
`MARK PROCESS
`'SAFE'
`
`EXECUTE DSCE.
`IF ANY
`
`RETURN TO
`PROCESS
`
`FIG. 3
`68
`
`30
`\
`
`CHECK REASON FOR
`INVOCATION- IS IT DUE
`TO SAFETY POINT TRAP
`OR ENTRY POINT TRAP
`
`ENTRY POINT
`
`TRAP
`
`
`70
`
`IS STATE OF
`
`
`PROCESS SAFE
`0R UNSAFE '2
`
`
`UNSAFE
`SAFE
`
`72
`
`7"
`
`ROUTE TO NEW
`FROGRAM
`
`ROUTE TO OLD
`PROGRAM
`
`
`79
`LAST
`
`REMAIN'NG
`
`
`PROCESS ?
`
`
`
`YES
`
`80
`
`EXECUTE DSCE.
`H: ANY
`
`84
`o
`
`P ND
`SUS E
`
`E
`PROC SS
`
`86
`
`RETURN CONTROL
`TO SYSTEM TO
`EXECUTE OTHER
`
`RETURN PROCESSES
`
`81
`
`UNSUSPEND
`SUSPENDED
`PROCESSES AND
`
`Page 4 Of 12
`
`Page 4 of 12
`
`
`
`US. Patent
`
`Oct.-25, 1994
`
`Sheet 4 of 4
`
`5,359,730
`
`FIG. 4
`
`CHANGE
`NODE 2
`
`‘
`
`95
`
`7
`
`CHANGE
`NODE 1
`
`96
`
`,
`
`103 MODULE
`NODE 1.1
`
`98
`
`HASH TABLE
`
`TRAP
`POINT
`NODE 1.1.1
`
`TRAP
`POINT
`NODE 1.1.2
`
`TRAP
`POINT
`NODE 1.1.3
`
`TRAP
`POINT
`NODE 1.2.1
`
`HASH COLLISION LINK
`
`1 14
`
`TRAP POINT
`
`Page 5 of 12
`
`Page 5 of 12
`
`
`
`1
`
`5,359,730
`
`2
`rent version to the new version. When all procedures
`have been replaced by their new version, the program
`update is complete.” (pg. 169) A ”procedure .
`.
`. that has
`changed between versions may be updated only when it
`is not active.” In contrast, the invention updates active
`tasks and uses entry points and safety points, and does
`not examine any stack. The invention allows for con-
`current execution of the old program and the new pro-
`gram by multiple tasks. Also, the invention does not
`require interception of every program exit.
`The invention involves the use of “safety points”
`which are system observable events and conditions.
`These events and conditions control the routing of tasks
`to the old program or to the new program. One may
`relate the concept of a safety point in a program to a
`sync point in a database (DB) transaction. All DB
`changes must be permanently written in the data base
`once a sync point is reached, all of them should be
`backed out if the transaction aborts prior to reaching a
`sync point, and in some database managers, none of the
`changes are visible to other transactions until a sync
`point has been reached. The differences between the
`DB sync point and the program change safety points
`are:
`
`Safety points are chosen anew with each change to
`the program. Sync points usually remain the same,
`even if the program flow or the database structure
`changes.
`Safety points most often reside in modules which are
`not being changed, while sync points are often
`embedded in the program constituting the transac-
`tion.
`
`Sync points are either explicit (system call), or trivi-
`ally implicit (end of transaction implies a sync
`point). Safety points are explicit, but cannot be
`observed in the program. They must be specified
`externally to the program. It is not possible to code
`a system call saying “This task is now Safe for any
`change”.
`Safety points lose their meaning when the change is
`fully applied or the system is restarted with all new
`modules. Though the code in and around the safety
`point continues to execute, it bears no further sig-
`nificance as a safety point. The sync point is part of
`the ongoing logical significance of the program.
`These differences also apply when comparing the
`concept of safety points and the concepts related to
`Checkpoint-Restart, and the points in time when the
`latter can be performed.
`
`SUMMARY OF THE INVENTION
`
`The problem of replacing program modules while a
`system is running has been an open issue for many years.
`There are numerous sub problems which make a solu-
`tion difficult to implement. This invention describes a
`method that solves most known sub problems. The
`method is applicable to most commercial operating
`systems, including MVS/ESA and UNIX (TM AT&T)
`operating systems. The constraints within which the
`solution must operate, are:
`1. The method should handle arbitrarily unstructured
`code, which may be called concurrently by multi-
`ple processes, using any method of call which is
`physically possible with the underlying machine
`architecture.
`
`2. The running code (the old version which is being
`changed) should not have required or otherwise
`
`METHOD OF OPERATING A DATA PROCESSING
`SYSTEM HAVING A DYNAMIC SOFTWARE
`UPDATE FACILITY
`
`BACKGROUND OF THE INVENTION
`
`This invention relates to the field of data processing,
`and, more particularly, to improvements in a method
`for dynamically making software changes in a running
`system.
`.
`There are commercially available data processing
`systems such as IBM ESA/390 data processing systems,
`which operate with many resident programs or modules
`such as those of the commercially available IBM
`MVS/ESA operating system. (“IBM”, “ESA/390”,
`and “MVS/ESA” are trademarks of International Busi-
`ness Machines Corporation) When a system is running,
`such resident modules are accessible to each other in
`many different ways, and multiple tasks and processes
`can independently access the programs. From time to
`time, various operating system modules are updated and
`it becomes necessary to substitute new versions for the
`old versions. The problem thus exists of how to effect
`non-disruptive replacement while the system is running
`and in consideration of the complex environment where 25
`one or more different processes are concurrently using
`the programs being replaced.
`The general problem is known and has been recog-
`nized in the prior art. A paper, “Change Programming
`in Distributed System”, by G. Etzkorn, International
`Workshop on Configurable and Distributed Systems,
`pages 140—151, London, UK, Mar. 25—27, 1992, de’
`scribes a method of dynamically reconfiguring pro-
`grams in a system in which the programs communicate
`by message passing between ports. Reconfiguration
`occurs only when the system has reached a “reconfigu-
`ration state” and stays in such state while the changes
`are applied or made. The method requires a first series
`of reconfiguration commands that place the system in
`the reconfiguration state and then a series of change
`commands which effect the change. A change is made
`by reconfiguring an old version out of the system and
`configuring a new version into the system. The inven—
`tion differs from such a system in several ways but the
`major points of distinction are as follows. First, the
`invention is not based on message passing but upon the
`use of entry points and safety points and the normal
`interaction of processes with the programs to be
`changed. Second, in the invention, both old and new
`programs may be executed concurrently via multitask-
`ing while the system described in the paper completely
`reconfigures an old program out of the way.
`Another paper “Dynamic Program Modification in
`Telecommunication Systems”, by O. Frieder et 21.,
`Proceedings of the IEEE SEVENTH CONFER-
`ENCE 0N SOFTWARE ENGINEERING FOR
`TELECOMMUNICATION SWITCHING SYS-
`TEMS, pages 168-172, 1989, proposes a solution for a
`subset of the problems in a distributed telecommunica-
`tions environment. The updating process described in
`this paper replaces programs having plural procedures,
`one procedure at a time. “The updating system inter-
`rupts the program and examines the current state of its
`runtime stack. Based on this information and the list of
`
`10
`
`15
`
`20
`
`3O
`
`35
`
`4s
`
`50
`
`55
`
`65
`
`all procedures that each procedure can call (generated
`by the language compiler), the updating system calcu-
`lates when each procedure may be updated. Updating a
`procedure involves changing its binding from its cur-
`
`Page 6 of 12
`
`Page 6 of 12
`
`
`
`3
`undergone a restructure, a rewrite or other modifi-
`cation in order to position it for the dynamic
`change at hand. An “ordinary” change should be
`applied to “ordinary” and existing code with the
`help of an external facility, and with the help of an 5
`administrative process.
`3. Process blocking (quiescing) during implementa-
`tion of a dynamic change must be kept to a mini-
`mum. Deadlocks are prohibited.
`There are also many problems that need to be solved. 10
`Five problems are discussed next.
`Problem 1: Coordinating Concurrent Changes to
`Multiple Modules— This is the most complex problem
`in managing dynamic changes. It revolves around the
`dependency of multiple processes on different versions 15
`of shared modules. For example, suppose two processes
`P1 and P2 call program A and program B occasionally.
`Changes to programs A and B involve loading two
`new, updated programs A’ and B’ into memory. It is
`possible that process P1 is at a point in which all new 20
`calls to A or B can be and should be routed to the new
`versions, namely, A’ and B’, while process P2 is at a
`point where it is still dependent on the old version of A
`and B, and invoking a new version will cause an error.
`There are two common techniques to resolve this 25
`problem. One technique is to shut down the whole
`system before the change is implemented, and restart it
`after loading the new versions of the programs. The
`shutdown guarantees (in most cases) that both processes
`P1 and P2 are at a point where there is no outstanding 30
`dependence on programs A and B. Naturally, this solu-
`tion suffers from the disadvantage that it causes a pro-
`longed system outage which is often undesirable.
`The second technique is to require that each process
`be contained within a “transaction”, i.e., a short, inde- 35
`pendent work request. A transaction processing system
`is then replicated by hardware and software redun-
`dancy. Initially, all the transactions are processed by
`one copy of the transaction processing system (TPS)
`while the other copy is idle. The change is implemented 40
`on the idle copy of the TPS which in turns begins to
`process all newly arriving transactions. All new trans-
`actions unconditionally execute the new version of the
`program. Meanwhile, the original copy of the TPS
`continues to execute the old transactions (uncondition- 45
`ally calling only the old version of the program) until
`they are all completed, at which time it becomes idle
`and is then candidate for change. This solution does not
`commonly work in “legacy systems” which are not
`structured to process in this manner. The need for dy- 50
`namic change exists nevertheless in those legacy sys—
`tems and a solution which does not require a major
`restructure is desired. Also, this solution requires a con-
`siderable amount of redundancy which implies in-
`creased cost. This solution does not address the cases of 55
`long running processes (e.g. “batch jobs”) which are
`never rerouted to the new system, and may not enjoy
`the benefits of dynamic change. Such long running
`processes also may delay indefinitely the time when the
`original system can be changed again. Also, many appli- 60
`cation program changes carry dependencies which
`survive beyond the life of a single transaction. Thus,
`new transactions are not all eligible for executing the
`new code and some may have to still execute the old
`code.
`The invention solves this problem by performing
`conditional routing. Whenever a process invokes a pro—
`gram for which more than one version exists, the system
`
`65
`
`5,359,730
`
`4
`routes the call to the version required by this process,
`based on the state of the given process.
`Problem 2: Physical Replacement of Isolated Mod—
`ules, i.e., replacing a single module independently of
`any other change and when no synchronization be-
`tween multiple tasks is required— This problem is
`readily solved in an environment where all entries to
`the module are conveniently intercepted by the system.
`This situation exists when entry to the module is ef—
`fected via some system call (LINK, ATTACH, FORK,
`EXEC, etc.) A straight forward solution is to route all
`calls which are directed at the module, and which were
`issued after the request for change was accepted by the
`system, to a new version of the module.
`However, the problem is substantially more difficult
`if callers are allowed to call the module in ways which
`bypass all operating system services. In addition, forc-
`ing every program invocation to be constantly filtered
`by the operating system, lest there be a change pending,
`may consume substantial processor resources, and is
`considered prohibitive in a system in which there are
`frequently called functions such as those included in
`operating system kernels.
`A second possible solution is to use the machine’s
`ability to create interrupts based on program behavior
`such as program event
`recording (PER)
`in the
`ESA/390 system. This mechanism is limited in its abil-
`ity to handle multiple concurrent outstanding changes,
`and imposes a substantial performance penalty on the
`system.
`is to
`Another solution which has been suggested,
`modify the original program code in memory in a way
`that it will redirect the execution of the program to a
`new version loaded elsewhere in memory. Such inter-
`ception needs to be specific to the module at hand and
`either cause a debugger to be invoked, or force uncon-
`ditional execution of the new version of the code. This
`solution lacks the ability to coordinate between changes
`to multiple modules. In accordance with the inventive
`solution, a combination of comprehensive trapping with
`registration and filtering of safety points is used as de-
`scribed below.
`Problem 3: Surgical Replacement of a Portion of a
`Module (e.g., a Control Sections (CSECT))——— The
`problem here is to be able to replace a piece of the
`object code in a running program (usually a CSECT in
`a load module) while resolving all possible references to
`and from this code segment. As is well known a
`“CSECT” is an independent segment of a program and
`provides a scope of recognition of names. In an MVS
`environment, replacing a CSECT in the nucleus or
`kernel is a typical example of significant value. One may
`want
`to consider
`implementing even more local
`“patches” (part of a CSECT) with the proposed
`method. The solution is to treat each CSECT as a mod-
`ule being replaced in the manner of the invention,
`namely, the CSECT is compiled and linked indepen-
`dently as a separate program. Pointers in the old module
`which point to the old CSECT, are treated in the same
`way as pointers to a module.
`Problem 4: Changing Data Structures— In problem
`1, if the change to programs A and B involves a change
`to the layout or format of a data structure, the prior art
`does not offer any mechanism which enables the change
`to the program to be reflected in a pre-existing data
`structure. The solution provided herein does address
`this problem. If the data structure is private or local to
`the existing process, then by a proper choice of safety
`
`Page 7 of 12
`
`Page 7 of 12
`
`
`
`5
`point, the data structure can be changed immediately
`upon reaching the safety point. A program which ef-
`fects the change in the data structure, is not part of the
`update facility but is part of a change package which
`also includes change-instructions and updated pro-
`grams. If the data structure is shared by multiple pro-
`cesses, the data structure can be changed when all of the
`processes which share the data structure, have been
`blocked at safety points. This technique is based on the
`manner in which coordinated groups are handled in
`accordance with the invention. Problem 5. Synchroniz-
`ing Multiple Tasks— Problem I referred to dependen-
`cies within a single process. The problem is magnified if
`a certain set of processes P1,P2, .
`.
`. ,Pn, must all start
`calling the new versions A’, B’, together, at a synchro-
`nized point, while other processes Q1,Q2, .
`.
`. Qm, must
`still call the old versions of those programs indefinitely.
`This problem is solved by treating the processes as a
`coordinated group, as described below.
`One of the objects of the invention is to provide an
`improved method for non-disruptively installing new
`versions of resident programs in a data processing sys-
`tem while the system is running.
`A further object of the invention is to non-disrup-
`tively install new versions of operating system modules
`while the system is running and one or more processes
`are executing which use and access such modules.
`Still another object of the invention is to provide a
`dynamic update facility that uses the combination of
`traps and safety points to effect transition between old
`and new versions of operating system programs.
`Briefly, in accordance with the invention, when a
`new version of a module is installed, every invocation
`of the old version is intercepted by the system. A dy-
`namic software update facility (DSUF),
`then deter-
`mines the state of the process which invoked the pro-
`gram. If the process is “unsafe”, the DSUF passes con-
`trol to the old version of the program. If the process is
`“safe” the DSUF passes control to the new version of
`the program. When the change is first installed all pro-
`cesses are initially considered unsafe. The developer of
`the change, provides along with the new programs
`change-instructions including a set of conditions under
`which a process can undergo a state transition from
`unsafe to safe. The DSUF, upon its initialization sets
`itself up to capture all process transitions from an unsafe
`state to a safe state. Thus the DSUF has complete
`knowledge about the state of each process and can
`exercise the conditional routing of control according to
`the developers specifications. The conditions for state
`transition are called “safety points”.
`DRAWINGS
`
`Other objects and advantages of the invention will be
`apparent from the following description taken in con-
`nection with the accompanying drawings wherein:
`FIG. 1 is a block diagram of a data processing system
`embodying the invention;
`FIGS. 2A and 2B, when joined at reference line
`A—A form a block diagram illustrating the general
`operation of the invention;
`FIG. 3 is a flow chart of a trap point routine shown in
`FIG. 1;
`FIG. 4 is a schematic diagram illustrating the tree
`structure of the change descriptor table shown in FIG.
`1.
`
`Page 8 of 12
`
`5,359,730
`
`6
`
`DETAILED DESCRIPTION
`
`Referring now to the drawings, and first to FIG. 1, a
`data processing system (DPS) 10 comprises a plurality
`of central processing units (CPUs) 12 and 14 (also re-
`ferred to as CPUI and CPU2) connected to a common
`memory 16 and to an I/O subsystem 18 by busses 13 and
`15. 1/0 subsystem 18 is further connected to various
`I/O devices and program libraries 20. DPS 10 is prefer-
`ably a large, commercially available IBM ESA/390
`computing system that runs under IBM MVS/ESA
`operating system and is classed as a multitasking, multi-
`processing DPS. FIG. 1 schematically illustrates mem-
`ory 16 as it appears when the system is operating in the
`RUN phase described hereinafter. For the purpose of
`illustrating the invention, assume that memory 16 stores
`two modules 22 and 24 that contain two old programs
`A and B respectively, which programs are part of the
`operating system and are shared by various other pro-
`grams and processes being executed in either or both of
`CPUl and CPU2. The general problem which the in-
`vention addresses and solves is how to replace modules
`22 and 24 with updated modules 23 and 25 respectively
`containing new program A’ and new program B’, while
`the system is running subject to the constraints and
`sub-problems stated previously. The new programs A’
`and B’ are changed or updated versions of old programs
`A and B. Each module may have more than one entry
`point.
`The general operation of DPS 10 will now be de-
`scribed with reference to both FIGS. 1 and 2. Prior to
`execution of PROCESSI and PROCESSZ, standard
`process control blocks (PCBs) 17 and 19 are created by
`the operating system in memory 16 which blocks con-
`tain information about the processes. Such blocks or
`extensions thereto are modified in accordance with the
`invention to include a selectively settable marker or
`flag, or bit for each change indicating whether the cor-
`responding process is safe or unsafe to use the updated
`program(s) being provided with such change. Also,
`information about the state of the conditions which
`
`make a process eligible to be marked safe, are stored in
`the PCB or an extension thereto. These conditions com-
`
`prise the events and/or states when the process is
`deemed “safe”. A dynamic software update facility
`(DSUF) 28 is loaded into memory 16 at system initial
`program load (IPL) and is selectively activated thereaf-
`ter to dynamically update installed operating system
`programs A and B.
`Prior to activation of the DSUF, the new programs
`are created by a change programmer modifying the old
`programs, recompiling, and linking the new programs
`to form load modules 23 and 25. Such modules are with
`machine readable change-instructions 27 and 29, for
`dynamically installing the new modules and programs.
`The change-instructions identify all entry points in the
`old programs and all safety points wherever located,
`such safety points being the events or conditions which
`make a process eligible for executing the new code. If
`the change involves a change in a data structure, a
`program called data structure change effector (DSCE)
`is also packaged to make the change in the data struc-
`ture. The exemplary change does not include a data
`change and thus no DSCE is shown in the drawings.
`It is believed that a detailed discussion of “safety
`points”, at this place in the description, will facilitate a
`better understanding of the invention. When a change
`programmer is developing a change to a system, the
`
`IO
`
`15
`
`20
`
`25
`
`3O
`
`35
`
`40
`
`45
`
`50
`
`55
`
`65
`
`Page 8 of 12
`
`
`
`7
`
`5,359,730
`
`8
`
`programmer can analyze the change modules to deter-
`mine the dependencies of tasks or processes on execu-
`tion of the old version and of the new version. During
`such analysis, the programmer can determine the condi-
`tions which must be satisfied when processes can stop
`executing the old program and start executing the new
`program. Often, conditions can be translated into events
`in the life of a task, or the combination of an event with
`an observable state of the process. These events, states,
`and associated conditions are deemed to be “safety
`points”. A safety point is specified by the change pro-
`grammer designing a change. Examples of safety points
`are when a task is: started after the change was imple-
`mented (that is, all new tasks are “safe”), entering or
`exiting a particular module (either one of the changed
`modules or another unchanged module), making a par-
`ticular system call, executing an instruction at a given
`offset at a given module, observed as swapped out,
`observed as being in a problem or user state (as opposed
`to a supervisor state), observed as being in a wait state
`awaiting completion of some other task or awaiting
`some new work to be assigned to it, running under a
`given job name, and not running under a given job
`name.
`
`These events and conditions are either observed by
`the system since they include a system call, or observed
`by the DSUF which receives control and tests for
`safety conditions before marking a process as safe. Sub-
`sequently, any task attempting to execute the old code
`will be routed either to the old code if the process is
`unsafe or to the new code if the process is safe. The
`change-instructions specify the conditions which allow
`the DSUF to determine which version of the program
`can be used by the process. Safety points can be in the
`old program,
`in the process itself, or in some other
`program, task or process. The safety points also include
`code, referred to hereinafter as “safety point code” that
`is executed at or near the safety point. Such code in-
`cludes a safety point trap, a “wait” instruction that
`places the process in a wait state, etc.
`DSUF 28 operates in phases, the different phases
`being shown in FIG. 2 as labeled boxes located along
`the left side of FIG. 2. Different more detailed actions
`which occur during such phases, are shown as labeled
`boxes located along the right side of FIG. 2. The differ-
`ent phases include an INSTALL phase 34, a PRE-
`PARE phase 36, an ACTIVATE phase 38, a RUN
`phase 40, and a COMMIT phase 42.
`When DSUF 28 is activated, INSTALL phase 34
`performs step 43 to store load modules 23 and 25 and
`change-instructions 27 and 29, in program library 20.
`Next, during PREPARE phase 36, step 44 initially
`marks all processes and tasks as “unsafe”. Such marking
`is done by setting the safety status in the corresponding
`PCBs 17 and 19. Step 46 then loads copies of the new
`programs A’ and B’ from the library into memory 16 in
`such a manner that the new programs are initially “hid-
`den” from the rest of the system. That is, no pointer is
`created allowing direct access to either program by the
`rest of the system—only DSUF has direct access ini-
`tially. Step 47 analyzes the changes by reading the
`change-instructions 27 and 29 and step 48 then creates a
`change descriptor (CHDESC) table 32, in memory 16,
`for storing the change information including the spe-
`cific conditions and events which make each task eligi-
`ble to be “safe”. Table 32 is described below relative to
`FIG. 4 and contains information for controlling the
`update process.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`During ACTIVATE phase 38, step 50 enables an
`intercept in a standard program check first level inter-
`rupt handler (PCFLIH) 26 so that from this point on,
`DSUF 28 receives control on every program check
`interrupt. Step 52 installs traps 53 and 55 in memory 16
`at all entry points in the old programs and at all safety
`points in old programs A and B and elsewhere as prede-
`termined by the change programmer and set forth in the
`change-instructions. Each trap is a hex byte x00 in the
`first byte and may include a second byte that is an ac-
`cess index into the hash table in CHDESC 32, described
`below. Alternatively, a trap can comprise a machine
`instruction which either causes an interrupt or other—
`wise enables the invocation of the DSUF. Step 54 then
`saves time stamps of when the traps were stored. The
`system is thus initialized and prepared for the run phase
`during which various processes are executed or run in
`CPUs 12 and 14.
`
`During RUN phase 40, the system appears as shown
`in FIG. 1. When a process being executed by one of the
`CPUs, e.g. PROCESSl, enters program A, an attempt
`is made to execute the first trap byte x00. Such code is
`an invalid instruction and the attempt to execute it pro-
`duces a program check interrupt causing PCFLIH 26 to
`be executed in step 56. DSUF 28 receives control from
`PCFLIH, examines the cause of the interrupt, and de—
`termines in step 58 whether the trap is a DSUF trap.
`When an interrupt occurs, information is passed indicat-
`ing the source of the interrupt and the determination of
`step 58 looks at such source. If the trap is not a DSUF
`trap, control is returned to PCFLIH to continue execu-
`tion. If step 58 produces a positive or ‘YES’ result, then
`trap point routine 30 is executed in step 60 to determine
`from the safety status recorded in the associated PCB
`whether program A or program A’ should be executed,
`and to route or pass control to the appropriate new or
`old program for execution.
`At a later time, a user can request the change to be
`committed and thereby initiate COMMIT phase 42.
`Alternatively,
`the COMMIT phase can be initiated
`automatically, e.g. by lapse of a predetermined amount
`of time sufficient in duration to reasonably insure that
`the new programs will work properly. In COMMIT
`phase 42, step 62 determines if all the processes are safe,
`i.e., have all the processes been marked “safe”. If so,
`step 64 switches over to the new programs and step 66
`ends the COMMIT phase. If any process is not safe,
`step 62 bypasses step 64 and the new programs are not
`committed. The switching over to the new programs is
`done by locating in each entry point node in CHDESC
`32 the arrays of addresses that point to the old code and
`then storing in such addresses pointers to the new code.
`Any task still using a “saved” old address executes
`correctly since the trap remains in place and routine 30
`routes all callers of the old code to the new code. The
`changes can be backed out of by a similar process of
`applying an ordinary change where old and new ver-
`sions exchange their roles. The advantage over plain
`removal of the trap is that safety rules can (optionally)
`be applied. For example, if a process is already execut-
`ing the new code, it can keep doing so but all subse—
`quent new processes would go back to the old code.
`Referring to FIG. 3, when trap point routine 30 is
`executed, step 68 checks the reason for invocation, i.e.,
`whether the reason is because of an entry point trap or
`a safety point trap. The tripping of a safety point trap
`indicates a state transition from unsafe to safe has oc~
`curred. If the reason is an entry point trap, step 70 looks
`
`Page 9 of 12
`
`Page 9 of 12
`
`
`
`9
`at a safety status marker in the PCB of