`Chan et al.
`
`[11]
`[45]
`
`Patent Number:
`Date of Patent:
`
`4,812,981
`Mar. 14, 1989
`
`[541
`
`MEMORY MANAGEMENT SYSTEM
`IMPROVING THE EFFICIENCY OF FORK
`OPERATIONS
`Carl Chan, Framingham; Edwar A.
`[75] Inventors:
`Feustel, Sherbom, both of Mass.
`Assignee:
`Prime Computer, Inc., Natick, Mass.
`App1.No.: 790,839
`Filed:
`Oct. 24, 1985
`
`Int. Cl.‘ ..................... .. G06F 15/00; G06F 12/12
`US. Cl. ................................... .. 364/200; 364/300
`Field of Search
`364/200 MS File, 900 MS File,
`364/300
`
`References Cited
`U.S. PATENT DOCUMENTS
`364/200
`4,l04,7l8 8/1978
`364/200
`4,511,214 5/1935
`Hardy
`364/200
`4,584,639 4/1986
`..
`4,589,093 5/1986 lppolito =1 ..
`.. 364/900
`364/2(X)
`4,758,951 1/19ss Sznyter, 111 ...................... ..
`
`OTHER PUBLICATIONS
`A UNIX System Implementation for System/370 by W.
`A. Felton, G. L. Miller, and J. M. Milner-published by
`AT&T Bell Laboratories Technical Journal, vol. 63,
`No. 8, Oct. 1984, pages 1751 through 1767.
`Virtual Memory by Peter J. Denning-published in
`
`Computing Surveys, vol. 2, No. 3, Sep. 1970, pages 153
`through 189.
`The Evolution of UNIX System Performance by J.
`Feder—-published by AT&T Bell Laboratories Techni
`cal Journal vol. 63, No. 8, Oct. 1984, pp. 1794-18l4.
`Primary Examiner—Raulfe B. Zache
`Assistant Examiner-Thomas C. Lee
`Attorney, Agent, or Firm-Lahive & Cock?eld
`[57]
`ABSTRACT
`Methods and apparatus for implementing fork opera
`tions on UNIX or UNIX-emulating operating systems,
`particularly in multi-user environments reduce the copy
`time, the number of page faults and, consequently the
`input-output (“l/O”) operations between the central
`processing unit, main memory and auxilliary memory.
`In one aspect of the invention, fork operations are exe
`cuted by rede?ning those pages of the parent process
`image resident in main memory as pages of a child pro
`cess image and modifying the page maps accordingly.
`Page faults are thereby eliminated for pages located in
`auxiliary memory. Additional improvements in perfor
`mance are obtained by monitoring the level of main
`memory utilization and selecting optical procedures
`based on the amount of excess capacity in main mem
`ory.
`
`10 Claims, 2 Drawing Sheets
`
`mm
`mm)’
`——->
`5
`'- ------- -——-.
`/i~-g
`m2‘ MP5 :
`L ———————— ——J
`J “"51???” 1
`24/». m/m/z/rr :
`1 6211/1444
`1
`
`aim/1
`mama/P
`/_2
`
`22)
`20]
`Ila/45'2"‘: , Vii/1467f?
`, rm/ 1
`1 6‘0Pf/I6‘
`1
`1444/15/11 1.4541549
`
`1
`
`40mm”
`mm/ arr/a5
`
`“m”
`
`1
`
`APPLE 1023
`
`
`
`US. Patent Mar. 14, 1989
`
`Sheet 1 of2
`
`4,812,981
`
`mm”
`1!
`l,__________
`
`FIG. 1
`
`2
`
`
`
`US. Patent
`
`Mar. 14,1989
`
`Sheet 2 of2
`
`4,812,981
`
`CE?)
`
`PA/FEW
`
`FIG. 2
`
`3
`
`
`
`1
`
`4,812,981
`
`MEMORY MANAGEMENT SYSTEM IMPROVING
`THE EFFICIENCY OF FORK OPERATIONS
`
`2
`that is to say, programs coded to interface with UNIX
`on one data processing system can be run on other
`hardware as well. Many of the data processing systems
`currently being manufactured employ either the UNIX
`operating system or an operating system that emulates
`UNIX functions. The Prime Computer Series 50 sys
`tems, for example, can be con?gured to run PRIMIX, a
`UNIX emulation which operates in conjunction with
`the Series 50 system’s PRIMOS operating system. (PRI
`MIX and PRIMOS are trademarks of Prime Computer,
`Inc.).
`UNIX is a multi-user, timesharing, operating system
`using a tree-structured ?le system. Other noteworthy
`functional features are its logical I/O capabilities, pipes.
`and forks. “Logical I/O” allows the user to specify the
`input and output ?les of a program at runtime rather
`than at compile time, thus providing greater flexibility.
`“Piping” is a UNIX feature which allows the buffering
`of input and output to and from other processes. “Fork
`ing” is a UNIX method for creating new processes.
`By themselves, these UNIX features offer no inherent
`bene?ts. However, the UNIX command environment
`(called the SHELL) provides easy access to these oper
`ating system capabilities and also allows them to be used
`in different combinations. With the proper selection and
`ordering of system commands, logical I/O, pipes, and
`forks, a UNIX user at command level can accomplish
`work which on another system would require writing
`and generating an entirely new program. This ability to
`easily create application program equivalents from
`command level is one of the unique and primary bene
`?ts of the UNIX operating system.
`A problem can often arise when a UNIX-type operat
`ing system is implemented in a multiple user environ
`ment. The basic operation of a “fork” in the UNIX
`system involves spawning a new process, and then
`copying the process image of the parent (the process
`making the fork call) to the child (the newly spawned
`process). The time required to copy over a process
`image is proportional to the size of the process image.
`Therefore, a generic problem of running a UNIX-type
`system on a large virtual memory CPU architecture, is
`that the process image copy time of a fork operation can
`be a performance bottleneck.
`The performance problem is further aggravated in a
`multi-user environment because many fork operations
`are taking place simultaneously. Each page fault gener
`ated while copying the process image from parent to
`child can potentially displace a page in another process,
`indirectly causing the other process to incur more page
`faults when it runs. Thus, the incremental paging caused
`by each fork operation can dramatically increase the
`overall paging rate of the system. The end result can be
`very poor performance when the system is heavily
`used. This problem is typically referred to as “thrash
`ing”.
`There exists a need for better methods and systems of
`memory management, particularly when UNIX-type
`operating systems are employed in multiple user, virtual
`memory architectures. A memory management system
`that can reduce the performance bottlenecks and poten~
`tial thrashing associated with UNIX fork operations,
`would satisfy a substantial need in the industry.
`
`SUMMARY OF THE INVENTION
`The present invention is useful whenever UNIX-type
`operating systems are implemented in data processors
`
`5
`
`15
`
`25
`
`35
`
`50
`
`BACKGROUND OF THE INVENTION
`The technical ?eld of this invention is data processing
`and, in particular, memory management for multiple
`user systems that employ demand paging and a virtual
`memory architecture.
`Data processing systems generally include a central
`processing unit (“CPU”), a storage system (or “main
`memory”), peripheral devices and associated interfaces.
`Typically, the main memory consists of relatively low
`cost, high-capacity, digital storage devices. The periph
`eral device may be, for example, a non-volatile auxiliary
`memory medium, such as a magnetic disk or magnetic
`tape drive.
`In order to carry out tasks, the central processing unit
`of such systems executes a succession of instructions
`which operate on data. The succession of instructions
`and the data those instructions reference are referred to
`as a program. The execution of a program by the central
`processing unit is referred to as a process. The address
`space that contains the instructions and data for a pro
`cess is called the process image. A process image typi
`cally is distributed between the main memory and the
`auxiliary memory, depending upon the size of the pro
`cess and the immediate needs of the central processing
`unit.
`An architecture called virtual memory is employed in
`many data processing systems where service is pro
`vided to a number of users. On some systems, virtual
`memory is divided into segments, each of which is fur
`ther divided into pages with each page accomodating a
`?xed number of words. In a number of systems, for
`example, the Series 50 systems manufactured by Prime
`Computer, Inc. of Natick, Massachusetts, the addresses
`of the segments and pages are allocated by an operating
`system and can be arranged in the main memory in a
`random fashion. To the user, however, an appearance
`of continuity and unlimited memory space is presented
`and the user need not be concerned with the actual
`interleaved nature of programs in the main memory.
`Moreover, at any given time some pages of a program
`may reside in the auxiliary memory rather than the main
`memory. Page maps and the like, are employed by the
`operating system to translate virtual memory addresses
`into physical locations in main or auxiliary memory.
`When a process begins, it is necessary to retrieve
`pages containing instructions or data from memory.
`When a page of a program is needed by a process but is
`not resident in main memory, a “page fault” occurs.
`Demand paging refers to a method of fetching pages
`from auxiliary memory. Demand paging systems trans
`port just one page (containing the next instruction to be
`55
`executed or data to be manipulated) into main memory
`at a time. A replacement policy is typically employed
`by the operating system at the same time to “page out”
`instructions or data that are not likely to be needed
`again in the near future. For example, the least recently
`used page can be transferred from main memory to
`auxiliary memory. Each time a page is transferred from
`one memory location to another, the address and page
`maps must be revised accordingly.
`One operating system which has gained widespread
`acceptance in the industry is the UNIX operating sys
`tem. (UNIX is a trademark of the AT&T Bell Laborato
`ries). The UNIX system is designed to be “portabie",
`
`65
`
`4
`
`
`
`4,812,981
`4
`3
`having virtual memory architectures. The invention
`where there is more main memory than is needed to run
`the number of currently active programs, it is possible
`reduces memory management problems in UNIX oper
`to eliminate the parent page I/O operation to auxiliary
`ations as well as those typically encountered when
`memory. An optimizing feature of the invention for
`other operating systems are used to emulate UNIX
`operations.
`pages in the second state, therefore, calls for two page
`faults to be taken when the system is lightly used as the
`In one aspect of the invention, a fork call is executed
`by rede?ning those pages of the parent process which
`parent page is copied to the child page. In this manner,
`the minimum number of I/O operations is always
`are resident in main memory as pages of a child process
`chosen.
`image; modifying the page maps in response to the
`rede?ned pages; and accessing the auxiliary memory to
`For parent pages in the third state (not resident), an
`optimizing feature can again be employed. For non-resi
`copy the remaining pages of the parent process image
`dent pages, the conventional technique requires three
`not resident in main memory as demanded for the child
`image.
`I/O operations when the system is heavily used. When
`practicing the invention, only two I/O’s must be carried
`In conventional UNIX and UNIX-emulating systems,
`the process image copying is accomplished by examin
`out. One 1/0 is required to read in the parent page from
`ing each active segment in the parent’s logical address
`auxiliary memory, and another is needed to copy it to
`the child page in auxiliary memory.
`space and copying all previously referenced pages in
`However, if physical memory usage is low, then the
`the segment from the parent to the child, page by page.
`optimization used for parent pages found in the second
`If the parent page is resident in memory, it is copied to
`state can also be used. That is, the parent page is copied
`the child. If the parent page is not resident in memory,
`directly to the child page. Although two page faults are
`then even before the copy can occur, a page fault must
`generated, the 1/0 to write the child page to auxiliary
`take place and the address maps must be updated. The
`memory is eliminated. Once again, better performance
`page must be transferred into main memory and copied
`is achieved because it takes less real time to complete
`to satisfy the child process. In most cases, a page fault
`will also occur whenever the parent page is copied
`the page faults than to complete an I/O, and the mini
`mal number of I/O operations is always chosen.
`(whether or not it is resident in main memory) because
`Alternatively for pages not resident in main memory
`space must be made available for the child page. The
`overall result is often substantial CPU page-fault and
`on a lightlyr used system, the parent page can be read
`copying overhead.
`into main memory and then rede?ned as the child page.
`This approach reduces CPU overhead even further by
`If the case of a blank page is ignored, each parent
`eliminating a page fault on the child and the copying
`page can be examined and found to be in one of three
`time from the parent to the child page.
`states, when copying a parent process image into a child
`image:
`The invention will next be described in connection
`with certain preferred embodiments; however, it should
`1. Resident in main memory and not modi?ed
`be clear that various changes and modi?cations can be
`2. Resident and modi?ed
`made without departing from the spirit or scope of the
`3. Not resident in main memory
`invention. For example, although the summary and
`(A parent page can also be found to be blank-mot yet
`detailed description describe a data processing system
`referenced-because typically every page of each ac
`tive segment of the parent’s logical address space is
`employing a main memory and an auxiliary (i.e. disk)
`memory, other levels of memory can also be added. In
`examined even though some of the pages in a given
`many systems, including the Prime Series 50 systems, a
`segment may not be in use. In this situation, both the
`high speed memory or data cache is also incorporated
`conventional method and the present invention pass
`over the blank page and proceed to the next page or
`to store a limited amount of data that are most likely to
`next active segment).
`be needed by a user at a particular time; the use of such
`data caches and associated data veri?cation means is, of
`For pages found in the ?rst state (resident and not
`modi?ed), the present invention substantially reduces
`course, compatible with the present invention. Simi
`larly, the invention can ?nd use in multiple processor
`the copy time, the number of page faults and conse
`quently, the input/output (“I/O") operations between
`data processing systems where the execution of differ
`the CPU, main memory and auxiliary memory when
`ent aspects of a process are conducted in parallel.
`conducting a UNIX fork operation. By effectively pass
`ing the set of physical pages belonging to parent process
`(i.e., the parent working set) to the child, page faults and
`possible I/O operations associated with the creation of
`child pages are eliminated. Additionally, parent-to
`child copying time is eliminated. Since it is the child
`process that usually continues to be executed after a
`fork operation has been initiated, the loss of immediate
`access to the parent process image is an insigni?cant
`trade-off. Most of the time, the parent process simply
`waits for the child process to complete its work.
`When practicing the present invention on pages in the
`second state (resident but modi?ed), the virtual page
`normally is ?rst paged out to auxiliary memory, before
`rede?ning it as the child page. Thus, at least one I/O
`operation is required. Nonetheless, this is preferable
`over the CPU overhead, page fault and possible I/O
`operations necessary to create a child page under prior
`art techniques. However, on a lightly used system
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a schematic block diagram of a data process
`ing system employing the present invention; and
`FIG. 2 is flow chart depicting a method of practicing
`the present invention.
`Detailed Description
`In FIG. 1, a schematic block diagram of a memory
`management system 10 is shown including central pro
`cessor 12, main memory 14, an auxiliary memory device
`16, page maps 18 (typically stored in main memory) and
`a page availability counter 24 (also typically stored in
`main memory). The central processor 12 further in
`cludes an image copying system 20 and a page fault
`handler 22. The image copying system and page fault
`handler can be implemented by hardware, ?rmware or
`software and are typically implemented by a combina
`tion of these means. In function, the portions of the
`
`50
`
`55
`
`65
`
`20
`
`35
`
`45
`
`5
`
`
`
`5
`central processor 12 not germane to the present inven
`tion as well as the main memory 14, auxiliary memory
`device 16 and page maps 18 in the present embodiment
`are similar to corresponding elements in the Series 50
`Systems manufactured by Prime Computer, Inc., of 5
`Framingham, Mass.
`In the illustrated embodiment, the central processor
`12 can be operated by the PRIMOS operating system.
`PRMIX, a UNIX-emulating program, is run on
`PRIMOS, translating UNIX system calls into equiva- 1O
`lent PRIMOS functions. The central processor 12 exe
`cutes fork operations by employing the image copying
`system 20 in conjunction with page fault handler 22.
`When a fork operation is called for, the imaging copy
`ing system 20 rede?nes the pages of the parent image
`that are resident in main memory (and not modi?ed) as
`pages of the child process image, and revises the page
`maps 18 in response to the rede?ned pages. It also acti
`vates the page fault handler 22 to retrieve from auxiliary
`memory those pages of the parent process image not
`resident in main memory. Additionally, the image copy
`ing system 20 interacts with the page fault handler 22 to
`copy modi?ed pages into auxiliary memory when nec
`essary and to copy selected pages into auxiliary memory 25
`when new pages are needed for the child process image.
`The page availability counter 24 determines the level of
`system utilization and activates certain optimizing fea
`tures, as discussed below, to reduce the number of I/O
`operations.
`In the FIG. 2, the operation of the image copying
`system 20 is further illustrated by a ?ow chart of the
`steps followed in creating the child image. The image
`copying system of the present invention searches for the
`?rst page of the parent image and determines whether 35
`the parent page is in main memory. The location of the
`page (resident or non resident) is determined, for exam
`ple, by examination of a page map residence bit in Prime
`Series 50 computers. If the page is in main memory, the
`page is examined to determine whether the contents
`have been modi?ed. Similarly, for the example of Prime
`Series 50 systems, the state of page (modi?ed or not
`modi?ed) is determined by examining a page map modi
`?ed state bit. If the parent page is in main memory and
`has not been modi?ed, the parent page is simply rede- 45
`?ned as the child page.
`If the page in main memory has been modi?ed and
`the system is heavily used, the contents must ?rst be
`recorded in the auxiliary memory. Once this is com
`pleted, the parent page is de?ned as a child page and the 50
`page maps are revised accordingly. Likewise, if the
`parent page was not located in main memory and the
`system is heavily used, the parent page is obtained from
`auxiliary memory and then copied back to auxiliary
`memory to form the child image.
`The optimizing features of the invention are also
`shown in FIG. 2. These features are employed when the
`system is only lightly used and it is not likely that the
`main memory capacity (or some fractional “threshold"
`thereof) will be exceeded. The optimizing features are
`used in those instances when a parent page in main
`memory has been modi?ed or when the parent page is
`found in auxiliary memory. In these aspects of the in
`vention, the system ?rst determines whether a main
`memory threshold has been exceeded. This is accom- 65
`plished for example in the Prime Series 50 systems by
`examining the page availability counter. In one embodi
`ment, the system is considered lightly used if 100 or
`
`4,812,981
`6
`more pages are available (unused) in main memory at
`any given time.
`If the parent page is resident in main memory but has
`been modi?ed, and the main memory capacity has not
`been exceeded, the parent page is just copied onto a
`new child page. If the parent page is in auxiliary mem
`ory, and main memory capacity has not been exceeded,
`then the parent page is read into main memory and
`either copied or rede?ned to create a child page.
`Once this process is completed for the ?rst page of
`the child image, it is repeated over and over again as the
`subsequent pages are demanded until the child image is
`completed.
`We claim:
`1. A method of copying a parent process (or task)
`having pages of information to a child process (or task)
`during a fork operation in a data processing system
`which employs virtual addressing to reference a physi
`cal memory, at least one page map to de?ne the rela
`tionship between virtual addresses and their physical
`locations, and demand paging to allocate a logic address
`space, de?ning a process, between a main memory and
`a lower density auxiliary memory, the pages of the
`parent process distributed in the main memory and the
`auxiliary memory wherein said pages resident in the
`main memory comprise modi?ed and unmodi?ed pages,
`the method comprising;
`a. rede?ning the unmodi?ed pages of the parent pro
`cess resident in main memory as pages of a child
`process;
`b. modifying the page map to indicate the pages resi
`dent in main memory rede?ned from the parent
`process to the child process; and
`c. copying pages of the parent process not resident in
`main memory from the auxiliary memory to create
`pages for the child process in said logic address
`space.
`2. The method of claim 1 further includes determin
`ing whether an excess of main memory capacity exists;
`copying the modi?ed parent pages to create new child
`pages in main memory if excess capacity exists; and
`rede?ning the modi?ed parent pages as pages of the
`child process after copying the modi?ed pages to auxil
`iary memory if excess capacity does not exist.
`3. The method of claim 2 wherein the step of deter
`mining whether excess memory capacity exists is con
`ducted by polling a page availability counter.
`4. The method of claim 1 wherein the step of copying
`the auxiliary memory pages of the parent process fur
`ther includes determining whether an excess of main
`memory capacity exists; reading the parent pages into
`main memory and copying the parent pages to create
`new child pages in main memory if excess capacity
`exists; and copying the parent pages to child pages in
`auxiliary memory after reading the parent pages in from
`auxiliary memory if excess memory capacity does not
`exist
`5. The method of claim 4 wherein the step of deter
`mining whether excess memory capacity exists is con
`ducted by polling a page availability counter.
`6. The method of claim 1 wherein the step of copying
`the auxiliary memory pages of the parent process fur
`ther includes determining whether an excess of main
`memory capacity exists; reading the parent pages into
`main memory and then rede?ning the parent pages as
`child pages if excess capacity exists; and copying the
`parent pages to child pages in auxiliary memory after
`
`30
`
`55
`
`6
`
`
`
`7
`reading the parent pages in from auxiliary memory if
`excess memory capacity does not exist.
`7. The method of claim 6 wherein the step of deter
`mining whether excess memory capacity exists is con
`ducted by polling a page availability counter.
`8. A process image copying system for copying a
`parent process having pages of information into a child
`process during a fork operation in a data processing
`system which employs virtual addressing to reference a
`physical memory, at least one page map to de?ne the
`relationship between virtual addresses and their physi
`cal locations, and demand paging to allocate a logic
`address space, which address space de?nes a process (or
`task), distributed between a main memory and a lower
`density auxiliary memory, the pages of the parent pro
`cess distributed in the main memory and the auxiliary
`memory wherein said pages resident in the main mem
`ory comprise modi?ed and unmodi?ed pages, the pro
`cess copying system comprising;
`
`8
`a. rede?ning means for rede?ning the unmodified
`pages of the parent process resident in main mem
`ory as pages of the child process;
`b. map modifying means for modifying the page map
`in response to page rede?nitions by the rede?ning
`means;
`0. copying means for copying pages of the parent
`process not resident in the main memory to create
`pages for the child process in said logic address
`space; and
`d. central processing means connected to said rede
`?ning means, said map modifying means, and said
`copying means for controlling and coordinating
`the activities of each of said means.
`9. The system of claim 8 wherein the system further
`includes a memory capacity monitoring means for de
`termining whether an excess of main memory capacity
`exists.
`10. The system of claim 9 wherein the monitoring
`means is a page availability counter.
`‘
`* i
`i
`#
`
`4,812,981
`
`20
`
`25
`
`30
`
`35
`
`45
`
`55
`
`65
`
`7