`Zwiegincew et al.
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 6,317,818 B1
`Nov. 13, 2001
`
`US006317818B1
`
`(54) PRE-FETCHING or PAGES PRIOR TO A
`HARD PAGE FAULT SEQUENCE
`
`6,047,363 *
`
`4/2000 Lewchuk ............................ .. 711/213
`
`* cited by examiner
`
`(75)
`
`Inventors: Arthur Zwiegincew, Bothell; James E.
`Walsh, Kirkland, both of WA (US)
`
`Primary Examirter—KeVin Verbrugge
`(74) Attorney, Agent, or Firm—Merchant & Gould
`
`(73) Assignee: Microsoft Corporation, Redmond, WA
`(U5)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) APPL N05 09/2823499
`(22)
`Filed.
`Man 30, 1999
`
`Int. Cl.7 .................................................... .. G06F 12/00
`(51)
`(52) U.S. Cl.
`............................................................ .. 711/213
`(58) Field of Search ................................ .. 711/217, 221,
`711/203, 209, 204, 213; 712/207
`
`(56)
`
`References Cited
`
`Us‘ PATENT DOCUMENTS
`5,150,472 *
`9/1992 Blank ct al.
`....................... .. 395/425
`.. 395/421.03
`5,694,568 * 12/1997 Harrison, 111 et al.
`
`..................... .. 709/219
`5,925,100 *
`7/1999 Drewry et al.
`
`ABSTRACT
`(57)
`Hard page fault patterns of an application program module
`are analyzed in order to determine the pages that will be
`retrieved from disk storage during a common hard page fault
`scenario. Copies of, or references to, the determined pages
`are stored in a scenario file, along with an index referred to
`as a page sequence. The scenario file may also include a
`prologue indicating events that lead to a hard page fault
`scenario and an epilogue that may indicate subsequent hard
`page fault scenarios. Execution of the application program
`module is monitored to detect the occurrence of a hard page
`fault scenario. When a hard page fault scenario is detected.
`21 Corresponding scenario file is fetched from disk storage
`and the determined pages, or copies thereof, are transferred
`into RAM. The determined pages, or copies thereof, may be
`placed on a stand-by list in RAM and later soft-faulted into
`the Working set of the application program upon request by
`the application program module
`thereby avoiding a
`Sequence of hard Page faults
`
`29 Claims, 3 Drawing Sheets
`
`30
`
`1
`
`300
`
`
`
`K
`
`302
`
`
`
`MONITOR APPLICATION
`PROGRAM MODULE FOR
`PAGE FAULT SCENARIOS
`
`
`
`310
`
`
`
`308
`
`DETECT PAGE
`FAULT SCENARIO
`
`304
`
`
`
`
`CRIATE SCENARIO
`FILE
`
`
`
`ANALYZE PAGE
`FAULT SCENARIO
`
`
`5CEF’,‘ILAER'0
`EXET
`
`YES
`OPEN SCENARIO FILE
`IN PRE—FETCHER
`
`812
`
`314
`ALLOCATE MEMORY SPACE IN RAM
`FOR COPIES OF MEMORY PAGES
`IN SEQUENCE FILE
`
`31 6
`
`31 8
`
`TRANSFER COPIES OF
`MEMORY PAGES INTO RAM
`
`SET UP PAGE TABLE ENTRIES
`TO REFLECT NEW MEMORY
`PAGES IN RAM
`
`1
`
`APPLE 1010
`
`APPLE 1010
`
`1
`
`
`
`U
`
`eu.m_inmi.9_ourP_..................................................................-LSan
`
`.mmo:zos__89>wz_mwmooE
`
`"mm_E<a<:2:
`
`US 6,317,818 B1
`
`000o_
`
`M_mm?
`
`0_2_39_1.w.5:N_map
`
`<m~_<4<oo.__
`
`
`1m_o<”_mm:z.
`
`tmo<#m:z_wxmozfizmamaxwaxmwm,_m_m_<:
`
`32:MMMmm...«.2
`
`<mm<mo;>'@ImmSn.s_ooAIIVIIml
`xmozfiz.,/
`
`
`Eosmm..........------..............--..........--$4..92may/5.X
`
`Nair
`
`Eosmz
`
`Emsmo<z<s_
`
`s_<mwomn_
`
`Emacs.
`
`
`s_<mwomn.
`
`mm—.ZO_._.<0_._n_n_<
`
`$590.2
`
`oz:<mmao
`
`_2Em>m
`
`8.«E3:marma
`
`
`
`om<om>m§m_w:o_>_
`
`zo:<o:&<
`
`2<mooE
`
`»zmsmo<z<s_
`
`mm._:oos_
`
`m:3002s_<aoomn_
`
`oz:<mmn_o
`
`sm:m>m
`
`_..0_u_
`
`2
`
`
`
`
`
`
`U.S. Patent
`
`Nov. 13, 2001
`
`Sheet 2 013
`
`US 6,317,818 B1
`
`250
`
`
`PAGE FAULT
`SCENARIO
`DETECTOR
`
`
`
`255
`
`PAGE FAULT
`SCENARIO
`ANALYZER
`
`
`
`APPLICATION
`PROGRAM
`MODULE
`
`SCENARIO
`FILE
`
`260
`
`
`
`DISK
`STORAGE
`
`VIRTUAL
`MEMORY
`
`MEMORY
`MANAGEMENT
`UN IT
`
`DISK DRIVE
`
`
`
`
`
`DECOMPRESSOR
`
`SCENARIO
`FILE
`
`DEFRAGGER
`
`F|G.2
`
`COMPRESSOR!
`
`265
`
`3
`
`
`
`U.S. Patent
`
`Nov. 13, 2001
`
`Sheet 3 013
`
`US 6,317,818 B1
`
`301
`
`300
`
`START
`
`302
`
`MONITOR APPLICATION
`PROGRAM MODULE FOR
`
`PAGE FAULT SCENARIOS
`
`31 0
`
`308
`
`CREATE SCENARIO
`FILE
`
`ANALYZE PAGE
`FAULT SCENARIO
`
`"'3°3
`
`
`
`304
`
`DETECT PAGE
`FAULT SCENARIO
`
`306
`
`
`
`SCENAR|O
`FILE
`EXIST
`
`?
`
`YES
`
`312
`
`OPEN SCENARIO FILE
`IN PRE—FETCHER
`
`314
`
`ALLOCATE MEMORY SPACE IN RAM
`FOR COPIES OF MEMORY PAGES
`IN SEQUENCE FILE
`
`316
`
`31 8
`
`MEMORY PAGES INTO RAM
`
`SET UP PAGE TABLE ENTRIES
`TO REFLECT NEW MEMORY
`PAGES IN RAM
`
`320
`
`4
`
`
`
`US 6,317,818 B1
`
`1
`PRE-FETCHING OF PAGES PRIOR TO A
`HARD PAGE FAULT SEQUENCE
`FIELD OF THE INVENTION
`
`This invention relates in general to the management of
`pages for improved performance of an application program
`module during hard page fault intensive scenarios. More
`particularly, the present invention relates to the reduction of
`hard page faults by pre-fetching pages into memory prior to
`the occurrence of a hard page fault sequence.
`BACKGROUND OF THE INVENTION
`
`In a computer system, physical memory refers to a
`hardware device that is capable of storing information. In
`common usage, physical memory refers to semiconductor
`storage (RAM) that is connected to the processing unit of the
`computer system. Many modern processing units and oper-
`ating systems also support virtual memory. Virtual memory
`is a technique that allows smaller and/or partially simulated
`memory devices to be represented as a large uniform pri-
`mary mcmory source. In operation, application program
`modules access memory through virtual addresses, which
`are then mapped by the operating system in conjunction with
`a memory management unit (MMU) onto physical memory
`addresses.
`
`In the context of a paging memory system, a “page” is
`dcfincd as a fixcd-sizc block of bytes whosc physical address
`can be changed via the MMU, working in conjunction with
`a Virtual Memory Manager. Apage is either mapped onto a
`physical address or is not present in RAM, in which case it
`is stored on a disk storage in a page file). A “hard page fault”
`is an exception that occurs when an application program
`module attempts to access a virtual memory page that is
`marked as being not present in RAM. When a hard page
`fault occurs, the Virtual Memory Manager must access disk
`storage to retrieve the data for the requested page .
`Application program modules are typically disk-bound. In
`other words, disk access and transfer times are limiting
`factors of the performance speed of an application program
`module. Disk access time refers to the time required by a
`disk drive to access disk storage and respond to a request for
`a data read or write operation. Therefore, the performance of
`an application program module is significantly limited dur-
`ing hard page fault intensive scenarios.
`There are various potential solutions to the performance
`bottleneck caused by disk access time during hard page fault
`scenarios. An obvious potential solution is to reduce disk
`access time. The reduction of disk access time is primarily
`a hardware consideration and is not easily accomplished.
`However, other potential solutions involve the manipulation
`of memory storage through software program modules.
`For example, one prior solution involves manipulating
`pages such that related blocks of memory are stored together
`on the same or an adjacent page. More specifically, appli-
`cation program module code is typically stored in pages in
`the order in which a compiler processed source code, not in
`the order in which it will be executed. Therefore, when a
`page is accessed by an application program module, it is
`likely that only a portion of the requested code is stored
`thereon and one or more hard page faults will occur to
`retrieve additional
`requested code from other pages.
`Manipulating the pages so that related code is stored on the
`same or adjacent pages reduces the number of pages
`rcquircd to cxccutc the code and thus rcduccs hard page
`faults. Implementing this approach requires an extra per-
`application effort. Also, it is not always possible to manipu-
`late code in pages in an efficient manner.
`
`10
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`Another prior solution involves strategically ordering
`pages in disk storage. According to this prior solution, the
`order in which pages will likely be accessed during typical
`usage of an application program is determined based on the
`assumption that disk access patterns are similar from run to
`run. Then, pages are stored in disk storage in the determined
`order. A strategic ordering of pages will result in a reduction
`of hard page fault times. However, this approach is some-
`what limited by the fact pages may be accessed more than
`once by an application program. Therefore, additional hard
`page faults may occur when a particular page must be
`re-retrieved from disk storage. Strategically ordering pages
`in disk storage tcnds to work best when it is cmploycd to
`reduce hard page faults in a single hard page fault scenario,
`typically boot.
`Another prior technique to reduce the performance bottle-
`neck caused by disk access time during hard page fault
`scenarios involves decreasing the amount of pages associ-
`ated witl1 an application program module. Reducing the
`number of pages containing code executed by an application
`program module necessarily reduces the number of hard
`page faults that may possibly occur during execution of the
`application program module. However,
`the reduction of
`memory associated with an application program module
`requires significant eifort on the part of the programmer, or
`improvements in compiler technologies,
`to streamline the
`application program module. Also, end-users demand appli-
`cation program modules having extremely robust function-
`ality and complex graphics capabilities. Thus, it is becoming
`increasingly more diflicult to streamline application pro-
`gram modules while meeting market demands.
`Thus, there remains a need for a method and system for
`improving the performance of an application program mod-
`ule by reducing disk access time without burdening the
`programmer.
`Thcrc furthcr remains a nccd in the art for a method and
`
`system for reducing hard page faults during execution of an
`application program module without detracting from the
`robustness of the application program module.
`SUMMARY OF THE INVENTION
`
`The present invention meets the needs described above by
`providing a system and method for improving the perfor-
`mance of an application program module by reducing the
`occurrence of hard page faults during the operation of an
`application program module. The present invention may be
`embodied in an add-on software program module that oper-
`ates in conjunction with the application program module. In
`this manner, no effort is required on the part of the appli-
`cation programmer to manipulate or modify the application
`program module in order
`to improve performance.
`Furthermore, the add-on software program module does not
`detract from the intended operation of the application pro-
`gram modulc.
`invention, a
`According to one aspect of the present
`scenario file is created which comprises ordered copies of
`pages that are likely to be retrieved from disk storage by an
`application program module during a hard page fault. The
`scenario file is stored in the disk storage. Then, the execution
`of the application program module is monitored until either
`an explicit begin-of-scenario instruction is detected, or a
`hard page fault scenario is detected. A hard page fault
`sccnario may comprise any situation or cvcnt that is likely
`to trigger a hard page fault, i.e., one or more requested pages
`will not be available in RAM and will be retrieved from disk
`storage. In response to the detection of a begin-of-scenario
`
`5
`
`
`
`US 6,317,818 B1
`
`3
`instruction or a hard page fault scenario, the scenario file is
`fetched from disk storage and the ordered copies of the
`pages are transferred into a standby list in RAM. In this
`manner,
`the requested pages will be soft faulted into a
`working set of the application program module, and no hard
`page fault will occur. In another aspect of the invention, a
`hard page fault scenario analyzer is provided for analyzing
`a hard page fault scenario of an application program module
`in order to determine the pages that will be retrieved from
`disk storage upon the occurrence of a hard page fault
`scenario. The hard page fault scenario analyzer creates a
`scenario file comprising copies of the pages in the deter-
`mined order. A hard page fault scenario detector is provided
`for monitoring the execution of the application module,
`detecting a hard page fault scenario and sending a message
`to a pre-fetcher. The hard page fault scenario detector may
`be manual or automatic. A pre-fetcher retrieves a scenario
`file from disk storage and transfers the copies of the deter-
`mined pages into RAM. The copies of the determined pages
`are placed on a standby list
`in RAM. Accordingly,
`the '
`determined pages will be available in RAM during a hard
`page fault scenario and will be soft-faulted into the working
`set of the application program module when they are
`requested by the application program module,
`thereby
`avoiding a hard page fault.
`According to another aspect of the invention, a scenario
`file is created which comprises ordered references to pages
`that are likely to be retrieved from disk storage by an
`application program module during a hard page fault
`scenario, rather than the actual pages themselves. The sce-
`nario file is stored in the disk storage. Then, the execution of
`the application program module is monitored until either an
`explicit begin-of-scenario instruction is detected, or a hard
`page fault scenario is detected. A hard page fault scenario
`may comprise any situation or event that is likely to trigger
`a hard page fault, i.e., one or more requested pages will not
`be available in RAM and will be retrieved from disk storage.
`In response to the detection of a begin-of-scenario instruc-
`tion or a hard page fault scenario, the pages referenced by
`the scenario file are fetched from disk storage in the optimal
`manner and are transferred into a standby list in RAM. In
`this manner, the requested pages will be soft faulted into a
`working set of the application program module, and no hard
`page fault will occur. This aspect of the invention will result
`in more seek operations on disk, but will still allow reading
`of the required pages in an optimal manner, rather than the
`‘as needed’ ordering if the pages are hard faulted into RAM.
`This aspect of the invention also reduces the disk space
`requirements over the previously mentioned aspect.
`These and other aspects, features and advantages of the
`present
`invention may be more clearly understood and
`appreciated from a review of the following detailed descrip-
`tion of the disclosed embodiments and by reference to the
`appended drawings and claims.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`10
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`4
`an application program module during hard page fault
`scenarios. The present invention is operable to detect and
`analyze a potential hard page fault scenario. In analyzing the
`potential hard page fault scenario,
`the present invention
`determines which pages will have to be retrieved from disk
`upon the occurrence of a hard page fault sequence. Then,
`prior to the occurrence of the potential hard page fault
`sequence, the present invention pre-fetches into RAM most,
`if not all, of the determined pages. Accordingly, no hard
`page fault will occur when the application program module
`attempts to access the determined pages. Thus, the perfor-
`mance speed of the application program module is improved
`due to fewer disk seeks. Those skilled in the art should
`appreciate that the definition of a hard page fault scenario is
`arbitrary, i.e., a single lengthy scenario could be ‘split’ into
`multiple smaller scenarios. Thus, pages may be pre-fetched
`in segments, one segment per ‘smaller’ scenario, resulting in
`a spreading out of the demand for a single large amount of
`memory in which to read the prefetched pages.
`A hard page fault occurs when a page requested by the
`application program is not available in RAM. The present
`invention attempts to determine the order in which pages are
`likely to be requested in a particular hard page fault scenario.
`A “scenario file” is created to store a “page sequence,”
`which comprises ordered copies of each determined page. A
`pre-fetcher is used to load an appropriate scenario file into
`RAM prior to the occurrence of a particular hard page fault
`sequence. Then, during the particular hard page fault
`scenario, the requested pages will be present in RAM and no
`hard page fault will occur.
`The following description will hereinafter refer to the
`drawing,
`in which like numerals indicate like elements
`throughout the several figures. FIG. 1 and the following
`discussion are intended to provide a brief and genera
`description of a suitable computing environment 100 for
`implementation of the present invention. The exemplary
`operating environment 100 includes a conventional persona
`computer system 120, including a processing unit 121, a
`system memory 122, and a system bus 123 that couples the
`system memory 122 to the processing unit 121. The system
`memory 122 includes read only memory (ROM) 124 anc
`random access memory (RAM) 125. A basic input/output
`system 126 (BIOS), containing the basic routines that help
`to transfer information between elements within the persona
`computer system 120, such as during start-up, is stored in
`ROM 124.
`
`The personal computer system 120 further includes a hare
`disk drive 127, a floppy disk drive 128, e.g., to read from or
`write to a removable magnetic disk 129, and an optical disk
`drive 130, e.g., for reading a CD-ROM disk 131 or to reac
`from or write to other optical media. The hard disk drive
`127, removable magnetic disk drive 128, and optical disk
`drive 130 are connected to the system bus 123 by a hard disk
`drive interface 132, a removable magnetic disk drive inter-
`face 133, and an optical drive interface 134, respectively.
`The drives and their associated computer-readable media
`provide nonvolatile storage for the personal computer sys-
`tem 120. Although the description of computer-readable
`media above refers to a hard disk, a removable magnetic
`disk and a CD-ROM disk, it should be appreciated by those
`skilled in the art that other types of media that are readable
`by a computer system, such as magnetic cassettes, flash
`memory cards, digital video disks, Bernoulli cartridges, and
`the like, may also be used in the exemplary operating
`environment.
`
`FIG. 1 is a block diagram of a personal computer illus-
`trating an operating environment for an exemplary embodi-
`ment of the present invention.
`FIG. 2 is a functional block diagram illustrating operation
`of an exemplary embodiment of the present invention.
`FIG. 3 is a flow diagram illustrating operation of an
`exemplary embodiment of the present invention.
`DETAILED DESCRIPTION
`
`60
`
`65
`
`The present invention is directed to a system and method
`for managing pages in order to improve the performance of
`
`The computer system 120 may include additional input
`devices (not shown), such as a microphone, joystick, game
`
`6
`
`
`
`US 6,317,818 B1
`
`10
`
`15
`
`25
`
`30
`
`35
`
`5
`pad, satellite dish, scanner, or the like. These and other input
`devices are often connected to the processing unit 121
`through a serial port interface 146 that is coupled to the
`system bus 123, but may be connected by other interfaces,
`such as an IEEE 1394 bus or a universal serial bus (USB).
`A monitor 147 or other type of display device is also
`connected to the system bus 123 via an interface, such as a
`video adapter 148. In addition to the monitor, personal
`computer systems typically include other peripheral output
`devices (not shown), such as speakers or printers.
`The personal computer system 120 may operate in a
`networked environment using logical connections to one or
`more remote computer systems, such as a remote computer
`system 149. The remote computer system 149 may be a
`server, a router, a peer device or other common network
`node, and typically includes many or all of the elements
`described relative to the personal computer system 120,
`although only a memory storage device 150 has been
`illustrated in FIG. 1. The logical connections depicted in
`FIG. 1 include a local area network (LAN) 151 and a wide ,
`area network
`152. Such networking environments
`are commonplace in ofiices, enterprise-wide computer
`networks, intranets and the Internet.
`When used in a LAN networking environment, the per-
`sonal computer system 120 is connected to the LAN 151
`through a network interface 153. When used in a WAN
`networking environment, the personal computer system 120
`typically includes a modem 154 or other means for estab-
`lishing communications over a WAN 152, such as the
`Internet. The modem 154, which may be internal or external,
`is connected to the system bus 123 via the serial port
`interface 146. In a networked environment, program mod-
`ules depicted relative to the personal computer system 120,
`or portions thereof, may be stored in the remote memory
`storage device 150. It will be appreciated that the network
`connections shown are exemplary and other me ans of estab-
`lishing a communications link between the computer sys-
`tems may be used. It will be further appreciated that the
`invention could equivalently be implemented on host or
`server computer systems other than personal computer
`systems, and could equivalently be transmitted to the host
`computer system by means other than a CD-ROM,
`for
`example, by way of the network connection interface 153.
`Anumber of program modules may be stored in the drives
`and RAM 125 of the computer system 120. Program mod-
`ules control how the computer system 120 functions and
`interacts with the user, with I/O devices or with other
`computers. Program modules comprise routines, data struc-
`tures and other software or firmware components. Examples
`of program modules are operating systems 135 and appli-
`cation program modules 138. In an exemplary embodiment,
`the present invention may comprise one or more memory
`management program modules 137 stored on the drives or
`RAM 125 of the computer 100. Specifically, program mod-
`ules 137 of the present invention may comprise computer
`implemented instructions for determining which pages will
`have to be retrieved from disk during a potential hard page
`fault scenario and pre-fetching the determined pages into
`RAM prior to the occurrence of the potential hard page fault
`sequence.
`Those sldlled in the art will appreciate that the invention
`may be practiced with other computer system
`configurations, including hand-held devices, multiprocessor
`systcms, microprocessor-bascd or programmablc consumer
`electronics, minicomputers, mainframe computers, and the
`like. The invention may also be practiced in distributed
`computing environments where tasks are performed by
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`remote processing devices that are linked through a com-
`munications network.
`In a distributed computing
`environment, program modules may be located in both local
`and remote memory storage devices.
`FIG. 2 is a functional block diagram illustrating operation
`of an exemplary embodiment of the present invention. As
`shown, an application program module 205 interacts with a
`virtual memory device 210 to access data stored in pages.
`The virtual memory device 210 communicates with a
`memory management unit (MMU) 215, which maps virtual
`memory addresses onto physical memory addresses in the
`RAM 220 of the computer system. The MMU, which is a
`well known device, maintains a record of pages present in
`the RAM 220. An MMU includes a lookaside cache;
`requests for pages not recorded by the MMU are redirected
`to the VMM. When the requested pages are not present in
`the RAM 220, they must be retrieved from disk storage 230.
`In such a situation, the VMM instructs a disk drive 225 to
`transfer the requested pages into the RAM 220. Typically,
`the requested pages are swapped with less used pages in the
`RAM 220. Accordingly, the least recently used pages are
`stored on the disk storage 230 and the VMM updates its
`records to reflect the new pages in the RAM 220. Swapping
`is a memory management technique that is well known in
`the art. Those skilled in the art will appreciate that above
`description of memory management has been provided by
`way of example only and that other algorithms may be, and
`often are, employed.
`A hard page fault scenario analyzer 240 anticipates and
`analyzes hard page fault scenarios. As mentioned, a hard
`page fault scenario is a situation in which a hard page fault
`sequence is highly likely to occur. The hard page fault
`scenario analyzer logs various hard page fault scenarios that
`occur during operation of the application program module
`205. The logged hard page fault scenarios are then analyzed
`to determine if they re-occur frequently, and if they do, they
`are put in a scenario file. This analysis can occur program-
`matically on the end-user’s computer system, or in advance
`by the developer of a particular software product. As an
`example, suppose the application program module 205 is a
`word processor and that an anticipated hard page fault
`scenario is the situation in which the user selects a well
`
`known “file open” command. In response to the “file open”
`command, the application program will display a graphical
`representation of a file directory. However,
`in order to
`display the graphical representation of the file directory, a
`sequence of hard page faults will occur because the word
`processor must retrieve a particular set of pages from disk.
`In accordance with an exemplary embodiment of the present
`invention, the hard page fault scenario analyzer 240 antici-
`pates the “open file” hard page fault scenario of the example
`and determines the set of pages that will need to be retrieved
`from disk upon the occurrence of the hard page fault. The
`determination of pages that will need to be retrieved from
`disk will be described in greater detail below. The detection
`of particular classes of hard page fault scenarios may be built
`into the system. For example, application launch is virtually
`always a hard page fault scenario, so an exemplary embodi-
`ment of the present invention may be configured such that
`any launch operation of an application program will be
`considered to be a hard page fault scenario.
`Based on the determined pages,
`the hard page fault
`scenario analyzer 240 creates a scenario file 245, which is
`stored in the disk storage 230. The scenario filc 245 may
`comprise ordered copies of the pages that will likely be
`retrieved from disk due to one or more hard page faults
`during the hard page fault scenario. The scenario file 245
`
`7
`
`
`
`US 6,317,818 B1
`
`7
`may comprise multiple copies of a single page. Similarly,
`multiple scenario files may include copies of the same pages.
`In this manner, the copies of the pages may be transferred
`into RAM 220 from the sequence file 245 in the appropriate
`order without
`the occurrence of a hard page fault. The
`transfer of the copies of pages into RAM is discussed below.
`Asequence file 245 may also comprise ordered references
`to the pages that are likely to be retrieved from disk storage
`230 during a hard page fault scenario, rathcr than copies of
`the actual pages themselves. In response to the detection of
`a begin-of-scenario instruction or a hard page fault scenario,
`the pages referenced in the scenario file are fetched from
`disk storage in the optimal manner and are transferred into
`RAM. In this manner, the requested pages will be available
`in RAM when requested, and no hard page fault will occur.
`This exemplary embodiment of the invention will result in
`more seek operations on disk, but will still allow reading of
`the required pages in an optimal manner, rather than the ‘as
`needed’ ordering if the pages are hard faulted into RAM. As
`compared to a scenario file 245 comprising ordered copies '
`of the determined pages, a scenario file comprising refer-
`ences to the determined pages reduces disk space require-
`ments.
`
`10
`
`15
`
`The hard page fault sccnario analyzer 240 may comprise
`functionality for automatically analyzing hard page fault
`scenarios and generating corresponding scenario files. By
`way of illustration, the hard page fault analyzer 240 may log
`hard page faults that occur upon execution of a process
`during operation of an application program module 205.
`During idle time of the application program module 205, the
`hard page fault scenario analyzer 240 may write the log of
`hard page faults to a log file. Then, a pattern matching
`algorithm may be used to find a pattern of hard page faults
`based on all log files generated for the process same. If a
`pattern of hard page faults is found, a new scenario file may
`be generated based on the pages that are retrieved from disk
`during the pattern. Automatically generated scenario files
`may be subject to subsequent refinement, i.e., they may be
`inp11t into the pattern-matching algorithm.
`The hard page fault analyzer 240 may also comprise
`various application program interfaces (APIs) for allowing
`an application program module 205 to explicitly designate
`certain scenarios as hard page fault scenarios and to instruct
`the hard page fault scenario analyzer 240 to create a corre-
`sponding sequence file. Those skilled in the art will appre-
`ciate that the use of such APIs is likely to be faster and more
`accurate than using functionality for automatically analyz-
`ing hard page fault scenarios.
`A scenario file 245 may further include an index referred
`to herein as a “page sequence.” The elements of a page
`sequence may be stored as triples in the form of: {image
`name, page offset, pre-fetch offset}, where a pre-fetch offset
`is the offset
`into the scenario file. Physically, a page
`sequence may be implemented as: {image name, count,
`(page offset, pre-fetch offset)
`The scenario file 245 may also comprise various other
`components, such as a “scenario file ID,” a “process image
`name,” a “prologue” and an “epilogue.” A scenario file ID
`may be a standard 128-bit COM GUID, or an equivalent
`thereof, which is well known in the art. A process image
`name may allow hard page fault scenarios to be constrained
`to certain processes (e.g. winword.exe). Prologues and epi-
`logucs may be provided as aids for determining the order of
`hard page fault scenarios. For example, a component of the
`hard page fault scenario analyzer 240 may be operable to
`keep track of events leading to and following hard page fault
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`scenarios. A prologue may be used to record events that
`typically lead to the hard page fault scenario associated with
`the scenario file 245. The elements of prologues may be
`stored as pairs in the form of: {image name, page offset}.
`Similarly, an epilogue may be used to help to predict
`whether another hard page fault scenario will follow the
`hard page fault scenario associated with the scenario file
`245. For example, an epilogue may be implemented as a list
`of scenario file IDs, and a decay factor to minimize the
`number of coincidental hints.
`
`After the hard page fault scenario analyzer 240 has
`analyzed various hard page fault scenarios of an application
`program module 205 and has stored corresponding scenario
`files 245 in the disk storage 230, the hard page fault scenario
`detector 250 monitors operation of the application program
`module 205 for the occurrence of a hard page fault scenario.
`A hard page fault scenario detector may be manual or
`automatic. A manual hard page fault scenario detector may
`be programmed to send messages to the pre-fetcher upon the
`occurrence of particular events. An automatic hard page
`fault scenario detector may be operable to analyze prologues
`and epilogues in scenario files to predict when a hard page
`fault scenario will occur. When the hard page fault scenario
`detector 250 detects a hard page fault scenario, it sends a
`message to a pre-fetcher 255. The message sent to the hard
`page fa11lt scenario detector 250 instructs the pre-fetcher to
`fetch from disk storage 230 the scenario filc 245 associated
`with the detected hard page fault scenario. Apre-fetcher 255
`typically exists as part of an operating system.
`In response to the message received from the hard page
`fault scenario detector 240, the pre-fetcher 255 accesses the
`disk drive 225 to retrieve the appropriate scenario file from
`the disk storage 230. The pre-fetcher 255 then transfers into
`the RAM 220 the page sequence of the retrieved scenario file
`245. The newly transferred pages are placed on a standby list
`in the RAM 220, which is a technique that is well known to
`those skilled in the art. As such, the newly transferred pages
`do not increase the working set, i.e., the pages currently
`utilized by the application program module 205. Then, as the
`newly transferred pages are requested by the application
`program 205, they are soft-faulted into the working set and
`a hard page fault is avoided.
`While pre-fetching scenario files 2