`Beardsley et al.
`
`I 1111111111111111 11111 lllll lllll lllll lllll lllll lllll 111111111111111 11111111
`US006381677Bl
`US 6,381,677 Bl
`Apr. 30, 2002
`
`(10) Patent No.:
`(45) Date of Patent:
`
`(54) METHOD AND SYSTEM FOR STAGING
`DATA INTO CACHE
`
`(75)
`
`Inventors: Brent Cameron Beardsley; Michael
`Thomas Benhase; Joseph Smith Hyde;
`Thomas Charles Jarvis; Douglas A.
`Martin; Robert Louis Morton, all of
`Tucson, AZ (US)
`
`(73) Assignee: International Business Machines
`Corporation, Armonk, NY (US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by O days.
`
`(21) Appl. No.: 09/136,626
`
`(22) Filed:
`
`Aug. 19, 1998
`
`Int. Cl.7 ................................................ G06F 12/12
`(51)
`(52) U.S. Cl. ........................ 711/137; 711/136; 711/113;
`711/160
`(58) Field of Search ................................. 711/113, 137,
`711/136, 112, 160, 118
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,437,155 A
`4,458,316 A
`4,467,421 A *
`4,468,730 A
`4,489,378 A
`4,490,782 A
`4,533,995 A
`4,571,674 A *
`4,574,346 A *
`4,583,166 A
`4,603,382 A
`4,636,946 A
`4,875,155 A
`4,882,642 A
`4,956,803 A
`4,979,108 A
`
`3/1984
`7/1984
`8/1984
`8/1984
`12/1984
`12/1984
`8/1985
`2/1986
`3/1986
`4/1986
`7/1986
`1/1987
`10/1989
`11/1989
`9/1990
`12/1990
`
`Sawyer et al.
`Fry et al.
`White ........................ 711/118
`Dodd et al.
`Dixon et al.
`Dixon et al.
`Christian et al.
`Hartung ...................... 711/160
`Hartung ...................... 711/112
`Hartung et al.
`Cole et al.
`Hartung et al.
`Iskiyan et al.
`Tayler et al.
`Tayler et al.
`Crabbe, Jr..
`
`20
`
`5,134,563 A
`5,263,145 A
`5,297,265 A
`5,426,761 A
`5,432,919 A
`5,432,932 A
`5,434,992 A
`
`7/1992 Tayler et al.
`11/1993 Brady et al.
`3/1994 Frank et al.
`6/1995 Cord et al.
`7/1995 Falcone et al.
`7/1995 Chen et al.
`7/1995 Mattson
`
`(List continued on next page.)
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`JP
`
`674263
`6052060
`
`3/1995
`2/1994
`
`OTHER PUBLICATIONS
`
`Improving Most Recently User Change Prefetching, IBM
`Technical Disclosure Bulletin, vol. 36, No. 08, Aug. 1993.
`Optimized Look-Ahead Extension on Sequential Access,
`IBM Technical Disclosure Bulletin, vol. 39, No. 11, Nov.
`1996.
`
`(List continued on next page.)
`
`Primary Examiner-Matthew Kim
`Assistant Examiner-B. R. Peugh
`(74) Attorney, Agent, or Firm-David W. Victor; Konrad
`Raynes Victor & Mann
`
`(57)
`
`ABSTRACT
`
`Disclosed is a system for caching data. After determining a
`sequential access of a first memory area, such as a direct
`access storage device (DASD), a processing unit stages a
`group of data sets from the first memory area to a second
`memory, such as cache. The processing unit processes a data
`access request (DAR) for data sets in the first memory area
`that are included in the sequential access and reads the
`requested data sets from the second memory area. The
`processing unit determines trigger data set from a plurality
`of trigger data sets based on a trigger data set criteria. The
`processing unit then stages a next group of data sets from the
`first memory area to the second memory area in response to
`reading the determined trigger data set.
`
`21 Claims, 6 Drawing Sheets
`
`28
`
`From first
`track in DAR,
`stage
`remainder of
`first stage
`group and
`next stage
`group.
`
`Stage from first track in
`DAR to end of stage
`group.
`
`32
`
`Stage tracks in
`intent count.
`
`INTEL-1034
`8,010,740
`
`
`
`US 6,381,677 Bl
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`8/1995 Dahman et al.
`5,440,686 A
`8/1995 Bhide et al.
`5,440,727 A
`8/1995 Shomler et al.
`5,446,871 A
`1/1996 Day, III et al.
`5,481,691 A
`4/1996 Crockett et al.
`5,504,861 A
`8/1996 Mattson et al.
`5,551,003 A
`11/1996 Hathorn et al.
`5,574,950 A
`12/1996 Shih
`5,590,308 A
`1/1997 Micka et al.
`5,592,618 A
`2/1997 McNutt et al.
`5,606,688 A
`3/1997 Kern et al.
`5,615,329 A
`4/1997 Shomler
`5,623,509 A
`5,623,608 A * 4/1997 Ng ............................. 711/137
`5,627,990 A
`5/1997 Cord et al.
`5,636,359 A
`6/1997 Beardsley et al.
`5,651,136 A
`7/1997 Denton et al.
`5,867,685 A * 2/1999 Fuld et al. .................. 711/113
`711/136
`6,049,850 A * 4/2000 Vishlitzky et al.
`OIBER PUBLICATIONS
`
`Direct Access Storage Device Cache Segment Management,
`IBM Technical Disclosure Bulletin, vol. 37, No. 08, Aug.
`1994.
`Cache System for Hard Disk System Utilizing the Access
`Data Address, IBM Technical Disclosure Bulletin, vol. 38,
`No. 01, Jan. 1995.
`Direct Memory Access Paging and Remote DRAM Access
`Through an Optimized Memory Mapping Mechanism, IBM
`Technical Disclosure Bulletin, vol. 38, No. 06, Jun. 1995.
`Non-Volatile Cache Storage Allocation Algorithm, IBM
`Technical Disclosure Bulletin, vol. 38, No. 12, Dec. 1995.
`Fixed Storage Allocation of Input-Output Buffers, IBM
`Technical Disclosure Bulletin, vol. 39, No. 03, Mar. 1996.
`Remote Copy Administrator's Guide and Reference,
`DFSMS/MVS Version 1, Third Edition, Jul. 1996, IBM Doc.
`No. SC35-0169-02.
`
`Pinter and Yoaz; Tango: a Hardware-based Data Prefetching
`Technique for Superscalar Processors; Proceedings of the
`29th Annual IEEE/ACM International Symposium on
`Microarchitecture MICRO-29, Dec. 2-4, 1996, Paris
`France.
`Patterson, et al.; Informed Prefetching and Caching: Pro(cid:173)
`ceedings of the 15th ACM Symposium on Operating Sys(cid:173)
`tems Principles, Dec. 3-6, 1995, Copper Mountain Resort,
`Colorado.
`Tomkins et al; Informed Multi Process Prefetching and
`Caching; Performance Evaluation Review Special Issue,
`vol. 25, No. 1, Jun. 1997-1997 ACM Sigmetrics Interan(cid:173)
`tional Conference on Measurement and Modeling of Com(cid:173)
`puter Systems.
`Patterson and Gibson; Exposing 1/0 Concurrency with
`Informed Prefetching; Proceedings of the Third Interna(cid:173)
`tional Conference on Parallel and Distributed Information
`Systems, Sep. 28-30, 1994, Austin, Texas.
`Shih, et al.; A File-Based Adaptive Prefetch Caching
`Design; Proceedings, 1990 IEEE International Conference
`on Computer Design: VLSI in Computers and Processors,
`Sep., 17-19, 1990, Hyatt Regency Cambridge, Cambridge,
`MA
`King, et al.; Management of a Remote Backup Copy for
`Disaster Recovery; ACM Transactions on Database Sys(cid:173)
`tems, vol. 16, No. 2, Jun. 1991, pp. 338-368.
`U.S. Serial No. 09/149,052, filed Sep. 8, 1998 (Dkt.
`#TU9-98-014 18.23).
`U.S. Serial No. 09/136,630, filed Aug. 19, 1998 (Dkt.
`#TU9-98-034 18.34).
`U.S. Serial No. 09/135,842, filed Aug. 18, 1998 (Dkt.
`#TU9-98-033 18.35.
`
`* cited by examiner
`
`
`
`U.S. Patent
`
`Apr. 30, 2002
`
`Sheet 1 of 6
`
`US 6,381,677 Bl
`
`Host
`System
`4c
`
`w
`
`10c
`~
`
`Host
`System
`10a
`4a
`~ '-"'-'
`IT]
`
`Host
`System
`4b
`.....,..,.
`
`10b
`~
`
`Storage
`Controller
`8
`.....,..,.
`
`I
`
`DASO
`6 w
`
`20
`
`28
`
`From first
`track in DAR,
`stage
`remainder of
`first stage
`group and
`next stage
`group.
`
`beyond tr
`ck in sta
`up inclu
`
`Yes
`
`30
`
`No
`
`Stage from first track in
`DAR to end of stage
`group.
`
`32
`
`
`
`U.S. Patent
`
`Apr. 30, 2002
`
`Sheet 2 of 6
`
`US 6,381,677 Bl
`
`38
`
`Is first
`track in DAR
`beyond trigger track
`in stage
`group including
`first track?
`
`No
`
`Stage from
`first track in
`DAR to end - -
`of stage
`group.
`
`Yes
`
`42
`
`No
`
`Stage from first
`track in DAR to end
`of stage group and
`stage remainder of
`intent count from
`next stage group.
`
`cks in int
`t span e
`cond sta
`
`40
`
`Yes
`
`From first track in
`DAR, stage
`remainder of first
`stage group and
`next stage group.
`
`
`
`U.S. Patent
`
`Apr. 30, 2002
`
`Sheet 3 of 6
`
`US 6,381,677 Bl
`
`Receive DAR including
`range of tracks to transfer.
`
`Go to first track in DAR.
`
`60
`
`62
`
`No
`
`66
`
`Wait for track
`to be staged.
`
`Read track in cache;
`return data to host.
`
`74
`
`End DAR.
`
`Go to next track in DAR.
`
`82
`
`No
`
`Determine
`.----...-i RICP.
`
`No
`
`es
`
`Prestage the next stage
`group into cache.
`
`84
`
`88
`
`92
`
`Stage in next
`stage group.
`
`Stage in remaining of
`tracks in intent count.
`
`
`
`U.S. Patent
`
`Apr. 30, 2002
`
`Sheet 4 of 6
`
`US 6,381,677 Bl
`
`DASD6
`
`0
`lsFI
`
`30
`lsFI
`
`60
`lsFI
`
`Stage Group 1
`1
`
`lsFI
`
`I •
`
`29
`• lsFI
`
`Stage Group 2
`31
`50
`I ·I 1 •
`
`lsFI
`
`59
`lsFI
`
`Stage Group 3
`61
`
`lsFI
`
`I •
`
`89
`• lsFI
`
`I
`
`I
`
`I
`
`Stage
`
`',
`Cache 12
`
`
`
`U.S. Patent
`
`Apr. 30, 2002
`
`Sheet 5 of 6
`
`US 6,381,677 Bl
`
`Process
`track request.
`
`110
`
`Process
`demotion
`of track.
`
`104
`
`Yes
`
`Increment
`Busy _for _stg
`
`106
`
`Call Update
`SGS.
`
`108
`
`Increment
`Num reads.
`
`No
`
`116
`
`Yes
`
`End
`program.
`
`Yes
`
`118
`
`No
`
`Increment
`prestg_no_acc.
`
`120
`
`Call Update
`SGS.
`
`
`
`U.S. Patent
`
`Apr. 30, 2002
`
`Sheet 6 of 6
`
`US 6,381,677 Bl
`
`Process call to
`update SGS.
`
`Yes
`
`No
`
`134
`
`End
`program.
`
`No
`
`Yes
`
`138
`
`Decrease stage
`group size.
`
`Yes
`
`142
`
`Increase stage
`group size.
`
`144
`
`Divide
`busy_for_stg ...,.. _______ _____,
`byC.
`
`146
`
`Divide
`prestg_no _ace
`byC.
`
`148
`
`Reset
`num_reads.
`
`
`
`US 6,381,677 Bl
`
`1
`METHOD AND SYSTEM FOR STAGING
`DATA INTO CACHE
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates to a method and system for
`staging data from a memory device to a cache and, in
`particular, staging data in anticipation of data access
`requests.
`2. Description of the Related Art
`In current storage systems, a storage controller manages
`data transfer operations between a direct access storage
`device (DASD), which may be a string of hard disk drives
`or other non-volatile storage devices and host systems and
`their application programs. To execute a read operation
`presented from a host system, i.e., a data access request
`(DAR), the storage controller must physically access the
`data stored in tracks in the DASD. A DAR is a set of
`contiguous data sets, such as tracks, records, fixed blocks, or 20
`any other grouping of data. The process of physically
`rotating the disk in the DASD to the requested track then
`physically moving the reading unit to the disk section to read
`data is often a time consuming process. For this reason,
`current systems stage data into a cache memory of the 25
`storage controller in advance of the host requests for such
`data.
`A storage controller typically includes a large buffer
`managed as a cache to buffer data accessed from the attached
`DASD. In this way, data access requests (DARs) can be
`serviced at electronic speeds directly from the cache thereby
`avoiding the electromechanical delays associated with read(cid:173)
`ing the data from the DASD. Prior to receiving the actual
`DAR, the storage controller receives information on a
`sequence of tracks in the DASD involved in the upcoming
`read operation or that data is being accessed sequentially.
`The storage controller will then proceed to stage the sequen(cid:173)
`tial tracks into the cache. The storage controller would then
`process DARs by accessing the staged data in the cache. In
`this way, the storage controller can return cached data to a
`read request at the data transfer speed in the storage con(cid:173)
`troller channels as opposed to non-cached data which is
`transferred at the speed of the DASD device.
`In prior art staging systems, a fixed number of tracks may
`be staged. In such fixed track staging systems, often one of
`the tracks being staged is already in cache. Other systems
`provide for staging data that will be subject to a sequential
`read operation, otherwise known as prestaging data. An
`example of a sequential prestaging routine to stage multiple
`tracks is described in U.S. Pat. No. 5,426,761, entitled
`"Cache and Sequential Staging and Method," which patent
`is assigned to International Business Machines Corporation
`("IBM") and which is incorporated herein by reference in its
`entirety. U.S. Pat. No. 5,426,761 discloses an algorithm for
`sequential read operations to prestage multiple tracks into
`cache when the tracks are in the extent range of the sequen(cid:173)
`tial read operation, the track is not already in the cache, and
`a maximum number of tracks have not been prestaged into
`cache.
`Tracks staged into cache may be demoted according to a
`Least Recently Used (LRU) algorithm to insure that the
`staged data in cache does not exceed a predetermined
`threshold. Current IBM storage controllers that stage data
`into cache are described in IBM publications "IBM 3990/
`9390 Storage Control Reference," IBM publication no. 65
`GA32-0274-04 (IBM Copyright, 1994, 1996), which pub(cid:173)
`lication is incorporated herein by reference and "Storage
`
`5
`
`10
`
`2
`Subsystem Library: IBM 3990 Storage Control Reference
`(Models 1, 2, and 3)", IBM document no. GA32-0099-06,
`(IBM Copyright 1988, 1994).
`One problem with current staging systems is the inability
`to adjust the rate of staging and/or the number of tracks
`staged to accommodate variances in DARs. If the number of
`tracks staged is not carefully controlled, then too many or
`too few tracks will be staged into cache. Staging too few
`tracks will delay responding to DARs because the response
`to the DAR will have to wait while the requested data is
`staged in from DASD. On the other hand, staging too much
`data into cache in advance of when the data is needed will
`result in wasted cache space. Further, if staged data is not
`accessed for a considerable period of time, then the staged
`data may be demoted according to a LRU algorithm. In such
`15 case, the demoted staged data will not be available for the
`anticipated DAR.
`SUMMARY OF THE PREFERRED
`EMBODIMENTS
`To overcome the limitations in the prior art described
`above, preferred embodiments disclose a system for caching
`data. After determining a sequential access of a first memory
`area, a processing unit stages a group of data sets from the
`first memory area to a second memory. The processing unit
`processes a data access request (DAR) for data sets in the
`first memory area that are included in the sequential access
`and reads the requested data sets from the second memory
`area. The processing unit determines a trigger data set from
`a plurality of trigger data sets based on a trigger data set
`criteria. The processing unit then stages a next group of data
`30 sets from the first memory area to the second memory area
`in response to reading the determined trigger data set.
`In further embodiments, the processing unit receives a
`data access request (DAR), information indicating a range of
`data to be accessed, and a first data set number indicating a
`35 first data set of the data sets. The processing unit stages a
`group of data sets from the first memory area to a second
`memory area in response to processing the information. The
`processing unit then processes a data access request (DAR)
`for data sets in the first memory area that are in the range.
`40 Upon reading a data set indicated as a trigger data set, the
`processing unit stages a next group of data sets from the first
`memory area to the second memory area in response to
`reading the trigger data set.
`In still further embodiments, the number of data sets in the
`45 group of data sets may be adjusted. The number of data sets
`in the group is decreased upon determining that a number of
`staged data sets demoted from the second memory area that
`have not been subject to a DAR exceed a first predetermined
`threshold. The number of data sets in the group is increased
`50 upon determining that a number of times requested data in
`a processed DAR are not staged to the second memory area
`exceeds a second predetermined threshold.
`Preferred embodiments seek to balance two competing
`goals. The first goal concerns providing sufficient tracks in
`55 cache to continuously make data available to DARs from
`host systems and avoid a cache miss. A cache miss occurs
`when the cache does not include the data requested by the
`host. The second goal concerns conserving available
`memory space in cache. To accomplish this second goal, too
`60 much data cannot be staged into cache in advance of when
`the DAR will be received. Otherwise, the staged data will
`remain unaccessed in cache and consume cache space which
`could otherwise be made available for other operations.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`Referring now to the drawings in which like reference
`numbers represent corresponding parts throughout:
`
`
`
`US 6,381,677 Bl
`
`5
`
`3
`FIG. 1 is a block diagram illustrating a software and
`hardware environment in which preferred embodiments of
`the present invention are implemented;
`FIGS. 2a, b illustrate logic to set-up stage operations in
`response to receiving a data access request (DAR) in accor-
`dance with preferred embodiments of the present invention;
`FIG. 3 illustrates logic to process the DAR in accordance
`with preferred embodiments of the present invention;
`FIG. 4 illustrates how tracks in the DASD are arranged in
`stage groups in accordance with preferred embodiments of
`the present invention;
`FIG. 5 illustrates logic to process a DAR request to check
`if a sufficient level of data is being staged in advance of
`sequential DARs in accordance with preferred embodiments 15
`of the present invention;
`FIG. 6 illustrates logic to process a demotion of a track to
`determine if too many tracks are being staged in advance of
`the sequential DARs in accordance with preferred embodi(cid:173)
`ments of the present invention; and
`FIG. 7 illustrates logic to alter the stage group size to
`optimize the number of tracks being staged in accordance
`with preferred embodiments of the present invention.
`
`10
`
`4
`The storage controller further includes a cache 12. In
`preferred embodiments, the cache 12 is implemented in a
`high speed, volatile storage area within the storage control(cid:173)
`ler 8, such as a DRAM, RAM, etc. The length of time since
`the last use of a record in cache 12 is maintained to
`determine the frequency of use of cache 12. The least
`recently used (LRU) data in cache 12 is demoted according
`to LRU or other cache memory management techniques
`known in the art. In alternative embodiments, the cache 12
`may be implemented in alternative storage areas accessible
`to the storage controller 8. Data can be transferred between
`the channels 10a, b, c and the cache 12, between the
`channels 10a, b, c and the DASD 6, and between the DASD
`6 and the cache 12. In alternative embodiments with
`branching, data retrieved from the DASD 6 in response to a
`read miss can be concurrently transferred to both the channel
`10a, b, c and the cache 12 and a data write can be
`concurrently transferred from the channel 10a, b, c to both
`a non-volatile storage unit and cache 12.
`Preferred embodiments may use different input/output
`20 (1/0) interfaces to transfer data among the host systems 4a,
`b, c, the storage controller 8, and the DASD 6. In the IBM
`ESA390 system, which utilizes the IBM mainframe system,
`an 1/0 operation operates through the execution of a series
`of linked channel command words (CCW). The CCW
`25 designates the storage area associated with the operation, the
`action to be taken whenever transfer to or from the area is
`completed, and other options. In the IBM mainframe
`system, a Locate Record channel command is used to
`specify the operation, location and number of consecutive
`30 records or tracks to process. The Locate Record command
`includes a count parameter that specifies the number of
`records or tracks that will be accessed in the data transfer
`commands that follow the Locate Record command. Also
`included is a field including the seek address ( cylinder and
`head, CCHH), which is the track where data transfer opera-
`35 tions begin. The actual 1/0 operations follow the Locate
`Record command.
`The Locate Record command is sent with a Define Extent
`command that specifies the first and last addressable tracks
`in the extent. The Define Extent command further includes
`a field defining attributes and limitations on the associated
`1/0 operation. For instance, the Define Extent command
`includes a field indicating whether associated 1/0 operations
`are part of a sequential access operation. Details of the
`Locate Record and Define Extent commands are described
`in the IBM publication "IBM 3990 Storage Controller
`Reference," IBM document no. GA32-0099-06, which pub(cid:173)
`lication was incorporated herein by reference above.
`Alternatively, preferred embodiments may be imple(cid:173)
`mented using the Small Computer System Interface (SCSI).
`In SCSI, a request to transfer data is accomplished with
`commands such as PRE-FETCH, READ(6), READ(lO), and
`READ LONG. All these commands request a device server,
`e.g., the storage controller 8, to transfer data from a device,
`e.g., the DASD 6, to a client, e.g., the host systems 4a, b, c.
`These commands include a logical block address and trans(cid:173)
`fer length fields. The logical block address provides the
`address of the first fixed block address in the data request.
`The transfer length specifies the number of contiguous
`logical blocks of data to be transferred starting from the
`logical block address. SCSI commands are described in
`"Information Technology-SCSI-3 Block Commands
`(SBC)," published by ANSI, reference no. ANSI NCITS
`306-199x (Rev. Sc, November 1997), which publication is
`incorporated herein by reference in its entirety.
`
`DETAILED DESCRIPTION OF IBE
`PREFERRED EMBODIMENTS
`
`In the following description, reference is made to the
`accompanying drawings which form a part hereof, and
`which is shown, by way of illustration, several embodiments
`of the present invention. It is understood that other embodi(cid:173)
`ments may be utilized and structural changes may be made
`without departing from the scope of the present invention.
`
`Hardware and Software Environment
`FIG. 1 illustrates a hardware environment in which pre(cid:173)
`ferred embodiments are implemented. A plurality of host
`systems 4a, b, c are in data communication with a DASD 6
`via a storage controller 8. The host systems 4a, b, c may be
`any host system known in the art, such as a mainframe
`computer, workstations, etc., including an operating system 40
`such as WINDOWS®, AlX®, UNIX®, MYS™, etc. AlX is
`a registered trademark of IBM; MYS is a trademark of IBM;
`WINDOWS is a registered trademark of Microsoft Corpo(cid:173)
`ration; and UNIX is a registered trademark licensed by the
`X/Open Company LTD. A plurality of channel paths 10a, b, 45
`c in the host systems 4a, b, c provide communication paths
`to the storage controller 8. The storage controller 8 and host
`systems 4a, b, c may communicate via any network or
`communication system known in the art, such as LAN,
`TCP/IP, ESCON®, SAN, SNA, Fibre Channel, SCSI, etc. 50
`ESCON is a registered trademark of International Business
`Machines Corporation ("IBM"). The host system 4a, b, c
`executes commands and receives returned data along a
`selected channel 10a, b, c. The storage controller 8 issues
`commands to physically position the electromechanical 55
`devices to read the DASD 6. In preferred embodiments, the
`structure of the storage controller 8 and interface between
`the storage controller 8 and host system may include aspects
`of the storage controller architecture described in the fol(cid:173)
`lowing U.S. patent applications assigned to IBM: "Failover 60
`System for a Multiprocessor Storage Controller," by Brent
`C. Beardsley, Matthew J. Kalas, Ronald R. Knowlden, Ser.
`No. 09/026,622, filed on Feb. 20, 1998; and "Failover and
`Failback System for a Direct Access Storage Device," by
`Brent C. Beardsley and Michael T. Benhase, Ser. No. 65
`08/988,887, filed on Dec. 11, 1997, both of which applica(cid:173)
`tions are incorporated herein by reference in their entirety.
`
`Staging Data Into Cache
`In preferred embodiments, data is staged from the DASD
`6 into the cache 12 when the storage controller 8 is informed
`
`
`
`US 6,381,677 Bl
`
`5
`or detects an upcoming series of sequential data access
`requests. In this way, the storage controller 8 can return data
`to the read request from the faster cache 12 as opposed to
`physically accessing the DASD 6 to access the requested
`data. The read command may inform the storage controller
`8 that the following read operation is part of a sequential
`operation comprising consecutive tracks to read. For
`instance, in the IBM mainframe environment, the Define
`Extent command indicates whether the following 1/0 opera(cid:173)
`tions are part of a sequential access. Alternatively, the
`storage controller 8 may utilize a predictive buffer memory
`management scheme to detect whether a sequence of data
`transfer operations are part of a sequential read command.
`Such predictive buffer memory management schemes are
`described in U.S. Pat. No. 5,623,608, entitled "Method and
`Apparatus for Adaptive Circular Predictive Buffer
`Management," assigned to IBM, and which patent is incor(cid:173)
`porated herein by reference in its entirety. These predictive
`memory buffer schemes may be used to detect sequential
`access for SCSI or mainframe type requests when the DARs
`do not specifically indicate whether the DAR is part of an
`ongoing sequential access.
`A data access includes both sequential and non-sequential
`data access operations. In a data access that is not part of a
`sequential access operation, the storage controller 8 is pro(cid:173)
`vided a specific range of tracks, i.e., the intent count, to
`access. For instance, the Locate Record command and SCSI
`read command provide a count or transfer length field
`indicating blocks of data to access. The storage controller 8
`will not access tracks beyond the range specified by the
`intent count. Alternatively, the storage controller 8 may
`receive a command indicating that data will be requested
`sequentially. For such sequential access operations where
`data will be accessed sequentially over a period of time, the
`storage controller 8 will prestage data until the storage
`controller 9 receives indication from another command or
`from predictive buffer management techniques that sequen(cid:173)
`tial access has ended.
`In preferred embodiments, an "intent count" is deter(cid:173)
`mined from the information provided in the DAR indicating
`the number of units of blocks, records or tracks to transfer.
`In the IBM s390 environment, the intent count is determined
`from a count parameter in the Locate Record command
`which specifies the number of records or tracks in the data
`transfer command. In SCSI, the intent count may be deter(cid:173)
`mined from the transfer length command indicating the
`number of contiguous logical blocks to transfer.
`Data is staged from the DASD 6 to cache 12 in "stage
`groups." A "stage group" is a number of sequential tracks,
`fixed blocks, stripes in a RAID system or any other data
`grouping, that are staged at a time to cache 12 from the
`DASD 6. The size of the stage group can vary. In the RAID
`environment, data is striped across multiple disks. For
`instance, in a 4+P RAID, four disks would store data blocks
`in a stripe unit and a fifth disk would store parity for the four
`data blocks. In this way a stripe includes all data blocks
`including the parity data for a parity group. In further
`embodiments, the DASD 6 may store the data in a format,
`such as Count Key Data (CKD) and the storage controller
`may receive a request in fixed blocks. In such case, the
`storage controller would translate the fixed block request to
`the CKD tracks in the DASD 6 to stage data into the cache
`12.
`The storage controller 8 transfers data from the cache 12
`across the channels 10a, b, c to the host systems 4a, b, c. In
`a sequential data access request, after transferring the data
`across the channel 10a, b, c, the storage controller 8 would
`
`6
`stage the next stage group in anticipation of the data to
`access. In this way, the storage controller 8 does not have to
`access data directly from the DASD 6 and instead may
`access the data for a DAR from the cache 12. In the SCSI
`5 environment, if the storage controller receives a READ
`command having a transfer length of 64 blocks, wherein
`each block is 512 bytes, the storage controller 8 can stage
`eight blocks at a time into the cache 12. While the storage
`controller 8 is transferring an eight block group across the
`10 channels 10a, b, c to the host system 4a, b, c, the storage
`controller 8 would stage the next eight blocks into cache 12.
`The storage controller 8 also maintains a "trigger track,"
`which is an offset track within a stage group that when
`processed directs the storage controller 8 to stage the next
`15 stage group into cache 12. When the storage controller 8
`reads the trigger rack when processing a DAR, the storage
`controller 8 then stages the next stage group into cache 12.
`In this way, the storage controller 8 limits the amount of data
`staged in cache 12 by controlling the number of stage groups
`20 staged at a time and controlling when to stage by designating
`a trigger track. The trigger track must be selected to balance
`two competing goals: (1) stage sufficiently ahead, i.e., earlier
`in the stage group, so that there is sufficient data to accom(cid:173)
`modate subsequent DARs versus (2) delay staging to mini-
`25 mize cache 12 usage. To balance these goals, the storage
`controller 8 may maintain different trigger tracks for differ(cid:173)
`ent types of hosts and for sequential versus non-sequential
`data access requests. To provide a slower stage process to
`accommodate slower transfers of data from the cache 12 to
`30 the channels 10a, b, c, the trigger track would be located
`closer to the end of the stage group versus a trigger track
`located closer to the beginning of the stage group when the
`transfer speed is high.
`There are two factors that effect how quickly data is
`35 accessed from cache. The first factor concerns the speed of
`the channel 10a, b, c and the second factor concerns the type
`of access operation, e.g., sequential or non-sequential. With
`respect to the first factor, with hosts 4a, b, c having slower
`channels 10a, b, c, data transfer from the cache 12 to the host
`40 is longer and data remains staged in cache for a longer
`period of time. To accommodate slower hosts, the storage
`controller 8 can locate the trigger track closer to the end of
`the stage group, e.g., ten tracks from the end of the stage
`group, to delay staging of the next stage group. On the other
`45 hand, to accommodate hosts 4a, b, c having faster channel
`10a, b, c data transfer rates, the storage controller 8 would
`locate the trigger track closer to the beginning of the stage
`group, e.g., an offset of 15 tracks, to stage the data into cache
`12 earlier because the DARs are more frequent, thus requir-
`50 ing more data readily available in the cache 12.
`With respect to the second factor, sequential data accesses
`are often accessed less frequently than non-sequentially
`accessed data. Sequential requests are often made when the
`host 4a, b, c is requesting blocks of data to set-up for an
`55 operation. In such case, the host 4a, b, c may not access the
`data immediately. On the other hand, non-sequential
`accesses are often for application access requests that need
`the entire block of requested data immediately in order to
`proceed. Thus, for sequential data accesses, the storage
`60 controller 8 can locate the trigger track closer to the end of
`the stage group to slow the stage rate. The stage rate may be
`slowed to accommodate the less frequent data accesses that
`occur during sequential accesses. Alternatively, for non(cid:173)
`sequential accesses, the storage controller 8 can locate the
`65 trigger track closer to the beginning of the stage group to
`stage data sooner because with non-sequential accesses, the
`data accesses are more frequent.
`
`
`
`US 6,381,677 Bl
`
`7
`The actual stage group size and trigger track locations
`may be determined empirically by performing test runs
`under different scenarios with different cache, storage
`controller, host, and DASD systems to select optimal trigger
`track and stage group sizes for a particular system architec(cid:173)
`ture.
`The storage controller 8 may also provide a "remaining
`intent count to stage" (RICP) value indicating the number of
`tracks, records or data blocks that remain to be staged for the
`intent count access range. When processing the trigger track,
`the storage controller 8 processes the RICP to determine
`whether the number of tracks remaining to be staged is less
`than the tracks that will be staged if the next stage group is
`staged. The storage controller 8 would only stage the next
`stage group to the extent that such a stage is needed to satisfy
`the intent count.
`In preferred embodiments, each track staged