`
`541
`
`that resources can be used at 100% of their maximum rate without side effects
`from interference. A later example takes a more realistic view.
`
`Example
`
`Given the following performance and cost information:
`
`a 50-MIPS CPU costing $50,000
`
`an 8-byte-wide memory with a 200-ns cycle time
`
`80 MB/sec 1/0 bus with room for 20 SCSI buses and controllers
`
`SCSI buses that can transfer 4 MB/sec and support up to 7 disks per bus
`(these are also called SCSI strings)
`
`a $2500 SCSI controller that adds 2 milliseconds (ms) of overhead to perform
`a disk 1/0
`
`an operating system that uses 10,000 CPU instructions for a disk 1/0
`
`a choice of a large disk containing 4 GB or a small disk containing 1 GB,
`each costing $3 per MB
`
`both disks rotate at 3600 RPM, have a 12-ms average seek time, and can
`transfer 2MB/sec
`
`the storage capacity must be 100 GB, and
`
`the average 1/0 size is 8 KB
`
`Evaluate the cost per 1/0 per second (IOPS) of using small or large drives.
`Assume that every disk 1/0 requires an average seek and average rotational
`delay. Use the optimistic assumption that all devices can be used at 100% of
`capacity and that the workload is evenly divided between all disks.
`
`Answer
`
`1/0 performance is limited by the weakest link in the chain, so we evaluate the
`maximum performance of each link in the 1/0 chain for each organization to
`determine the maximum performance of that organization.
`Let's start by calculating the maximum number of IOPS for the CPU, main
`memory, and 1/0 bus. The CPU 1/0 performance is determined by the speed of
`the CPU and the number of instructions to perform a disk 1/0:
`
`.
`50MIPS
`Maximum IOPS for CPU= lOOOO .
`.
`l/O = 5000
`mstructions per
`
`The maximum performance of the memory system is determined by the memory
`cycle time, the width of the memory, and the size of the 1/0 transfers:
`
`(1/200 ns)*8
`.
`.
`Maximum IOPS for mam memory = 8 KB per l/O ""' 5000
`
`Ex.1035.573
`
`DELL
`
`
`
`542
`
`9.8 Designing an 1/0 System
`
`The 1/0 bus maximum performance is limited by the bus bandwidth and the size
`of the 1/0:
`
`80 MB/sec
`.
`Maximum IOPS for the 1/0 bus = 8 KB per l/O :::: 10000
`
`Thus, no matter which disk is selected, the CPU and main memory limits the
`maximum performance to no more than 5000 IOPS.
`Now its time to look at the performance of the next link in the 1/0 chain, the
`SCSI controllers. The time to transfer 8 KB over the SCSI bus is
`
`SCSI bus transfer time = 4 ~~fsec = 2 ms
`Adding the 2-ms SCSI controller overhead means 4 ms per 1/0, making the
`maximum rate per con.troller
`
`Maximum IOPS per SCSI controller= -
`4
`
`1
`ms
`
`= 250 IOPS
`
`All the organizations will use several controllers, so 250 IOPS is not the limit for
`the whole system.
`The final link in the chain is the disks themselves. The time for an average
`disk 1/0 is
`1/0 time= 12 ms+ 360~·~PM + 2 ~~fsec = 12+8.3+ 4 = 24.3 ms
`
`so the disk performance is
`
`Maximum IOPS (using average seeks) per disk= 24.~ ms:::: 41 IOPS
`
`The number of disks in each organization depends on the size of each disk: 100
`GB can be either 25 4-GB disks or 100 1-GB disks. The maximum number of
`I/Os for all the disks is:
`Maximum IOPS for 25 4-GB disks = 25 * 41=1025
`Maximum IOPS for 100 1-GB disks = 100 * 41=4100
`
`Thus, provided there are enough SCSI strings, the disks become the new limit to
`maximum performance: 1025 IOPS for the 4-GB disks and 4100 for the 1-GB
`disks.
`While we have determined the performance of each link of the 1/0 chain, we
`still have to determine how many SCSI buses and controllers to use and how
`many disks to connect to each controller, as this may further limit maximum
`performance. The 1/0 bus is limited to 20 SCSI controllers and the SCSI
`
`Ex.1035.574
`
`DELL
`
`
`
`Input/Output
`
`543
`
`standard limits disks to 7 per SCSI string. The minimum number of controllers is
`for the 4-GB disks
`
`Minimum number of SCSI strings for 25 4-GB disks = ;
`
`5
`or 4
`
`and for 1-GB disks
`
`Minimum number ofSCSI strings for 100 1-GB disks= l~O or 15
`
`We can calculate the maximum IOPS for each configuration:
`Maximum IOPS for 4 SCSI strings = 4 * 250 = 1000 IOPS
`Maximum IOPS for 15 SCSI strings = 15 * 250 = 3750 IOPS
`
`The maximum performance of this number of controllers is slightly lower
`than the disk I/0 throughput, so let's also calculate the number of controllers so
`they don't become a bottleneck. One way is to find the number of disks they can
`support per string:
`
`Number of disks per SCSI string at full bandwidth = ~~O = 6.1 or 6
`
`and then calculate the number of strings:
`
`Number of SCSI strings for full bandwidth 4-GB disks = ~5
`= 4.1 or 5
`
`Number of SCSI strings for full bandwidth 1-GB disks= l~O = 16.7 or 17
`
`This establishes the performance of four organizations: 25 4-GB disks with 4
`or 5 SCSI strings and 100 1-GB disks with 15 to 17 SCSI strings. The maximum
`performance of each option is limited by the bottleneck (in boldface):
`
`4-GB disks, 4 strings = Min(5000,5000,10000,1025,1000) = 1000 IOPS
`
`4-GB disks, 5 strings = Min(5000,5000,10000,1025,1250) = 1025 IOPS
`
`1-GB disks, 15 strings = Min(5000,5000,10000,4100,3750) = 3750 IOPS
`•
`1-GB disks, 17 strings = Min(5000,5000,10000,4100,4250) = 4100 IOPS
`
`We can now calculate the cost for each organization:
`
`Ex.1035.575
`
`DELL
`
`
`
`544
`
`9.8 Designing an 1/0 System
`
`4-GB disks, 4 strings = $50,000 + 4*$2,500 + 25 * (4096*$3)
`
`= $367,200
`
`4-GB disks, 5 strings = $50,000 + 5*$2;500 + 25 * (4096*$3) = $369,700
`
`1-GB disks, 15 strings = $50,000 + 15*$2,500 + 100 * (1024*$3) = $394,700
`
`1-GB disks, 17 strings = $50,000 + 17*$2,500 + 100 * (1024*$3) = $399,700
`
`Finally, the cost per IOPS for each of the four configurations is $367, $361,
`$105, and $97, respectively. Calculating maximum number of average I/Os per
`second assuming 100% utilization of the critical resources, the best
`cost/performance is the organization with the small disks and the largest number
`of controllers. The small disks have 3.4 to 3.8 times better cost/performance than
`the large disks in this example. The only drawback is that the larger number of
`disks will affect system availability unless some form of redundancy is added
`(see pages 520-521).
`
`This above example assumed that resources can be used 100%. It is
`instructive to see what is the bottleneck in each organization.
`
`Example
`
`For the organizations in the last example, calculate the percentage of utilization
`of each resource in the computer system.
`
`Answer
`
`Figure 9.31 gives the answer.
`
`Resource
`
`CPU
`Memory
`1/0 bus
`SCSI buses
`Disks
`
`4-GB disks,
`4 strings
`
`4-GB disks,
`5 strings
`
`1-GB disks,
`15 strings
`
`1-GB disks,
`17 strings
`
`20%
`20%
`10%
`100%
`98%
`
`21%
`21%
`10%
`82%
`100%
`
`75%
`75%
`38%
`100%
`91%
`
`82%
`82%
`41%
`96%
`100%
`
`FIGURE 9.31 The percentage of utilization of each resource given the four
`organizations in the previous example. Either the SCSI buses or the disks are the
`bottleneck.
`
`In reality buses cannot deliver close to 100% of bandwidth without severe
`increase in latency and reduction in throughput due to contention. A variety of
`rules of thumb have been evolved to guide I/0 designs:
`
`Ex.1035.576
`
`DELL
`
`
`
`Input/Output
`
`545
`
`No 1/0 bus should be utilized more than 75% to 80%;
`
`No disk string should be utilized more than 40%;
`
`No disk arm should be seeking more than 60% of the time.
`
`Example
`
`Recalculate performance in the example above using these rules of thumb, and
`show the utilization of each component. Are there other organizations that
`follow these guidelines and improve performance?
`
`Answer
`
`Figure 9.31 shows that the 1/0 bus is far below the suggested guidelines, so we
`concentrate on the utilization of seek and SCSI bus. The utilization of seek time
`per disk is
`
`Time of average seek =
`12 ms = 12 = 5001
`1
`24
`Time between I/Os
`41 IOPS
`
`70
`
`which is below the rule of thumb. The biggest impact is on the SCSI bus:
`
`Suggested IOPS per SCSI string = -
`4
`
`l * 40% = 100 IOPS.
`ms
`
`With this data we can recalculate IOPS for each organization:
`
`4-GB disks, 4 strings = Min(5000,5000,7500,1025,400) = 400 IOPS
`
`4-GB disks, 5 strings = Min(5000,5000,7500,1025,500) = 500 IOPS
`
`1-GB disks, 15 strings = Min(5000,5000,7500,4100,1500) = 1500 IOPS
`
`1-GB disks, 17 strings = Min(5000,5000,7500,4100,1700) = 1700 IOPS
`
`Under these assumptions, the small disks have about 3.0 to 4.2 times the
`performance of the large disks.
`Clearly, the string bandwidth is the bottleneck now. The number of disks per
`string that would not exceed the guideline is
`Number of disks per SCSI string at full bandwidth= ~~O = 2.4 or 2
`
`and the ideal number of strings is
`
`Number of SCSI strings for full bandwidth 4-GB disks = ~ = 12.5 or 13
`
`Number of SCSI strings for full bandwidth 1-GB disks= l~O = 50
`
`Ex.1035.577
`
`DELL
`
`
`
`
`
`Input/Output
`
`547
`
`The IBM 360/370 I/O architecture has evolved over a period of 25 years.
`Initially, the I/0 system was general purpose, and no special attention was paid
`to any particular device. As it became clear that magnetic disks were the chief
`consumers of I/0, the IBM 360 was tailored to support fast disk I/0. IBM's
`dominant philosophy is to choose latency over throughput whenever it makes a
`difference. IBM almost never uses a large buffer outside the CPU; their goal is
`to set up a clear path from main memory to the I/O device so that when a device
`is ready, nothing can get in the way. Perhaps IBM followed a corollary to the
`quote on page 526: you can buy bandwidth, but you need to design for latency.
`As a secondary philosophy, the CPU is unburdened as much as possible to allow
`the CPU to continue with computation while others perform the desired I/O
`activities.
`The example for this section is the high-end IBM 3090 CPU and the 3990
`Storage Subsystem. The IBM 3090, models 3090/100 to 3090/600, can contain
`one to six CPUs. This 18.5-ns-clock-cycle machine has a 16-way interleaved
`memory that can transfer eight bytes every clock cycle on each of two
`(3090/100) or four (3090/600) buses. Each 3090 processor has a 64-K.B, 4-way(cid:173)
`set-associative, write-back cache, and the cache supports pipelined access taking
`two cycles. Each CPU is rated about 30 IBM MIPS (see page 78), giving at
`most 180 MIPS to the IBM 3090/600. Surveys of IBM mainframe installations
`suggest a rule of thumb of about 4 GB of disk storage per MIPS of CPU power
`(see Section 9.12).
`It is only fair warning to say that IBM terminology may not be self-evident,
`although the ideas are not difficult. Remember that this I/O architecture has
`evolved since 1964. While there may well be ideas that IBM wouldn't include if
`they were to start anew, they are able to make this scheme work, and make it
`work well.
`
`The 3990 1/0 Subsystem Data-Transfer Hierarchy
`and Control Hierarchy
`
`The I/0 subsystem is divided into two hierarchies:
`
`1. Control-This hierarchy of controllers negotiates a path through a maze of
`possible connections between the memory and the I/O device and controls
`the timing of the transfer.
`
`2. Data-This hierarchy of connections is the path over which data flows
`between memory and the I/O device.
`
`After going over each of the hierarchies, we trace a disk read to help understand
`the function of each component.
`For simplicity, we begin by discussing the data-transfer hierarchy, shown in
`Figure 9.33 (page 548). This figure shows one section of the hierarchy that con(cid:173)
`tains up to 64 large IBM disks; using 64 of the recently announced IBM 3390
`disks, this piece could connect to over one trillion bytes of storage! Yet this
`
`Ex.1035.579
`
`DELL
`
`
`
`
`
`Input/Output
`
`549
`
`96 channels to a 3090/600. Because they are "multiprogrammed," channels can
`actually service several disks. For historical reasons, IBM calls this block
`multiplexing.
`Channels are connected to the 3090 main memory via two speed-matching
`·buffers, which funnel all the channels into a single port to main memory. Such
`buffers simply match the bandwidth of the I/O device to the bandwidth of the
`memory system. There are two 8-byte buffers per channel.
`The next level down the data hierarchy is the storage director. This is an
`intermediary device that allows the many channels to talk to many different I/0
`devices. Four to sixteen channels go to the storage director depending on the
`model, and two or four paths come out the bottom to the disks. These are called
`two-path strings or four-path strings in IBM parlance. Thus, each storage
`director can talk to any of the disks using one of the strings. At the top of each
`string is the head of string, and all communication between disks and control
`units must pass through it.
`At the bottom of the datapath hierarchy are the disk devices themselves. To
`increase availability, disk devices like the IBM 3380 provide four paths to
`connect to the storage director; if one path fails, the device can still be
`connected.
`The redundant paths from main memory to the I/0 device not only improve
`availability, but also can improve performance. Since the IBM philosophy is to
`avoid large buffers, the path from the I/0 device to main memory must remain
`connected until the transfer is complete. If there were a single hierarchical path
`from devices to the speed-matching buffer, only one I/0 device in a subtree
`could transfer at a time. Instead, the multiple paths allow multiple devices to
`transfer simultaneously through the storage director and into memory.
`The task of setting up the datapath connection is that of the control hierarchy.
`Figure 9.34 shows both the control and data hierarchies of the 3990 1/0
`subsystem. The new device is the I/0 processor. The 3090 channel controller
`and 1/0 processor are load/store machines similar to DLX, except that there is no
`memory hierarchy. In the next subsection we see how the two hierarchies work
`together to read a disk sector.
`
`Tracing a Disk Read in the IBM 3990 1/0
`Subsystem
`
`The 12 steps below trace a sector read from an IBM 3380 disk. Each of the 12
`steps is labeled on a drawing of the full hierarchy in Figure 9.34 (page 550).
`
`1. The user sets up a data structure in memory containing the operations that
`should occur during this I/0 event. This data structure is termed an //0 control
`block, or IOCB, which also points to a list of channel control words (CCWs).
`This list is called a channel program. Normally, the operating system provides
`the channel program, but some users write their own. The operating system
`checks the IOCB for protection violations before the I/0 can continue.
`
`Ex.1035.581
`
`DELL
`
`
`
`
`
`Input/Output
`
`551
`
`Location ccw
`Define
`CCWl:
`Extent
`Locate
`Record
`
`CCW2:
`
`CCW3:
`
`Comment
`
`Transfers a 16-byte parameter to the storage director. The
`channel sees this as a write data transfer.
`Transfers a 16-byte parameter to the storage director as
`above. The parameter identifies the operation (read in this
`case) plus seek, sector number, and record ID. The channel
`again sees this as a write data transfer.
`Read Data Transfers the desired disk data to the channel and then to
`the main memory.
`
`FIGURE 9.35 A channel program to perform a disk read, consisting of three channel
`command words (CCWs). The operating system checks for virtual memory access
`violations of CCWs by simulating them to check for violations. These instructions are linked
`so that only one START SUBCHANNEL instruction is needed.
`
`(
`
`3. The I/O processor uses the control wires of one of the channels to tell the
`storage director which disk is to be accessed and the disk address to be read. The
`channel is then released.
`
`4. The storage director sends a SEEK command to the head-of-string controller
`and the head-of-string controller connects to the desired disk, telling it to seek to
`the appropriate track, and then disconnects. The disconnect occurs between
`CCW2 and CCW3 in Figure 9.35.
`
`Upon completion of these first four steps of the read, the arm on the disk
`seeks the correct track on the correct IBM 3380 disk drive. Other I/O operations
`can use the control and data hierarchy while this disk is seeking and the data is
`rotating under the read head. The I/O processor thus acts like a multipro(cid:173)
`grammed system, working on other requests while waiting for an I/O event to
`complete.
`An interesting question arises: When there are multiple uses for a single disk,
`what prevents another seek from screwing up the works before the original
`request can continue with the I/O event in progress? The answer is the disk
`appears busy to the programs in the 3090 between the time a s TART
`SUBCHANNEL instruction starts a channel program (step 2) and the end of that
`channel program. An attempt to execute another START
`SUBCHANNEL
`instruction would receive busy status from the channel or from the disk device.
`After both the seek completes and the disk rotates to the desired point relative
`to the read head, the disk reconnects to a channel. To determine the rotational
`position of the 3380 disk, IBM provides rotational positional sensing (RPS), a
`feature that gives early warning when the data will rotate under the read head.
`IBM essentially extends the seek time to include some of the rotation time,
`thereby tying up the datapath as little as possible. Then the I/0 can continue:
`
`5. When the disk completes the seek and rotates to the correct position, it
`contacts the head-of-string controller.
`
`Ex.1035.583
`
`DELL
`
`
`
`552
`
`9.9 Putting It All Together: The IBM 3990 Storage Subsystem
`
`6. The head-of-string controller looks for a free storage director to send the
`signal that the disk is on the right track.
`
`7. The storage director looks for a free channel so that it can use the control
`wires to tell the I/0 processor that the disk is on the right track.
`
`8. The 1/0 processor simultaneously contacts the storage director and I/0
`device (the IBM 3380 disk) to give the OK to transfer data, and tells the channel
`controller where to put the information in main memory when it arrives at the
`channel.
`
`There is now a direct path between the I/0 device and memory and the
`transfer can begin:
`
`9. When the disk is ready to transfer, it sends the data at 3 megabytes per
`second over a bit-serial line to the storage director.
`
`10. The storage director collects 16 bytes in one of two buffers and sends the
`information on to the channel controller.
`
`11. The channel controller has a pair of 16-byte buffers per storage director and
`sends 16 bytes over a 3-MB or 4.5-MB per second, 8-bit-wide datapath to the
`speed-matching buffers.
`
`12. The speed-matching buffers take the information corning in from all
`channels. There are two 8-byte buffers per channel that send 8 bytes at a time to
`the appropriate locations in main memory.
`
`Since nothing is free in computer design, one might expect there to be a cost
`in anticipating the rotational delay using RPS. Sometimes a free path cannot be
`established in the time available due to other I/0 activity, resulting in an RPS
`miss. An RPS miss means the 3990 I/0 Subsystem must either:
`
`• Wait another full rotation-16.7 ms-before the data is back under the head,
`or
`
`• Break down the hierarchical datapath and start all over again!
`
`Lots of RPS misses can ruin response times.
`As mentioned above, the IBM 1/0 system evolved over many years, and
`Figure 9.36 shows the change in response time for a few of those changes. The
`first improvement concerns the path for data after reconnection. Before the
`Systern/370-XA, the data path through the channels and storage director (steps 5
`through 12) had to be the same as the path taken to request the seek (steps 1
`through 4). The 370-XA allows the path after reconnection to be different, and
`this option is called dynamic path reconnection (DPR). This change reduced the
`time waiting for the channel path and the time waiting for disks (queueing
`delay), yielding a reduction in the total average response time of 17%. The
`second change in Figure 9.36 involved a new disk design. Improvements to the
`
`Ex.1035.584
`
`DELL
`
`
`
`
`
`554
`
`9.9 Putting It All Together: The IBM 3990 Storage Subsystem
`
`• A large number of 1/0 devices at a time
`
`• High performance
`
`• Low latency
`
`Substantial expendability and lower latency are hard to get at the same time.
`IBM channel-based systems achieve the third and fourth goals by utilizing
`hierarchical data paths to connect a large number of devices. The many devices
`and parallel paths allow simultaneous transfers and, thus, high throughput. By
`avoiding large buffers and providing enough extra paths to minimize delay from
`congestion, channels offer low-latency 1/0 as well. To maximize use of the
`hierarchy, IBM uses rotational positional sensing to extend the time that other
`tasks can use the hierarchy during an 1/0 operation.
`Therefore, a key to performance of the IBM 1/0 subsystem is the number of
`rotational positional misses and congestion on the channel paths. A rule of
`thumb is that the single-path channels should be no more than 30% utilized and
`the quad-path channels should be no more than 60% utilized, or too many
`rotational positional misses will result. This 1/0 architecture dominates the
`industry, yet it would be interesting to see what, if anything, IBM would do
`differently if given a clean slate.
`
`9.10 I Fallacies and Pitfalls
`
`Fallacy: 110 plays a small role in supercomputer design
`
`The goal of the Illiac IV was to be the world's fastest. computer. It may not have
`achieved that goal, but it showed 1/0 as the Achilles' Heel of high-performance
`machines. In some tasks, more time was spent in loading data than in computing.
`Amdahl's Law demonstrated the importance of high performance in all the parts
`of a high-speed computer. (In fact, Amdahl made his comment in reaction to
`claims for performance through parallelism made on behalf of the Illiac IV.) The
`Illiac IV had a very fast transfer rate (60 MB/sec), but very small, fixed-head
`disks (12-MB capacity). Since they were not large enough, more storage was
`provided on a separate computer. This led to two ways of measuring 1/0
`overhead:
`
`Warm start-Assuming the data is on the fast, small disks, 1/0 overhead is
`the time to load the Illiac IV memory from those disks.
`
`Cold start-Assuming the data is in on the other computer, 1/0 overhead
`must include the time to first transfer the data to the Illiac IV fast disks.
`
`Figure 9.37 shows ten applications written for the Illiac IV in 1979. Assuming
`warm starts, the supercomputer was busy 78% of the time and waiting for 1/0
`22% of the time; assuming cold starts, it was busy 59% of the time and waiting
`for 1/0 41 % of the time.
`
`Ex.1035.586
`
`DELL
`
`
`
`
`
`556
`
`9.1 O Fallacies and Pitfalls
`
`The most telling example comes from the IBM 360. It was decided that the
`performance of the ISAM system, an early database system, would improve if
`some of the record searching occurred in the disk controller itself. A key field
`was associated with each record, and the device searched each key as the disk
`rotated until it found a match. It would then transfer the desired record. For the
`disk to find the key, there had to be an extra gap in the track. This scheme is
`applicable to searches through indices as well as data.
`The speed a track can be searched is limited by the speed of the disk and of
`the number of keys that can be packed on a track. On an IBM 3330 disk the key
`is typically 10 characters, but the total gap between records is equivalent to 191
`characters if there were a key. (The gap is only 135 characters if there is no key,
`since there is no need for an extra gap for the key.) If we assume the data is also
`10 characters and the track has nothing else on it, then a 13165-byte track can
`contain
`
`13165
`191+lO+10 = 62 key-data records
`
`This performance is
`/k
`h
`
`16.7 ms (1 revolution)_ 25 - . ms ey searc
`62
`
`In place of this scheme, we could put several key-data pairs in a single block and
`have smaller inter-record gaps. Assuming there are 15 key-data pairs per block
`and the track has nothing else on it, then
`
`13165
`13165
`.
`135+15*(l0+10) = 135+300 = 30 blocks of key-data parrs
`The revised performance is then
`/k
`h
`16.7 ms (1 revolution) _ 04
`30*15
`- . ms ey searc
`Yet as CPUs gotfaster, the CPU time for a search was trivial. While the strategy
`made early machines faster, programs that use the search-key operation in the
`I/0 processor run six times slower on today's machines!
`
`Fallacy: Comparing the price of media versus the price of the packaged
`system.
`
`This happens most frequently when new memory technologies are compared to
`magnetic disks. For example, comparing the DRAM-chip price to magnetic-disk
`packaged price in Figure 9.16 (page 518) suggests the difference is less than a
`factor of 10, but its much greater when the price of packaging DRAM is
`included. A common mistake with removable media is to compare the media
`cost not including the drive to read the media. For example, optical media: costs
`
`Ex.1035.588
`
`DELL
`
`
`
`Input/Output
`
`557
`
`only $1 per MB in 1990, but including the cost of the optical drive may bring the
`price closer to $6 per MB.
`
`Fallacy: The time of an average seek of a disk in a computer system is the
`time for a seek of one-third the number of cylinders.
`
`This fallacy comes from confusing the way manufacturers market disks with the
`expected performance and with the false assumption that seek time~ are linear in
`distance. The 1/3 distance rule of thumb comes from calculating the distance of
`a seek from one random location to another random location, not including the
`current cylinder and assuming there are a large number of cylinders. In the past,
`manufacturers listed the seek of this distance to offer a consistent basis for
`comparison. (As mentioned on page 516, today they calculate the "average" by
`timing all seeks and dividing by the number.) Assuming (incorrectly) that seek
`time is linear in distance, and using the manufacturers reported minimum and
`"average" seek times, a common technique to predict seek time is:
`
`T'
`T.
`Distance
`1meseek = 1meminimum + D'
`1stanceaverage
`
`(T'
`* 1meaverage -
`
`)
`T'
`1meminimum
`
`The fallacy concerning seek time is twofold. First, seek time is not linear
`with distance; the arm must accelerate to overcome inertia, reach its maximum
`traveling speed, decelerate as it reaches the requested position, and then wait to
`allow the arm to stop vibrating (settle time). Moreover, in recent disks
`sometimes the arm must pause to control vibrations. Figure 9.38 (page 558)
`plots time versus seek distance for an example disk. It also shows the error in
`the simple seek-time formula above. For short seeks, the acceleration phase
`plays a larger role than the maximum traveling speed, and this phase is typically
`modeled as the square root of the distance. Figure 9.39 (page 558) shows
`accurate formulas used to model the seek time versus distance for two disks.
`The second problem is the average in the product specification would only be
`true if there was no locality to disk activity. Fortunately, there is both temporal
`and spatial locality (page 403 in Chapter 8): disk blocks get used more than once
`and disk blocks near the current cylinder are more likely to be used than those
`farther away. For example, Figure 9.40 (page 559) shows sample measurements
`of seek distances for two workloads: a UNIX timesharing workload and a
`business-processing workload. Notice the high percentage of disk accesses to the
`same cylinder, labeled distance 0 in the graphs, in both workloads.
`Thus, this fallacy couldn't be more misleading. The Exercises debunk this
`fallacy in more detail.
`
`Ex.1035.589
`
`DELL
`
`
`
`
`
`
`
`560
`
`9.12 Historical Perspective and References
`
`9.1 2 I Historical Perspective and References
`
`The forerunner of today's workstations was the Alto developed at Xerox Palo
`Alto Research Center in 1974 [Thacker et al. 1982]. This machine reversed
`traditional wisdom, making instruction set interpretation take back seat to .the
`display: the display used half the memory bandwidth of the Alto. In addition to
`the bit-mapped display, this historic machine had the first Ethernet [Metcalfe
`and Boggs 1976] and the first laser printer. It also had a mouse, invented earlier
`by Doug Engelhart of SRI, and a removable cartridge disk. The 16-bit CPU
`implemented an instruction set similar to the Data General Nova and· offered
`writable control store (see Chapter 5, Section 5.8). In fact, a single micropro(cid:173)
`grammable engine drove the graphics display, mouse, disks, network, and, when
`there was nothing else to do, interpreted the instruction set.
`The attraction of a personal computer is that you don't have to share it with
`anyone. This means response time is predictable, unlike timesharing systems.
`Early experiments in the importance of fast response time were performed by
`Doherty and Kelisky [1979]. They showed that if computer-system response
`time increased a second that user think time did also. Thadhani [1981] showed a
`jump in productivity as computer response times dropped to a second and
`another jump as they dropped to a half-second. His results inspired a flock of
`studies, and they supported his observations [IBM 1982]. In fact, some studies
`were started to disprove his results! Brady [1986] proposed differentiating entry
`time from think time (since entry time was becoming significant when the two
`were lumped together) and provided a cognitive model to explain the more than
`linear relationship between computer response time and user think time.
`The ubiquitous microprocessor has inspired not only personal computers in
`the 1970s, but the current trend to moving controller functions into 1/0 devices
`in the late 1980s and 1990s. For example, microcoded routines in a central CPU
`made sense for the Alto in 1975, but technological changes soon made separate
`microprogrammable controller 1/0 devices economical. These were then
`replaced by the application-specific integrated circuits. 1/0 devices continued
`this trend by moving controllers into the devices themselves. These are called
`intelligent devices, and some bus standards (e.g., IPI and SCSI) have been
`created just for these devices. Intelligent devices can relax the timing constraints
`by handling many of the low-level tasks and queuing the results. For example,
`many SCSI-compatible disk drives include a track buffer on the disk itself,
`supporting read ahead and connect/disconnect. Thus, on a SCSI string some
`disks can be seeking and others loading their track buffer while one is
`transferring data from its buffer over the SCSI bus.
`Speaking of buses, the first multivendor bus may have been the PDP-11
`Unibus in 1970. DEC encouraged other companies to build devices that would
`plug into their bus, and many companies did. A more recent example is SCSI,
`
`Ex.1035.592
`
`DELL
`
`
`
`Input/Output
`
`561
`
`which stands for small computer systems interface. This bus, originally called
`SASI, was invented by Shugart and was later standardized by the IEEE.
`Sometimes buses are developed in academia; the NuBus was developed by Steve
`Ward and his colleagues at MIT and used by several companies. Alas, this open(cid:173)
`door policy on buses is in contrast to companies with proprietary buses using
`patented interfaces, thereby preventing competition from plug-compatible
`vendors. This practice also raises costs and lowers availability of 1/0 devices
`that plug into proprietary buses, since such devices must have an interface
`designed just for thatbus. Levy [1978] has a nice survey on issues in buses.
`We must also give a few references to specific 1/0 devices. Readers
`- interested in the ARPANET should see Kahn [1972]. As mentioned in one of
`the section quotes, the father of computer graphics is Ivan Sutherland, who
`received the ACM Turing Award in 1988. Sutherland's Sketchpad system
`[1963] set the standard for today's interfaces and displays. See Foley and Van
`Dam [1982] and Newman and Sproull [1979] for more on computer graphics.
`Scranton, Thompson, and Hunter [1983] were among the first to report the
`myths concerning seek times and distances for magnetic disks.
`Comments on the future of disks can be found in several sources. Goldstein
`[1987] projects the capacity. and 1/0 rates for IBM mainframe installations in
`1995, suggesting that the ratio is no less than 3.7 GB per IBM mainframe MIPS
`today, and that will grow to 4.5 GB per MIPS in 1995. Frank [1987] speculated
`on the physical recording density, proposing the MAD formula on disk growth
`that