`Lcwchuk ct al.
`
`[19]
`
`[11] Patent Number:
`
`6,058,461
`
`[45] Date of Patent:
`
`May 2, 2000
`
`US006058461A
`
`191-202 [Online] httpI/C/Www.cs.washington.edu/research/'
`smt.
`
`Primary E.xarm'rter—Eddie P. Chan
`Assistant I?xmm'ner—Yamir Encarn acion
`Attorney, Agent, or Firm—Conley, Rose & Tayon, P.C.;
`Lawrence J. Merkel
`
`[57]
`
`ABSTRACT
`
`A computer system includes one or more microprocessors.
`The microprocessors assign a priority level to each memory
`operation as the memory operations are initiated. In one
`embodiment, the priority levels employed by the micropro-
`cessors include a fetch priority level and a prefetch priority
`level. The [etch priority level is higher priority than the
`prefetch priority level, and is assigned to memory operations
`which are the direct result of executing an instruction. The
`prefetch priority level
`is assigned to memory operations
`which are generated according to a prefetch algorithm
`implemented by the microprocessor. As memory operations
`are routed through the computer system to main memory
`and corresponding data transmitted, the elements involved
`in performing the memory operations are configured to
`interrupt the transfer of data for the lower priority memory
`operation in order to perform the data transfer for the higher
`priority rrierriory operation. While one embodimerit of the
`computer system employs at least a fetch priority and a
`prefetch priority, the concept of applying priority levels to
`various memory operations and interrupting data transfers of
`lower priority memory operations to higher priority memory
`operations may be extended to other types of memory
`operations, even if prefetching is not employed within a
`computer system. For example, speculative memory opera-
`tions may be prioritized lower than non-speculative memory
`operations throughout the computer system.
`
`17 Claims, 4 Drawing Sheets
`
`[54]
`
`COMPUTER SYSTEM INCLUDING
`PRIORITIES FOR MEMORY OPERATIONS
`AND ALLOVVING A HIGHER PRIORITY
`MEMORY OPERATION TO INTERRUPT A
`LOWER PRIORITY MEMORY OPERATION
`
`Inventors: W. Kurt Lewchuk, Austin; Brian D.
`McMinn, Buda; James K. Pickett,
`Austin, all of Tex.
`
`Assignee: Advanced Micro Devices, Inc.,
`Sunnyvale, Calif,
`
`Appl. No.: 08/982,588
`Filed:
`|)ec. 2, I997
`G06F 13/18
`Int. Cl.7
`711/158; 711/137; 711/151
`.................
`U.s. Cl.
`711/151, 158,
`Field of Search ..
`711/137; 709/240; 710/113, 114, 119, 120,
`121, 40, 240, 241, 242, 243, 244, 264;
`712207
`
`References Cited
`U.S. PATENT DOCUMENTS
`Motliersole et al.
`Teshirna et al.
`Khare et al.
`Mizrahi-Shalom et al.
`Nguyen et al.
`Shintani ct al.
`
`3/1 988
`7/1 988
`ll/I 994
`4/1 997
`9/1 997
`2/1 998
`OTHER PUBLICATIONS
`
`
`
`.. 712/207
`.. 7'l’l/157
`.. 712/207
`.. 711/137
`.. 711/151
`.. 712/207
`
`[56]
`
`4,729,093
`4,755,933
`5,367,657
`5,619,663
`5,673,415
`5,721,865
`
`Tullsen et al., Exploiting Choice:Instruction Fetch and Issue
`on an Implcmcntablc Simultanc ous Mutithrc ading Processor
`Proceedings of the 23rd Annual International Symposium on
`Computer Architecture, Philadelphia, PA, May 1996, pp.
`
`from microprocessors 10A and 10B
`34
`f as
`_____________________________________________ _ _‘
`
`Bus Efldge 1
`‘
`I
`‘
`
`Q
`Data!
`Tag
`
`58
`
`I
`
`CPU Bus
`
`11
`
`5
`I
`1
`
`52
`
`Address/
`Tag 7
`
`54
`
`11
`
`‘[5:/w
`‘
`
`Prlortly
`
`V
`
`cpu Interface
`__ ,,
`,
`55 A .
`.
`Pnorllv
`
`44
`
`Request Queue
`t IT-’ D313
`Main
`7
`,,
`p
`Butler
`Memory
`comro||gr
`Conlrolunrl 45 —>| Open F'agaF'norIty ‘
`Q
`—
`43
`Q
`Address and Control
`
`
`
`@
`
`0001
`
`Volkswagen 1005
`
`0001
`
`Volkswagen 1005
`
`
`
`U.S. Patent
`
`May 2, 2000
`
`Sheet 1 of 4
`
`6,058,461
`
`%._mon>mv.
`
`mm:o_>_
`
`mm
`
`
`
` 3>._oEm_>_Ems.
`
`m:m_
`
`
`
`>._oEm_2o_,
`
`8_>mo<m_
`
`fl
`
`mamEmucooom
`
`Hmmutm
`
`%838.8
`
`§838_On_
`
`
`
`mamn_0<om
`
`8:396
`
`3_m__o=:oo
`
`_ommmooEo.o__>_
`
`Q1,
`
`VN
`
`2m_on_
`
`>.._._o_._n_
`
`wmutmmam
`
`ml.
`
`m=m_:n_o__
`§_omwwooEo.o__>_
`
`0002
`
`
`
`U.S. Patent
`
`May 2, 2000
`
`Sheet 2 of 4
`
`6,058,461
`
`from microprocessors 10A and 10B
`
`CPU Interface
`
`Bus Bridge 1
`2
`
`Addressl
`
`Tag
`
`0003
`
`
`
`U.S. Patent
`
`May 2, 2000
`
`Sheet 3 of 4
`
`6,058,461
`
`Receive New Memory
`Operation
`
`Higher
`Priority than |n—Progress
`Operation?
`
`Same Page as In-
`Progress Operation?
`
`Continue |n—Progress
`Operation
`
`Interrupt In-Progress
`Operation
`
`Perform New
`
`Operation
`
`Resume In-Progress
`Operation
`
`0004
`
`
`
`U.S. Patent
`
`000229yaM
`
`Sheet 4 of 4
`
`6,058,461
`
`
`
`Nw<Q33NN<Q_.N<n_ _._.<n_o_.<D
`
`m.9“.
`
`N_.<omm<o$5E585w:8o20
`
`\.2:«:3«Ed_‘_‘<Do_<o»N8.
`
`
`
`
` \NSN2:___95T~20T:<OTo_.<Ojr<m
`
`8
`
`0005
`
`
`
`6,058,461
`
`1
`COMPUTER SYSTEM INCLUDING
`PRIORITIES FOR MEMORY OPERATIONS
`AND ALLOWING A HIGHER PRIORITY
`MEMORY OPERATION T0 INTERRUPT A
`LOWER PRIORITY MEMORY OPERATION
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`This invention is related to the field of computer systems
`and, more particularly,
`to memory latency issues within
`computer systems.
`2. Description of the Related Art
`Superscalar microprocessors achieve high performance
`by executing multiple instructions per clock cycle and by
`choosing the shortest possible clock cycle consistent with
`the design. On the other hand, superpipelinec microproces-
`sor designs divide instruction execution into a large number
`of subtasks which can be performed quick y, and assign
`pipeline stages to eacl1 subtask. By overlapping the execu-
`tion of many instructions within the pipeline, superpipelined
`microprocessors attempt to achieve high per ormance.
`Superscalar microprocessors demand low memory
`latency due to the number of instructions attempting con-
`current execution and due to the increasing clock frequency
`(i.e. shortening clock cycle) employed by l;1C superscalar
`microprocessors. Many of the instructions include memory
`operations to fetch (read) and update (write) memory oper-
`ands. The memory operands must be fetchec from or con-
`veyed to memory, and each instruction mus originally be
`fetched from memory as well. Similarly, superpipelined
`microprocessors demand low memory latency because of
`the high clock frequency employed by these microproces-
`sors and the attempt to begin execution of a new instruction
`each clock cycle. It is noted that a given microprocessor
`design may employ both superscalar and superpipelined
`techniques in an attempt
`to achieve the highest possible
`performance characteristics.
`Microprocessors are often configured into computer sys-
`tems which have a relatively large, relatively slow main
`memory. Typically, multiple dynamic random access
`memory (DRAM) modules comprise the main memory
`system. The large main memory provides storage for a large
`number of instructions and/or a large amount of data for use
`by the microprocessor, providing faster access to the instruc-
`tions and/or data than may be achieved from a disk storage,
`for example. However, the access times of modern DRAMs
`are significantly longer
`than the clock cycle length of
`modern microprocessors. The memory access time for each
`set of bytes being transferred to the microprocessor is
`therefore long. Accordingly, the main memory system is not
`a low latency system. Microprocessor performance may
`suffer due to high memory latency.
`In order to allow low latency memory access (thereby
`increasing the instruction execution efficiency and ulti-
`mately microprocessor performance), computer systems
`typically employ one or more caches to store the most
`recently accessed data and instructions. Additionally,
`the
`microprocessor may employ caches internally. A relatively
`small number of clock cycles may be required to access data
`stored in a cache, as opposed to a relatively larger number
`of clock cycles required to access the main memory.
`Low memory latency may be achieved in a computer
`system if the cache hit rates of the caches employed therein
`are high. An access is a hit in a cache if the requested data
`is present within the cache when the access is attempted. On
`
`2
`the other h and, an access is a miss in a cache if the requested
`data is absent from the cache when the access is attempted.
`Cache hits are provided to the microprocessor in a small
`number of clock cycles, allowing subsequent accesses to
`occur i11ore quickly as well and thereby decreasing the
`elfective memory latency. Cache misses require the access to
`receive data from the main memory, thereby increasing the
`elfective memory latency.
`In order to increase cache hit rates, computer systems may
`employ prefetching to “guess” which data will be requested
`by the microprocessor in the future. The term prefetch, as
`used herein, refers to transferring data (e.g. a cache line) into
`a cache prior to a request for the data being received by the
`cache in direct response to executing an instruction (either
`speculatively or non-speculatively). A request is in direct
`response to executing the instmction if the definition of the
`instruction according to the instruction set architecture
`employed by the microprocessor includes the request for the
`data. A “cache line” is a contiguous block of data which is
`the smallest unit for which a cache allocates and deallocates
`storage. If the prefetched data is later accessed by the
`microprocessor, then the cache hit rate may be increased due
`to transferring the prefetched data into the cache before the
`data is requested.
`Unfortunately, prefetching can consume memory band-
`width at an inopportune time with respect to the occurrence
`of non-speculative memory operations. For example, a
`prefetch memory operation may be initiated just slightly
`prior to the initiation of a non-prefetch memory operation.
`As the prefetch memory operation is occupying the memory
`system already,
`the latency of the non-prefetch memory
`operation is increased by the amount of time the memory
`system is occupied with the prefetch request. Particularly if
`the prefetch is incorrect (i.e. the prefetched data is not used
`later by the requester), the increased latency may decrease
`performance of the microprocessor (and the overall coni-
`puter system).
`SUMMARY OF THE INVENTION
`
`The problems outlined above are in large part solved by
`a computer system in accordance with the present invention.
`The computer system includes one or more microprocessors.
`The microprocessors assign a priority level to each memory
`operation as the memory operations are initiated. In one
`embodiment, the priority levels employed by the micropro-
`cessors include a fetch priority level and a prefetch priority
`level. The fetch priority level is higher priority than the
`prefetch priority level, and is assigned to memory operations
`which are the direct result of executing an instruction. The
`prefetch priority level is assigned to memory operations
`which are generated according to a prefetch algorithm
`implemented by the microprocessor. As memory operations
`are routed through the computer system to main memory
`and corresponding data transmitted, the elements involved
`in performing the memory operations are configured to
`interrupt the transfer of data for the lower priority memory
`operation in order to perform the data transfer for the higher
`priority memory operation.
`Advantageously, even though memory bandwidth is con-
`sumed by the prefetch memory operations,
`the latency
`experienced by the fetch memory operations may not be
`significantly impacted due to the interrupting of the prefetch
`memory operations to perform the fetch memory operations.
`Performance of the computer system may be increased due
`to the lack of impact on the latency of the fetch memory
`operations by the prefetch memory operations. Furthermore,
`
`0006
`
`
`
`6,058,461
`
`3
`more aggressive prefetch algorithms (e.g. algorithms which
`generate more prefetch memory operations) may be
`employed because the concerns regarding increasing the
`memory latency of non—prefetch memory operations
`because of interference by the prefetch memory operations
`is substantially allayed. The more aggressive prefetch algo-
`rithms may lead to increased prefetch effectiveness, further
`decreasing overall effective memory latency. Performance
`of the microprocessors employing the more aggressive
`prefetch algorithms may thereby by increased, and overall
`performance of the computer system may accordingly be
`improved .
`While one embodiment of the computer system employs
`at least a fetch priority and a prefetch priority, the concept of
`applying priority levels to various memory operations and
`interrupting data transfers of lower priority memory opera-
`tions to higher priority memory operations may be extended
`to other types of memory operations, even if prefetching is
`not employed within the computer system. For example,
`speculative memory operations may be prioritized lower
`than non—speculative memory operations throughout
`the
`computer system. Performance of the computer system may
`thereby be increased.
`Broadly speaking, the present invention contemplates a
`method for transferring data in a computer system. A first
`memory operation having a first priority is initiated.
`Subsequently, a second memory operation having a second
`priority is initiated. At least a portion of data corresponding
`to the first n1er11ory operation is transferred. The transferring
`is interrupted if the second priority is higher than the first
`priority, and data corresponding to the second memory
`operation is transferred during the interruption.
`The present invention further contemplates a computer
`system comprising at
`least one microprocessor, a main
`memory, and a bus bridge. The microprocessor is configured
`to initiate a first memory operation and to subsequently
`initiate a second memory operation. Additionally, the micro-
`processor is configured to assign a first priority to the first
`memory operation responsive to a first
`type of the first
`memory operation, and to assign a second priority to the
`second memory operation responsive to a second type of the
`second memory operation. The main memory is configured
`to store data including first data corresponding to the first
`memory operation and second data corresponding to the
`second memory operation. Coupled between the micropro-
`cessor and the main memory, the bus bridge is configured to
`initiate transfer of the first data from the main memory to the
`microprocessor responsive to the first memory operation.
`Furthermore, the bus bridge is configured to interrupt trans-
`fer of the first data upon receiving the second memory
`operation if the second priority is higher than the first
`priority.
`invention contemplates a bus
`the present
`Moreover,
`bridge for a computer system, comprising a CPU interface
`block and a memory controller. The CPU interface block is
`coupled to receive bus operations from a microprocessor.
`The bus operations include memory operations, and each
`memory operation includes a priority assigned by an initia-
`tor of the memory operation. Coupled to the CPU interface
`block and a memory, the memory controller is configured to
`receive each memory operation and the priority from the
`CPU interface block. The memory controller is configured to
`interrupt an in-progress memory operation to service a
`subsequent memory operation if a first priority correspond-
`ing to the in-progress memory operation is lower than a
`second priority corresponding to the subsequent memory
`operation.
`
`4
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Other objects and advantages of the invention will
`become apparent upon reading the following detailed
`description and upon reference to the accompanying draw-
`ings in which:
`FIG. 1 is a block diagram of one embodiment of a
`computer system.
`FIG. 2 is a block diagram of one embodiment of a bus
`bridge shown ir1 FIG. 1.
`FIG. 3 is a flowchart illustrating operation of one embodi-
`ment of the bus bridge shown in FIGS. 1 and 2 upon
`receiving a memory operation.
`FIG. 4 is a timing diagram illustrating operation of certain
`signals upon an interface between the bus bridge shown in
`FIGS. 1 and 2 and the main memory shown in FIG. 1,
`according to one embodiment of the bus bridge and the main
`memory, for a memory operation.
`FIG. 5 is a timing diagram illustrating operation of certain
`signals upon the interface between the bus bridge shown in
`FIGS. 1 and 2 and the main memory shown in FIG. 1,
`according to one embodiment of the bus bridge and the main
`memory, for a first memory operation interrupted by a
`second memory operation.
`While the invention is susceptible to various modifica-
`tions and alternative forms, specific embodiments thereof
`are shown by way of example in the drawings and will
`herein be described in detail.
`It should be understood,
`however, that the drawings and detailed description thereto
`are not intended to limit the invention to the particular form
`disclosed, but on the contrary, the intention is to cover all
`modifications, equivalents and alternatives falling within the
`spirit and scope of the present invention as defined by the
`appended claims.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`Turning now to FIG. 1, a block diagram of one embodi-
`ment of a computer system including one or more micro-
`processors (e.g. microprocessors 10A and 10B shown in
`FIG. 1) coupled to a variety of system components through
`a bus bridge 12 is shown. Other embodiments are possible
`and contemplated. In the depicted system, a main memory
`14 is coupled to bus bridge 12 through a memory bus 16, and
`a graphics controller 18 is coupled to bus bridge 12 through
`an AGP bus 20. Finally, a plurality of PCI devices 22A—22B
`are coupled to bus bridge 12 through a PCI bus 24. A
`secondary bus bridge 26 may further be provided to accom-
`modate an electrical interface to one or more EISA or ISA
`devices 28 through an EISA/ISA bus 30. Microprocessors
`10A and 10B are coupled to bus bridge 12 through a CPU
`bus 34 and a priority line 38. Alternatively, independent
`buses may be coupled between bus bridge 12 and each of
`microprocessors 10A and 10B. As illustrated by the dotted
`illustration of microprocessor 10B, embodiments of com-
`puter system 5 employing only one microprocessor are
`contemplated. Additionally, embodiments employing more
`than two microprocessors are contemplated.
`Generally speaking, microprocessors 10A and 10B are
`configured to initiate memory operations upon CPU bus 34
`in order to transfer data to and from main memory 14.
`Microprocessors 10A and 10B assign a priority to each
`memory operation, and transmit that priority concurrently
`with initiation of the memory operation. The assigned pri-
`ority is transmitted via priority line 38. In one embodiment,
`at least two priority levels are defined: a fetch priority level
`
`0007
`
`
`
`6,058,461
`
`5
`is
`and a prefetch priority level. The fetch priority level
`assigned to memory operations which are the direct result of
`executing an instruction. These memory operations may be
`either read memory operations or write memory operations.
`The prefetch priority level is assigned to prefetch memory
`operations generated in accordance with the prefetch algo-
`rithm employed by microprocessors 10A and 10B. Prefetch
`memory operations may be read memory operations. It is
`noted that microprocessors 10A and 10B may employ any
`suitable prefetch algorithm. A variety of well-known
`prefetch algorithms may be used, for example.
`Bus bridge 12 receives the memory operations initiated by
`microprocessors 10A and 10B, and transfers the data to/from
`main memory 14 via memory bus 16. The data is returned
`to the microprocessor 10A or 10B (if the data is a read
`memory operations) via CPU bus 34. Data is transferred to
`bus bridge 12 via CPU bus 34 for a write memory operation,
`and the data is subsequently transmitted to main memory 14
`via memory bus 16.
`If bus bridge 12 is in the process of performing a data
`transfer for a memory operation to main memory 14 and the
`memory operation is assigned a prefetch priority level, the
`read memory operation may be interrupted to perform a data
`transfer for another memory operation which is assigned a
`fetch priority level. Advantageously, prefetch memory
`operations (which are assigned the prefetch priority level) do
`not interfere with access to memory for fetch memory
`operations (i.e. memory operations assigned the fetch pri-
`ority level). The fetch memory operations may be completed
`with a latency similar to the latency that is experienced in the
`absence of prefetch memory operations. Even though the
`prefetch memory operations consume memory systei11
`bandwidth, they may not substantially increase the memory
`latency of the [etch memory operations. Performance of the
`microprocessors 10A and 10B (and overall performance of
`the computer system) may be increased by the lack of
`increase in the memory latency for fetch operations due to
`the occurrence of prefetch memory operations. Furthermore,
`the interruption of prefetch memory operations to perform
`higher priority fetch memory operations may allow for more
`aggressive prefetch algorithms to be employed within
`microprocessors 10A and 10B. Since the latency of fetch
`memory operations is substantially unaffected by the
`prefetch memory operations, more aggressive prefetching
`may be permissible.
`Subsequent to transferring data in response to the fetch
`memory operation, bus bridge 12 is configured to resume
`transferring data for the interrupted, lower priority memory
`operation. The lower priority memory operation is thereby
`completed. It is noted that the interruption of the transfer of
`data may occur upon memory bus 16 or upon CPU bus 34,
`depending upon the embodiment. For example, if CPU bus
`34 employs tagging to identify the address transfer of a
`memory operation with the corresponding data transfer, the
`tag may be conveyed with each data transfer on CPU bus 34.
`To interrupt a lower priority data transfer to perform a higher
`priority data transfer,
`the tag of the higher priority data
`transfer is conveyed. Subsequently,
`the tag of the lower
`priority data transfer is conveyed to complete the data
`transfer of the lower priority memory operation.
`In one embodiment, bus bridge 12 is configured to inter-
`rupt a data transfer to main memory 14 if the lower priority,
`in-progress memory operation and the higher priority
`memory operation are within the same “page”. As used
`herein, a “page” refers to a block of data stored within the
`same row of the DRAMs which comprise main memory 14.
`The row is accessed via a row address provided by bus
`
`6
`bridge 12, and then the column address of the particular
`datum being addressed is provided (typically using the same
`address lines used to provide the row address). Additionally
`data within the row can be accessed by providing another
`column address without providing the row address again
`(referred to as a “page hit”). Reading or writing additional
`data from the same row in this manner (referred to as “page
`mode") may allow for lower latency access to the data, since
`the row address need not be provided in between each
`column access.
`
`By interrupting a lower priority memory operation to
`perform a higher priority memory operation in the same
`page only, the higher priority memory operation may be
`performed quickly (e.g. with a page hit timing). If a different
`page were accessed, then the current page would be deac-
`tivated and the new page accessed by providing the row
`address of the higher priority memory operation, then the
`corresponding column addresses. Subsequently,
`the new
`page would be deactivated and the page corresponding to the
`lower priority memory operation re-established. The time
`spent deactivating and activating pages may outweigh the
`latency savings for the higher priority memory operation.
`While the present disclosure may refer to the prefetch
`priority level and the fetch priority level for memory opera-
`tions (with the fetch priority level being a higher priority
`than the prefetch priority level), it is contemplated that other
`priority levels may be assigned for other purposes in other
`embodiments. Furthermore, even if prefetching is not
`employed, the assignment of priority levels to diiferent types
`of memory operations may be advantageous. For example,
`speculative memory operations (performed due to the execu-
`tion of speculative instructions) might be assigned a lower
`priority level
`than non-speculative memory operations
`(performed due to the execution of non-speculative
`instructions). In this manner, speculative memory operations
`could be interrupted to perform non-speculative memory
`operations. Since the non-speculative memory operations
`have been confirmed as being required according to the
`execution of the program and the speculative memory
`operations may or may not be required, it may be advanta-
`geous to interrupt the speculative memory operations to
`decrease the latency of the non-speculative memory opera-
`tions. As another example, write back operations to update
`memory with updates made to a cache line within the cache
`of a microprocessor may be assigned a lower priority than
`memory operations to fill a cache line within the cache.
`As used herein, a “memory operation” is a transfer of data
`between an initiator and a memory (or a master and a slave,
`respectively). A “read memory operation” is a transfer of
`data from the slave (i.e.
`the memory) to the master. For
`example, microprocessor 10A or 10B may initiate a read
`memory operation to transfer data from main memory 14 to
`the microprocessor. A“write memory operation” is a transfer
`of data from the master to the slave (i.e. the memory). For
`example, microprocessor 10A or 10B may initiate a write
`memory operation to transfer data from the microprocessor
`to main memory 14. Memory operations may be of different
`sizes. However, memory operations to transfer data to and
`from the cache (e.g. prefetch memory operations and many
`fetch memory operations) may be performed using a cache
`line size. Generally, several transfer cycles (or “beats”) on
`both memory bus 16 and CPU bus 34 are used to transfer a
`cache line of data. For example, four beats is a typical
`number to transfer a cache line. Interrupting a memory
`operation to perform a higher priority memory operation
`may comprise inserting the beats for the higher priority
`memory operation between two of the beats for the lower
`
`0008
`
`
`
`6,058,461
`
`7
`priority memory operation. To “initiate” a memory
`operation, at least the address of the memory operation is
`conveyed to tl1e slave. Additional control
`ir1forn1atior1
`(including, e.g. the priority level and the read/write nature of
`the memory operation) may be conveyed concurrent with
`the memory operation or using a predefined protocol with
`respect to conveyance of the address. More than one bus
`clock cycle may comprise initiating a memory operation,
`depending upon the protocol of CPU bus 34. Data may be
`conveyed at a time subsequent to initiation of the memory
`operation.
`In addition to the above described functionality, bus
`bridge 12 generally provides an interface between micro-
`processors 10A and 10B, main memory 14, graphics con-
`troller 18, and devices attached to PCI bus 24. When an
`operation is received from one of the devices connected to
`bus bridge 12, bus bridge 12 identifies the target of the
`operation (e.g. a particular device or, in the case of PCI bus
`24, that the target is on PCI bus 24). Bus bridge 12 routes the
`operation to the targeted device. Bus bridge 12 generally
`translates an operation from the protocol used by the source
`device or bus to the protocol used by the target device or bus
`and routes the operation appropriately. Bus bridge 12 may
`further be responsible for coherency activity to ensure a
`proper result for the operation, etc.
`In addition to providing an interface to an ISA/EISA bus
`from PCI bus 24, secondary bus bridge 26 may further
`incorporate additional
`functionality, as desired. For
`example,
`in one embodiment, secondary bus bridge 26
`includes a master PCI arbiter (not shown) for arbitrating
`ownership of PCI bus 24. An input/output controller (not
`shown), either external from or integrated with secondary
`bus bridge 26, may also be included within computer system
`5 to provide operational support for a keyboard and mouse
`32 and for various serial and parallel ports, as desired. An
`external cache unit (not shown) may further be coupled to
`CPU bus 34 between microprocessors 10A and 10B and bus
`bridge 12 in other embodiments. Alternatively, the external
`cache may be coupled to bus bridge 12 and cache control
`logic for the external cache may be integrated into bus
`bridge 12.
`Main memory 14 is a memory in which application
`programs are stored and from which microprocessors 10A
`and 10B primarily execute. A suitable main memory 14
`comprises DRAM (Dynamic Random Access Memory),
`SDRAM (Synchronous DRAM), or RDRAM (RAMBUS
`DRAM).
`PCI devices 22A—22B are illustrative of a variety of
`peripheral devices such as, for example, network interface
`cards, video accelerators, audio cards, hard or floppy disk
`drives or drive controllers, SCSI (Small Computer Systems
`Interface) adapters and telephony cards. Similarly, ISA
`device 28 is illustrative of various types of peripheral
`devices, such as a modem, a sound card, and a variety of data
`acquisition cards such as GPIB or field bus interface cards.
`Graphics controller 18 is provided to control the rendering
`of text and images on a display 36. Graphics controller 18
`may embody a typical graphics accelerator generally known
`in the art to render three-dimensional data structures which
`can be effectively shifted into and from main memory 14.
`Graphics controller 18 may therefore be a master of AGP
`bus 20 in that it can request and receive access to a target
`interface within bus bridge 12 to thereby obtain access to
`main memory 14. A dedicated graphics bus accommodates
`rapid retrieval of data from main memory 14. For certain
`operations, graphics controller 18 may further be configured
`
`8
`to generate PCI protocol transactions on AGP bus 20. The
`AGP interface of bus bridge 12 may thus include function-
`ality to support both AGP protocol transactions as well as
`PCI protocol target and initiator transactions. Display 36 is
`any electronic display upon which an image or text can be
`presented. Asuitable display 36 includes a cathode ray tube
`(“CR ”), a liquid crystal display (“LCD”), etc.
`It is noted that, while the AGP, PCI, and ISA or EISA
`buses have been used as examples in the above description,
`any bus architectures may be substituted as desired.
`Turning now to FIG. 2, a block diagram of one embodi-
`ment of bus bridge 12 and main memory 14 is shown in
`greater detail. Other embodiments are possible and contem-
`plated. Only portions of bus bridge 12 pertaining to the
`present disclosure are shown in FIG. 2. Other portions may
`be implemented as desired. As shown in FIG. 2, bus bridge
`12 includes a CPU interface block 40 and a main memory
`controller 42. Main memory controller 42 may include a
`request queue 44, a control unit 46, an open page/priority
`storage 48, and a data buffer 50. CPU interface block 40 is
`coupled to CPU bus 34 and priority line 38. Additionally,
`CPU interface block 40 is coupled to main memory con-
`troller 42 via an address/tag bus 52, a R/W line 54, a priority
`line 56, and a data/tag bus 58. Each of address/tag bus 52,
`R/W line 54, priority line 56, and data/tag bus 58 are coupled
`to request queue 44, and data/tag bus 58 is coupled to data
`buffer 50. Request queue 44, data buffer 50, and open
`page/priority storage 48 are coupled to control unit 46.
`Additionally, control unit 46 is coupled to an address and
`control bus 16A and a data bus 16B which comprise memory
`bus 16. Data buller 50 is also coupled to data bus 16B. Main
`memory 14 comprises a plurality of DRAM banks
`60A—60N. Each DRAM bank 60A—60N comprises one or
`more DRAMs, and each DRAM bank 60A—60N is coupled
`to memory bus 16. The DRAMS included in main memory
`14 may comprise any type of DRAM, including standard
`asynchronous DRAM, SDRAM, etc.
`CPU interface block 40 is configured to receive bus
`operations from microprocessors 10A and 10B upon CPU
`bus 34, and to initiate bus operations upon CPU bus 34 in
`response to operations received from other devices attached
`thereto (e.g. coherency operations in response to memory
`accessed performed by other devices, etc.). If CPU interface
`block 40 receives a memory operation upon CPU bus 34,
`CPU interface block 40 routes the address of the memory
`operation and the corresponding tag from CPU bus 34 upon
`address/tag bus 52 to main memory controller 42.
`Additionally, the read/write nature of the memory operation
`is conveyed via read/write line 54 and the corresponding
`priority (received upon priority line 38) is conveyed upon
`priority line 56. If the memory operation is a write memory
`operation, the corresponding data is conveyed via data/tag
`bus 58.
`
`Request queue 44 stores the information provided by CPU
`interface block 40. If request queue 44 is empty prior to
`receipt of a memory operation and main memory 14 is idle,
`the memory operation may be selected by control unit 46 for
`presentation to main memory 14. Additionally, if a memory
`op