`Bowes et al.
`
`IIIIIIllllllllIIIllIlIllllIlIllIlIIllIIIIIIIIIIIIIIIlllllllllIlllIllllIlllI
`USOO5546547A
`[11] Patent Number:
`5,546,547
`[45] Date of Patent:
`Aug. 13, 1996
`
`[54] MEMORY BUS ARBITER FOR A
`COMPUTER SYSTEM HAVING A Dsp
`CO-PROCESSOR
`[75] Inventors: Michael J. Bowes, Cupertino; Farid A.
`Yazdy, Belmont, both of Calif.
`[73] Assignee: Apple Computer, Inc., Cupertino,
`Cahf'
`I21] APPL N05 189,138
`[22] Filed:
`Jam 28, 1994
`[51] Int. Cl.6 .................................................... .. G06F 13/00
`[52] US. Cl. ............. ..
`.. 395/294; 395/293
`[58] Field of Search
`........................ .. 395/325, 425
`[56]
`References Cited
`U‘S‘ PATENT DOCUMENTS
`2/1937 Kneib .................................... .. 395/325
`4,641,238
`5/1990 Kitamura et a1.
`395/425
`4,928,234
`1/1991 Craft et a1.
`395/325
`4,987,529
`5,263,163 11/1993 Holt et al. .... ..
`395/325
`5,301,282
`4/1994 Arnini et a1. .......................... .. 395/325
`
`5/1994 Averill .................................. .. 395/325
`5,313,591
`5,353,417 10/1994 Fuoco et a1. .......................... .. 395/325
`Primary Examiner—-Jack B. Harvey
`Assistant Examiner—David A. Wiley
`A"
`A t
`F- _B1ak 1 S k 1 ff T 1 & Zaf_
`magmey’ gen’ or "m
`ey’ ° 0° ’ ay or
`[57]
`ABSTRACT
`An arbitration scheme for a computer system in which a
`digital signal processor resides on the computer system’s
`memory bus without requiring a block of dedicated static
`random access memory. An arbitration cycle is divided into
`10 slices of WhlCh 5 slices are provrded in each arbitration
`loop to the digital signal processor. Two slices are provided
`each to the system’s I/O interface and to the peripheral bus
`Controllef- A ?nal Slice is Provided to the Systems CPU- A
`default state when no memory bus resource is requesting the
`system memory bus parks the memory bus on the CPU. The
`arbitration scheme provides su?icient bandwidth for real
`time signal processing by the digital signal processor oper
`mine from the system’s dynamic random 866985 memory
`while also providing sui?cient bandwidth for a local area
`network interface through the system’s I/O interface.
`8 Claims, 5 Drawing Sheets
`
`10
`,1
`CPU
`
`200
`,J
`MCA
`
`12
`,1
`ROM
`
`14
`/J
`MAIN MEMoRY
`SUBSYSTEM
`(DRAM)
`
`MEMORY BUS
`
`\
`
`>~110
`/
`
`O
`$1
`
`//\
`\
`
`a
`
`r‘! a
`z ‘
`
`5|
`
`/
`
`<
`\
`
`20~ DSP
`
`I/O INTERFACE
`DMA
`30,, CONTROLLER
`
`~40
`CONTROLLER
`
`/
`<
`
`\
`
`NUBUS
`
`\
`> "I30
`
`/
`
`131~ VIDEO
`
`COMMUNICATIONS ~132
`
`7
`,1;
`z 1
`cc c1: % 2
`5
`a 1
`E CO: ~/ 8 cu
`w“
`‘r
`Lu
`w-
`N
`x
`m
`t5 2
`U ‘C
`6 O
`I
`
`a
`cn
`9
`"'
`
`_ ,_
`<0 C:
`a O
`a.
`\V 1
`a
`‘
`
`Apple Exhibit 1005
`Page 1 of 25
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 1 of 5
`
`5,546,547
`
`10
`[J
`CPU
`
`14
`F/
`12
`N MAIN MEMoRY
`SUBSYSTEM
`RoM
`(DRAM)
`
`15
`[J
`1/0
`INTERFACE
`
`16
`r’
`MASS
`STORAGE
`
`/
`<
`
`\
`
`MEMoRY BUS
`
`\
`>
`
`/
`
`a 100
`
`20..
`
`DSP
`
`SRAM ~24
`
`Hgure 1 (Prior Art)
`
`Apple Exhibit 1005
`Page 2 of 25
`
`
`
`US. Patent
`
`Aug. 13, 1996
`
`Sheet 2 of 5
`
`5
`
`5,546,547
`
`10
`
`200
`
`CPU
`
`MCA
`
`12
`[J
`ROM
`
`14
`/J
`MSAlIJT‘I3 MESMOFIY
`SY TEM
`(DRAM)
`
`<
`\
`
`MEMORY BUS
`
`>~1 10
`/
`
`2o~ 0gp
`
`I/O INTERFACE
`DMA
`30_ CONTROLLER
`
`NUBUS
`CONTROLLER
`
`40
`
`/
`Q NUBUS
`
`\
`
`\
`> 430
`
`/
`
`131~ VIDEO
`
`COMMUNICATIONS 132
`
`I
`
`Q
`
`/\
`/ \
`
`co
`D
`m
`0
`>
`
`\ /
`\/
`
`Figure 2
`
`N
`
`~ 0
`
`5i
`
`7
`' LL!
`2 I
`CC II % E
`m
`2 p
`-— II
`50 “9'-
`w l
`
`1.0
`
`w
`1
`'
`%
`5 ‘i’
`E 5
`“I I
`
`_ T-
`‘65
`‘D Q-
`a’
`
`‘
`
`Apple Exhibit 1005
`Page 3 of 25
`
`
`
`US. Patent
`
`Aug. 13, 1996
`
`Sheet 3 0f 5
`
`5,546,547
`
`m ERME
`
`Qwu an?
`~ m2, M. can
`DM¢ % @322 w
`
`awn
`
`@321 w a”? w E
`
`@321 a 5, w 5%
`
`5? w 5, w m2
`
`Pmwmm
`
`rmom
`
`Apple Exhibit 1005
`Page 4 of 25
`
`
`
`US. Patent
`
`Aug. 13, 1996
`
`Sheet 4 of 5
`
`5,546,547
`
`MCA
`
`200
`
`212~
`DSP BUS
`—--———>
`~21o
`INTERFACE UNIT
`
`—_—’ DRAM
`22° ~ INTERFACE UNIT I
`—T——>
`
`~280
`NON-DSP
`235~
`
`——_> BUS INTERFACE UNIT
`
`DSP CLCK > DSP STATE
`
`MACHINE
`
`|——>
`
`BCLK
`
`' NON-DSP
`STATE
`MACHINES
`
`___>
`
`BB_ -> BUS ARB‘TRATION UN”
`<—--—>
`LOGIC
`DSP REQ
`DSP
`NUBEEL>
`241"wATCIIDoC 4
`IOB REQ I
`TIMER
`S 240
`
`CPUBGN,DSPBGN,NUBB3N,IOBBGN >
`D PL
`s OCK
`CPU LOCK
`
`Figure 4
`
`Apple Exhibit 1005
`Page 5 of 25
`
`
`
`U.S. Patent
`
`Aug. 13, 1996
`
`Sheet 5 of 5
`
`5,546,547
`
`
`
`m.m.§u..~n~
`
`
`
`mammI._.mz>>omo_
`
`
`
`
`
`mammI._.mzaoEjoEzoomamsz
`
`
`
`
`
`mo.>mzm>_EEjomczoomsmpzm__.:Ezm>En_
`
`mamm=._._.mzao2&0
`
`:n_omI.—>mzm>_¢o
`
`mamm=._._.mzzoEu
`
`Eum=.:.>mzm>_E
`
`
`
`E:m¢<m_E>m_zm_>_mo
`
`E226mamm_._._.m_nagzmzs
`
`
`z_<o<mamm=._._.wz>>Onan
`
`mmEmm<m_._._.>mzm>_mn_mammiwz>>Onag
`
`ms:
`
`Apple Exhibit 1005
`Page 6 of 25
`
`Apple Exhibit 1005
`Page 6 of 25
`
`
`
`
`
`
`
`
`1
`MEMORY BUS ARBITER FOR A
`COMPUTER SYSTEM HAVING A DSP
`CO-PROCESSOR
`BACKGROUND OF THE INVENTION
`l. Related Applications
`This application is related to U.S. patent application Ser.
`No. 08/1 89,139, entitled “Dual Bus Concurrent Multi-Chan~
`nel Direct Memory Access Controller and Method ”, Ser. No.
`08/ 189,132, entitled “Multiple Register Set Direct Memory
`Access Channel Architecture”, and Ser. No. Oil/189,131,
`entitled “Direct Memory Access Channel Architecture and
`Method for Reception of Network Information”, each of
`which is assigned to the assignee of the present invention
`and ?led concurrently herewith.
`2. Field of the Invention
`The present invention relates to digital computer system
`architecture. More particularly, the present invention relates
`to a computer architecture in which multiple processing
`units share a common memory bus and support real-time
`applications.
`3. Description of Related Art
`Until recently, telecommunications and computing were
`considered to be entirely separate disciplines. Telecommu
`nications was analog and done in real-time whereas com
`puting was digital and performed at a rate determined by the
`processing speed of a computer. Today, such technologies as
`speech processing, sound processing, electronic facsimile
`30
`and image processing have blurred these lines. In the coming
`years, computing and telecommunications will become
`almost indistinguishable in a race to support a broad range
`of new multimedia (i.e., voice, video and traditional data)
`applications. These applications are made possible by
`emerging digital-processing technologies, which include:
`compressed audio (both high ?delity audio and speed), high
`resolution still images, video, and high speed signal trans
`mission such as by means of modem or facsimile exchange.
`The emerging technologies will allow for collaboration at a
`distance such as by video conferencing.
`Each of these aspects of real'time information processing
`may require dedicated processors designed for their imple
`mentation. However, it is becoming more and more common
`to use programmable digital signal processors (DSP) avail
`able on the market today, such as the AT&T® DSP3210.
`DSPs are autonomous processors having their own real-time
`operating systems. As such, they are ideally suited to real
`time audio and image signal processing.
`In handling real~time information such as speech recog
`nition and modem functionality, a DSP requires a large
`amount of bandwidth to memory for processing the sheer
`volume of data required to effectuate real-time computing.
`FIG. 1 illustrates a typical computer architecture in which a
`CPU 10 is coupled to a memory bus 100. The memory bus
`100 may also be referred to as the system bus or CPU bus.
`In any event, it is this bus which couples the system’s CPU
`10 to the I/O interface 15 and the various components of the
`memory subsystem. In FIG. 1, the CPU is in communication
`with a ROM 12 and the main memory subsystem 14 through
`the memory bus 100. The main memory subsystem 14
`usually comprises a memory controller and a large array of
`dynamic random access memory (DRAM) for supporting
`operating applications and data for access'by the CPU over
`the memory bus 100. The main memory subsystem is
`distinguished from mass storage 16 which may comprise
`hard magnetic disk drives or a CD-ROM which provide for
`
`50
`
`55
`
`65
`
`20
`
`5,546,547
`2
`relatively slow access, high volume storage of information.
`Access times to mass storage are slower in part because of
`the need to process requests through the I/O interface and the
`inherently slower nature of mass storage devices. The
`memory subsystem’s DRAM, on the other hand, is semi
`conductor memory which provides for fairly quick storage
`and retrieval for operating applications and which may be
`fed from the slower mass storage 16 through the I/O
`interface 15 over the memory bus 100 to meet the require
`ments of the CPU 10.
`In many computer systems, the I/O interface 15 is also a
`direct memory access (DMA) controller which manages the
`transfer of data between I/O devices and the main memory
`subsystem without requiring the CPU to perform that task.
`The I/O interface 15 may also be used for coupling the
`computer system to other computer systems over a network
`such as an Ethernet local area network.
`Also shown coupled to the memory bus 100 in FIG. 1 is
`a digital signal processor (DSP) 20. The logic or controller
`needed to couple the DSP to the memory bus is not shown.
`The DSP 20 may be an oif-the-shelf DSP such as the
`AT&T® DSP3210. Most DSPs include an on~board cache of
`static random access memory (SRAM) which in the case of
`the AT&T DSP321O is an 8-Kbyte SRAM cache. In prior art
`computer systems, because of the high bandwidth required
`for real-time processing by a DSP, it has not been possible
`for the DSP to run oif of the computer system’s DRAM in
`the way the CPU 10 utilizes it without adversely affecting
`the rest of the computer system. Thus, there has been
`provided a large block of SRAM 24 for use by the DSP 20.
`This has allowed the memory bus to be relatively free of
`DSP requests yielding the freedom to process CPU requests
`and requests from I/O devices or networks through the I/O
`interface without having to contend for the bandwidth
`required by the DSP 20.
`A signi?cant disadvantage to the prior art computer
`architecture of FIG. 1 is the requirement of a substantial
`block of static random access memory 24. SRAMs are
`signi?cantly more expensive than DRAM which greatly
`increases the cost of computer systems which incorporate
`SRAM. One object of the emerging multimedia technolo
`gies is to bring these technologies to the mass market in
`con?gurations as inexpensive as possible. It is therefore one
`object of the present invention to provide a computer
`architecture which incorporates DSP technology for real
`time data processing without requiring the inclusion of
`expensive SRAM to support the DSP.
`
`45
`
`SUMMARY OF THE INVENTION
`From the foregoing, it can be appreciated that a computer
`architecture which provides for a digital signal processor to
`operate as a co-processor with a CPU over a common
`memory bus without requiring expensive static random
`access memory would be greatly advantageous. Accord
`ingly, it is an object of the present invention to provide a
`mechanism and method for arbitrating the memory bus
`bandwidth to e?iciently allow the use of a digital signal
`processor and a CPU over a common memory bus sharing
`the system’s dynamic random access memory subsystem
`without requiring an expensive block static random access
`memory. It is further an object of the present invention to
`provide an arbitration scheme in which, in addition to the
`CPU and digital signal processor, a DMA controller and a
`peripheral card expansion bus controller might also access
`the memory bus with all bus masters receiving sufficient
`
`Apple Exhibit 1005
`Page 7 of 25
`
`
`
`5,546,547
`3
`4
`bandwidth over the memory bus to provide for real-time
`FIG. 3 illustrates a state diagram of the arbitration scheme
`isochronous data processing.
`for assigning bandwidth slots to the various components of
`the preferred embodiment computer architecture.
`These and other objects of the present invention are
`provided by a computer architecture in which a CPU and
`FIG. 4 illustrates a more detailed logical representation of
`digital signal processor (DSP) as well as other memory bus
`the arbiter for implementing the arbitration scheme of the
`present invention.
`masters are provided on a common memory bus with the
`system’s dynamic random access memory subsystem. An
`FIG. 5 illustrates a timing diagram showing a simple state
`application speci?c integrated circuit (ASIC) for arbitrating
`transition sequence in accordance with a preferred embodi
`between bus masters is provided which implements an
`ment implementation of the present invention.
`arbitration scheme for sharing the bandwidth of the system’s
`memory bus to provide the DSP with sufficient access to the
`DRAM to carry out real-time, isochronous data processing
`while still allowing the system’s I/O controller to satisfy
`Ethernet communication requirements over the memory bus.
`The arbitration scheme of the present invention is an adap
`tive arbitration scheme that varies access to the memory bus
`as a function of time and depends upon what operations the
`various bus masters are requesting. The next state of the bus
`arbiter depends on the present state, in addition to the
`amount of bus traffic and history of prior requests.
`The arbitration scheme is tuned to maximize accessibility
`of the memory bus to the DSP which has by far the greatest
`bandwidth requirements. A cycle in the arbitration scheme is
`divided into 10 time slices of which 5 slices are available for
`the DSP to assert the memory bus. However, the total
`amount of time in a given cycle that the DSP is allotted the
`memory bus is limited by a watchdog timer in the arbitration
`module so as not to choke off the other potential bus masters.
`During the arbitration cycle, each of the potential bus
`masters may signal a bus request over the memory bus and
`a priority scheme de?ned by a state diagram determines
`which bus master will next have access to the memory bus.
`In the preferred embodiment of the present invention, to
`save a pin on the CPU, the CPU does not utilize the means
`for requesting the memory bus. Instead, the default state for
`the arbiter is to assign the memory bus to the CPU when no
`other bus master is requesting it. Further, at the end of each
`arbitration cycle, the last frame is reserved for the CPU
`which may then carry out any desired operation.
`Finally, the present invention is implemented in such a
`way that the clock rate of the DSP need not be synchronized
`with the rest of the modules in the computer system. In this
`way, enhancements in DSP technology will allow faster
`digital signal processors to be incorporated in computer
`systems without requiring total redesign of the architecture.
`The arbiter ASIC effectively has a separate set of state
`machines devoted to the time domain of the digital signal
`processor which operate in conjunction with the state
`machines of the modules operating in the time domain of the
`rest of the computer system. A brief amount of time is
`required for resynchronizing to the different time domains
`on the memory bus which is justi?ed by the increased
`?exibility in the design of the computer architecture.
`
`45
`
`50
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The objects, features and advantages of the present inven
`tion will be apparent from the following detailed description
`in which:
`FIG. 1 illustrates a prior art computer architecture in
`which a digital signal processor requires an expensive static
`random access memory block.
`FIG. 2 illustrates a block diagram of a computer archi
`tecture incorporating the present invention arbitration
`scheme.
`
`55
`
`60
`
`65
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`A method and apparatus are described for the arbitration
`of a number of bus masters over a memory bus in a computer
`system architecture supporting a digital signal processor.
`Throughout this detailed description numerous details are
`speci?ed such as I/O devices and network standards, in order
`to provide a thorough understanding of the present inven
`tion. To one skilled in the art, however, it will be understood
`that the present invention may be practiced without such
`speci?c details. In other instances well-known control struc
`tures and gate level circuits have not been shown in detail in
`order not to obscure unnecessarily the present invention.
`Particularly, the arbitration module for implementing the
`arbitration scheme of the present invention, in the preferred
`embodiment, is implemented in an application speci?c inte
`grated circuit (ASIC). With today’s manufacturing technol
`ogy, the development of ASICs generally does not require
`the rendering of fully detailed circuit diagrams. The de?ni
`tion of logic functionality and state diagrams allow com
`puter aided design techniques to design the desired inte
`grated circuit. Accordingly, the present invention will be
`described primarily in terms of functionality to be imple
`mented by an ASIC. The software used for developing the
`ASIC chip from the logic de?ned for the preferred embodi
`ment will be attached as an appendix. Those of ordinary skill
`in the art, once given the following descriptions of the
`various functions to be carried out by the present invention
`will be able to implement the necessary logic in an ASIC or
`other technology without undue experimentation.
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent disclosure, as it
`appears in the Patent and Trademark Oi?ce patent ?les or
`records, but otherwise reserves all copyrights whatsoever.
`The present invention concerns a computer architecture in
`which a digital signal processor (DSP) operates as a true
`co-processor in the computer system. That is, an arbitration
`technique and mechanism are implemented which allows a
`DSP to reside on the system’s CPU or memory bus and share
`the memory bus resources with the other potential bus
`masters on the memory bus. The scheme isimplemented
`such that the DSP is provided with su?icient bandwidth to
`perform real-time digital signal processing using the sys
`tem’s dynamic random access memory (DRAM) and not
`requiring the incorporation of an expensive block of static
`random access memory (SRAM). DRAM is far less expen
`sive than SRAM and the elimination of a block of SRAM
`greatly reduces the cost of computer systems. The DSP is
`provided with sufficient bandwidth on the memory bus to
`perform real-time isochronous signal processing, yet at the
`same time the arbitration scheme of the present invention
`prevents the other bus masters from being starved from the
`bandwidth they require in carrying out their operations.
`
`Apple Exhibit 1005
`Page 8 of 25
`
`
`
`5,546,547
`5
`While the present invention arbitration scheme will be
`described in term of a particular implementation of a com
`puter system, it should be understood that the broad concept
`of providing for a DSP on a memory bus and sharing the
`system’s DRAM may be extended to other more compli
`cated systems, or generalized to simpler systems.
`Overview of the Present Invention Computer
`System
`Referring now to FIG. 2, a preferred embodiment com
`putcr architecture implementing the present invention is
`shown. The constituents of the computer architecture are
`shown coupled to a CPU or memory bus 110. The memory
`bus 110 provides the signal paths for the exchanging of data
`between the various elements on the memory bus. Further
`provided by the memory bus are control lines for such things
`as bus requests and bus granting signals and other system
`level control signals. The signals required for implementing
`the present invention will be described further herein. As
`with the prior art computer systems, the architecture illus
`trated in FIG. 2 has the CPU 10, the ROM 12 and the
`system‘s main memory subsystem 14 coupled to the
`memory bus 110. It should be understood that in the illus
`tration the main memory subsystem 14 coupled to the
`memory bus refers to the system’s DRAM and the controller
`required for writing to and reading from the DRAM based
`on a requested transaction. The CPU 10 may be considered
`a potential bus master as contrasted with the memory
`components 12 and 14 which are considered bus slaves. The
`ROM 12 and main memory subsystem 14 will not drive the
`memory bus for transactions on their own behalf.
`The preferred embodiment computer architecture of FIG.
`2 has a number of other potential bus masters coupled to it.
`The I/O interface 30 is used for coupling the memory bus to
`the U0 bus 120 for providing the computer system with a
`number of I/O capabilities. The I/O interface 30 of the
`present invention is also a DMA controller which controls
`the transactions between the main memory system 14 and
`I/O devices without requiring the resources of the CPU 10
`for these transactions. The DMA controller of the preferred
`embodiment implementation is described in co-pending
`U.S. patent applications: Ser. No. 08/189,139, entitled “Dual
`Bus Concurrent Multi-Channel Direct Memory Access Con
`tmller and Method Ser. No. 08/189,132, entitled “Multiple
`Register Set Direct Memory Access Channel Architecture”,
`and Ser. No. 08/189,131, entitled “Direct Memory Access
`Channel Architecture and Method for Reception of Ethernet
`Packets”, assigned to the assignee of the present invention
`and ?led concurrently herewith.
`The U0 bus 120 of the preferred embodiment implemen
`tation as illustrated in FIG. 2 provides the avenue for a
`number of different I/O devices to be incorporated in the
`computer system. For example, the SCSI port 121 is coupled
`through the I/O bus 120. SCSI stands for small computer
`system interface which is one of the computer industry’s
`standards for coupling I/O devices such as the hard disk
`drive 122 or ?oppy drives or other storage media and other
`I/O devices to today’s microcomputer systems. The com
`puter system’s serial port 125 is also coupled to the I/O bus
`120. The serial port can be used for attaching a modem or
`printer or other serial devices. Finally, the 110 bus 120 is
`used to couple the computer system to a local area network
`through the Ethernet port 128. The Ethernet port 128 is used
`for attaching the computer system of the present invention to
`a local area network such as Ethernet or other LAN 129.
`This allows the computer system to communicate with other
`
`60
`
`6
`computer systems and share common resources such as
`community printers, etc. As will be described further herein,
`the provision of an Ethernet connection to the present
`invention computer system introduces constraints for the
`bandwidth sharing of the memory bus.
`Another potential bus master coupled to the system bus
`110 of the preferred embodiment computer system is a
`peripheral card expansion bus controller. The peripheral
`card expansion bus controller illustrated in the ?gure is the
`NuBus controller 40. While the present illustration uses a
`NuBus controller, other peripheral card expansion bus pro
`tocols are generally known. The NuBus controller 40 is used
`to provide communication between the various memory bus
`masters and the system’s expansion bus 130, called NuBus.
`The NuBus provides a fast backplane for coupling such
`things as video controllers 131, RAM expansion cards, mass
`storage device controller cards, or other communications
`devices 132. Again, the preferred embodiment computer
`system is constrained by some of the NuBus requirements in
`the bandwidth sharing of the memory bus 110.
`Also, in FIG. 2 there is illustrated another potential bus
`master, the digital signal processor (DSP) 20. Unlike prior
`art computer systems, the present invention provides for the
`DSP 20 to reside on the system’s memory bus and operate
`from the computer system’s main memory subsystem 14. In
`implementing the present invention this greatly reduces
`system cost by eliminating the need for an expensive block
`of SRAM. In the preferred embodiment implementation, the
`DSP 20 is an AT&T DSP3210 which provides an internal 8K
`SRAM cache. This is an oiT-the-shelf DSP which is highly
`programmable and has a fairly well de?ned operating sys
`tem. The DSP can be programmed to carry out such func
`tions as speech processing, audio channel control, modem
`emulation, image processing and the like. Many of these
`functions are real-time operations and require a tremendous
`amount of the memory bus bandwidth between the DSP and
`the DRAM of the main memory subsystem 14. For reasons
`that will be described further herein, the DSP 20 is not
`constrained to operating at the same clock speed as the rest
`of the components of the computer system. This will require
`some resynchronizing for various operations to be
`described, but provides for ?exibility as newer technology
`and faster DSPs are developed.
`Finally, in FIG. 2 there is coupled to the memory bus 110
`the memory controller and arbiter (MCA) 200. In the
`preferred embodiment implementation, the arbiter 200 is an
`applications speci?c integrated circuit (ASIC) for arbitrating
`the memory bus 110 between the various bus masters subject
`to the constraints each imposes to provide optimal band
`width for each, particularly the DSP which is responsible for
`a signi?cant amount of real-time signal processing. In an
`alternative embodiment, the arbiter logic could be designed
`in some other form of logic.
`
`Arbitration Constraints
`Because the primary motivation of the present invention
`is to incorporate a digital signal processor as a co-processor
`on a computer system’s memory bus while providing it -
`sul?cient bandwidth to utilize the system’s DRAM rather
`than an expensive block of SRAM, it is necessary to talk
`about the various modes of operation of the DSP, particu
`larly for carrying out real-time processing. As was indicated,
`the DSP 20 is programmable and has an 8-Kbyte internal
`SRAM cache. Ideally, software written for the DSP should
`be segmented in such a way that blocks may be loaded into
`
`10
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`65
`
`Apple Exhibit 1005
`Page 9 of 25
`
`
`
`15
`
`25
`
`5,546,547
`8
`7
`times require the memory bus. When this occurs, the
`the cache of the DSP allowing the DSP to run as much of the
`resource propagates a bus request signal over the memory
`time as possible from its internal cache. Thus, one mode in
`bus 110 to the arbiter 200. A ?gure referred to further herein
`which the DSP will utilize the memory bus 110 is to read a
`will indicate more clearly what signals are required for the
`large block memory from the DRAM 14 into its internal
`various bus requests. When the memory bus is available for
`SRAM. This ?rst mode of operation on the memory bus can
`assignment, the arbiter 200 will issue a bus grant signal to
`be referred to as a “block read”. Another mode of operation
`one of the resources according to a priority scheme which
`concerns the handling of data that has already been pro
`takes into consideration all of the constraints described
`cessed by the DSP. In many cases it will be necessary to push
`above as well as various previous states of the system and
`that data back out to the DRAM so that some other parts of
`the present state of the system.
`the computer system can utilize it. Thus, the capability of
`Referring now to FIG. 3, a state diagram is illustrated
`bursting data out is a second mode of operation which may
`which is used to derive the logic of the arbiter ASIC 200.
`be referred to further herein as a “block write”.
`This state diagram is the preferred embodiment arbitration
`Because of the nature of some software, it is not always
`scheme for the implemented computer system described
`guaranteed that the code is going to be divisible into discrete
`with respect to FIG. 2 which has, in addition to a CPU and
`blocks for block read operations. There may be times when
`a DSP on the memory bus, an I/O interface for coupling the
`it is necessary to repeatedly access the DRAM, effectively
`system to a local area network and other I/O devices as well
`supporting a scheme where the DSP executes code directly
`as a NuBus interface with each imposing considerable
`from the DRAM. As contrasted with a block read, this might
`constraints. The state diagram illustrated in FIG. 3 is instru
`be considered a situation of “back-to-back single reads”
`mental in developing the VERILOG code or equivalent in
`from the DRAM. The occurrence of these back-to-back
`the production of the arbiter ASIC 200. The code for the
`operations is recognizable and will be considered by the
`preferred embodiment arbitration scheme is set forth in
`arbiter scheduling algorithm to be described further herein.
`Appendix A.
`Finally, during the course of normal operation there may be
`The bus arbiter 200 follows the priority scheme illustrated
`situations in which the DSP requires only a single piece of
`by the state diagram of FIG. 3. The arbiter receives the bus
`data from the DRAM. These types of read operation are
`request signals from the potential bus masters, and based on
`referred to as “scattered-single reads”. Because the DSP will
`the present state and a speci?ed priority order asserts an
`be processing many real-time operations as well as others,
`appropriate bus grant signal. In the preferred embodiment
`each of these modes of operation where the DSP must have
`implementation, to save a pin on the CPU, the CPU 10 does
`access to the memory bus must be taken into consideration
`not issue bus request signals. Instead, the state of the
`by the arbitration scheme of the present invention.
`memory bus assignment defaults to the CPU and remains
`In addition to the DSP’s huge requirement for bandwidth
`parked on the CPU until other resources request the memory
`on the memory bus, the arbiter 200 must also contend with
`bus. The CPU is also provided with one time slot in the
`constraints imposed by other resources on the memory bus.
`priority scheme to be described in which it is granted the
`Particularly, the I/O interface 30 provides the interconnec
`memory bus.
`'
`tion of the computer system to an Ethernet local area
`Each circle in the arbiter state diagram of FIG. 3 repre
`network. This Ethernet connection should never be choked
`sents a present state memory bus assignment and the arrows
`off because losing an Ethernet packet results in a tremendous
`represent the state transition from a current bus master to the
`penalty in time for recovery. Thus, the I/O interface 30 must
`next bus master when the speci?ed conditions are met.
`be guaranteed at least a predetermined amount of time for
`Because the DSP has the largest bus bandwidth requirement,
`every arbitration cycle to support the system’s Ethernet
`the system is optimized to meet its need and support its
`connection. The ?oppy disk drive, serial ports and audio
`real-time operations. In order to optimize this requirement
`input/output DMA, in a similar fashion, also require some
`and still give a reasonable bandwidth to the other bus
`guaranteed memory bus bandwidth. Likewise, with all other
`masters, the DSP is assigned 5 time slots among a total of 10
`constraints taken into account, the system’s SCSI perfor
`in the arbitration loop. The arbitration loop starts at DSP?l
`mance should not be allowed to degrade below an accept
`state 301 and ends at the CPU state 310. The CPU state 310
`able level.
`is the point of reference which also designates the comple
`Similar to the I/O interface constraints, there are con
`tion of the entire arbitration loop. The NuBus controller 40
`straints imposed by the computer system being coupled to an
`and the I/O interface 30 each own 2 time slots in the
`expansion bus through the NuBus controller 40. Like the I/O
`arbitration loop while the CPU 10 owns only 1 slot. The
`interface, there are certain latencies that NuBus introduces
`sequence through the arbitration loop is the following:
`which must be provided for thus requiring a guaranteed
`DSP__1—>NuB_1—aDSP_2->IOB_1—>DSP_3—> NuB_
`amount of memory bus availability each arbitration cycle for
`2—>DSP__4—>IOB*2—>DSP_5—>CPU. This will be the
`the NuBus controller. Finally, though the CPU does very
`order of memory bus ownership if all bus masters are
`little in the way of real-time operation in a system having a
`requesting the bus in every state, though with some restric
`DSP, it still must have some access every arbitration cycle
`tions to be described further herein.
`to the main memory subsystem 14.
`The state d