`Reddy
`
`Ill 111111111111111011
`
`US005712664A
`[111 Patent Number:
`[451 Date of Patent:
`
`5,712,664
`Jan. 27, 1998
`
`[54) SHARED MEMORY GRAPIDCS
`ACCELERATOR SYSTEM
`
`[75]
`
`Inventor: Cbitranjan N. Reddy, Milpitas, Calif.
`
`[73] Assignee: Alliance Semiconductor Corporation,
`San Jose, Calif.
`
`[21] Appl. No.: 136,553
`
`OcL 14, 1993
`
`[22) Filed:
`lnL Cl. 6
`.. ..................................................... ~9G 5/&0
`[51]
`[52) U.S. CL ........................... 345/200; 345/201; 395/508;
`395/519
`[58] Field of Sea rch ..................................... 345/133, 132,
`345/189-191, 204, 203, 200, 201; 395/162,
`164, 166, 185, 375, 129, 725, 141, 508,
`519; 348/584, 441, 586; 365/200
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`3/1980 Groothuis ................................ 345f201
`4,191,956
`4,228,528 10/1980 Cenker .................................... 365f200
`...................... 345/201
`4,812,836 3/1989 Kurabke et al .
`3/1989 Yoshiba ................................... 345.1201
`4,816,815
`8/1990 Hannah ................................... 345/189
`4,951,232
`4,956,708
`9/1990 Itagaki .................................... 348/441
`7/1991 Edwards et al . ........................ 395/375
`5,031,092
`5,083,294
`1/1992 Okajima .................................. 365.1200
`4/1993 Matsuo et al . .......................... 345/133
`5,2(1}./}62
`5,258,843 1111993 Truoog .................................... 3481586
`5,293,540 3/1994 Traoi et al. ················---·-··· 348/584
`5,297,148
`3/1994 Hanri et al. ········-··---··-····· 3651200
`5,303,334
`4/1994 Synder et al.. ----·-·--······-· 3951129
`611994 Mattisoo et al . ........................ 3451200
`5,319,388
`5,321,806
`611994 Meinerth et al .
`....................... 395/162
`5,363,500 1111994 Takeda ......•..•...•..........•........... 345f201
`5,386,573
`l/1995 Ok:unoto ................................. 395m5
`
`5,392,393 V 1995 Deering ................................... 395/162
`3/1995 Van Aleen ............................... 395/141
`5,396,586
`7/1996 Keeoe et al . ............................ 345/200
`5,537,128
`
`FOREIGN PPJEN'I' DOCUMJBNTS
`8/1992 Japan .
`4-211293
`
`OTHER PUBUCATIONS
`TMS34010 User's Guide. Texas Instruments, pp. 1-5
`through 1-7, 1986.
`ffiM Technical Disclosure Bulletin, 35(1A):45-46, Jun.
`1992.
`'TRS-80 Color Computer Technical Reference
`Tandy.
`Manual" pp. 17-21, 1981.
`
`Primary Examiner-Steven Saras
`An~mey, Agent, or Firm--Lirobach & Limbach L.L.P.
`[57]
`
`ABSTRACT
`
`A shared memory graphics accelerator system that provides
`graphics display data to a display .includes a central pro(cid:173)
`cessing unit for generating graphics display data and graph(cid:173)
`ics commands for processing the display data. An .integrated
`graphics display memory element includes both a graphics
`accelerator conoected to receive display data and graphics
`commands from the central process.i.og unit and an on-chip
`frame buffer memory element. The on-chip frame buffer
`memory element is conoected to receive display data from
`the graphics accelerator via a display data distribution bus.
`An off-chip frame buffer memory element is also connected
`to the display data distribution bus to receive display data
`from the graphics accelerator. The graphics accelerator
`selectively distributes display data to the oo-dlip frame
`buffer memory element and to the off-chip frame buffer
`memory element based on predetennined display data dis(cid:173)
`tribution criteria.
`
`3 Claims, 2 Dnwing Sheets
`
`104
`
`CPU
`
`102 106
`
`IGDM
`
`122
`
`120
`
`121
`
`Page 1 of 7 ZTE EXHIBIT 1026
`
`
`
`U.S. Patent
`
`Jan.27, 1998
`
`Sheet 1 of 2
`
`5,712,664
`
`22
`
`14
`
`CRT
`
`FRAME
`BUFFER
`(DRAM)
`
`18
`
`10
`
`16
`
`FIG. 1
`(PRIOR ART)
`
`118
`
`116
`
`CRT
`
`122
`
`104
`
`CPU
`
`102 106
`
`IGDM
`
`120
`
`121
`
`FIG. 2
`
`Page 2 of 7
`
`
`
`U.S. Patent
`
`Jan. 27, 1998
`
`Sheet 2 of 2
`
`5,712,664
`
`306
`J
`304 IRAMDACJ---. CRT
`I
`{}
`{}
`y
`318
`I
`IGDM
`
`300
`\.
`
`IGDM
`
`7
`
`310
`
`)
`
`302
`J
`
`DRAM
`A
`
`'
`3 14
`
`...
`....
`K
`...
`~ (
`320
`
`.)
`
`I DRAM
`312
`
`/'),
`
`\
`
`308
`
`)
`
`..
`It'
`CPU ~
`
`r
`307
`
`I
`316
`
`I
`
`FIG. 3
`
`IRAMDAc:
`' ~
`
`CRT
`
`..
`K
`...
`
`..
`..
`
`CPU
`
`IGDM
`
`FIG. 4
`
`Page 3 of 7
`
`
`
`2
`solutions have either physical or practical limitations.
`Increasing the bus width in.creases the silicon area and the
`package pin count Increasing the speed of the bus requires
`utilization of more complex silicon process tecbnology.
`
`SUMMARY OF THE INVENTION
`
`s
`
`1
`SHARED MEMORY GRAPm CS
`ACCELERATOR SYSTEM
`
`5,712,664
`
`BACKGROUND OF THE 1NVENTION
`1. Field of the Invention
`The present invention relates to the visual display of a
`The present invention provides a graphics display system
`computer graphics image and, in particular, to a graphics
`that enhances performance by integrating a portion of the
`display system that integrates both a graphics accelerator
`frame buffer storage space and the graphics accelerator
`engine and a portion of the graphics frame buffer memory on
`10 engine on the same dtip while at the same time maintaining
`the same monolithic chip.
`the flexibility to expand the frame buffer size as needed.
`2. Discussion of the Prior Art
`A video graphics system typically uses either VRAM or
`Generally. the present invention provides a shared
`memory graphics accelerator system that provides display
`DRAM frame buffers to store the pixel display data utilized
`in displaying a graphics or video image on a display element 15 data to a display element. The shared memory graphics
`accelerator system includes a central processing unit that
`such as a CRT.
`A VRAM frame buffer includes two ports that are avail-
`generates both display data and graphics commands for
`processing the display data. An integrated graphics display
`able for the pixel data to flow from the memory to the
`display. One port is known as the serial port and i s totally
`m~ory el~ment includes both a _graphics accelerator that
`dedicated to refreshing the display screen image. The other 20 rece1ves display data and graphics comma.nds from the
`port is a random access port that is used for receiving pixel
`central processing unit and an on-chip frame buffer memory
`updates generated by a CPU or a graphics accelerator
`element that is connected to receive display data from the
`engine. A typical VRAM arrangement allocates 99% of the
`graphics accelerator via a display data distribution bus. An
`off-chip fr~ bu~er memory elc:ment. is also connected to
`available bandwidth to the random port thereby allowing the
`system to display fast moving objects and to support large 25 the data distribution bus to rece1ve display data from the
`graphics accelerator. The graphics accelerator selective.ly
`display CRI's.
`However, in a DRAM-based video system. the pixel data
`distributes the display data to the on-chip memory element
`~d to the 0~-chip ~emox:r ~lement based on predefined
`updates and the screen refresh data contend for a single
`frame buffer memory port. This contention reduces the
`display data distribution cntet'la.
`amount of bandwidth available for pixel data updates by the 30
`The above-described integrated solution increases the
`CPU and the graphics engine, resulting in a lower perf or-
`performance of the graphics display system because display
`data retrieval from the on-chip frame buffer is much faster
`mance graphics display system.
`than fr~m an external frame buffC: a?d ~e DRAM timing
`However, in most applications. the DRAM solution is
`preferable to the VRAM solution at the expense of lower
`constramts are reduced, thus achievmg unproved system
`perfonnance because DRAMs are cheaper than VRAMs.
`35 performance. This integrated solution also allows the dis-
`FIG. 1 sh~ws a conventional graphics display system 10
`play memory size to. be expanded by adding external
`memory so th~t large displays can be accommodated on ~n
`wherein a CPU 12 writes pixel display data on data bus 11
`~-needed basts. Also. _the frame buff~ space can. be dis-
`to be displayed on the CRT screen 14 through a graphics
`!tibuted amon~ severalmt~ated solutions, thereby mcre_as-
`accelerator (GXX) 16 onto a DRAM frame buffer 18 via
`data bus 19. The CPU 12 also provides certain higher level 40 mg bo~ the display baodwid~ and the parallel processmg
`capability between th~ CRT display and the CPU.
`graphics command signals 20 to the graphics accelerator 16
`to manipulate the display data stored in the DRAM frame
`A better understanding of the features and advantages of
`buffer 18.
`the present invention will be obtained by reference to the
`The graphics accelerator 16 retrieves display data from
`following detailed description and accompanying drawings
`the frame buffer 18 via data bus 19 utilizing reference 45 w~~ set forth ~ ill~strative e_~bodiment in which the
`pnnapals of the mvention are utilized.
`address bus 21, processes the retrieved display data based on
`the CPU command signals 20 and writes the new pixel data
`DESCRIPTION OF THE DRAWINGS
`back to the frame buffer 18.
`The pixel data is displayed on the CRT 14 through a so
`FIG. 1 is a schematic diagram illustrating a conventional
`graphics subsystem.
`random access memory digital-to-analog converter
`(RAMDAC) 22 that receives the data via a data display bus
`FIG. 2 is a schematic diagram illustrating a shared
`memory graphics accelerator system in accordance with the
`24.
`The graphics accelerator 16 also reads display data from
`present invention.
`the frame buffer 18 via data bus 19 and sends it to the 55
`FIG. 3 is a schematic diagram illustrating a shared
`memory graphics accelerator system in accordance with the
`RAMDAC 22 via the data display bus 24 to meet the
`periodic refresh requirements of the CRT display 14.
`present invention in a distributed display arrangement
`Thus, as illustrated in FIG. 1, the bandwidth of the data
`FIG. 4 is a schematic diagram illustrating a shared
`bus 19 is shared by three functions: display refresh , CPU
`memory graphics accelerator system in accordance with the
`display data update, and graphics accelerator display 60 present invention but with no expansion memory.
`manipulation. As the display size (i.e., the number of pixels
`to be displayed on the CRr screen 14) increases, the display
`updates and display manipulation functions are reduced
`because of the bandwidth limitations of the data bus 19
`caused by the fixed refresh requirements of the CRT 14.
`While these limitations can be addressed by increasing the
`data bus width or by increasing its speed, both of these
`
`DETAll.ED DESCRIPTION OF THE
`INVENTION
`The present invention addresses the data bus bandwidth
`65 problem common to conventional DRAM-based graphics
`display systems by integrating a portion of the display data
`frame buffer memory space on the graphics accelerator chip
`
`Page 4 of 7
`
`
`
`5,712,664
`
`20
`
`4
`3
`mented with substantially increased refresh frequency
`and thereby allowing simultaneous access to both on-chip
`(much less than 15.6 ~!Sec.) to reduce tbe on-chip power
`DRAM frame buffer data and off-chip DRAM frame buffer
`data while maintaining the flexibility to increase the display
`dissipation. For example. a 16 Mblt on-chip DRAM frame
`data memory size externally to meet a variety of CRT
`buffer memory 112 could have one 200 nsec. refresh cycle
`s every 2 usee., which translates to a 10% refresh overhead.
`display size requirements.
`F!G. 2 shows a shared memory graphics accelerator
`While this refresh overhead is a significant portion of the
`system 100 that includes a central processing unit (CPU)
`total available bandwidth, with improved on-chip DRAM
`102 that sends pixel display data via address/data bus 104
`access time resulting from integration of the DRAM ll2
`and graphics command signals via a control bus 106 to a
`with the graphics accellerator 110, overall system perfor-
`single integrated graphics display memory (IGDM) 108.
`10 mance is improved significantly. Those skilled in the art will
`Those skilled in the art will appreciate that the bus widths
`appreciate that, as more of the system sub-blocks, such as
`are CPU-dependent.
`the RAMDAC 118, are integrated with the graphics acoel(cid:173)
`The integrated graphics display memory element 108
`erator 110 and the on-chip DRAM frame buffer memory
`includes a graphics accelerator (GXX) 110 that receives the
`112, the refresh overhead is optimized with respect to
`pixel display data and distributes it between an on-chip
`IS improved on-chip DRAM acoess time and increased on-chip
`DRAM frame buffer 112 and an off-chip DRAM frame
`power dissipation to provide improved total system perfor(cid:173)
`buffer 114 via a display data distribution bus 120, using a
`mance. Furthermore, inaeased refresh frequency pennits
`common address bus 115. The data distribution between
`smaller memory storage cell capacitance which reduces total
`on-chip memory 112 and off-chip memory 114 is based upon
`chip size.
`user defined criteria loaded onto the integrated graphics
`Thus. the on-chip DRAM 112 has a substantially higher
`display memory element 108 during power-up. This infor(cid:173)
`refresh frequency than the monolithic off-chip DRAM 114.
`mation can be stored either i n the CPU hard disk or in a
`T he integrated graphics display memory element 108
`boot-up EPROM. 1bis distribution of the pixel display data
`includes means for supporting the multiple refresh fre(cid:173)
`is optimized for maximum CPU updates onto the on-chip
`quency requirements of the on-chip DRAM 112 and the
`display buffer DRAM lU and the off-chip DRAM 114 and,
`25 off-chip DRAM 114.
`at the same time. for supporting a maximum display size
`In some low power applications, average power dissipa-
`refresh on the CIIT display 116.
`tion can be reduced by increasing both the memory cell size
`By splitting the display frame buffer into an on-chip
`and the refresh interval. Another way to reduce power is to
`DRAM portion 112 and an off-chip DRAM portion 114. the
`increase the number of DRAM sense amplifiers, but this
`graphics accelerator engine 110 can double the pixel read
`data to a RAMDAC 118 by simultaneously accessing 30 solution inaeases chip size.
`Those skilled in the art will appreciate that the FlO. 2
`on-chip and off-chip frame buffer display data and multi-
`plexing it onto the distributed data bus 120 using control
`configuration of system 100 can be implemented utilizing
`signals 121. A FIFO memory 122 provides a buffer between
`available integrated circuit technology.
`the RAMDAC 118 which requires continuous display data 35
`FIG. 3 shows two integrated graphics display memory
`input and the distributed data bus 120. which is shared for
`elements (IGDM) 30t and 302 connected in parallel
`display update. display manipulate and display refresh
`between a display data output bus 304 and RAMDAC 306
`operations.
`and to CPU 307 via an address and data bus 308, without any
`external memory. to display a contiguous image on the CRr
`It is also possible for the graphics accelerator engine 110
`to read on-chip DRAM 112 at a much faster rate that it can 40 screen 310 using a frame buffer DRAM 312 on-chip to
`read off-chip DRAM 114. thereby making more CPU 102
`integrated graphics display element 300 and a frame buffer
`update time available for on-chip DRAM ll2. This increase
`DRAM 314 on-chip to integrated graphics display element
`in CPU update bandwidth can, for example, be translated
`302. Thus, the two integrated graphics display memory
`into a faster moving .image portion which can be stored onto
`elements 300 and 302, provide the total frame buffer storage
`the on-chip DRAM 112 and a slower moving portion which 45 space for pixel display data to be displ ayed on the CRf
`can be stored onto the off-chip DRAM 114. Those skilled in
`screen 310. Each of integrated graphics display memory
`the art will appreciate that this distribution of the load can be
`elements 300 and 302 can receive CPU instructions via the
`CPU control bus 316 and can display portions of the
`implemented many different ways between the on-chip
`DRAM lU and the off-chip DRAM 114 to meet the
`required image on the CIIT sa een 310. Also the two
`pelfonnance requirements of the total graphics display sys- so integrated graphics display memory elements 300 and 302
`can communicate with each other via the control signal bus
`tem.
`318 and address/data path 320 to split tbe image or cedis-
`Those skilled in the art will also appreciate that successful
`implementation of the integrated graphics display memory
`tribute the load among themselves without CPU
`element lOS described above requires that the on-chip
`intervention. thereby increasing the total system perter-
`DRAM frame buffer 112 have substantially different char- ss mance.
`One possible example of load sharing in the environment
`acteristics than a monolithic DRAM used for data storage.
`A typical monolithic DRAM requires a 200 nsec. refresh
`of the FIG. 3 system could arise when one integrated
`cycle every 15.6 11sec., which is equivalent to a 1.28%
`graphics display memory element works on even lines of the
`refresh overllead. During this refresh time, no data may be
`CIIT display while the other integrated graphics display
`read from the DRAM; the time is used primarily for refresh- 60 memory element is drawing odd lines o n the CRr saeen
`310. Those skilled in the art will recognize that it is also
`ing the DRAM cell data. 1bis refresh overhead time needs
`to be constant (or as small as possible) with increasing chip
`possible to subdivide the CRr screen 310 even further into
`density. Unfortunately. chip power dissipation must be
`multiple small sections with each section being serviced by
`increased with inaeasing chip density in order to maintain
`a cor:responding integrated graphics display memory ele-
`constant overhead.
`6S ment; these integrated graphics display memory elements
`For the integrated graphics display memory element 108,
`can be cascaded to display a contiguous image on the CRf
`the on-chip DRAM frame buffer memory 112 is imple-
`screen 310.
`
`Page 5 of 7
`
`
`
`5,712,664
`
`6
`5
`n is well known that, because the number of pixels on a
`a data distribution bus connected to the graphics accel-
`erator;
`CRI' screen is smaller than the frame buffer size doe to the
`aspect ratio of the CRT screen and the binary nature of the
`an on-chip frame buffer memory element connected to the
`memory increments, there are always extra bits left in the
`data distribution bus for receiving graphics display data
`frame buffer that are unused by the CRI' display. During s
`from the graphics accelerator;
`power-up of either the FIG. 2 or the FIG. 3 system. the
`an output data storage element connected to the data
`graphics accelerator engine can check the entire frame buffer
`distribution bus for receiving graphics display data
`storage space for any failed bits and then map these failed
`from the graphics accelerator and the on-chip frame
`bits onto the excess memory space available in the frame
`buffer. This becomes important since, as the combined 10
`buffer memory element as output display data,
`the off-chip frame buffer memory element being connect-
`graphics accelerator and on-chip DRAM die size increases,
`the number of fully functional chips drops dramatically. The
`able to the data distribution bus for providing graphics
`display data to the output data storage element as
`excess space needed to repair the faulty frame buffer bits can
`output display data, and
`be allocated from the on-chip frame buffer DRAM so that
`the access delay penalty occurring dwi.ng the faulty bit IS wherein the on-chip frame buffer memory element has a
`access can be reduced, since the on-chip DRAM is much
`faster than off-chip DRAM. This fail bit feature can be
`implemented utili.z.ing techniques disclosed in the following
`two co-pending and commonly-assigned applications: (1)
`.,. ha 20
`US S N 08/041 909 filed A
`2 1990 (Is
`. . er. o.
`,
`,
`sue ree
`s
`pr.
`,
`been paid) and (2) U.S. Ser. No. 08/083.198, filed Jun. 25,
`1993. Both of these applications are hereby inCOipOrated by
`reference.
`
`25
`
`.
`.
`.
`.
`. As shown m FIG. ~- for smaller display SJ.Ze~, a slDgle
`10tegrated graphics display m~~~ry element ~thout D:DY
`external memory can be used uutially. As the display SJ.Ze
`requirements ina-ease. external display memory can be
`added in conjunction with an on-chip display memory
`availability. As described above, it is also possible to connect
`multiple integrated graphics display memory elements in 30
`parallel to meet the display size requirements and, at the
`same time, to execute multiple instructions in parallel,
`thereby increasing the CRT display performance.
`It should be understood that various alternatives to the
`embodiment of the invention described herein may be 3S
`employed in practicing the invention. It is intended that the
`following claims define the scope of the invention and that
`structures and methods within the scope of these claims and
`their equivalents be covered thereby.
`What is claimed is:
`1. An integrated graphics display memory element utiliz-
`able in a graphics accelerator system that provides graphics
`display data to a display element for display thereby,
`wherein the graphics accelerator system includes a central
`processing unit that generates graphics display data and 45
`graphics commands for processing graphics display data and
`an otf-chip frame buffer memory element having a first
`refresh frequency requirement, the integrated graphics dis-
`play memory element comprising:
`a graphics accelerator that can be connected to receive
`graphics display data and graphics commands from the
`central processing unit via a CPU data bus and a control
`signal bus, respectively;
`
`40
`
`so
`
`second refresh frequency requirement lower than the
`first refresh frequency of the off-chip frame buffer
`element
`2. An integrated graphics display memory element as in
`claim 1 and wherein the on-chip frame buffer memory
`elffemhie~t frhas a bcuffell size greater! than the cell size of the
`er memory ~ ement.
`o -c p arne
`3. A shared memory graphics accelerator system that
`provides graphics display data to a display element for
`display thereby. the shared memory graphics accelerator
`system comprising·
`.
`.
`.
`·.
`a central process~g urut that generates gra~hics disp~ay
`~ta and graphics commands for procesSlDg graphics
`display data;
`an integrated graphics display memory element that
`includes both a graphics accelerator connected to
`receive graphics display data and graphlcs commands
`from the central processing unit and au on-chip frame
`buffer memory element connected to receive graphics
`display data from the graphics accelerator via a display
`data distribution bus; and
`an otf -chip frame buffer memory element connected to
`receive graphics display data from the graphics accel-
`erator via the data distribution bus;
`wherein the graphics accelerator selectively distributes
`display data to the on-chip frame buffer memory ele-
`ment and to the otT-chip frame buffer memory element
`based on pre-defined display data distribution criteria.
`and
`wherein the display data distribution aiteria are pre-
`defined such that the graphics accelerator selectively
`distributes display data corresponding to fast moving
`images to the on-chip frame buffer memory element
`and display data couesponding to slowing moving
`images of the otf -chip frame buffer memory element.
`
`* * * * *
`
`Page 6 of 7
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`CERTIFICATE OF CORRECTION
`
`PATENT NO.
`5,712,664
`January 27, 1998
`DATED
`INVENTOR(S) : Chi tranj an N. Reddy
`
`It is certified that error appears in the above-indentified patent and that said letters Patent is hereby
`corrected as shown below:
`
`Claim 2, line 3, "greater" should be -- lesser
`
`Signed and Sealed this
`
`Seventeenth Day of November~ 1998
`
`Auest:
`
`Allesling Officer
`
`BRUC E LEHMAN
`
`Page 7 of 7