`
`11800571266411
`
`United States Patent
`
`[19]
`
`[11] Patent Number:
`
`5,712,664
`
`
`[45] Date of Patent:
`Jan. 27, 1998
`Reddy
`
`[54] SHARED MEMORY GRAPHICS
`ACCELERATOR SYSTEM
`
`5,392,393
`5,396,586
`5,537,128
`
`2.11995 Dealing .. 3951162
`311995 VanAkw
`3951141
`
`711996 Keene e191.
`3451200
`
`[75]
`
`Inventor: Chitnmjan N. Roddy, Milpitas, Calif.
`
`FOREIGN PM'ENT DOCUMENTS
`
`[73] Assignee: Alliance Semiconductor Corporation,
`San Jose, Calif.
`
`1211293
`
`811992
`
`Japan.
`
`OTHER PUBLICATIONS
`
`[21] Appl. No; 136,553
`
`[22] Filed:
`
`Oct. 14, 1993
`
`Int. Cl.6.. G09G 5100
`[51]
`[52] US. Cl. ....................... 3451200; 3451201: 3951508;
`3951519
`3451133. 132,
`[58] Field of Search
`3451189—191, 204, 203, 200, 201; 3951162,
`164, 166, 185, 375, 129. 725, 141, 508,
`519; 3481584, 441, 536; 3651200
`
`[56]
`
`References Cited
`U.S. PATENT DOCUNIENTS
`
`34512101
`311980
`4,191,956
`3651200
`4,228,528 1011980
`3451201
`4,812,836
`311989
`3451201
`4,816,815
`311989
`3451189
`4,951,232
`811990
`3481441
`4,956,703
`911990
`.. 3951375
`5,031,092
`"111991
`.. 3651200
`5,083,294
`111992
`. 3451133
`5,202,962
`411993
`.. 3481536
`5,258,843
`1111993
`. 3481584
`5,293,540
`311994
`3651200
`5,297,148
`311994
`3951129
`5,303,334
`411994
`3451200
`5,319,388
`611994
`3951162
`5,321,806
`611994
`3451201
`5,363,500 1111994
`5,386,573
`111995 Okamoto ................................ 3951725
`
`
`
`TMS34010 User’s Guide. Texas Instruments. pp.
`through 1—7, 1986.
`IBM Technical Disclosure Bulletin, 35(1A):45—46, Jun.
`1992.
`"IRS—80 Color Computer Technical Reference
`Tandy.
`Manual” pp. 17-21, 1981.
`
`1—5
`
`Primary Examine-Steven Saras
`Attorney Agent, or Firm—Limbach 8r. Limbach L.L.P.
`
`[57]
`
`ABSTRACT
`
`A shared memory graphics accelerator systu that provides
`graphics display data to a display includes a central pro-
`cessing unit for generating graphics display data and graph-
`ics commands for processing the display data. An integrated
`graphics display memory element includes both a graphics
`accelemwr connected to receive display data and graphics
`commands from the central processing unit and an on-chip
`frame buffer memory element. The on-chip fi'ame buffer
`memory element is connected to receive display data from
`the graphics accelerator via a display data distribution bus.
`An off-chip frame bufi‘ea' memory element is also connected
`to the display data distribution has to receive display data
`from the graphics accelerator. The graphics accelerator
`scleaively distributes display data to the on-dn'p frame
`heifer memory element and to the inf-chip frame bufl’er
`memory element based on predetermined display data dis-
`tribution criteria.
`
`3 Claims, 2 Drawing Sheets
`
`1123
`
`“5
`
`
`
`Page 1 of 7
`Page 1 of 7
`
`HTC-LG-SAMSUNG EXHIBIT 1026
`HTC-LG-SAMSUNG EXHIBIT 1026
`
`
`
`US. Patent
`
`Jan. 27, 1993
`
`Sheet 1 of 2
`
`5,712,664
`
`22
`
`14
`
`(DRAM)
`
`GRAHflCS
`ACCELERATOR
`(GXX)
`
`FRAME
`BUFFER
`
`FIG. 1
`
`(PRIOR ART)
`
`
`
`FIG. 2
`
`Page 2 of 7
`Page 2 of 7
`
`
`
`US. Patent
`
`Jan. 27, 1998
`
`Sheet 2 of 2
`
`5,712,664
`
`
`
`FIG. 3
`
`
`
`Page 3 of 7
`Page 3 of 7
`
`
`
`5,712,664
`
`1
`SHARED MEMORY GRAPHICS
`ACCELERATOR SYSTEM
`
`BACKGROUND OF THE EVENTION
`
`1. Field of the Invention
`
`The present invention relates to the visual display of a
`computer graphics image and, in partiwlar. to a graphics
`display system that integrates both a graphics accelerator
`engine and apoxtion of the graphics frame bulfer memory on
`the same monolithic chip.
`2. Discussion of the Prior Art
`
`A video graphics system typically uses either VRAM or
`DRAM frame bulfers to store the pixel display data utilized
`in displaying a graphics or video image on a display element
`such as a CRT.
`
`AVRAM frame bulfer includes two ports that are avail-
`able for the pixel data to flow from the memory to the
`display. One port is known as the serial port and is totally
`dedicated to refreshing the display screen image. The other
`port is a random access port that is used for receiving pixel
`updates generated by a CPU or a graphics accelerator
`engine. A typical VRAM arrangement allocates 99% of the
`available bandwidth to the random port thereby allowing the
`system to display fast moving objects and to support Large
`display CRTs.
`However, in a DRAM-based video system. the pixel data
`updates and the screen refresh data contend for a single
`frame buffer memory port. This contention reduces the
`amount of bandwidth available for pixel data updates by the
`CPU and the graphics engine, resulting in a lower perfor-
`mance graphics display system.
`However, in most applications. the DRAM solution is
`preferable to the VRAM solution at the expense of lower
`performance, because DRAMs are cheaper than V'RAMs.
`FIG. 1 showa a conventional graphics display system 10
`wherein a CPU 12 writes pixel display data on data bus 11
`to be displayed on the CKI‘ screen 14 through a graphics
`accelerator (GXX) 16 onto a DRAM frame bufl’er 18 via
`data bus 19. The CPU 12 also provides certain higher: level
`graphics command signals 20 to the graphics accelerator 16
`to manipulate the display data stored in the DRAM frame
`butter 18.
`
`The graphics accelerator 16 retrieves display data from
`the frame bufer 18 via data bus 19 utilizing reference
`address bus 21. processes the retrieved display data based on
`the CPU command signals 20 and writes the new pixel data
`back to the frame bufier 18.
`
`The pixel data is displayed on the CRT 14 through a
`random access memory digital-to-analog converter
`(RAMDAC) 22 that receives the data via a data display bus
`24
`
`The graphics accelerator 16 also reads display data from
`the frame buifer 18 via data bus 19 and sends it to the
`RAMDAC 22 via the data display bus 24 to meet the
`periodic refresh requirements of the CRT display 14.
`Thus. as illustrated in FIG. 1, the bandwidth of the data
`bus 19 is shared by three functions: display refresh, CPU
`display data update, and graphics accelerator display
`manipulation. As the display size (i.e., the number of pixels
`to be displayed on the CKI‘ screen 14) increases, the display
`updates and display manipulation functions are reduced
`because of the bandwidth limitations of the data bus 19
`
`caused by the fixed refresh requirements of the CRT 14.
`While these limitations can be addressed by increasing the
`data bus width ra- by increasing its speed, both of these
`
`2
`limitations.
`solutions have either physical or practical
`Increasing the bus width increases the silicon area and the
`package pin count. Increasing the speed of the bus requires
`utilization of more complex silicon process technology.
`
`SUMMARY OF THE JNVENTION
`
`10
`
`15
`
`25
`
`30
`
`35
`
`45
`
`55
`
`The present invention provides a graphics display system
`that enhances performance by integrating a portion of the
`frame butter storage space and the graphics accelerator
`engine on the same chip while at the same time maintaining
`the flexibility to expand the frame butter size as needed.
`Generally.
`the present
`invention provides a shared
`memory graphics accelerator system that provides display
`data to a display element. The shared memory graphics
`accelerator system includes a central processing unit that
`generates both display data and graphics commands for
`processing the display data. An integrated graphics display
`memory element includes both a graphics accelerator that
`receives display data and graphics commands from the
`central processing unit and an on-chip frame buffer memory
`element that is connected to receive display data from the
`graphics accelerator via a display data distribution bus. An
`off-chip frame bufier memory element is also connected to
`the data distribution bus to receive display data from the
`graphics accelerator. The graphics accelerator selectively
`distributes the display data to the on-chip memory element
`and to the elf—chip memory element based on predefined
`display data distribution criteria.
`The above-described integrated solution increases the
`performance of the graphics display system because display
`data retrieval from the on—chip frame boiler is much faster
`than from an external frame buffer and the DRAM timing
`constraints are reduced, thus adricving improved system
`performance. This integrated solution also allows the dis-
`play memory size to be expanded by adding external
`memory so that large displays can be accommodated on an
`as-needed basis. Also. the frame bulfer space can be dis-
`n-ibnted among several integrated solutions, thereby increas-
`ing both the display bandwidth and the parallel processing
`capability between the CRT display and the CPU.
`A better understanding of the features and advantages of
`the present invention will be obtained by reference to the
`following detailed descriptiou and accompanying drawings
`which set forth an illustrative embodiment in which the
`principals of the invention are utilized.
`DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a schematic diagram illustrating a conventional
`graphics subsystem.
`FIG. 2 is a schematic diagram illustrating a shared
`memory graphics accelerator system in accordance with the
`present invention.
`FIG. 3 is a sd'iematic diagram illusn'ating a shared
`memory graphics accelerator system in accordance with the
`present invention in a distributed display arrangement.
`FIG. 4 is a schematic diagram illustrating a shared
`memory graphics accelerator system in accordance with the
`present invention but with no expansion memory.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`I55
`
`The present invention addresses the data bus bandwidth
`problem common to conventional DRAM-based graphics
`display systems by integrating a portion of the display data
`frame buffer memory space on the graphics accelerator chip
`
`Page 4 of 7
`Page 4 of 7
`
`
`
`5,712,664
`
`3
`
`and thereby allowing simultaneous access to both on-chip
`DRAM frame buffer data and off-chip DRAM flame bufer
`data while maintaining the flexibility to increase the display
`data memory size externally to meet a variety of CRT
`display size requirements.
`FIG. 2 shows a shared memory graphics accelerator
`system 100 that includes a central processing unit (CPU)
`102 that sends pixol display data via addressldata bus 104
`and graphics command signals via a control bus 106 to a
`single integrated graphics display memory (IGDM) 108.
`Those skilled in the art will appreciate that the bus widths
`are CPU-dependent.
`The integrated graphics display memory element 108
`includes a graphics accelerator (GXX) 110 that receives the
`pixel display data and distributes it between an on—chip
`DRAM frame bulfer 112 and an off-chip DRAM frame
`buffer 114 via a display data distribution bus 120. using a
`common address bus 115. The data distribution between
`on-chip memory 112 and off-chip memory 114 is based upon
`user defined criteria loaded onto the integrated graphics
`display memory element 108 during power-up. This infora
`mation can be stored either in the CPU hard disk or in a
`boot-up EPROM. This distribution of the pixel display data
`is optimized for maximum CPU updates onto the on—chip
`display butler DRAM 112 and the ofi-chip DRAM 114- and,
`at the same time. for supporting a maximum display size
`refresh on the CRT display 116.
`By splitting the display frame buffer into an (JD-Chip
`DRAM portion 112 and an off-chip DRAM portion 114. the
`graphics accelerator engine 110 can double the pixel read
`data to a RAMDAC 118 by simultaneously accessing
`on-chip and olf—chip frame buffer display data and multi~
`plexing it onto the distributed data bus 120 using control
`signals 121. A FIFO memory 122 provides a buffer between
`the RAMDAC 113 which requires continuous display data
`input and the distributed data bus 120. which is shared for:
`display update. display manipulate and display refresh
`operations.
`It is also possible for the graphics accelerator engine 110
`to read on-chip DRAM 112 at a much faster rate that it can
`read ofi-chip DRAM 114. thereby making more CPU 102
`update time available for on-chip DRAM 112. This increase
`in CPU update bandwidth can. for example, be translated
`into a faster moving image portion which can be stored onto
`the on-chip DRAM 112 and a slower moving portion which
`can be stored onto the oil-chip DRAM 114. Those skilled in
`the art will appreciate that this distribution of the load can be
`implemented many different ways between the on—chip
`DRAM 112 and the off—clup DRAM 114 to meet
`the
`performance requirements of the total graphics display sys-
`tern.
`
`10
`
`15
`
`25
`
`30
`
`35
`
`45
`
`Those skilled in the artwill also appreciate that successful
`implementation of the integrated graphics display memory
`element 103 described above requires that
`the on—chip
`DRAM frame butler 112 have substantially difierent char-
`acteristics than a monolithic DRAM used for data storage.
`
`55
`
`A typical monolithic DRAM requires a 200 nsec. refresh
`cycle every 15.6 nsec.. which is equivalent to a 1.28%
`refresh overhead. During this refresh time, no data may he
`read from the DRAM; the time is used primarily for refresh—
`ing the DRAM cell data. This retresh overhead time needs
`to be constant (or as small as possible) with increasing chip
`density. Unfortunately. chip power dissipation must be
`increased with increasing chip density in order to maintain
`constant overhead.
`
`For the integrated graphics display memory element 108,
`the on—chip DRAM frame buffer memory 112. is imple-
`
`4
`mented with substantially increased refresh frequency
`(much less than 15.6 nsec.) to reduce the on-chip power
`dissipation. For example. a 16 Mbit on-chip DRAM frame
`buffer memory 112 could have one 200 nsec. refresh cycle
`every 2 nsec., which translates to a l{l% refresh overhead
`While this refresh overhead is a significant portion of the
`total available bandwidth. with improved on—chip DRAM
`access time resulting from integration of the DRAM 112
`with the graphics accellerator 110, overall system perfor-
`mance is improved significantly. Those skilled in the art will
`appreciate that, as more of the system sub-blocks. such as
`the RAMDAC 118, are integrated with the graphics accel-
`erator 110 and the Oil-chip DRAM frame buffer memory
`112.
`the refresh overhead is optimized with respect to
`improved on-chip DRAM access time and increaSed on-chip
`power dissipation to provide improved total system perfor—
`mance. Furthemacre, increased refresh frequency permits
`smaller memory storage cell capacitance which reduces total
`chip size.
`Thus. the on-chip DRAM 112 has a substantially higher
`refresh frequency than the monolithic elf—chip DRAM 114.
`The integrated graphics display memory element 108
`includes means for supporting the multiple refresh fre-
`quency requirements of the on-chip DRAM 112 and the
`off-chip DRAM 114.
`In some low power applications, average power dissipa-
`tion can be reduced by increasing both the memory cell size
`and the refresh interval. Another way to reduce power is to
`increase the number of DRAM sense amplifiers, but this
`solution increases chip size.
`Those skilled in the art will appreciate that the FIG. 2
`configuration of system 100 can be implemented utilizing
`available integrated circuit technology.
`FIG. 3 shows two integrated graphics display memory
`elements (IGDM) 300 and 302 connected in parallel
`between a display data output has 304 and RAMDAC 306
`and to CPU 30’? via an address and data bus 308, without any
`external memory. to display a contiguous image on the CRT
`screen 310 using a frame butter DRAM 312 on-chip to
`integrated graphics display element 300 and a frame buifer
`DRAM 314 on-chip to integrated graphics display element
`302. Thus.
`the two integrated graphics display memory
`elements 300 and 302, provide the total frame buffer storage
`space for pixel display data to be displayed on the CRT
`screen 310. Each of integrated graphics display memory
`elements 300 and 302 can receive CPU instructions via the
`CPU control bus 316 and can display portions of the
`required image on the CRT screen 310. Also the two
`integrated graphics display memory elements 300 and 302
`can communicate with each other via the control signal bus
`318 and addressldatza path 320 to split the image or redis-
`tribute the load among themselves without CPU
`intervention.
`thereby increasing the total system perfor-
`mance.
`
`One possible example of load sharing in the environment
`of the FIG. 3 system could arise when one integrated
`graphics display memory element works on even lines of the
`CRT display while the other integrated graphics display
`memory element is drawing odd lines on the CRT screen
`310. Those skilled in the art will recognize that it is also
`possible to subdivide the CRT screen 310 even further into
`multiple small sections with each section being serviced by
`a corresponding integrated graphics display memory ele—
`ment; these integrated graphics display memory elements
`can be cascaded to display a contiguous image on the CRT
`screen 310.
`
`Page 5 of 7
`Page 5 of 7
`
`
`
`5,712,664
`
`5
`
`It is well knovvn that, because the number of pixels on a
`CRI‘ screen is smaller than the frame buffer size due to the
`aspect ratio of the CRT screen and the binary nature of the
`memory increments. there are always extra bits left in the
`frame buffer that are unused by the CRT display. During
`power-up of either the FIG. 2. or the FIG. 3 system. the
`graphics accelerator engine can check the entire frame buifer
`storage space for any failed bits and then map these failed
`bits onto the excess memory space available in the frame
`butler. This becomes important since. as the combined
`graphics accelerator and onwchip DRAM die size increases.
`the number of fully functional chips drops dramatically. The
`excess space needed to repair the faulty frame butler bits can
`be allocated from the on-ehip frame buffer DRAM so that
`the access delay penalty occurring during the faulty bit
`access can be reduced, since the on-chip DRAM is much
`faster than oE-chip DRAM. This fall bit feature can be
`implemented utilizing techniques disclosed in the following
`two co-pending and commonly-assigned applications: (1)
`US. Ser. No. 081041.909. filed Apr. 2. 1990 (Issue Fee has
`been paid) and (2) U.S. Ser. No. 08!083,198. filed Jun. 25,
`1993. Both of these applications are hereby incorporated by
`reference.
`'
`As shown in FIG. 4. for smaller display sizes, a single
`integrated graphics display memory element without any
`external memory can be used initially. As the display size
`requirements increase. external display memory can be
`added in conjunction with an on-chip display memory
`availability. As described above, it is also possible to connect
`multiple integrated graphics display memory elements in
`parallel to meet the display size requirements and. at the
`same time, to exewte multiple instructions in parallel,
`thereby increasing the CRT display performance.
`It should be understood that various alternatives to the
`embodiment of the invention described herein may be
`employed in practicing the invention. It is intended that the
`following claims define the scope of the invention and that
`storms and methods within the scope of these claims and
`their equivalents be covered thereby.
`What is claimed is:
`1. An integrated graphics display memory element utiliz-
`able in a graphics accelerator system that provides graphics
`display data to a display element for display thereby,
`wherein the graphics accelerator system includes a central
`processing unit that generates graphics display data and
`graphics commands for processing graphics display data and
`an off-chip frame butler memory element having a first
`refresh frequency requirement, the integrated graphics dis-
`play memory element comprising:
`a graphics accelerator that can be connected to receive
`graphics display data and graphics cormnands from the
`central processing unitvia a CPU data bus and a control
`signal bus. respectively;
`
`10
`
`15
`
`20
`
`30
`
`35
`
`45
`
`SD
`
`6
`a data distribution bus connected to the graphics accel—
`orator:
`
`an on-chip frame buffer memory element connected to the
`data distribution bus ferreceiving graphics display data
`from the graphics accelerator;
`an output data storage element connected to the data
`distribution bus for receiving graphics display data
`from the graphics accelerator and the on-chip frame
`bufier memory element as output diSplay data,
`the elf-drip frame buffer memory element being connect-
`able to the data distribution bus for providing graphics
`display data to the output data storage element as
`output display data. and
`wherein the on-chip frame buffer memory element has a
`second refresh frequency requirement lower than the
`first refresh frequency of the elf-chip frame heifer
`element.
`2. An integrated graphics display memory element as in
`claim 1 and wherein the on-chip frame butler memory
`element has a cell size greater than the cell size of the
`ofi—chip frame buifer memory element.
`3. A shared memory graphics accelerator system that
`provides graphics display data to a display element for
`display thereby. the shared memory graphics accelerator
`system comprising:
`a central processing unit that generates graphics display
`data and graphics commands for processing graphics
`diSPhy data;
`that
`an integrated graphics display memory element
`includes both a graphics accelerator connected to
`receive graphics display data and graphics commands
`from the central processing unit and an on—chip frame
`butler memory element connected to receive graphics
`display data from the graphics accelerator via a display
`data distribution bus; and
`an ofl‘~chip frame buffer memory element connected to
`receive graphics display data from the graphics accel—
`orator via the data distribution bus;
`
`wherein the graphics accelerator selectively distributes
`display data to the on-chip frame buffer memory ele-
`ment and to the elf-chip frame buffer memory element
`based on pro-defined display data distribution criteria.
`and
`
`wherein the display data disuibution criteria are pre-
`defined such that the graphics accelerator selectively
`distributes display data corresponding to fast moving
`images to the on-chip frame butler memory element
`and display data corresponding to slowing moving
`images of the off~chip frame butter memory element.
`*****
`
`Page 6 of 7
`Page 6 of 7
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`CERTIFICATE OF CORRECTION
`
`PATENTNO.
`
`: 5,712,654
`
`DATED
`
`: January 27, 1993
`
`INVENTOR(S): Chitranjan N. Reddy
`
`It is certified that error appears in the above-indentitied patent and that said Letters Patent is hereby
`corrected as shown below:
`
`Claim 2,
`
`line 3, "greater" should be —~ lesser --.
`
`Signed and Sealed this
`
`Seventeenth Day of November, 1998
`
`Amen-
`
`604 W
`
`BRUCE LEHMAN
`
`AHFSNHQ Offir‘er
`
`(‘rmrmmimwr ref Pawn-'3' and Trademark.
`
`
`
`Page 7 of 7
`Page 7 of 7
`
`