throbber
C
`
`||||||||||||||||||||[l||||||||||||||l|||||||||||||||||||||||||||||llll
`US0061 0441 TA
`
`United States Patent
`Nmbm1unL
`
`119]
`
`l’atent Number:
`||: 1
`[H] DmcofPm£m:
`
`6,104,417
`*Aug1&2fim
`
`[54] UNIFIED MPZMORY C{)N[PUTER
`ARCHI'l”IiCTURI". WITH DYNAMIC
`GRAPHICS MEMORY ALLOCATION
`
`I75]
`
`]IW€flT0l'S'- Mifhfll-‘I J- K- Nifltfiefl. 530 1080; Zflhid
`5. Hussaln, Palo Alto, holh of Calil‘.
`
`_
`I a I Nance:
`
`[T3] A-migncc: Silicon Graphics. Inc.. Muumain View.
`Calil".
`_
`_
`_
`_
`rh1'c'.'mmm lffiufad N.‘ a Ccmllmlcd '”,n,S'
`ecuilon appltcallnn fried under 3? (.l‘R
`Lsam)‘ and is subject In the lwumy year
`patent
`{Hm pnwisiflm of 35 U‘S_(_-_
`|_-34(a)('_J_).
`
`[ZI | Appl. No: l}8,'7l.‘3,'.-'79
`
`Sap’ I3’ 1996
`Film:
`[22
`---------------------------------I GMF 1335
`“IL CL
`[51]
`
`---- v- 34515212 34555”; 345.3503;
`[53] U-S- CL I
`3455 l_24_-'''11":3m
`‘
`_
`5202-
`[33]
`“*3” 0f 59379“
`345’'lS'1' 5] " Sm‘ 71 1 "'0" 'm‘ '05
`Refcmnccs cued
`U.S. P!\TlE.NT DOIf‘UMENTS
`9.=I‘J05
`L£:hI'n-‘In elal.
`
`T
`
`[55]
`
`5.450.542
`
`345512
`
`5_h40.54-3
`
`l"i_-"I9!J'."
`
`I":1rrcIlL'I al.
`
`.34-5.’5TI2
`
`.f’r:'mm'_r .F:'.rrmu‘m:r—[(ec M. Tung
`A!.iomr:_\'. Agmr. or .r‘"r'rm—Wagncr. Murabito 6’: [Ian
`
`[5?|
`
`ABS’I'RACT
`
`A compulcr syslcrn provides dynamic memory aliucalion for
`graphics. The cumpulur syslum includes a mumnry
`uuntrollcr, 3 unified system mt.-mo1'_v. and memory clients
`I
`_h h .
`‘
`_
`h_ _
`_
`_
`_
`_.
`h‘
`_
`NIL
`.w|n_r,
`.:.Lu..~..a I0_l
`L.
`.~uy:~_.I:._m mumury \|a_1 I. momrary
`controller. Memory clrcnls can mciuclc a graphics rendering
`engine, a (.‘]’U, an image processor. a data crmniprcssion.-’
`uxpanzsiun (twice. an inpul.-"oulput tlcvicc, :1 graphics [Jack
`end clcvicc. The I‘J(1IIlpI1lI:1‘S.y$lum pruvidcsa read.-"wriI:: accuse
`
`Ihruugh [his memory
`the unilicrl syslcm memory,
`10
`::nntr(:l1cr,
`[hr cach u|'
`lhu memory cliunls.
`'['rans|aIiun
`hardware is inciuciud for nrapping virluai acklrcw.-s of pixel
`lauflbrs In phy.‘iiL‘a| memory locations in the unified system
`rrwmury. l’ixL:| bu |I::rs are dynanliczrlly allocalctl as iilces cl’
`physically ::un1iguou.~; memory.
`'['1'a11sE.11iu11 hardware is
`implumunlcti
`in :;.'1::l1 of the curnpulaliunzll dcviccs. which
`are :m:1m|ud as memory clscnls m Ihc computer syslcm.
`including primariiy lhc rendering engine.
`
`29 Claims, I3 Drawing Sheets
`
`UNIFIED
`SY STE M
`MEMORY
`
`E
`
`MEMORY CONTROLLER
`
`GRAPHICS
`GRAPHICS
`INPUT!
`IMAGE
`
`BACK
`RENDERING
`OUTPUT
`PFIOC ESSOFI
`ENGINE
`END
`
`lfl
`
`ELQ
`
`2.12
`
`H3.
`
`
`
`
`215
`DATA COMPRESSION I
`EXPANSION DEVICE
`
`
`
`0001
`0001
`
`Volkswagen 1003
`Volkswagen 1003
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 1 of 13
`
`6,104,417
`
`GP”
`
`"' 0
`
`GRAPHICS
`PROCESSOR
`
`mass
`PROCESSOR
`
`19.5
`
`ILQE
`
`1;
`
`HQ
`
`MAIN MEMORY
`
`GRAPHICS MEMORY
`
`IP MEMORY
`
`CONTROLLER
`
`JJQ
`
`CONTROLLER 11
`
`CONTROLLER 11_
`
`MAIN
`MEMORY
`E
`
`MEMORY
`HE
`
`DEDICATED IP
`MEMORY
`E
`
`PRIOR ART
`
`FIGURE 1
`
`0002
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 2 of 13
`
`6,104,417
`
`UNIFIED
`SYSTEM
`MEMORY
`
`102
`
`MEMORY CONTROLLER
`
`ENGINE
`
`E
`
`OUTPUT
`
`10
`
`END
`
`E
`
`PROCESSOR
`
`Q4
`
`E
`DATA COMPRESSION!
`EXPANSION DEVICE
`
`FIGURE 2A
`
`0003
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 3 of 13
`
`6,104,417
`
`64 @ 100 MHZ
`
`GF'AF'H'°5
`BACK END
`
`64 @100 MHZ
`
`GRAPHRCS RENDERING
`AND MEMORY
`CONTROLLER IC
`‘EB
`
`RENDEHING
`ENGINE
`
`28
`
`33
`133 MHZ
`
`INPUT!
`OUTPUT
`
`133 MHZ
`
`UNIFIED SYSTEM MEMORY
`(USMJ
`
`FIGURE 2B
`
`0004
`
`

`
`U.S. Patent
`
`Aug.15,200l)
`
`Sheet 4 of 13
`
`6,104,417
`
`CPUIIPCE
`INTERFACE
`
`GEE
`INTERFACE
`E
`
`E
`INTQIEIIQACE
`
`MEMORY CONTROLLER
`
`FIGURE 2C
`
`0005
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 5 of 13
`
`6,104,417
`
`512-BYTES
`
`00-0}
`
`300
`
`HGURESA
`
`302
`
`30
`
`HGUHESB
`
`0006
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 6 of 13
`
`6,104,417
`
`GEN TLB
`INDEX
`
`TLB TYPE
`
`FRAME BUFFER
`
`“-3 A
`256 x 16-BITS
`
`FRAME BUFFEFI
`TLB B
`256 X 16—BiTS
`
`FRAME BUFFER
`TLB C
`256 X 16-BITS
`
`TEXTURE TLB
`112 X 16-BITS
`
`CID TLB
`16 X 16-BITS
`
`LJNEAFI A TLB
`32 X 32-BITS
`
`LINEAR B TLB
`32 X 32-BITS
`
`GEN OFFSET
`
`FIGURE 3C
`
`0007
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 7 of 13
`
`6,104,417
`
`Command Pipe
`
`CS_N
`
`RASFN
`
`CAS_N
`
`VVE_N
`
`Mem_Addr
`
`Memmask_out
`
`ECCmask
`
`Merndata2mern_out
`
`Memdatazcllentjn
`
`.0.
`
`Memma5k_in
`
`Data Pipe
`
`Decode Stall
`‘1'— hoid
`'0' - enable
`
`Mernda'ta2mem_i.n
`
`Memdata2mem_oul
`
`I
`
`Decode Stall
`
`ECC
`General =
`
`ECC
`Correct
`
`FIGURE 4
`
`0008
`
`

`
`U.S. Patent
`
`Aug.15,200l)
`
`Sheet 8 of 13
`
`6,104,417
`
`Clk
`
`Clientreqxalid
`
`C|ientreq.adr
`
`C|ientreq.cmd
`
`Clientreqmsg
`
`Clientreq.ecc
`
`Clientreagnt
`
`Y
`Y
`Y
`Y
`Y
`AA-AMAMA
`
`V
`V
`V
`V
`V
`...-.%.%.
`V
`v
`Y
`7
`Y
`A L-Aw; M592 A
`
`TH
`
`request latched into queue
`
`FIGURE 5
`
`0009
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 9 of 13
`
`6,104,417
`
`Clientreawrrdy
`
`C|ientres.oe
`
`|
`
`-
`
`_
`cnemremmsg
`
`Y
`Y
`Y
`1'
`Y
`.@.j.@.@.
`
`memaataemenun
`
`V
`V
`Datao .3 Dam
`
`MernmaSk_in
`
`Maskfl
`
`
`
`‘TA A
`
`MaSl<2
`
`FIGURE 6
`
`C|ientres.rdrdy :
`
`V
`V
`V
`V
`V
`Clientresmdmsg A L_A A
`
`Memdtal2c|ienl_out
`
`T
`Y
`D390 n Dam?
`
`FIGURE 7
`
`0010
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 10 of 13
`
`6,104,417
`
`We_n
`
`Mem_addr
`
`Memdata2mem_out
`
`Me mmask_out
`
`Ecc__out
`
`Eccmask
`
`I
`
`I
`
`I
`
`I
`
`4
`Precharge
`
`4
`Activate
`
`1!
`write
`
`FIGURE 8
`
`0011
`
`

`
`U.S. Patent
`
`Aug. 15,2000
`
`Sheet 11 of 13
`
`6,104,417
`
`Cs_n
`
`Ras_n
`
`Ca:-:_n
`
`We_n
`
`Me-m_addr
`
`Merndata2mern_ou1
`
`Memrnask_out
`
`Ecc_out
`
`Eccmask
`
`|
`
`|
`
`V
`Y
`Y
`Y
`.m.-mp.
`
`V
`V
`Am;
`
`7
`7
`gm
`
`Y
`v
`.1.
`
`Precharge
`
`FIGURE 9
`
`0012
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 12 of 13
`
`6,104,417
`
`Address from CPU
`16 M1:-it Bank
`EBS
`
`IBS
`
`row address
`
`column address
`
`2928 272625 24 23222120 19181716
`
`151413121110 9 8
`
`7 6 5 4 3210
`
`Internal Revenue Address
`16 Mbil Bank
`
`EBS
`
`IBS
`I
`II
`24 232221201918171615141312111093 7654 3210
`
`row address
`
`column address
`
`Address from CPU
`64 Mei! Bank
`EBS
`EBS
`
`row address
`
`column address
`
`2928 27262524
`
`23222120 19131716 15141312 1110 9 8
`
`7 B 5 4
`
`3 210
`
`Iniernal Revenue Address
`64 Mbit Bank
`EB
`IBS
`
`row address
`
`column address
`
`24 23222120 19181715151413-12111098 7654
`
`3210
`
`Key -
`IBS - Internal Bank Select
`EBS - Extemal Bank Select
`
`FIGURE 10
`
`0013
`
`

`
`U.S. Patent
`
`Aug. 15, 2000
`
`Sheet 13 of 13
`
`6,104,417
`
`RR or Flw and
`Tras = 0
`
`HR or Fiw and
`Tras = 0
`
`PR — Page Flead
`PW - Page Write
`FIFI - Random Read
`RW - Random Write
`
`FIGURE 11
`
`0014
`
`

`
`6,104,417
`
`1
`UNIF'IED MEMORY COM!-‘U'l‘l*IR
`ARCHITltlC'I'llRE WITH DYNAMIC
`GRAPHICS MEMORY ALLOCATION
`
`BACKGROUNI) OF TI-IE INVl:'.NTION
`
`The present invention relates to the field of computer
`systems. Specifically,
`the present
`invention relates to a
`computer system architecture including dynamic memory
`allocation of pixel buffers for graphics and image process-
`mg.
`
`BACKGROUND OF THE. INVlE.N'l'lON
`
`Typical prior art computer systems often rely on periph-
`eral processors and dedicated peripheral memory units to
`perform various computational operations. For example,
`peripheral graphics display processors are used to render
`graphics images (synthesis) and peripheral image processors
`are used to perform image processing (analysis). In typical
`prior art computer systems, CPU main memory is separate
`from peripheral memory units which can be dedicated to
`graphics rendering or image processing or other computa-
`tional functions.
`With reference to Prior Art FIG. 1, a prior art computer
`graphics system 100 is shown. The prior art computer
`graphics system 100 includes three separate memory units;
`a main memory 102, a dedicated graphics memory 104, and
`a dedicated image processing memory (image processor
`memory) 105. Main memory 102 provides fast access to
`data for a ('.‘l’U 106 and an inputfoutput device 108. The
`CPU 106 and inputfoulput device 108 are connected to main _
`memory 102 via a main memory controller 110. Dedicated
`graphics memory 104 provides fast access to graphics data
`for a graphics processor 112 via a graphics memory con-
`troller 114. Dedicated image processor memory 105 pro-
`vides fast access to buffers of data used by an image __
`processor 116 via an image processor memory controller
`118. In the prior art computer graphics system 100, (TPU 106
`has readtyvrite access to main memory 102 but not
`to
`dedicated graphics memory 104 or dedicated image proces-
`sor memory 105. Likewise,
`the image processor 116 has
`readfwrite access to dedicated image processor memory 105.
`but not to main memory 102 or dedicated graphics memory
`104. Similarly, graphics processor 112 has readi"\-vrite access
`to dedicated graphics memory 104 but not to main memory
`102 or dedicated image processor memory 105.
`Certain computer system applications require that data,
`stored in main memory 102 or in one of the dedicated
`memory units 104, 105, he operated upon by a processor
`other than the processor which has access to the memory
`unit
`in which the desired data is stored. Whenever data .
`stored in one particular memory unit is to be processed by
`a designated processor other than the processor which has
`access to that particular memory unit,
`the data must be
`transferred to a memory unit
`for which the designated
`processor has ‘access. For example, certain image processing .
`applications require that data, stored in main memory 102 or
`dedicated graphics memory 104, be processed by the image
`processor 116. Image processing is defined as any function
`{s} that apply to two dimensional blocks of pixels. These
`pixels may be in the format of file system images, fields, or
`frames of video entering the prior art computer system 100
`through video ports, mass storage devices such as
`CD-ROMs, fixed-disk subsystems and Local or Wide Area
`network ports. In order to enable image processor 116 to
`access data stored in main memory 102 or in dedicated
`graphics memory 194, the data must be transferred or copied
`to dedicated image processor memory 105.
`
`_
`
`.
`
`2
`One problem with the prior art computer graphics system
`100 is the cost of high perfomtance peripheral dedicated
`memory systems such as the dedicated graphics memory
`unit 104 and dedicated image processor memory 105.
`Another problem with the prior art computer graphics sys-
`tem 100 is the cost of high performance interconnects for
`multiple memory systems. Another problem with the prior
`art computer graphics system 100 is that the above discussed
`transfers of data between memory units require time and
`processing resources.
`Thus, what is needed is a computer system architecture
`with a single unified memory system which can be shared by
`multiple processors in the computer system without trans-
`ferring data between multiple dedicated memory units.
`SUMMARY OF Tl-IE INVE.N'I'l0N
`
`invention pertains to a computer system
`The present
`providing dynamic memory allocation for graphics. The
`computer system includes a memory controller, a unified
`system memory, and memory clients each having access to
`the system memory via the memory controller. Memory
`clients can include a graphics rendering engine, a central
`processing unit
`(CPU), an image processor, a data
`cttmpretrsionfexpartsittn device, an inputfoutput device, and
`a graphics back end device. In a preferred embodiment, the
`rendering engine and the memory controller are imple-
`mented on a first integrated circuit (first IC) and the image
`processor and the data compressiontexpansion are imple-
`mented on a second 1C. The computer system provides
`readfwrite access to the unified system memory, through the
`memory controller, for each of the memory clients. Trans-
`lation hardware is included for mapping virtual addresses of
`pixel butiers to physical memory locations in the unified
`system memory. Pixel bulfers are dynamically allocated as
`tiles of physically contiguous memory. Translation
`hardware, for mapping the virtual addresses of pixel buffers
`to physical memory locations in the unified system memory.
`is implemented in each of the computational devices which
`are included as memory clients in the computer system.
`In a preferred embodiment, the unified system memory is
`implemented using synchronous DRAM. Also in the pre-
`ferred embodiment, tiles are comprised of 64 kilobytes of
`physically contiguous memory arranged as 138 rows of 138
`pixels wherein each pixel is a 4 byte pixel. However, the
`present invention is aLso well suited to using tiles of other
`sizes. Also in the preferred embodiment, the dynamically
`allocated pixel bulfers are comprised of n3 tiles where n is
`an integer.
`The computer system of the present invention provides
`functional advantages for graphical display and image pro-
`cessing. There are no dedicated memory units in the com-
`puter system of the present invention aside from the unilied
`system memory. Therefore, it
`is not necessary to transfer
`data from one dedicated memory unit to another when a
`peripheral processor is called upon to process data generated
`by the ('.'PU or by another peripheral device.
`BRIEF I)ESCRIP'I'[ON OF THE. DRAWINGS
`
`The present invention is illustrated by way of example,
`and not by way of limitation, in the ligures of the accom-
`panying drawings and in which like reference numerals refer
`to similar elements and in which:
`Prior Art FIG. 1 is a circuit block diagram of a typical
`prior art computer system including peripheral processors
`and associated dedicated memory units.
`FIG. 2A is a circuit block diagram of an exemplary unified
`system memory computer architecture according to the
`present invention.
`
`0015
`
`

`
`6,104,417
`
`3
`FIG. 2B is an internal circuit block diagram of a graphics
`rendering and memory controller [(2 including a memory
`controller (M(_‘}and a graphics renderittg engine integrated
`therein.
`
`is an internal circuit block diagram of the -
`FIG. 2C’
`graphics rendering and memory controller [(1 of FIG. 2B.
`FIG. 3Ais an illustration ofan exemplary tile fordynamic
`allocation of pixel bufl°ers according to the present invention.
`FIG. 313 is an illustration of an exemplary pixel buffer
`comprised of It: tiles according to the present invention.
`FIG. 3C is a block diagram of an address translation
`scheme according to the present invention.
`FIG. 4 is a block diagram ol' a memory controller accord-
`ing to the present invention.
`FIG. 5 is a timing diagram for memory client requests
`issued to the unified system memory according to the
`present invention.
`FIG. 6 is a timing diagram for memory clicnt write data
`according to the present invention.
`FIG. 7 is a timing diagram for memory client read data
`according to the present invention.
`FIG. 8 is a timing diagram for an exemplary write to a
`new page performed by the unified system memory accord-
`ing to the present invention.
`FIG. 9 is a timing diagram for an exemplary read to a new
`page performed by the unified system memory according to
`the present invention.
`FIG. 10 shows external banks of the memory controller _
`according to the present invention.
`FIG. 11 shows a flow diagram for bank state machines
`according to the present invention.
`DETAILED DESCRIPTION OF TI-I1’.
`INVENTION
`
`4
`system 200 includes a unified system memory 204 which is
`shared by various memory system clients including a CPU
`206. a graphics rendering engine 208. an inputfoutput I(_‘
`210, a graphics back end IC 212. an image processor 214.
`data compressionrlexpansion device 215 and a memory con-
`troller 204.
`‘With reference to FIG. 21-}, an exemplary computer sys-
`tem 201, according to the present
`invention.
`is shown.
`Computer system 201 includes the unified system memory
`202 which is shared by various memory system clients
`including the CPU 206. the inputloutput IC 210. the grapltics
`back end IC 212, an image processing and compression and
`expansion IC 216, and a graphics rendering and memory
`controller [C 218. The image processing and eornpressiun
`and expansion IC 216 includes the image processor 214, and
`a data compression and expansion unit 215. GRMC IC 218
`includes the graphics rendering engine {rendering engine)
`208 and the memory controller 204 integrated therein. The
`graphics rendering and memory controller [C218 is coupled
`to unilied system memory 202 via a high bandwidth memory
`data bus (IIBWMD BUS) 225. In a preferred embodiment of
`the present invention. 1-IBWMD BUS 225 includes a demol-
`tiplexer (SD-MUX} 220. a lirst BUS 222 coupled between
`the graphics rendering and memory controller IC 218 and
`SD—MUX 220, and a second bus 224 coupled between
`I‘ SI)-MUX 220 and unified system memory 202.
`In the
`preferred embodiment of the present invention, BUS 222
`includes 144 lines cycled at [33 NIH: and BUS 224 includes
`288 lines cycled at 56 MHz. SI)-MUX 220 demultiplexes
`the 144 lines of BUS 222. which are cycled at 133 MHZ, to
`double the number of lines, 288, of BUS 224, which are
`cycled at half the frequency, 66 MHz. CPU 206 is coupled
`to the graphics rendering and memory controller IC‘ 218 by
`a third bus 226. In the preferred embodiment of the present
`invention, BUS 226 is 64 bits wide and carries signals
`cycled at 100 MHZ. The image processing and compression
`and expansion IC 216 is coupled to BUS 226, by a third bus
`228. In the preferred embodirnent of the present invention,
`BUS 228 is 64 bits wide and carries signals cycled at 101!
`MHZ. The graphics back end IC 212 is coupled to the
`grapltics rendering and memory controller IC 218 by a
`fourth lrius 230. In the preferred embodiment of the present
`invention, BUS 230 is 64 bits wide and carries signals
`cycled at 133 Mllz. The inputloutput IC 210 is coupled to
`the graphics rendering and memory controller IC 218 by a
`fifth bus 232. In the preferred embodiment of the present
`invention, BUS 232 is 32 bits wide and carries signals
`cycled at 133 MI-I2.
`'I'l.te inputfoutput IC 210 of FIG. 2A contains all of the
`inputfoutput interfaces including: keyboard Sr. mouse, inter-
`val timers, serial, parallel. ic, audio, video in & out. and fast
`cthcrnet. The inputr’oLttput IC‘ 210 also contains an interface
`to an external 64-bit PCI expansion bus, BUS 231,
`that
`supports five masters (two SCSI controllers and three expan-
`sion slots).
`With reference to FIG. 2C, an internal circuit block
`diagram is shown of the graphics rendering and memory
`controller IC 218 according to an embodiment of the present
`invention. As previously mentioned. rendering 20 engine
`208 and memory controller 204 are integrated within the
`graphics rendering and memory controller IC 218. The
`graphics rendering and memory controller IC 218 also
`includes a CI’U,lII’Cl:'. interface 238, an inputroutput inter-
`face 240, and a GBE interface 236.
`With reference to FIGS. 2A and 2B. GBE interface 232
`buffers and transfers display data from unified system
`memory 202 to the graphics back end i(_‘ 212 in 16x32-byte
`
`_
`
`In the following detailed klescription of the present
`invention, numerous specific details are set forth in order to
`provide a thorough understanding of the present invention.
`However, it will be obvious to one skilled in the an that the
`present invention may be practiced without these specific
`details. In other instances well known methods. procedures.
`components, and circuits have not been described in detail
`as not
`to unnecessarily obscure aspects of the present
`invention.
`Reference will now be made in detail to the preferred
`embodiments of the present invention, a computer system
`architecture having dynamic memory allocation for
`graphics, examples of which are illustrated in the accompa-
`nying drawings. While the invention will be described in _
`conjunction with the preterred ernbodimerILs,
`it will he
`understood that they are not intended to limit the invention
`to these embodiments. On the contrary,
`the invention is
`intended to cover alternatives. modifications and
`equivalents, which may be included within the spirit and ,_
`scope of the invention as defined by the appended claims.
`Furthermore,
`in the following detailed description of the
`present invention, numerous specilic details are set forth in
`order to provide a thorough understanding of the present
`invention. However, it will be obvious to one of ordinary
`skill in the art that the present invention may be practiced
`without these specilic details. In other instances, well known
`methods, procedures, components, and circuits have not
`been described in detail as not
`to unnecessarily obscure
`aspects of the present invention.
`With reference to FIG. 2A, a computer system 200,
`according to the present
`invention,
`is shown. Computer
`
`0016
`
`

`
`6,104,417
`
`S
`bursts. GBE interface 232 buffers and transfers video cap-
`ture data from the graphics back end [(f 212 to unified
`system memory 202 in 16x32-byte bursts. (iBl:‘. interface
`232 issues GBE interrupts to CPUKIPCE interface 234. BUS
`228, shown in both FIG. 2A and FIG. 2B, couples GBE ‘
`interface 232 to the graphics back end IC 212 (FIG. 2A). The
`inputfoutput interface 236 buffers and transfers data from
`unified system memory 202 to the inputfoutput IC 210 in
`8:-<32-byte bursts. The inputioutput interface 236 buffers and
`transfers data from the inputloutput IC 210 to uni lied system
`memory 202 in 8><32-byte bursts. The inpuuoutput interface
`236 issues the inputfoutput
`IC interrupts to CPUKIPCE
`interface 234. BUS 230, shown in both FIG. 2A and FIG.
`2B, couples the inputroutput
`interface 236 to the input!
`output lC'210(i-‘IG. 2A). Abus, BUS 224, providescoupling
`between CPUIIPCE interface 234 and CPU 206 and the
`image processing and compression and expansion IC 216.
`With reference to FIG. 2A, the memory controller 214 is
`the interface between memory system clients [CPU 206,
`rendering engine 208, inputfoutput IC 210, graphics back
`end IC 212, image processor 214, and data contpressionf
`expansion device 215) and the unified system memory 202.
`As previously mentioned,
`the memory controller 214 is
`coupled to unified system memory 202 via HBWMD BUS
`225 which allows fast transfer of large amounts of data to .
`and from unified system memory 202. Memory clients make
`read and write requests to unified system memory 202
`through the memory controller 214. The memory controller
`214 converts requests into the appropriate control 5 equences
`and passes data between memory clients and unified system _
`memory 202. In the preferred embodiment of the present
`invention, the memory controller 214 contains two pipeline
`structures, one for commands and another for data. The
`request pipe has three stages, arbitration, decode and IJSSIIC.-"
`state machine. The data pipe has only one stage, ECC.
`Requests and data llow through the pipes in the following
`manner. Clients place their requests in a queue. The arbi-
`tration logic looks at all of the requests at the top of the client
`queues and decides which request to start through the pipe.
`From the arbitration stage, the request llows to the decode
`stage. During the decode stage.
`inlorrnation about
`the
`request is collected and passed onto an issuefstate machine
`stage.
`With reference to FIG. 2A, the rendering engine 208 is a
`2-D and 3-D graphics coprocessor which can accelerate
`rasterization.
`In a preferred embodiment of the present
`invention. the rendering engine 208 is also cyclcd at 66 M I-12
`and operates synchronously to the unified system memory
`202. The rendering engine 208 receives rendering param-
`eters from the CPU 206 and renders directly to frame bulfers .
`stored in the unified system memory 202 (FIG. 2A). The
`rendering engine 208 issues memory access requests to the
`memory controller 214. Since the rendering engine 208
`shares the unilied system memory 202 with other memory
`clients, the perforrnancc of the rendering engine 208 will
`vary as a function of the load on the unified system memory
`202. The rendering engine 208 is logically partitioned into
`four major functional uniLs: a host interface, a pixel pipeline,
`:1 memory transfer engine. and a memory request unit. The
`host interface controls reading and writing from the host to
`programming interface registers. The pixel pipeline imple—
`ments a rasterixation and rendering pipeline to a frame
`bulfer. The memory transfer engine performs memory band-
`width byte aligned clears and copies on both linear buffers
`and frame buffers. The memory request unit arbitrates
`between requests from the pixel pipeline and queues up
`memory requests to be issued to the memory controller 214.
`
`__
`
`..
`
`6
`The computer system 200 includes dynamic memory
`allocation of virtual pixel buffers in the unified system
`memory 202. Pixel buffers include frame buffers. texture
`maps, video maps, image buffers, etc. Each pixel buffer can
`include multiple color bull}.-rs, a depth butter, and a stencil
`butfer. In the present invention. pixel butfcrs are allocated in
`units of contiguous memory called tiles and address trans-
`lation bullets are provided for dynamic allocation of pixel
`buffers.
`With reference to FIG. 3A, an illustration is shown of an
`exemplary tile 300 for dynamic allocation of pixel buffers
`according to the present invention. In a preferred embodi-
`ment of the present
`invention, each tile 300 includes I54
`kilobytes of physically contiguous memory. A 64 l-tilobyte
`tilt: size can be comprised of I28xl28 pixels for 32 bit
`pixels, 256x13 pixels for 16 hit pixels. or 512x128 pixels
`for 8 bit pixels. In the present invention, tiles begin on 64
`kilobyte aligned addresses. An integer number of tiles can be
`allocated for each pixel buffer. For example, a ltllxmlpixcl
`bullet and a 256x256 pixel buffer would both require four
`(128-428) pixel tiles.
`With reference to FIG. 3B, an illustration is shown of an
`exemplary pixel buffer 3112 according to the present inven-
`tion. In the computer system 200 of the present invention,
`translation hardware maps virtual atldresscs of pixel l'}1.tfft)l'S
`302 to physical memory locations in unified system memory
`202. Each of the computational units of the computer system
`200 (image processing and compression and expansion IC,
`212, graphics back end IC 212, The inputfoutput IC 210, and
`rendering engine 208) includes translation hardware for
`mapping virtu al addresses of pixel buffers 302 to physical
`memory locations in unilied system memory 202. Each pixel
`hulfer 302 is partitioned into [12 tiles 300, where rt is an
`integer. In a preferred embodiment of the present invention,
`n-4.
`
`The rendering engine 208 supports a frame bufler address
`translation butfer ('l‘l.B)
`to translate frame buffer (x.y)
`addresses into physical memory addresses. This TLB is
`loaded by CPU 206 with the base physical memory
`atldresses of the tiles which compose a color buffer and the
`s1enciI—depth buffer ofa frame buffer. In a preferred embodi-
`ment of the present invention, the frame buffer 'l'LI3 has
`enough entries to hold the tile base physical memory
`addrc.v.<ses of a 2048x2048 pixel color l’t'tJffI.:l‘ and a 2048x
`" 2048 pixel stencil-depth buffer. Therefore, the TLB has 256
`entries for color buffer tiles and 256 entries for stencil-depth
`bulfer tiles.
`
`Tiles provide a convenient unit for memory allocation. By
`allowing tiles to be scattered throughout memory,
`tiling
`makes the amount of memory which must be contiguously
`allocated manageable. Additionally, tiling provides a means
`of reducing the amount of system memory consumed by
`frame buffers. Rendering to tiles which do not contain any
`pixels pertinent for display,
`invisible tiles, can be easily
`clipped out and hence no memory needs to be allocated for
`these tiles. For example, a lt'l24x1(]24 virtual frame buffer
`consisting of front and back RGBA bu|]'ers and a depth
`bulfer would consume 12 Mb of memory if fully resident.
`However, if each 1024x1024 buffer were partitioned into 64
`(128><l28) tiles of which only four tiles contained non-
`occluded pixels, only memory for those visible tiles would
`need to be allocated.
`In this case, only 3 MB would be
`consumed.
`
`invention, memory system clients (e.g.,
`In the present
`CPU 206, rendering engine 208, inputfoutput l(_‘ 2 I0, graph-
`ics back end IC 212,
`image processor 214. and data
`
`0017
`
`

`
`6,104,417
`
`8
`SDRAM components and populated on the front only or the
`front and back side of the DIMM. Two DIMMS are required
`to make an external SDRAM bank. 1Mx1t3 SDRAM com-
`ponents construct a 32 Mbyte external bank, while -’lM><16
`SDRAM components construct a 128 Mbyle external bank.
`unified system memory 202 can range in sin: from 32
`Mhytes to 1 Gbyte.
`FIG. 3C shows a block diagram of an address translation
`scheme according to the present invention. FIG. 4 shows a
`block diagram of the memory controller 204 ot' the present
`invention.
`A memory client interface contains the signals listed in
`Table 1, below.
`
`TABLE 1
`
`Mcmog clicnt interface signals
`CREME
`Pin
`Name
`
`# of
`BiLs
`
`Description
`
`3
`
`'
`
`'
`
`intema
`only
`
`intema
`only
`lt11£1‘ttfl
`onlyinterrtn
`only
`ittterrta
`only
`internal
`only
`irtterrtzt
`only
`internal
`only
`int:-rna
`only
`internzt
`only
`internal
`only
`Enterna
`only
`
`internal
`only
`
`type of request -
`I — rv.-act
`3 - write
`4 - rntw
`addrms of request
`
`ntesaage sent with request
`I - valid
`El ‘ not valid
`t
`- ecc is valid
`(J - ecc not valid
`t - room in client queue
`0 - no room
`I
`- ME‘ is ready for write data
`0 ~ MC‘ not ready for write
`data
`I — valid read data
`0 — not valid read data
`E — enable client driver
`0 - disable client driver
`[cud ntcssnge sent with read
`data:
`-writ: ntessage sent with wt-rdy
`
`memory data from client
`going to unified system
`lIl.GWlDf:y'
`memory ma.-rtt from client
`going
`to unified system memory
`ll
`- write byte
`t - don't write byte
`trterrtntaslt
`in (U) is matched
`with memdamintem _in (7:0)
`and so on.
`memory data from unified
`system rnentnry going to the
`client
`
`Signal
`
`c ientreqcmd
`
`ierttreqadr
`
`ierttreqmsg
`
`* iertu:q.v
`
`ierttrcqecc
`
`ie ntresgnt
`
`ie rttrcs.wrrdy
`
`' ic rttrc5.rdr-:|_\'
`‘ ie rttte.-:.<>e
`
`c ic nlrt.-s.rdrnsg
`
`c ie titres. wrmsg
`memdnta2-
`ment
`in
`me mnta:ak_.in
`
`n1en1dnta2-
`cliertt_ out
`
`With reference to FIG. 5, at timing diagram for memory
`client requests is shown. A memory client makes a request
`to the memory controller 204 by asserting clientreq.valid
`while setting the clientreq.adr, clientreqmsg, clientreq.cmt|
`and clientreq.ecc lines to the appropriate values. if there is
`room in the queue, the request is latched into the memory
`client queue. Only two of the memory clients, the rendering
`engine 208 and the inputfoutput IC 210, use clientreq.msg.
`The message specifies which subsystem within the input!
`output IC 210 or the rendering engine 208 made the request.
`When an error occurs, this message is saved along with other
`pertinent information to aid in the debug process. For the
`rendering engine 208,
`the message is passed through the
`request pipe and returned with other pertinent infomtation to
`
`7
`compressitinlexpansion devitx: 215) share the unilied system
`memory 202. Since each memory system client has access
`to memory shared by each of the other memory system
`clients,
`there is no need for transferring data from one
`dedicated memory unit to another. For example, data can be
`received by the inputfoutput
`IC 210, decomprc.<tsed (or
`expanded) by the data compressionfexpansion device 215,
`and stored in the unified system memory 202. "this data can
`then be accessed by the CPU 206, the rendering engine 208,
`the inputloulput IC 210. the graphics back end IC 212, or the
`image processor 214. As a second example, the CPU 206,
`the rendering engine 208,
`the inpulfoutput
`IC 210,
`the
`graphics back end IC 212, or the image processor 214 can
`use data generated by the CPU 206, the rendering engine
`208, the inputfoutput [C 210, the graphics back end It? 212,
`or the image processor 214. Each ofthe computational units
`(CPU 206, inputfoutpul IC 210, the graphics back end IC
`212, the image processing and compression and expansion
`IC 216, the graphics rendering and memory controller [C
`218, and the data oompressionxexpansion device 215) has
`translation hardware for determining the physical i1t.lElI‘c$.‘-‘:65
`of pixel bullets as is discussed below.
`There are numerous video applications for which the
`present invention computer system 200 provides fimctional
`advantages over prior art computer system architectures.
`These applications range from video conferencing to video
`editing. There is significant variation in the processing
`required for the various applications, but a few processing
`steps are common to all applications: capture,
`littering,
`scaling. compression, blending. and display. In operation of
`computer system 200,
`inputfoutput IC 210 can bring in a
`compressed stream of video data which can be stored into
`unified system memory 202. The inputfoutput IC 210 can
`access the compressed data stored in unified system memory
`220, via a path through the graphics rendering and memory __
`controller IC 218. The inputfoulput IC 210 can then decom-
`press the accessed data and store the decompressed data into
`unified system memory 202. The stored image data can then
`be used. for example. as a texture map by rendering engine
`208 for mapping the stored image onto another image. The
`resultant image can then be stored into a pixel buffer which
`has been allocated dynamically in unified system memory
`202. If the resultant i.rI1age is stored into a frame buffer,
`allocated dynamically in unified system memory 202, then
`the resultant image can be displayed by the graphics back
`end IC 212 or the imag

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket