throbber
USOO7633506B1
`
`(12) United States Patent
`Leather et a].
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,633,506 B1
`Dec. 15, 2009
`
`(54) PARALLEL PIPELINE GRAPHICS SYSTEM
`
`6,344,852 B1
`
`2/2002 Zhu et al.
`
`(75) I
`nventors:
`
`CA (Us)
`M kM L h S
`ar
`. eat er, aratoga,
`;
`Eric Demers’ Palo Alto, CA (Us)
`
`(73) Assignee; ATI Technologies ULC, Markham,
`Ontario (CA)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 709 days.
`
`(21) Appl. N0.: 10/724,3s4
`
`(22) Filed:
`
`N0“ 26 2003
`’
`Related US. Application Data
`(60) Provisional application No. 60/429,976, ?led on Nov.
`27s 2002
`
`(51) Int. Cl.
`(2006.01)
`G06T1/20
`(52) us. Cl. ...................... .. 345/506; 345/505; 345/519
`(58) Field of Classi?cation Search ............... .. 345/506,
`345/505, 519
`See application ?le for complete search history
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,885,703 A * 12/1989 Deering .................... .. 345/422
`5,179,640 A
`1/1993 Duffy
`5,550,962 A
`8/1996 Nakamura et al.
`5,745,118 A
`4/1998 Alcorn et al.
`5,794,016 A
`8/1998 Kelleher
`5,818,469 A 10/1998 Lawless et al.
`5,905,506 A
`5/1999 Hamburg
`5,977,997 A * 11/1999 Vainsencher .............. .. 345/519
`5,999,196 A * 12/1999 Storm et a1. .............. .. 345/506
`6,118,452 A
`9/2000 Gannett
`6,184,906 B1
`2/2001 Wang et a1.
`6,219,062 B1 *
`4/2001 Matsuo et a1. ............ .. 345/426
`
`6,222,550 B1* 4/2001 Rosman et a1. . . . . .
`
`. . . .. 345/419
`
`6,292,200 B1 *
`6,323,860 B1
`
`9/2001 Bowen et a1. ............. .. 345/506
`11/2001 Zhu et a1.
`
`3/2002 Lindholm et al.
`6,353,439 B1
`4/2002 Heeschen et al.
`6,380,935 B1
`5/2002 Morgan et al.
`6,384,824 B1
`6,407,736 B1* 6/2002 Regan ...................... .. 345/422
`6,417,858 B1
`7/2002 Bosch et al.
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`Akeley K. et a1., “High-Performance Polygon Rendering”, ACM
`Computer Graphics vol. 22 No. 4, 1988, pp. 239-246.*
`(Continued)
`_
`Primary ExamineriChante Harrison
`Assistant ExamineriMichelle K Lay
`(74) Attorney, Agent, or FirmiVedder Price P.C.
`
`(57)
`
`ABSTRACT
`
`The present invention relates to a parallel pipeline graphics
`system. The parallel pipeline graphics system includes a
`back-end con?gured to receive primitives and combinations
`of primitives (i.e., geometry) and process the geometry to
`produce values to place in a frame buffer for rendering on
`screen. Unlike prior single pipeline implementation, some
`embodiments use two or four parallel pipelines, though other
`con?gurations having 20n pipelines may be used. When
`geometry data is sent to the back-end, it is divided up and
`provided to one of the parallel pipelines. Each pipeline is a
`component of a raster back-end, where the display screen is
`divided into tiles and a de?ned portion of the screen is sent
`through a pipeline that owns that portion of the screen’s tiles.
`In one embodiment, eachpipeline comprises a scan converter,
`a hierarchical-Z unit, a Z buffer logic, a rasterizer, a shader,
`and a color buffer logic.
`
`21 Claims, 15 Drawing Sheets
`
`Tmnsbmed
`Venlees
`
`Raltenzimn
`Plpeline A
`520
`
`Rasterlza?nn
`Pipeline a
`
`MEDIATEK, Ex. 1001, Page 1
`
`

`

`US 7,633,506 B1
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`6,424,345
`6,557,083
`6,570,579
`6,573,893
`6,636,232
`6,650,327
`6,650,330
`6,697,063
`6,714,203
`6,724,394
`6,731,289
`6,753,878
`6,762,763
`6,778,177
`6,791,559
`6,801,203
`6,809,732
`6,864,893
`6,864,896
`6,897,871
`6,980,209
`7,015,913
`7,061,495
`7,170,515
`2002/0145612
`2003/0076320
`2003/0164830
`
`7/2002
`4/2003
`5/2003
`6/2003
`10/2003
`11/2003
`11/2003
`2/2004
`3/2004
`4/2004
`5/2004
`6/2004
`7/2004
`8/2004
`9/2004
`10/2004
`10/2004
`3/2005
`3/2005
`5/2005
`12/2005
`3/2006
`6/2006
`1/2007
`10/2002
`4/2003
`9/2003
`
`Smith et al.
`Sperber et al. ............ .. 711/144
`MacInnis et al.
`Naqvi et al.
`Larson
`Airey et al.
`Lindholm et al.
`Zhu ......................... .. 345/421
`Morgan et al.
`ZatZ et al.
`Peercy et al.
`Heirich et al.
`Migdal et al.
`Furtner ..................... .. 345/544
`
`Baldwin ................... .. 345/557
`Hussain
`ZatZ et al.
`ZatZ
`Perego
`Morein et a1.
`Donham et al. ........... .. 345/426
`Lindholm et al.
`Leather
`Zhu ......................... .. 345/422
`
`Blythe et al. .............. .. 345/581
`Collodi
`
`Kent ........................ .. 345/505
`
`2004/0041814 A1
`2004/0100471 A1
`2004/0164987 A1
`2005/0068325 A1
`2005/0200629 A1
`
`3/2004 Wyatt et al.
`5/2004 Leather et al.
`8/2004 Aronson et al.
`3/2005 Lefebvre et al.
`9/2005 Morein et a1.
`
`OTHER PUBLICATIONS
`
`Elias, Hugo; Polygon Scan Converting; from http://freespace.virgin.
`net/hugo.elias/graphics/Xipolysc.htrn; pp. 1-7; Jul. 26, 2005*
`BreternitZ, Jr., Mauricio et al.; Compilation, Architectural Support,
`and Evaluation of SIMD Graphics Pipeline Programs on a General
`Purpose CPU; IEEE; 2003; pp. 1-11.
`International Search Report for PCT Patent Application PCT/
`IB2004/003821 dated Mar. 22, 2005.
`European Search Report from European Patent Of?ce; European
`Application No. 032574642; dated Apr. 4, 2006.
`Foley, James et al.; Computer Graphics, Principles and Practice;
`Addison-Wesley Pubiishing Company; 1990; pp. 873-899.
`Crockett, Thomas W.; An introduction to parallel rendering; Eisevier
`Science B.V.; 1997; pp. 819-843.
`Montrym, John S. et al.; In?niteReality: A Real-Time Graphics Sys
`tem; Silicon Graphics Computer Systems; 1997; pp. 293-302.
`Humphreys, Greg et al.; .WireGL: A Scalable Grpahics System for
`Ciusters; ACM Siggraph; 2001; pp. 129-140.
`Fuchs, Henry et al.; Pixel-Planes 5: A Heterogeneous Multiprocessor
`Graphics System Using Processor-Enhanced Memories; Computer
`Graphics; vol. 23, No. 3; Jul. 1989; pp. 79-88.
`
`* cited by examiner
`
`MEDIATEK, Ex. 1001, Page 2
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 1 0f 15
`
`US 7,633,506 B1
`
`95.55
`
`\ 2:
`
`
`
`02 EEO m2ng
`
`><AmmHD
`
`mUSmO A
`o3
`
`\ szmE
`
`/ 02
`
`o: EASE; J m:
`
`gm 2: L 96 929E moigo
`
`mmEDm My: A|| mzoioa?mé
`
`bimzmwmo
`
`Q:
`
`MEDIATEK, Ex. 1001, Page 3
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 2 0f 15
`
`US 7,633,506 B1
`
`FRONT-END RECEIVES
`GRAPHICS INSTRUCTIONS
`AND OUTPUTS GEOMETRY
`200
`
`I
`
`BACK-END OBTAINS
`GEOMETRY AS INPUT
`210
`
`FIGURE 2
`
`I
`
`DETERMINE WHICH
`PIPELINES TO USE TO
`PROCESS THE GEOMETRY
`220
`
`I
`
`USE APPROPRIATE PIPELINES
`TO OPERATE ON THE
`GEOMETRY
`230
`
`I
`
`NUMERICAL VALUES ASSOCIATED
`WITH THE PIXELS THAT DEFINE THE
`GEOMETRY ARE PLACED IN A FRAME
`BUFFER
`240
`
`MEDIATEK, Ex. 1001, Page 4
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 3 0f 15
`
`US 7,633,506 B1
`
`#0
`
`%
`ohm W/
`
`szgué /.
`
`
`
`
`
`com “EEO WUQEAQJWG
`
`m
`
`Edi,
`
`H
`
`c
`
`MS“;
`WAC,
`
`com
`
`/
`
`I
`I
`
`m EEQE
`
`R
`
`
`
`111 MZDHFUDMEmh/m
`
`wU m» m <m U
`
`m H m
`
`MEDIATEK, Ex. 1001, Page 5
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 4 0f 15
`
`US 7,633,506 B1
`
`FRONT-END RECEIVES
`GRAPHICS INSTRUCTIONS
`AND OUTPUTS GEOMETRY
`400
`
`I
`
`BACK-END OBTAIN S
`GEOMETRY AS INPUT
`410
`
`FIGURE 4
`
`I
`
`DETERIVIINE WIHCH
`PIPELINES OWN WHICH
`PORTION OF THE GEOMETRY
`420
`
`I
`
`USE APPROPRIATE PIPELINES
`TO OPERATE ON THE
`GEOMETRY
`430
`
`I
`
`NUMERICAL VALUES ASSOCIATED
`WITH THE PIXELS THAT DEFINE THE
`GEOMETRY ARE PLACED IN A 256 BIT
`FRAME BUFFER
`440
`
`MEDIATEK, Ex. 1001, Page 6
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 5 0f 15
`
`US 7,633,506 B1
`
`lFIIGlURE 5
`
`Transformed
`Vertices
`
`510
`Graphics
`Assembly
`
`530
`L____
`
`$515
`8 qtp
`“'
`
`To other
`Rasterization
`Pmennes(as
`needed)
`0 >
`
`Rasterization
`Pipeline B
`
`Rasterization
`Pipeline A
`
`0
`
`“ii
`
`\I‘
`
`“
`
`h
`
`'
`
`Hierarchical
`i Z Interface
`
`550
`
`w W
`
`Early
`
`2 Interface
`
`Buffer
`Logic
`555
`
`Scan
`Converter
`54o
`
`T.
`
`liE
`Rasterizer
`(paragneter
`interplclater)
`560
`
`Unit
`
`Color
`Buffer I,
`
`Logic ‘
`590
`
`MEDIATEK, Ex. 1001, Page 7
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 6 6f 15
`
`US 7,633,506 B1
`
`Bounding box
`
`vertex 2
`
`Vertex 0
`
`FIGURE 6
`
`MEDIATEK, Ex. 1001, Page 8
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 7 0f 15
`
`US 7,633,506 B1
`
`0
`Q
`Q
`E
`
`asepem Jalsg?aa
`
`MEDIATEK, Ex. 1001, Page 9
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 8 0f 15
`
`US 7,633,506 B1
`
`l ____________________________________ _ _ 1
`
`:
`|
`|
`|
`
`I
`|
`g
`
`|
`l
`I
`I
`|
`|
`|
`|
`|
`l
`I
`
`I
`:
`I
`l
`:
`
`1/0
`
`819
`
`<———-———-
`
`KEYggARD ___—~>
`------
`
`_________________________
`
`.
`
`i
`i
`
`PROCESSOR
`<1 ------ --
`1
`
`813
`
`I
`2
`
`5
`
`MASS STORAGE
`------------ --,>
`11 ----- --
`812
`=
`I
`i
`i
`i
`1
`I
`
`MOUSE
`811
`
`V1DI;(1)6AMP
`+
`CRT
`817
`
`I
`
`VIDEsqlVIEM
`
`818
`
`K
`
`=
`E
`MAIN
`3
`____________ H? MEMORY
`
`COMNI
`INT <—----—
`820
`A
`
`801
`
`315
`
`l
`
`'
`,,,,,,,,,,,,,,,,,,,,,,,,, L
`
`_____ __
`
`| |
`|
`|
`I
`
`'
`|
`|
`g
`
`l
`l
`'
`|
`i
`I
`I
`I
`I
`I
`l
`:
`
`SERVER
`
`826
`
`I
`l
`:
`I
`|
`
`____________________________________ _ _|
`
`800
`
`NETWORK LINK 21
`8
`
`LOCAL
`NETWOR
`K
`822
`
`INTERNET
`825
`
`ISP
`824
`
`FIGURE 8
`
`HOST
`823
`
`MEDIATEK, Ex. 1001, Page 10
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 9 0f 15
`
`US 7,633,506 B1
`
`RASTERIZER
`
`1110
`
`TEXTURE
`UNFF
`
`b UNHHED
`SHADER
`
`1130
`
`1
`
`1100
`
`FRAME
`BUFFER
`
`1120
`
`FIGURE 9
`
`MEDIATEK, Ex. 1001, Page 11
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 10 0f 15
`
`US 7,633,506 B1
`
`1200
`Rasterizer (rs)
`
`Output FIFO/
`Formatter
`
`rame
`
`FIGURE 10
`
`Control
`1244
`
`constant
`inst
`
`
`
`Texture Unit
`
`MEDIATEK, Ex. 1001, Page 12
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 11 0f 15
`
`US 7,633,506 B1
`
`1350
`
`1320
`
`1300 \
`1310
`tex to;
`\ dp3 t4'rltolt1;
`dp3 t4 _g, to, Q;
`dp3 t4 .b,tO,t3;
`
`\ t t4_
`1325
`ex
`'
`\ dp3 t5.r,t0,t1;
`1330
`dp3 t5 ' 9" to’tz‘
`\ dp3 t5.b,t0,t3;
`teX t5;
`mad t0 , t5 , r0 , r1;
`
`1340
`
`LEVEL 0 TEXTURE
`/ INSTRUCTIONS
`LEVELOALU
`/INSTRUCTIONS
`
`LEVEL 1 TEXTURE
`INSTRUCTIONS
`
`INLSE‘IYIELIIST'I‘OLIQIS
`
`*\ LEVEL 2 TEXTURE
`INSTRUCTIONS
`
`LEVEL 2 ALU
`INSTRUCTIONS
`
`FIGURE 11
`
`MEDIATEK, Ex. 1001, Page 13
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 12 0f 15
`
`US 7,633,506 B1
`
`from Rasterizer
`1400
`
`to ALU's
`
`Instruction
`Store
`
`to SRAM's
`Read Addr
`Addr
`
`Input
`Machine
`
`1410
`
`1450 w
`
`from "ix"
`
`1430 E
`
`1445 1435
`
`Level 0
`Tex Machine)
`
`Level 0
`ALU Machine
`
`Level 1
`.
`Tex Machlne)
`
`Level 1
`ALU Machine
`
`Level 2
`Tex Machine
`
`texture
`command
`
`Level 2
`ALU Machine
`
`Level 3
`Tex Machine
`
`Level 3
`ALU Machine
`
`Arbit
`
`1486
`
`Output
`Machine
`
`to output
`formatter
`
`1485
`
`FIGURE 12
`
`MEDIATEK, Ex. 1001, Page 14
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 13 0f 15
`
`US 7,633,506 B1
`
`1520
`
`Rasterizer (rs)
`
`rc
`
`—— 288
`
`-- 288
`
`V
`iag
`
`V
`
`1510
`Shader
`Subsystem
`side door
`
`+
`eglster
`R
`subs“ _
`3
`
`A
`
`
`
`Global Register Load Bus
`
`
`
`
`
`cc 139x
`—————L
`—— 256
`
`Frame Buffer (cb)
`
`1550
`
`1500
`
`FIGURE 13
`
`MEDIATEK, Ex. 1001, Page 15
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 14 0f 15
`
`US 7,633,506 B1
`
`
`
`,1, mEDQE
`
`, x
`
`26 r :2 i?uzmm
`
`TI! 4i
`
`3: 5. Q8 50..
`
`Li I... L l lll I?!
`
`
` p r X F _ i l! A: {I ..r a...
`
`o 5 _‘ oow v 0N0 w
`1.
`
`
`
`
`
`omwr V 23x8. “55:: 35:: Six:
`
`awzumw 3M3 33$ 35%; 822w “HM”.
`
`
`
`
`
`5 8 8 @—
`
`4... 1| 1| L1
`
`2 U.- u- u. U» E J
`
`_
`
`it Air 1 .1! |
`
`MEDIATEK, Ex. 1001, Page 16
`
`

`

`US. Patent
`
`Dec. 15, 2009
`
`Sheet 15 0f 15
`
`US 7,633,506 B1
`
`cm W
`phase OX1X2X3XOX1X2X3XOX1XZX3XDX1X2X3XOX1)
`
`FIGURE 15
`
`MEDIATEK, Ex. 1001, Page 17
`
`

`

`US 7,633,506 B1
`
`1
`PARALLEL PIPELINE GRAPHICS SYSTEM
`
`This application claims priority to US. Provisional Appli
`cation No. 60/429,976, ?led Nov. 27, 2002.
`This is a related application to a co-pending US. patent
`application entitled “DIVIDING WORK AMONG MUL
`TIPLE GRAPHICS PIPELINES USINGA SUPER-TILING
`TECHNIQUE”, having Ser. No. 10/459,797, ?led Jun. 12,
`2003, having Leather et al. as the inventors, owned by the
`same assignee and hereby incorporated by reference in its
`entirety.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates computer graphics chips.
`Portions of the disclosure of this patent document contain
`material that is subject to copyright protection. The copyright
`owner has no objection to the facsimile reproduction by any
`one of the patent document or the patent disclosure as it
`appears in the Patent and Trademark Of?ce ?le or records, but
`otherwise reserves all copyright rights whatsoever.
`2. Background Art
`Computer systems are often used to generate and display
`graphics on an output device such as a monitor. When com
`plex and realistic graphics are desired there are often addi
`tional components, or chips, that are added to the computer
`system to assist it with the complex instruction processing
`that it must perform to render the graphics to the screen.
`Graphics chips may be considered as having a front-end and
`a back-end. The front-end typically receives graphics instruc
`tions and generates “primitives” that form the basis for the
`back-end’s work. The back-end receives the primitives and
`performs the operations necessary to send the data to a frame
`buffer where it will eventually be rendered to the screen. As
`will be further described below, graphics chip back-ends are
`currently inadequate. Before further discussing this problem,
`an overview of a graphics system is provided.
`Graphics System
`Display images are made up of thousands of tiny dots,
`where each dot is one of thousands or millions of colors.
`These dots are known as picture elements, or “pixels”. Each
`pixel has multiple attributes associated with it, including a
`color and a texture which is represented by a number value
`stored in the computer system. A three dimensional display
`image, although displayed using a two dimensional array of
`pixels, may in fact be created by rendering of a plurality of
`graphical objects. Examples of graphical objects include
`points, lines, polygons, and three dimensional solid objects.
`Points, lines, and polygons represent rendering primitives
`which are the basis for most rendering instructions. More
`complex structures, such as three dimensional objects, are
`formed from a combination or mesh of such primitives. To
`display a particular scene, the visible primitives associated
`with the scene are drawn individually by determining those
`pixels that fall within the edges of the primitive, and obtaining
`the attributes of the primitive that correspond to each of those
`pixels. The obtained attributes are used to determine the dis
`played color values of applicable pixels.
`Sometimes, a three dimensional display image is formed
`from overlapping primitives or surfaces. A blending function
`based on an opacity value associated with each pixel of each
`primitive is used to blend the colors of overlapping surfaces or
`layers when the top surface is not completely opaque. The
`?nal displayed color of an individual pixel may thus be a
`blend of colors from multiple surfaces or layers.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`In some cases, graphical data is rendered by executing
`instructions from an application that is drawing data to a
`display. During image rendering, three dimensional data is
`processed into a two dimensional image suitable for display.
`The three dimensional image data represents attributes such
`as color, opacity, texture, depth, and perspective information.
`The draw commands from a program drawing to the display
`may include, for example, X andY coordinates for the verti
`ces of the primitive, as well as some attribute parameters for
`the primitive (color and depth or “Z” data), and a drawing
`command. The execution of drawing commands to generate a
`display image is known as graphics processing.
`Graphics Processing Chips
`When complex graphics processing is required, such as
`using primitives to as a basis for rendering instructions or
`texturing geometric patterns, graphics chips are added to the
`computer system. Graphics chips are speci?cally designed to
`handle the complex and tedious instruction processing that
`must be used to render the graphics to the screen. Graphics
`chips have a front-end and a back-end. The front-end typi
`cally receives graphics instructions and generates the primi
`tives or combination of primitives that de?ne geometric pat
`terns.
`The primitives are then processed by the back end where
`they might be textured, shaded, colored, or otherwise pre
`pared for ?nal output. When the primitives have been fully
`processed by the back end, the pixels on the screen will each
`have a speci?c number value that de?nes a unique color
`attribute the pixel will have when it is drawn. This ?nal value
`is sent to a frame buffer in the back-end, where the value is
`used at the appropriate time.
`Modern graphics processing chip back-ends are equipped
`to handle three-dimensional data, since three-dimensional
`data produces more realistic results to the screen. When pro
`ces sing three-dimensional data, memory bandwidth becomes
`a limitation on performance. The progression of graphics
`processing back-ends has been from a 32 bit system, to a 64
`bit system, and to a 128 bit system. Moving to a 256 bit
`system, where 512 bits may be processed in a single logic
`clock cycle, presents problems. In particular, the ef?cient
`organization and use of data “words” with a 256 bit wide
`DDR frame buffer is problematic because the granularity is
`too coarse. Increasing the width of the frame buffer to 256 bits
`requires innovations in the input and output (I/ O) system used
`by the graphics processing back-end.
`
`SUMMARY OF THE INVENTION
`
`The present invention relates to a parallel array graphics
`system. In one embodiment, the parallel array graphics sys
`tem includes a back-end con?gured to receive primitives and
`combinations of primitives (i.e., geometry) and process the
`geometry to produce values to place in a frame buffer for
`eventual rendering on a screen. In one embodiment, the
`graphics system includes two parallel pipelines. When data
`representing the geometry is presented to the back-end of the
`graphics chip, it is divided into data words and provided to
`one or both of the parallel pipelines.
`In some embodiments, fourparallel pipelines or otherpipe
`line con?gurations having 2An pipelines may be used. Each
`pipeline is a component of a raster back-end, where the dis
`play screen is divided into tiles and a de?ned portion of the
`screen (i.e., one or more tiles) is sent through a pipeline that
`owns that portion of the screen’s tiles.
`In one embodiment, each parallel pipeline comprises a
`raster back-end having a scan converter to step through the
`geometric patterns passed to the back-end, a “hierarchical-Z”
`
`MEDIATEK, Ex. 1001, Page 18
`
`

`

`US 7,633,506 B1
`
`3
`component to more precisely de?ne the borders of the geom
`etry, a “Z-buffer” for performing three-dimensional opera
`tions on the data, a rasterizer for computing texture addresses
`and color components for a pixel, a uni?ed shader for com
`bining multiple characteristics for a pixel and outputting a
`single value, and a color buffer logic unit for taking the
`incoming shader color and blending it into the frame buffer
`using the current frame buffer blend operations. A plurality of
`FIFO (First-In, First-Out) units are used to balance load
`among the pipelines.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`These and other features, aspects and advantages of the
`present invention will become better understood with regard
`to the following description, appended claims and accompa
`nying drawings where:
`FIG. 1 is a parallel pipeline graphics system architecture
`according to an embodiment of the present invention.
`FIG. 2 is a ?owchart showing the operation of a parallel
`pipeline graphics system according to an embodiment of the
`present invention.
`FIG. 3 is a parallel pipeline graphics system architecture
`according to another embodiment of the present invention.
`FIG. 4 is a ?owchart showing the operation of a parallel
`pipeline graphics system according to another embodiment of
`the present invention.
`FIG. 5 is a raster back-end portion of a pipeline according
`to another embodiment of the present invention.
`FIG. 6 is a bounding box illustrating an embodiment of the
`invention.
`FIG. 7 shows an apparatus for synchronizing graphics data
`and state according to an embodiment of the present inven
`tion.
`FIG. 8 is an embodiment of a computer execution environ
`ment suitable for the present invention.
`FIG. 9 is a block diagram of a uni?ed shader according to
`an embodiment of the present invention.
`FIG. 10 shows a uni?ed shader architecture according to an
`embodiment of the present invention.
`FIG. 11 shows how shader code is partitioned according to
`an embodiment of the present invention.
`FIG. 12 shows how control logic is used according to an
`embodiment of the present invention.
`FIG. 13 shows a register subsystem according to an
`embodiment of the present invention.
`FIG. 14 shows a multiple shader system according to an
`embodiment of the present invention.
`FIG. 15 shows anALU according to an embodiment of the
`present invention.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`The invention relates to a parallel pipeline graphics system.
`In the following description, numerous speci?c details are set
`forth to provide a more thorough description of embodiments
`of the invention. It will be apparent, however, to one skilled in
`the art, that the invention may be practiced without these
`speci?c details. In other instances, well known features have
`not been described in detail so as not to obscure the invention.
`Parallel Array Graphics System
`One embodiment of the present invention is shown in the
`block diagram of FIG. 1. Graphics processing chip 100 com
`prises a front-end 110 and a back end 120. The front-end 110
`receives graphics instructions 115 as input and generates
`geometry 116 as output. The back-end 120 is used to process
`the geometry 116 it receives as input. For instance, the back
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`end 120 might operate by texturing, shading, scanning, col
`oring, or otherwise preparing a pixel for ?nal output.
`When the geometry 116 has been fully processed by back
`end 120, the pixels on the screen will each have a speci?c
`number value that de?nes a unique color attribute the pixel
`will have when it is drawn. The number values are passed to
`a frame buffer 130 where they are stored for use at the appro
`priate time, for instance, when they are rendered on display
`device 160. Back-end 120 includes two parallel pipelines,
`designated pipeline 140 and pipeline 150. When data repre
`senting the geometry is presented to the back-end 120 of the
`graphics chip, it is divided into data words and provided to
`one or both of the parallel pipelines 140 and 150.
`FIG. 2 provides a ?owchart showing the operation of the
`architecture of FIG. 1 according to an embodiment of the
`present invention. At step 200 a graphics chip front-end
`receives graphics instructions as input and generates geom
`etry as output. At step 210, a graphics chip back-end obtains
`the geometry as input. Next, it is determined which pipelines
`to use to operate on the geometry at step 220. At step 230 the
`appropriate pipelines operate on the geometry, for instance,
`the pipelines might texture, shade, scan, color, or otherwise
`preparing the geometry for ?nal output. Then, at step 240, the
`numerical values that are associated with the pixels that
`de?ne the geometry are put into a frame buffer. The size of the
`frame buffer may vary.
`In another embodiment, 2 or more pipelines are used and
`eachpipeline is a component of a rasterback-end. The display
`screen is divided into tiles and a de?ned portion of the screen
`is sent (i.e., one or more tiles) through a pipeline that owns
`that portion of the screen’ s tiles. This embodiment is shown in
`FIG. 3. Graphics processing chip 300 comprises a front-end
`310 and a back-end 320. The front-end 310 receives graphics
`instructions 315 as input and generates geometry 316 as out
`put. The back-end 320 is used to process the geometry 316 it
`receives as input. For instance, the back-end 320 might oper
`ate by texturing, shading, scanning, coloring, or otherwise
`preparing a pixel for ?nal output.
`When the geometry 316 has been fully processed by back
`end 320, the pixels on the screen will each have a speci?c
`number value that de?nes a unique color attribute the pixel
`will have when it is drawn. The number values are passed to
`a frame buffer 330 where they are stored for use at the appro
`priate time, for instance, when they are rendered on display
`device 360. Back-end 320 includes 2An parallel pipelines,
`designated pipeline 0 through pipeline n—l. When data rep
`resenting the geometry is presented to the back-end 320 of the
`graphics chip 300, it is analyzed by back-end 320 to deter
`mine which geometry (or portions of geometry) fall within a
`given tile. For instance, if pipeline 0 owns tile 0 on display
`device 360, then the geometry in tile 0 is passed to pipeline 0.
`FIG. 4 provides a ?owchart showing the operation of the
`architecture of FIG. 3 according to an embodiment of the
`present invention. At step 400 a graphics chip front-end
`receives graphics instructions as input and generates geom
`etry as output. At step 410, a graphics chip back-end obtains
`the geometry as input. Next, at step 420 the back-end analyzes
`the geometry to determine which pipeline owns which por
`tion of the geometry, for instance if a geometry falls within
`two tiles, then the geometry processing is divided among the
`pipelines that own those tiles. At step 430 the appropriate
`pipelines operate on the geometry, for instance, the pipelines
`might texture, shade, scan, color, or otherwise preparing the
`geometry for ?nal output. Then, at step 440, the numerical
`values that are associated with the pixels that de?ne the geom
`etry are put into a frame buffer.
`Embodiment of a Back-End Graphics Chip
`
`MEDIATEK, Ex. 1001, Page 19
`
`

`

`US 7,633,506 B1
`
`5
`In one embodiment, each parallel pipeline comprises a
`raster back-end having a scan converter to step through the
`geometric patterns passed to the back-end, a “hierarchical-Z”
`component to more precisely de?ne the borders of the geom
`etry, a “Z-buffer” for performing three-dimensional opera
`tions on the data, a rasterizer for computing texture addresses
`and color components for a pixel, a uni?ed shader for com
`bining multiple characteristics for a pixel and outputting a
`single value, and a color buffer logic unit for taking the
`incoming shader color and blending it into the frame buffer
`using the current frame buffer blend operations.
`In operation, graphics assembly unit 510 takes transformed
`vertices data and assembles complete graphics primitivesi
`triangles or parallelograms, for instance. A set-up unit 515
`receives the data output from graphics assembly 510 and
`generates slope and initial value information for each of the
`texture address, color, or Z parameters associated with the
`primitive. The resulting set-up information is passed to 2 or
`more identical pipelines. In the current example there are two
`pipelines, pipeline 520 and pipeline 525, but the present
`invention contemplates any con?guration of parallel pipe
`lines. In this example, each pipeline 520 and 525 owns one
`half of the screens pixels. Allocation of work between the
`pipelines is made based on a repeating square pixel, tile
`pattern. In one embodiment, logic 530 in the set-up unit 515
`intersects the graphics primitives with the repeating tile pat
`tern such that a primitive is only sent to a pipeline if it is likely
`that it will result in the generation of covered pixels. The
`functionality of a setup unit is further described in commonly
`owned co-pending US. patent application entitled “Scalable
`Rasterizer Interpolator”, with Ser. No. 10/730,864, ?led Dec.
`8, 2003, and is hereby fully incorporated by reference.
`In one embodiment of the present invention, the set-up unit
`manages the distribution of polygons to the pipelines. As
`noted above, the display is divided into multiple tiles and each
`pipeline is responsible for a subset of the tiles. It should be
`noted that any number of square or non-square tiles can be
`used in the present invention.
`A polygon can be composed of 1, 2, or 3 vertices. Vertices
`are given by the graphics application currently executing on a
`host system. The vertices are converted from object space
`3-dimensional homogeneous coordinate system to a display
`(screen) based coordinate system. This conversion can be
`done on the host processor or in a front end section of the
`graphics chip (i.e. vertex transformation). The screen based
`coordinate system has at least X andY coordinates for each
`vertex. The set-up unit 515 creates a bounding box based on
`the screen space X, Y coordinates of each vertex as shown in
`FIG. 6. The bounding box is then compared against a current
`tile pattern. The tiling pattern is based on the number of
`graphics pipelines currently active. For example, in a two (A
`and B) pipeline system, the upper left and lower right pixel
`tiles of a four tile quad are assigned to pipeline A and the
`upper right and lower left tiles to pipeline B (or vice versa). In
`a single pipeline system, all tiles are assigned to pipelineA. In
`one embodiment, the setup unit computes initial value (at
`vertex 0) and slopes for each of up to 42 parameters associated
`with the current graphics primitive.
`The bounding boxes’ four comers are mapped to the tile
`pattern, simply by discarding the lower bits of X &Y. The four
`corners map to the same or different tiles. If they all map to the
`same tile, then only the pipeline that is associated with that
`tile receives the polygon. If it maps to only tiles that are
`associated with only one pipeline, then again only that pipe
`line receives the polygon. In one embodiment, if it maps to
`tiles that are associated with multiple pipelines, then the
`entire polygon is sent to all pipelines.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`Each pipeline contains an input FIFO 535 used to balance
`the load over different pipelines. A scan converter 540 works
`in conjunction with Hierarchical Z interface of Z buffer logic
`555 to step through the geometry (e.g., triangle or parallelo
`gram) within the bounds of the pipeline’s tile pattern. In one
`embodiment, initial stepping is performed at a coarse level.
`For each of the coarse level tiles, a minimum (i.e., closest) Z
`value is computed. This is compared with the farthest Z value
`for the tile stored in a hierarchical-Z buffer 550. If the com
`pare fails, the tile is rejected. The functionality of the scan
`converter and Hierarchical Z interface is further described in
`commonly owned co-pending U. S. patent application entitled
`“Scalable Rasterizer Interpolator”, with Ser. No. 10/730,864,
`?led Dec. 8, 2003, and is hereby fully incorporated by refer
`ence.
`The second section of the scan converter 540 works in
`conjunction with the Early Z interface of the Z buffer logic
`550 to step through the coarse tile at a ?ne level. In one
`embodiment, the coarse tile is subdivided into 2><2 regions
`(called “quads”). For each quad, coverage and Z (depth)
`information is computed. A single bit mode register speci?es
`where Z buffering takes place. If the current Z buffering mode
`is set to “early”, each quad is passed to the Z buffer 555 where
`its Z values are compared against the values stored in the Z
`buffer at that location. Z values for those coveredpixels which
`“pass” the Z compare, are written back into the Z buffer, and
`a modi?ed coverage mask describing the result of the Z
`compare test is passed back to the scan converter 540. At this
`stage, those quads for which none of the covered pixels
`passed the Z compare test are discarded. The early Z func
`tionality attempts to minimize the amount of work applied by
`the uni?ed shader and texture unit to quads which are not
`visible. The functionality of the scan converter and Early Z
`interface is further described in commonly owned co -pending
`US. patent application entitled “Scalable Rasterizer Interpo
`lator”, with Ser. No. 10/730,864, ?led Dec. 8, 2003, and is
`hereby fully incorporated by reference.
`Rasterizer 560 computes up to multiple sets of 2D or 3D
`perspective correct texture addresses and colors for each
`quad. The time taken to transfer data for each quad depends
`on the total number of texture addresses and colors required
`by that quad.
`A uni?ed shader 570 works in conjunction with the texture
`unit 585 and applies a programmed sequence of instructions
`to the rasterized values. These instructions may involve
`simple mathematical functions (add, multiply, etc.) and may
`also involve requests to the texture unit. A un

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket