throbber
USOO876O454B2
`
`(12) United States Patent
`Morein et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,760,454 B2
`*Jun. 24, 2014
`
`(54)
`
`(75)
`
`(73)
`
`(*)
`
`(21)
`(22)
`(65)
`
`(63)
`
`GRAPHICS PROCESSING ARCHITECTURE
`EMPLOYING A UNIFIED SHADER
`
`Inventors: Stephen L. Morein, Cambridge, MA
`(US); Laurent Lefebvre, Lachgnaie
`(CA); Andrew E. Gruber, Arlington,
`MA (US); Andi Skende, Shrewsbury,
`MA (US)
`Assignee: ATI Technologies ULC, Markham,
`Ontario (CA)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`This patent is Subject to a terminal dis
`claimer.
`
`Notice:
`
`Appl. No.: 13/109.738
`Filed:
`May 17, 2011
`
`Prior Publication Data
`US 2011/0216O77 A1
`Sep. 8, 2011
`
`Related U.S. Application Data
`Continuation of application No. 12/791,597, filed on
`Jun. 1, 2010, now abandoned, which is a continuation
`of application No. 1 1/842.256, filed on Aug. 21, 2007,
`now abandoned, which is a continuation of application
`No. 1 1/117,863, filed on Apr. 29, 2005, now Pat. No.
`7.327.369, which is a continuation of application No.
`10/718,318, filed on Nov. 20, 2003, now Pat. No.
`6,897,871.
`
`(51)
`
`(52)
`
`Int. C.
`G06F 5/00
`U.S. C.
`USPC .......................................................... 345/SO1
`
`(2006.01)
`
`(58) Field of Classification Search
`CPC ......... G06T 15/005; G06T 15/80, G06T 1/20;
`G06T 1/60; G09G 5/363; G06F 3/14
`USPC .......................................................... 34.5/501
`See application file for complete search history.
`References Cited
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`5,485,559 A
`5,550,962 A
`
`1/1996 Sakaibara et al.
`8/1996 Nakamura et al.
`(Continued)
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`EP
`EP
`
`3, 2011
`22961-16 A2
`3, 2011
`2299.408 A2
`4/2011
`230946.0 A1
`OTHER PUBLICATIONS
`
`European Patent Office Examination Report; EP Application No.
`04798.938.9; dated Nov. 9, 2006; pp. 1-3.
`(Continued)
`Primary Examiner — Kee MTung
`Assistant Examiner — Frank Chen
`(74) Attorney, Agent, or Firm — Faegre Baker Daniels LLP
`(57)
`ABSTRACT
`A graphics processing architecture in one example performs
`Vertex manipulation operations and pixel manipulation
`operations by transmitting vertex data to a general purpose
`register block, and performing vertex operations on the vertex
`datab a processor unless the general purpose register block
`does not have enough available space therein to store incom
`ing vertex data; and continues pixel calculation operations
`that are to be or are currently being performed the processor
`based on instructions maintained in an instruction store until
`enough registers within the general purpose register block
`become available.
`11 Claims, 5 Drawing Sheets
`
`INDCES
`
`ARBTER
`
`MUX
`
`4.
`6
`
`6
`
`65 -62 |
`
`UNIFE
`SHADER
`
`4 To MMORY
`68
`
`MEMORY
`
`8.
`
`85
`
`89A
`
`CACHE
`
`69
`
`78
`
`RENDER
`BACK
`EMD
`
`77
`
`F8
`
`MMORY
`controller
`
`PARAMTER FOA
`CACHE
`
`POSITION
`CACHE
`
`OB
`
`71
`prMITW
`ASSEMBLY
`
`73
`
`DISPLAY
`CONTROLLER
`
`rasterization
`ENGINE
`
`
`
`8
`
`8
`
`84
`
`82
`
`DISPLAY
`
`MMORY
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 1 of 11
`
`

`

`US 8,760,454 B2
`Page 2
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,818,469
`6,118,452
`6,353,439
`6,384,824
`6,417,858
`6,573,893
`6,650,327
`6,650,330
`6,697,074
`6,704,018
`6,724,394
`6,731.289
`6,809,732
`6,864,893
`6,897,871
`6,980,209
`7,015,909
`7,015,913
`7,038,685
`7,239,322
`7,327,369
`7,646,817
`7,742,053
`7,746,348
`2003/003 0643
`2003/0076320
`2003. O164830
`2004/0041814
`2004O164987
`2005.0068325
`2005/02O0629
`2007/0222785
`2007/0222786
`2007/0222787
`2007/0285427
`2010.0156915
`2010/0231592
`
`A
`A
`B1
`B1
`B1
`B1
`B1
`B2
`B2
`B1
`B1
`B1
`B2
`B2
`B1
`B1
`B1
`B1
`B1
`B2
`B2
`B2
`B2
`B2
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`A1
`
`10, 1998
`9, 2000
`3, 2002
`5, 2002
`T/2002
`6, 2003
`11, 2003
`11, 2003
`2, 2004
`3, 2004
`4, 2004
`5, 2004
`10, 2004
`3, 2005
`5/2005
`12, 2005
`3, 2006
`3, 2006
`5, 2006
`7/2007
`2, 2008
`1, 2010
`6, 2010
`6, 2010
`2, 2003
`4, 2003
`9, 2003
`3, 2004
`8, 2004
`3, 2005
`9, 2005
`9, 2007
`9/2007
`9, 2007
`12, 2007
`6, 2010
`9, 2010
`
`Lawless et al.
`Gannett
`Lindholm et al.
`Morgan et al.
`Bosch et al.
`Naqvi et al.
`Airey et al.
`Lindholm et al.
`Parikh et al. .......
`Mori et al.
`Zatz et al.
`Peercy et al.
`Zatz et al.
`Zatz
`Morein et al.
`Donham et al.
`Morgan, III et al.
`Lindholm et al.
`Lindholm ..........
`Lefebvre et al.
`Morein et al.
`Shen et al.
`Lefebvre et al.
`Lefebvre et al.
`Taylor et al.
`Collodi
`Kent
`Wyatt et al.
`Aronson et al.
`Lefebvre et al.
`Morein et al.
`Lefebvre et al.
`Lefebvre et al.
`Lefebvre et al.
`Morein et al.
`Lefebvre et al.
`Morein et al.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`. 345,522
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`. 345.426
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`. 345,501
`
`375,240.25
`
`OTHER PUBLICATIONS
`
`Purcell, Timothy J. et al.; Ray Tracing on Programmable Graphics
`Hardware; SIGGRAPH '02; San Antonio, TX; ACM Transactions on
`Graphics; Jul. 2002; vol. 21, No. 3; pp. 703-712.
`Mark, William R. et al.; CG: A System for programming graphics
`hardware in a C-like language; SIGGRAPH '03; San Diego, CA;
`ACM Transactions on Graphics; Jul. 2002; vol. 22, No. 3; pp. 896
`907.
`Breternitz, Jr., Mauricio et al.; Compilation, Architectural Support,
`and Evaluation of SIMD Graphics Pipeline Programs on a General
`Purpose CPU: IEEE: 2003; pp. 1-11.
`International Search Report and Written Opinion; International
`Application No. PCT/IB2004/003821; dated Mar. 22, 2005.
`EP Supplemental Search Report; EP Application No. 10075688.1;
`dated Feb. 25, 2011.
`EP Supplemental Search Report; EP Application No. 10075686.5;
`dated Feb. 25, 2011.
`EP Supplemental Search Report; EP Application No. 10075687.3;
`dated Feb. 25, 2011.
`EP Supplemental Search Report; EP Application No. 10075685.7;
`dated Feb. 25, 2011.
`Eldridge, Matthew et al.; Pomegranate: A Fully Scalable Graphics
`Architecture; Computer Graphics, SIGGRAPH 2000 Conference
`Proceedings; Jul. 23, 2000.
`Owens, John D. et al.; Polygon Rendering on a Stream Architecture;
`Proceedings 2000 SIGGRAPH/Eurographics Workshop on Graphics
`Hardware; Aug. 21, 2000.
`Chinese Office Action; Chinese Application No. 2004800405708;
`dated Sep. 2008.
`Chinese Office Action; Chinese Application No. 2004800405708;
`dated Nov. 2009.
`Chinese Office Action; Chinese Application No. 2004800405708;
`dated Sep. 2010.
`* cited by examiner
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 2 of 11
`
`

`

`U.S. Patent
`
`Jun. 24, 2014
`
`Sheet 1 of 5
`
`US 8,760,454 B2
`
`
`
`Z ||
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 3 of 11
`
`

`

`U.S. Patent
`
`Jun. 24, 2014
`
`Sheet 2 of 5
`
`US 8,760,454 B2
`
`FIG. 2A
`(PRIOR ART)
`
`
`
`FIG. 2B
`(PRIOR ART)
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 4 of 11
`
`

`

`U.S. Patent
`
`Jun. 24, 2014
`
`Sheet 3 of 5
`
`US 8,760,454 B2
`
`55
`
`MEMORY
`
`----------- 4.
`TEXTURE
`
`MAP
`
`41
`
`43
`
`VERTEX FETCH
`
`V-CACHE
`
`44
`
`42
`
`40
`
`45
`
`VERTEX
`SHADER
`
`46
`
`VERTEX
`STORE
`
`48
`
`47
`
`49
`
`PRIMITIVE
`ASSEMBLY
`
`50
`
`51
`
`RASTERIZATION 52
`ENGINE
`
`TO
`57
`
`55
`
`FROM
`57
`
`TEXTURE
`CACHE 58
`56
`
`53
`
`PXEL
`SHADER
`
`54
`
`59
`
`FIG. 3
`(PRIOR ART)
`
`POST RASTER
`PROCESSING
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 5 of 11
`
`

`

`U.S. Patent
`
`Jun. 24, 2014
`
`Sheet 4 of 5
`
`US 8,760,454 B2
`
`64
`
`63
`
`6O
`
`
`
`INDICES
`
`65
`
`UNIFIED
`SHADER
`
`RENDER
`BACK
`END
`
`MEMORY
`CONTROLLER
`
`
`
`DISPLAY
`CONTROLLER
`
`TO MEMORY
`7-68
`TEXTURE
`VERTEX
`CACHE
`
`MEMORY
`DATA
`
`PARAMETER
`CACHE
`
`POSITION
`CACHE
`
`
`
`71
`
`PRIMITIVE
`ASSEMBLY
`
`73
`
`RASTERIZATION
`ENGINE
`
`75
`
`8
`
`84
`
`82
`
`DISPLAY
`
`MEMORY
`
`FIG. 4A
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 6 of 11
`
`

`

`U.S. Patent
`
`Jun. 24, 2014
`
`Sheet 5 of 5
`
`US 8,760,454 B2
`
`INDICES
`
`VERTEX
`CACHE
`
`FIG. 4B
`
`61A
`
`61B
`
`
`
`FROMMUX
`
`MEMORY
`FETCH
`67
`
`CONSTANTS
`
`SOURCE B
`
`SOURCEC
`
`9
`
`96A
`
`CPU
`
`(SCALER)
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 7 of 11
`
`

`

`1.
`GRAPHICS PROCESSING ARCHITECTURE
`EMPLOYING A UNIFIED SHADER
`
`US 8,760,454 B2
`
`RELATED APPLICATIONS
`
`10
`
`This application is a continuation of co-pending U.S. appli
`cation Ser. No. 12/791,597, filed Jun. 1, 2010, entitled
`“GRAPHICS PROCESSING ARCHITECTURE EMPLOY
`ING A UNIFIED SHADER, having as inventors Steven
`Morein et al., owned by instant assignee and is incorporated
`herein by reference, which is a continuation of co-pending
`U.S. application Ser. No. 1 1/842.256, filed Aug. 21, 2007,
`entitled “GRAPHICS PROCESSING ARCHITECTURE
`EMPLOYING A UNIFIED SHADER, having as inventors
`Steven Morein et al., owned by instant assignee and is incor
`porated herein by reference, which is a continuation of U.S.
`application Ser. No. 1 1/117,863, filed Apr. 29, 2005, which
`has issued into U.S. Pat. No. 7,327,369, entitled “GRAPHICS
`PROCESSING ARCHITECTURE EMPLOYING A UNI
`FIED SHADER, having as inventors Steven Morein et al.,
`and owned by instant assignee and is incorporated herein by
`reference which is a continuation of U.S. application Ser. No.
`10/718,318, filed on Nov. 20, 2003, which has issued into
`U.S. Pat. No. 6,897,871, entitled “GRAPHICS PROCESS
`25
`ING ARCHITECTURE EMPLOYING A UNIFIED
`SHADER, having as inventors Steven Morein et al., and
`owned by instant assignee and is incorporated herein by ref
`CCC.
`
`15
`
`FIELD OF THE INVENTION
`
`The present invention generally relates to graphics proces
`sors and, more particularly, to a graphics processor architec
`ture employing a single shader.
`
`BACKGROUND OF THE INVENTION
`
`30
`
`35
`
`40
`
`55
`
`In computer graphics applications, complex shapes and
`structures are formed through the sampling, interconnection
`and rendering of more simple objects, referred to as primi
`tives. An example of Such a primitive is a triangle, or other
`Suitable polygon. These primitives, in turn, are formed by the
`interconnection of individual pixels. Color and texture are
`then applied to the individual pixels that comprise the shape
`based on their location within the primitive and the primitives
`45
`orientation with respect to the generated shape; thereby gen
`erating the object that is rendered to a corresponding display
`for Subsequent viewing.
`The interconnection of primitives and the application of
`color and textures to generated shapes are generally per
`50
`formed by a graphics processor. Conventional graphics pro
`cessors include a series of shaders that specify how and with
`what corresponding attributes, a final image is drawn on a
`screen, or suitable display device. As illustrated in FIG. 1, a
`conventional shader 10 can be represented as a processing
`block 12 that accepts a plurality of bits of input data, such as,
`for example, object shape data (14) in object space (x,y,z);
`material properties of the object, such as color (16); texture
`information (18); luminance information (20); and viewing
`angle information (22) and provides output data (28) repre
`senting the object with texture and other appearance proper
`ties applied thereto (x, y, z).
`In exemplary fashion, as illustrated in FIGS. 2A-2B, the
`shader accepts the vertex coordinate data representing cube
`30 (FIG. 2A) as inputs and provides data representing, for
`example, a perspectively corrected view of the cube 30' (FIG.
`2B) as an output. The corrected view may be provided, for
`
`60
`
`65
`
`2
`example, by applying an appropriate transformation matrix to
`the data representing the initial cube 30. More specifically, the
`representation illustrated in FIG. 2B is provided by a vertex
`shader that accepts as inputs the data representing, for
`example, vertices V, V, and V., among others of cube 30 and
`providing angularly oriented vertices V,'.V. and V', includ
`ing any appearance attributes of corresponding cube 30'.
`In addition to the vertex shader discussed above, a shader
`processing block that operates on the pixel level, referred to as
`a pixel shader is also used when generating an object for
`display. Generally, the pixel shader provides the color value
`associated with each pixel of a rendered object. Convention
`ally, both the vertex shader and pixel shader are separate
`components that are configured to perform only a single
`transformation or operation. Thus, in order to perform a posi
`tion and a texture transformation of an input, at least two
`shading operations and hence, at least two shaders, need to be
`employed. Conventional graphics processors require the use
`of both a vertex shader and a pixel shader in order to generate
`an object. Because both types of shaders are required, known
`graphics processors are relatively large in size, with most of
`the real estate being taken up by the vertex and pixel shaders.
`In addition to the real estate penalty associated with con
`ventional graphics processors, there is also a corresponding
`performance penalty associated therewith. In conventional
`graphics processors, the vertex shader and the pixel shader are
`juxtaposed in a sequential, pipelined fashion, with the vertex
`shader being positioned before and operating on vertex data
`before the pixel shader can operate on individual pixel data.
`Thus, there is a need for an improved graphics processor
`employing a shader that is both space efficient and computa
`tionally effective.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The present invention and the associated advantages and
`features thereof, will become better understood and appreci
`ated upon review of the following detailed description of the
`invention, taken in conjunction with the following drawings,
`where like numerals represent like elements, in which:
`FIG. 1 is a schematic block diagram of a conventional
`shader;
`FIGS. 2A-2B are graphical representations of the opera
`tions performed by the shader illustrated in FIG. 1;
`FIG. 3 is a schematic block diagram of a conventional
`graphics processor architecture;
`FIG. 4A is a schematic block diagram of a graphics pro
`cessor architecture according to the present invention;
`FIG. 4B is a schematic block diagram of an optional input
`component to the graphics processor according to an alternate
`embodiment of the present invention; and
`FIG. 5 is an exploded schematic block diagram of the
`unified shader employed in the graphics processor illustrated
`in FIG. 4A.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`Briefly stated, the present invention is directed to a graph
`ics processor that employs a unified shader that is capable of
`performing both the vertex operations and the pixel opera
`tions in a space saving and computationally efficient manner.
`In an exemplary embodiment, a graphics processor according
`to the present invention includes an arbiter circuit for select
`ing one of a plurality of inputs for processing in response to a
`control signal; and a shader, coupled to the arbiter, operative
`to process the selected one of the plurality of inputs, the
`shader including means for performing vertex operations and
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 8 of 11
`
`

`

`3
`pixel operations, and wherein the shader performs one of the
`Vertex operations or pixel operations based on the selected
`one of the plurality of inputs.
`The shader includes a general purpose register block for
`storing at least the plurality of selected inputs, a sequencer for
`storing logical and arithmetic instructions that are used to
`perform vertex and pixel manipulation operations and a pro
`cessor capable of executing both floating point arithmetic and
`logical operations on the selected inputs according to the
`instructions maintained in the sequencer. The shader of the
`present invention is referred to as a “unified” shader because
`it is configured to perform both vertex and pixel operations.
`By employing the unified shader of the present invention, the
`associated graphics processor is more space efficient than
`conventional graphics processors because the unified shader
`takes up less real estate than the conventional multi-shader
`processor architecture.
`In addition, according to the present invention, the unified
`shader is more computationally efficient because it allows the
`shader to be flexibly allocated to pixels or vertices based on
`workload.
`Referring now to FIG. 3, illustrated therein is a graphics
`processor incorporating a conventional pipeline architecture.
`As shown, the graphics processor 40 includes a vertex fetch
`block 42 which receives vertex information relating to a
`primitive to be rendered from an off-chip memory 55 on line
`41. The fetched vertex data is then transmitted to a vertex
`cache 44 for storage online 43. Upon request, the vertex data
`maintained in the vertex cache 44 is transmitted to a vertex
`shader 46 on line 45. As discussed above, an example of the
`information that is requested by and transmitted to the vertex
`shader 46 includes the object shape, material properties (e.g.
`color), texture information, and viewing angle. Generally, the
`Vertex shader 46 is a programmable mechanism which
`applies a transformation position matrix to the input position
`information (obtained from the vertex cache 44), thereby
`providing data representing a perspectively corrected image
`of the object to be rendered, along with any texture or color
`coordinates thereof.
`After performing the transformation operation, the data
`representing the transformed vertices are then provided to a
`vertex store 48 on line 47. The vertex store 48 then transmits
`the modified vertex information contained therein to a primi
`tive assembly block 50 on line 49. The primitive assembly
`block 50 assembles, or converts, the input vertex information
`into a plurality of primitives to be Subsequently processed.
`Suitable methods of assembling the input vertex information
`into primitives is known in the art and will not be discussed in
`greater detail here. The assembled primitives are then trans
`mitted to a rasterization engine 52, which converts the previ
`ously assembled primitives into pixel data through a process
`referred to as walking. The resulting pixel data is then trans
`mitted to a pixel shader 54 on line 53.
`The pixel shader 54 generates the color and additional
`appearance attributes that are to be applied to a given pixel,
`and applies the appearance attributes to the respective pixels.
`In addition, the pixel shader 54 is capable of fetching texture
`data from a texture map 57 as indexed by the pixel data from
`the rasterization engine 52 by transmitting such information
`on line 55 to the texture map. The requested texture data is
`then transmitted back from the texture map 57 on line 57 and
`stored in a texture cache 56 before being routed to the pixel
`shader online 58. Once the texture data has been received, the
`pixel shader 54 then performs specified logical or arithmetic
`operations on the received texture data to generate the pixel
`color or other appearance attribute of interest. The generated
`pixel appearance attribute is then combined with a base color,
`
`10
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 8,760,454 B2
`
`4
`as provided by the rasterization engine on line 53, to thereby
`provide a pixel color to the pixel corresponding at the position
`of interest. The pixel appearance attribute present online 59 is
`then transmitted to post raster processing blocks (not shown).
`As described above, the conventional graphics processor
`40 requires the use of two separate shaders: a vertex shader 46
`and a pixel shader 54. A drawback associated with such an
`architecture is that the overall footprint of the graphics pro
`cessor is relatively large as the two shaders take up a large
`amount of real estate. Another drawback associated with con
`ventional graphics processor architectures is that can exhibit
`poor computational efficiency.
`Referring now to FIG. 4A, in an exemplary embodiment,
`the graphics processor 60 of the present invention includes a
`multiplexer 66 having vertex (e.g. indices) data provided at a
`first input thereto and interpolated pixel parameter (e.g. posi
`tion) data and attribute data from a rasterization engine 74
`provided at a second input. A control signal generated by an
`arbiter 64 is transmitted to the multiplexer 66 on line 63. The
`arbiter 64 determines which of the two inputs to the multi
`plexer 66 is transmitted to a unified shader 62 for further
`processing. The arbitration scheme employed by the arbiter
`64 is as follows: the vertex data on the first input of the
`multiplexer 66 is transmitted to the unified shader 62 on line
`65 if there is enough resources available in the unified shader
`to operate on the vertex data; otherwise, the interpolated pixel
`parameter data present on the second input will be passed to
`the unified shader 62 for further processing.
`Referring briefly to FIG. 5, the unified shader 62 will now
`be described. As illustrated, the unified shader 62 includes a
`general purpose register block 92, a plurality of source reg
`isters: including source register A93, source register B95,
`and source register C 97, a processor (e.g. CPU) 96 and a
`sequencer 99. The general purpose register block 92 includes
`sixty four registers, or available entries, for storing the infor
`mation transmitted from the multiplexer 66 on line 65 or any
`other information to be maintained within the unified shader.
`The data present in the general purpose register block 92 is
`transmitted to the plurality of source registers via line 109.
`The processor 96 may be comprised of a dedicated piece of
`hardware or can be configured as part of a general purpose
`computing device (i.e. personal computer). In an exemplary
`embodiment, the processor 96 is adapted to perform 32-bit
`floating point arithmetic operations as well as a complete
`series of logical operations on corresponding operands. As
`shown, the processor is logically partitioned into two sec
`tions. Section 96 is configured to execute, for example, the
`32-bit floating point arithmetic operations of the unified
`shader. The second section, 96A, is configured to perform
`scaler operations (e.g. log, exponent, reciprocal square root)
`of the unified shader.
`The sequencer 99 includes constants block 91 and an
`instruction store 98. The constants block 91 contains, for
`example, the several transformation matrices used in connec
`tion with vertex manipulation operations. The instruction
`store 98 contains the necessary instructions that are executed
`by the processor 96 in order to perform the respective arith
`metic and logic operations on the data maintained in the
`general purpose register block 92 as provided by the source
`registers 93-95. The instruction store 98 further includes
`memory fetch instructions that, when executed, causes the
`unified shader 62 to fetch texture and other types of data, from
`memory 82 (FIG. 4A). In operation, the sequencer 99 deter
`mines whether the next instruction to be executed (from the
`instruction store 98) is an arithmetic or logical instruction or
`a memory (e.g. texture fetch) instruction. If the next instruc
`tion is a memory instruction or request, the sequencer 99
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 9 of 11
`
`

`

`5
`sends the request to a fetch block (not shown) which retrieves
`the required information from memory 82 (FIG. 4A). The
`retrieved information is then transmitted to the sequencer 99.
`through the vertex texture cache 68 (FIG. 4A) as described in
`greater detail below.
`If the next instruction to be executed is an arithmetic or
`logical instruction, the sequencer 99 causes the appropriate
`operands to be transferred from the general purpose register
`block 92 into the appropriate source registers (93.95,97) for
`execution, and an appropriate signal is sent to the processor
`96 on line 101 indicating what operation or series of opera
`tions are to be executed on the several operands present in the
`source registers. At this point, the processor 96 executes the
`instructions on the operands present in the source registers
`and provides the result online 85. The information present on
`line 85 may be transmitted back to the general purpose reg
`ister block 92 for storage, or transmitted to Succeeding com
`ponents of the graphics processor 60.
`As discussed above, the instruction store 98 maintains both
`Vertex manipulation instructions and pixel manipulation
`instructions. Therefore, the unified shader 99 of the present
`invention is able to perform both vertex and pixel operations,
`as well as execute memory fetch operations. As such, the
`unified shader 62 of the present invention is able to perform
`both the vertex shading and pixel shading operations on data
`in the context of a graphics controller based on information
`passed from the multiplexer. By being adapted to perform
`memory fetches, the unified shader of the present invention is
`able to perform additional processes that conventional vertex
`shaders cannot perform; while at the same time, perform pixel
`operations.
`The unified shader 62 has ability to simultaneously per
`form vertex manipulation operations and pixel manipulation
`operations at various degrees of completion by being able to
`freely switch between Such programs or instructions, main
`tained in the instruction store 98, Very quickly. In application,
`Vertex data to be processed is transmitted into the general
`purpose register block 92 from multiplexer 66. The instruc
`tion store 98 then passes the corresponding control signals to
`the processor 96 on line 101 to perform such vertex opera
`tions. However, if the general purpose register block 92 does
`not have enough available space therein to store the incoming
`vertex data, such information will not be transmitted as the
`arbitration scheme of the arbiter 64 is not satisfied. In this
`manner, any pixel calculation operations that are to be, or are
`currently being, performed by the processor 96 are continued,
`based on the instructions maintained in the instruction store
`98, until enough registers within the general purpose register
`block 92 become available. Thus, through the sharing of
`resources within the unified shader 62, processing of image
`data is enhanced as there is no down time associated with the
`processor 96.
`Referring back to FIG. 4A, the graphics processor 60 fur
`ther includes a cache block 70, including a parameter cache
`70A and a position cache 70B which accepts the pixel based
`output of the unified shader 62 on line 85 and stores the
`respective pixel parameter and position information in the
`corresponding cache. The pixel information present in the
`cache block 70 is then transmitted to the primitive assembly
`block 72 on line 71. The primitive assembly block 72 is
`responsible for assembling the information transmitted
`thereto from the cache block 70 into a series of triangles, or
`other suitable primitives, for further processing. The
`assembled primitives are then transmitted on line 73 to ras
`terization engine block 74, where the transmitted primitives
`are then converted into individual pixel data information
`through a walking process, or any other Suitable pixel gen
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 8,760,454 B2
`
`10
`
`15
`
`6
`eration process. The resulting pixel data from the rasteriza
`tion engine block 74 is the interpolated pixel parameter data
`that is transmitted to the second input of the multiplexer 66 on
`line 75.
`In those situations when vertex data is transmitted to the
`unified shader 62 through the multiplexer 66, the resulting
`vertex data generated by the processor 96, is transmitted to a
`render back end block 76 which converts the resulting vertex
`data into at least one of several formats suitable for later
`display on display device 84. For example, if a stained glass
`appearance effect is to be applied to an image, the information
`corresponding to such appearance effect is associated with
`the appropriate position data by the render back end 76. The
`information from the renderbackend 76 is then transmitted to
`memory 82 and a display controller line 80 via memory
`controller 78. Such appropriately formatted information is
`then transmitted on line 83 for presentation on display device
`84.
`Referring now to FIG. 4B, shown therein is a vertex block
`61 which is used to provide the vertex information at the first
`input of the multiplexer 66 according to an alternate embodi
`ment of the present invention. The vertex block 61 includes a
`vertex fetch block 61A which is responsible for retrieving
`Vertex information from memory 82, if requested, and trans
`mitting that vertex information into the vertex cache 61B. The
`information stored in the vertex cache 61B comprises the
`vertex information that is coupled to the first input of multi
`plexer 66.
`As discussed above, the graphics processor 60 of the
`present invention incorporates a unified shader 62 which is
`capable of performing both vertex manipulation operations
`and pixel manipulation operations based on the instructions
`stored in the instruction store 98. In this fashion, the graphics
`processor 60 of the present invention takes up less real estate
`than conventional graphics processors as separate vertex
`shaders and pixel shaders are no longer required. In addition,
`as the unified shader 62 is capable of alternating between
`performing vertex manipulation operations and pixel
`manipulation operations, graphics processing efficiency is
`enhanced as one type of data operations is not dependent upon
`another type of data operations. Therefore, any performance
`penalties experienced as a result of dependent operations in
`conventional graphics processors are overcome.
`The above detailed description of the present invention and
`the examples described therein have been presented for the
`purposes of illustration and description. It is therefore con
`templated that the present invention cover any and all modi
`fications, variations and equivalents that fall within the spirit
`and scope of the basic underlying principles disclosed and
`claimed herein.
`What is claimed is:
`1. A method carried out by a unified shader comprising:
`performing vertex manipulation operations and pixel
`manipulation operations by transmitting vertex data to a
`general purpose register block, and performing vertex
`operations on the vertex data by a processor within the
`unified shader unless the general purpose register block
`does not have enough available space therein to store
`incoming vertex data; and
`continuing pixel calculation operations that are to be or are
`currently being performed by the processor based on
`instructions maintained in an instruction store until
`enough registers within the general purpose register
`block become available.
`2. A unified shader, comprising:
`a general purpose register block for maintaining data;
`a processor unit;
`
`Realtek Ex. 1001
`Case No. IPR2023-00922
`Page 10 of 11
`
`

`

`7
`a sequencer, coupled to the general purpose register block
`and the processor unit, the sequencer maintaining
`instructions operative to cause the processor unit to
`execute vertex calculation and pixel calculation opera
`tions on selected data maintained in the general purpose
`register block; and
`wherein the processor unit executes instructions that gen
`erate a pixel color in response to selected data from the
`general purpose register block and generates vertex
`position and appearance data in response to selected data
`from the general purpose register block.
`3. A unified shader comprising:
`a processor unit operative to perform vertex calculation
`operations and pixel calculation operations; and
`shared resources, operatively coupled to the processor unit;
`the processor unit operative to use the shared resources for
`either vertex data or pixel information and operative to
`perform pixel calculation operations until enough
`shared resources become available and then use the
`shared resources to perform vertex calculation opera
`tions.
`4. A unified shader comprising:
`a processor unit operative to perform vertex calculation
`operations and pixel calculation operations; and
`shared resources, operatively coupled to the processor unit;
`the processor unit operative to use the shared resources for
`either vertex data or pixel information and operative to
`perform vertex calculation operations until enough
`shared resources become available and then use the
`shared resources to perform pixel calculation opera
`t1OnS.
`
`10
`
`15
`
`25
`
`30
`
`US 8,760,454 B2
`
`8
`5. A unified shader comprising:
`a processor unit;
`a sequencer coupled to the processor unit, the sequencer
`maintaining instructions operative to cause the proces
`Sor unit to execute vertex calculation and pixel calcula
`tion operations on sel

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket