throbber
llllll||l||||llllllllllllll||l|||l|l|l|||||||||llllllllllllllllllllllllllll
`USOO_5394524A
`Patent Number:
`Date of Patent;
`
`5,394,524
`
`Feb: 28, 1995
`
`[11]
`
`[45]
`
`FOREIGN PATENT DOCUMENTS
`W038/09539 12/1988 WIPO . ... .
`...,. ..... G06F 15/72
`W089/01664 2/1989 WIPO . . .. .
`..... ..... G06F 12/02
`
`
`
`Primary Examiner—Marl< R. Powell
`Assistant Exarm'rzer—Kee M. Tung
`Attorney, Agent, or Firm——Wi]liam A. Kinnaman; Duke
`W. Yee; Andrew J. Dillon
`[57]
`ABSTRACT
`In a graphics subsystem, a highly interactive two-di-
`mensional (2D) data stream and a computationally in-
`tensive three-dimensional (3D) data stream are pro-
`cessed concurrently in such a manner that processing of
`the 2D data stream is not held up by processing of the
`3D data stream. A 3D geometry subsystem having a
`parallel pipeline architecture is used to process the 3D
`data stream, while a 2D subsystem concurrently pro-
`cessed the 2D data stream in parallel with the 3D sub-
`system. A reordering device couples the processed 2D
`and 3D data streams to a common raster subsystem. The
`reordering device, which contains an internal buffer,
`reorders any order—dependent elements of the 3D data
`stream appearing at the output of the 3D geometry
`subsystem in an order different from the order in which
`they were supplied to the input end. The reordering
`device prioritizes the 2D data stream relative to the 3D
`data stream so that elements of the 2D data stream
`arriving from the 2D subsystem are passed to the raster
`subsystem almost immediately, without having to wait
`for elements of the 3D data stream.
`
`16 Claims, 6 Drawing Sheets
`
`United States Patent
`DiNicola et al.
`
`[193
`
`[54]
`
`[75]
`
`METHOD AND APPARATUS FOR
`PROCESSING TWO GRAPHICS DATA
`STREAMS IN PARALLEL
`
`Inventors: Paul D. DiNico1a, Hurley; Joseph C.
`Kantz, Saugerties; Omar M. Rahim,
`Kingston; David A. Rice, New Paltz;
`Edward M. Ruddick, Woodstock, all
`of N.Y.
`
`[73]
`
`Assignee:
`
`International Business Machines
`Corporation, Armonk, N.Y.
`
`Appl. No.: 983,455
`
`Filed:
`
`Nov. 30, 1992
`
`Related U.S. Application Data
`Continuation-in-part of Ser. No. 926,724, Aug. 7, 1992,
`Pat. No. 5,315,701.
`
`Int. Cl.5
`U.S. C1.
`Field of Searc
`395/650,
`
`.................................. G06F 3/14
`.................. .. 395/163
`5/119, 1
`, 141, 162-164,
`345/24, 112, 133, 204, 214;
`364/200 MS File, 900 MS File, 228
`References Cited
`U.S. PATENT DOCUMENTS
`4,550,386 10/1985 Hirosawa et al.
`4,737,921
`4/1983 Goldwasser et al
`395/163
`4,987,550
`1/1991 Leonard et a1.
`. 395/150
`5,045,995
`9/1991 Levinthal et al
`364/200
`5,136,593
`8/1992 Rice
`364/DIG. 1
`
`
`
`[21]
`
`[22]
`
`[63]
`
`[51]
`[52]
`[53]
`
`[56]
`
`—— ADDRESS/DATA BUS
`> --
`- COMMUNICAHONS PATH
`
`V TO SYSTEM BUS
`
`
`
`
`ROSTER SUBSYSTEM
`
`330
`
` i 325
`
`
`
`REORDERING DEVICE
`
`328
`
`0001
`
`Volkswagen 1007
`
`0001
`
`Volkswagen 1007
`
`

`
`U.S. Patent
`
`Feb. 28, 1995
`
`Sheet 1 of 6
`
`5,394,524
`
`1:!=*
`
`/
`
`52
`
`0002
`
`

`
`U.S. Patent
`
`Sheet 2 of 6
`
`5,394,524
`
`
`
`mezozo_mz<$m,K~_<z<,_n_zwrfim
`
`
`
`
`
`
`E.:oEzooE3528Ede”:zoo:5.
`
`xmaimammtefiomozo
`wmjofizooEE
`
`mno
`
`EjoEz8
`
`Eozmz
`
`dj<m<._$5ox<om>mx
`
`
`@5528~_£w___s7_oo éfiuoo§m_8E
`
`$_
`
`mam2m.~m>mE=%m:z_
`
`S88Ejofizoo
`
`0003
`
`

`
`U.S. Patent
`
`Feb. 28, 1995
`
`Sheet 3 of 6
`
`5,394,524
`
`_.I2wmwz
`
`nllluIII.II-I'll.
`
`....
`
`
`
`.....\\............%in
`
`
` DAKSEmzoEa_z2§8_.:<:5.
`ma.<2a\m$Eo<m:55Smém
`newV‘
`
`
`
`EnNE.
`
`mam
`
`m_oE~Ez_
`
`mam55%.E
`
`
`
`mo_>mooz_%e_8m
`
`zfimémzm$58
`
`0004
`
`

`
`U.S. Patent
`
`Feb. 28, 1995
`
`Sheet 4 of 6
`
`5,394,524
`
`3D PORT
`
`1
`
`IN
`
`END TAG
`
`DATA
`
`DATA
`
`DATA
`
`SEQ.NO.
`
`DATA
`
`DATA
`
`DATA
`
`SEQ.N0.
`
`END TAG
`
`DATA
`
`DATA
`
`DATA
`
`DATA
`
`DATA
`
`DATA
`
`DATA
`
`DATA
`
`DATA IN
`2D pQRT
`
`NEW SEQ.NO.
`OR END TAG
`
`DATA IN
`SELECTED 3D
`PORT AND
`NOT IN 2D PORT
`
`\
`
`PROCESS
`2D
`PORT
`
`PROCESS
`SELECTED 3D
`PORT
`
`0005
`
`

`
`U.S. Patent
`
`Feb. 23, 1995
`
`Sheet 5 of 6
`
`5,394,524
`
`wmAuzE
`CURRENT SEQUENCE
`NUMBER{CUR_SEQ_.NUM)
`
`702
`
`CHECK BOTTOM OF EACH
`ENABLED FIFO
`
`IS THERE DATA*
`AT THE BOTTOM OF
`ANY FIFO
`9
`
`1: DATA AS OPPOSED T0
`SEQUENCE NUMBERS. END TAGS.
`OR EMPTY FIFOS
`
`708
`
`N0
`
`712
`
`SELECT LOWEST SEQUENCE
`NUMBER FROM BOTTOM
`WORDS OF FIFOS
`
`YES. wA1T FOR NEW
`SEQ NUMBER
`
`ARE
`ANY nros
`EMSTY
`
`N0
`
`724
`
`ERROR CONDITTON
`
`726
`
`END
`
`YES (ORDER INDEPENDENT
`WORK)
`
`NO (ORDER DEPENDENT
`WORK)
`LOWEST §EQUENCE
`NUMBER
`CUR_SEQ_NUM+1
`?
`
`720
`
`INCREMENT CUR_SEQ._.NUM
`
`E|.G_.ZA
`
`0006
`
`

`
`U.S. Patent
`
`Feb. 28, 1995
`
`Sheet 6 of 5
`
`5,394,524
`
`MOVE DATA FROM FIFO TO OUTPUT
`UNTIL A NEW SEQUENCE NUMBER
`OR END TAG IS ENCOUNTERED, OR
`UNTIL THE FIFO IS EMPTY.
`
`MOVE DATA FROM CP FIFO
`TO OUTPUT UNTTL AN END
`TAG IS ENCOUNTERED
`
`DISCARD SEQUENCE NUMBER
`(THIS STARTS TRANSFER OF DATA
`FROM THIS FIFO)
`
`0007
`
`

`
`1
`
`5,394,524
`
`METHOD AND APPARATUS FOR PROCESSING
`TWO GRAPHICS DATA STREAMS IN PARALLEL
`
`REFERENCE TO RELATED APPLICATION
`
`This application is a continuation-in-part of applica-
`tion Ser. No. 07/926,724, filed Aug. 7, 1992, now U.S.
`Pat. No. 5,315,701, entitled “A Method and System for
`Processing Graphics Data Streams Utilizing Scalable
`Processing Nodes”.
`
`10
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`The present invention relates in general to a method
`and system for improved graphical computation and in
`particular to a method and system for utilizing graphical
`computation to process a data stream. Still more partic-
`ularly, the present invention relates to a method and
`system of graphical computation to efficiently process a
`graphics data stream.
`2. Description of the Related Art
`Data processing systems such as personal computers
`and workstations are commonly utilized to run comput-
`er-aided design (CAD) applications, computer-aided
`manufacturing (CAM) applications, and computer-
`aided software engineering (CASE) tools. Engineers,
`scientists, technicians, and others employ these applica-
`tions daily. These applications involve complex calcula-
`tions, such as finite element analysis, to model stress in
`structures. Other applications include chemical or mo-
`lecular modelling applications. CAD/CAM/CASE
`applications are normally graphics intensive in terms of
`the information relayed to the user. Other data process-
`ing system users may employ other graphics intensive
`applications such as desktop publishing applications.
`Ideally, such systems should be able to process two
`graphics data streams in parallel and interleave the re-
`sulting drawing information without mutual interfer-
`ence. One of the data streams might consist of two-di-
`mensional (2D) drawing primitives and window manip-
`ulation commands, while the other might be primarily
`three-dimensional (3D) drawing primitives and attri-
`butes. The 3D data stream processing should be ex-
`tremely high performance, while the 2D processing
`should be very low latency. In addition, the time re-
`quired to swap between these two data streams should
`be minimal. The system should be able to use current
`processor technology. Overall, the system should pro-
`vide consistent high-performance, low-latency 2D pro-
`cessing in conjunction with providing a scalable range
`of 3D processing.
`Systems which are currently on the market providing
`2D and 3D data stream support process these data
`streams sequentially, i.e., by time-multiplexing them on
`a single processor or processor complex. They process
`one data stream for a period of time, then they process
`the second for a period of time, and then they return to
`the first. This approach is an unacceptable solution since
`intermixing a data stream which is computationally
`intensive with one that is highly interactive generally
`degrades both. The computationally intensive one (3D)
`does not get as much processor time as it might, and the
`interactive one (2D) must wait for the 3D data stream to
`be processed before getting an opportunity to display
`the interactive information that the user is waiting for.
`Currently available systems require large amounts of
`context information to be swapped in order to switch
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`from processing 3D information to processing 2D infor-
`mation and back.
`A system which provides for fast 3D graphics run-
`ning alongside (or within) an interactive 2D windowed
`environment
`(e.g., X Windows)
`requires a system
`which can process these two data streams efficiently
`without mutual interference. However, the traditional
`approach of time-slicing between the two types of data
`streams can cause serious performance problems, as
`noted above.
`
`SUMMARY OF THE INVENTION
`
`In general, the present invention contemplates a seal-
`able parallel pipeline graphics system with separate
`processor complexes for the 2D data stream (the con-
`trol processor) and for the 3D data stream (attribute and
`node processors). The 3D subsystem is optimized to
`provide extremely high floating-point performance,
`which is required for 3D graphics. The 2D subsystem
`has less processing capacity, but has faster, more direct
`access to the raster subsystem that is used to actually
`modify the pixels seen on the screen.
`In accordance with the present invention, a compos-
`ite graphics data stream comprising a highly interactive
`2D data stream and a computationally intensive 3D data
`stream is partitioned into its constituent 2D and 3D
`streams, which are sent to separate 2D and 3D subsys-
`tems operating in parallel with one another. The pro-
`cessed 2D and 3D data streams are coupled to a com-
`mon raster subsystem by a reordering device or priorit-
`izer, which prioritizes the 2D data stream relative to the
`3D stream so that elements of the 2D data stream arriv-
`ing from the 2D subsystem are passed to the raster
`subsystem almost immediately, without having to wait
`for elements of the 3D data stream.
`
`Preferably, the 3D subsystem comprises a parallel
`pipeline system having a plurality of processing nodes,
`each of which contains a processor pipeline. Segments
`of the 3D data stream are distributed to the various
`processing nodes in such a manner as to balance the
`workload among the nodes. To maintain the relative
`sequence of 3D primitives that must be processed by the
`raster subsystem in a given order (and are therefore
`order dependent),
`the 3D segments are assigned se-
`quence numbers as they are distributed to the process-
`ing nodes. Successively dispatched order-independent
`segments are assigned the same sequence number, while
`order-dependent segments are assigned successively
`increasing sequence numbers. In addition to the se-
`quence numbers, end tags are sent to the processing
`nodes to indicate hiatuses in the incoming 3D data
`stream.
`
`Segments of the 2D data stream that are sent to the
`2D subsystem are not assigned sequence numbers; al-
`though the processing of these segments is generally
`order dependent, they necessarily retain their original
`order since, unlike the 3D subsystem, the 2D subsystem
`does not have parallel processing channels. On the
`other hand, as in the 3D subsystem, end tags are sent to
`the 2D subsystem to indicate hiatuses in the incoming
`2D data stream.
`The prioritizer interposed between the 2D and 3D
`subsystems and the raster subsystem has a 2D port for
`the 2D subsystem and a 3D port for each processing
`node of the 3D subsystem. Each port has associated
`with it a FIFO for buffering incoming data pending its
`further processing. In general, the prioritizer processes
`the 3D data (by dispatching it to the raster subsystem) in
`
`0008
`
`

`
`5,394,524
`
`5
`
`10
`
`15
`
`3
`order of sequence number, so that order-dependent
`primitives maintain their original sequence. The priorit-
`izer services each of the 3D ports in turn in recirculat-
`ing fashion, servicing a given port until it encounters
`either a new sequence number or an end tag indicating
`a temporarily empty port. Before proceeding to service
`the next 3D port, however, the prioritizer checks the
`2D port to determine whether it is empty. If not, the
`prioritizer services the 2D port until it encounters an
`end tag (indicating a gap in the 2D data stream), at
`which time it switches to the next 3D port.
`The primary advantage of this system over the prior
`art is that it allows the 2D and 3D data streams to be
`processed concurrently and interleaved in such a way
`that the 2D data stream is not forced to wait for large
`amounts of 3D data to be processed before it can be
`processed.
`As an example, in some systems, if a computationally
`intense piece of 3D work is given to the system to do
`(such as a NURBS surface or a high quality factor cir-
`cle), all 2D work on the system must stop while the 3D
`computations are completed. The 3D work may take
`many seconds or even minutes to complete. During this
`time, if the user wants to pop up a menu or open a new
`window, he will find that the system will not respond to
`the request until the 3D work is done. This is very
`disconcerting to the user and may even lead him to
`believe that the system is dead. In the present system, by
`contrast, the 3D output is temporarily interrupted while
`the 2D work goes on, so the menu or window appears
`almost as quickly as if the 3D work were not going on.
`Furthermore, the 3D output only is affected. The 3D
`processing continues with the output being buffered
`until the prioritizer again selects the 3D subsystem.
`Note that 3D processing is never halted.
`The separate 2D subsystem, with a direct, prioritized
`path into the raster subsystem via the prioritizer, pro-
`vides
`the consistent high-performance,-low-latency
`processing for a 2D (e.g., X Windows) data stream in
`conjunction with a 3D subsystem which is indepen-
`dently scalable to meet a range of processing needs.
`An additional advantage of this system is the reduc-
`tion in the amount of data which must be saved and
`restored when switching between the 3D and 2D pro-
`cessing. In current systems, since a single processor or
`processor complex is processing both data streams, it
`must completely save the state of the process in order to
`switch from one to the other; in the case of a 3D pro-
`cess, this is typically a large amount of data. In the
`present system, this is unnecessary, since the state of
`each process is maintained on independent processors in
`the 2D and 3D subsystems.
`The above as well as additional objects, features, and
`advantages of the present invention will become appar-
`ent in the following detailed written description.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 depicts a pictorial representation of a com-
`puter system in which the present invention may be
`implemented in accordance with a preferred embodi-
`ment of the present invention;
`FIG. 2 is a block diagram of selected components in
`a personal computer in which a preferred embodiment
`of the present invention may be implemented;
`FIG. 3 depicts a block diagram of a graphics subsys-
`tem constructed in accordance with a preferred em-
`bodiment of the present invention;
`
`4
`FIG. 4 is a block diagram of the FIFO associated
`with the 2D port of the reordering device shown in
`FIG. 3;
`-
`FIG. 5 is a block diagram of the FIFO associated
`with each 3D port of the reordering device shown in
`FIG. 3;
`FIG. 6 is a state diagram illustrating how the reorder-
`ing device interleaves servicing of its 2D and 3D ports;
`and
`FIG. 7 depicts a high level flowchart of a method and
`system for recombining processed Work Groups.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENT
`
`With reference now to the figures and in particular
`with reference to FIG. 1, there is depicted a pictorial
`representation of a computer system in which the pres-
`ent invention may be implemented in accordance with a
`preferred embodiment of the present invention. A com-
`puter 50 is depicted which includes a system unit 52, a
`video display terminal 54, a keyboard 56, and a mouse
`58. Computer 50 may be implemented utilizing any
`suitable computer such as an IBM PS/2® personal
`computer or an IBM RISC System/6000® worksta-
`tion, both products of International Business Machines
`Corporation. (RISC System/6000 and PS/2 are regis-
`tered trademarks of International Business Machines
`Corporation.) A preferred embodiment of the present
`invention may be implemented in other types of data
`processing systems, for example, host-attached graphics
`systems such as the IBM 5080 and 6090 graphics sys-
`tems or minicomputers.
`Referring now to FIG. 2, there is depicted a block
`diagram of selected components in computer 50 in
`which a preferred embodiment of the present invention
`may be implemented. System unit 52 preferably in-
`cludes a system bus 60 for interconnecting and estab-
`lishing communication between various components in
`system unit 52. Microprocessor 62 is connected to sys-
`tem bus 60 and may also have numeric coprocessor 64
`connected to it. DMA controller 66 is also connected to
`system bus 60 and allows various devices to appropriate
`cycles from microprocessor 62 during large I/O trans-
`fers.
`
`Read only memory (ROM) 68 is mapped into the
`microprocessor 62 address space. Read Only Memory
`(ROM) 68 and Random Access Memory (RAM) 70 are
`also connected to system bus 60. ROM 68 contains the
`power-on self test (POST) and the Basic Input/Output
`System (BIOS) which control hardware operations,
`such as those involving disk drives and the keyboard.
`CMOS RAM 72 is attached to system bus 60 and con-
`tains system configuration information.
`Also connected to system bus 60 are memory control-
`ler 74, bus controller 76, and interrupt controller 78
`which serve to aid in the control of data flow through
`system bus 60 between various peripherals, adapters,
`and devices. System unit 52 also contains various input-
`/output (I/O) controllers such as: keyboard and mouse
`controller 80, video controller 82, parallel controller 84,
`serial controller 86, and diskette controller 88. Key-
`board and mouse controller 80 provides a hardware
`interface for keyboard 90 and mouse 92. Video control-
`ler 82 provides a hardware interface for video display
`terminal 94. Parallel controller 84 provides a hardware
`interface for devices such as printer 96. Serial controller
`86 provides a hardware interface for devices such as a
`modem 98. Diskette controller 88 provides a hardware
`
`50
`
`55
`
`60
`
`65
`
`0009
`
`

`
`5,394,524
`
`55
`
`5
`interface for floppy disk unit 100. Expansion cards may
`also be added to system bus 60, such as disk controller
`102, which provides a hardware interface for hard disk
`unit 104. Empty slots 106 are provided so that other
`peripherals, adapters, and devices may be added to
`system unit 52. A preferred embodiment of the present
`invention may be added to system unit 52 in the form of
`a graphics adapter placed into empty slots 106.
`Those skilled in the art will appreciate that the hard-
`ware depicted in FIG. 2 may vary for specific applica-
`tions. For example, other peripheral devices such as:
`optical disk media, audio adapters, or chip program-
`ming devices such as a PAL or EPROM programming
`device, and the like may also be utilized in addition to or
`in place of the hardware already depicted.
`In accordance with a preferred embodiment of the
`present invention, processors may be arranged in paral-
`lel pipelines to form processing nodes. These processing
`nodes are utilized to perform the bulk of the graphics
`computations for a data processing system. The proces-
`sors receive data from input communications paths and
`perform required computations, such as transforma-
`tions, clipping, lighting, etc. Each processor in a pro-
`cessing node passes intermediate data to the following
`processor to allow it to continue the calculations. This
`allows the computations to be spread among the proces-
`sors within a processing node. Each processor may
`have its own memory, and the communications paths
`are designed to allow data movement to occur without
`impacting the ability of the processors to access their
`code and data memory in accordance with a preferred
`embodiment of the present invention.
`FIG. 3 is a block diagram of a graphics subsystem 300
`constructed in accordance with a preferred embodi-
`ment of the present invention. Graphics subsystem 300,
`which is contained within video controller 82 (FIG. 2),
`includes a 2D subsystem 301 and a 3D subsystem 303.
`The 3D subsystem 303 is in turn formed of a plurality of
`processing pipelines or nodes 305, as described below. 40
`Graphics subsystem 300 receives interleaved 2D and
`3D graphics data streams through a bus interface 302,
`which is coupled to the system bus 60 of the host system
`52 utilizing presently available techniques well known
`to those skilled in the art. The 3D graphics data stream
`may be divided up or partitioned into Work Elements.
`A Work Element (WE) may be (1) a drawing primitive,
`which is a command to draw, i.e., a line, a polygon, a
`triangle, or text; (2) an attribute primitive, which is a
`command to change an attribute, also called an attribute 59
`change, i.e. color or line style, or (3) a context primitive,
`which is context information for an area of display or a
`window. Both the 2D and the 3D graphics data stream
`may be stored in a work element RAM 304.
`An attribute processor (AP) 306 performs prepro-
`cessing of the incoming 2D and 3D data streams (such
`as graphics attribute processing) and dispatches work to
`the 3D processing nodes 305 or to the 2D subsystem
`301, as appropriate. Attribute processor 306 may be
`either a suitably programmed general-purpose proces-
`sor or a special-purpose logic circuit.
`Attribute processor 306 reads work from an input
`FIFO, memory or other input path and moves work
`groups to the appropriate processing node 305. This
`processor is also responsible for operations such as in-
`cluding a sequence number with the work groups so
`that the work groups may be reordered after processing
`by the processing nodes 305. Also, for some graphics
`
`65
`
`6
`data streams, the processor may perform display list
`processing and non-drawing processing.
`Attribute processor 306 is utilized to parse or parti-
`tion the 3D data stream into multiple segments in accor-
`dance with a preferred embodiment of the present in-
`vention. Each segment is also called a work group
`(WG), and each work group may contain one or more
`work elements. The number of work elements in a work
`group may be determined by various factors such as the
`amount of processing time that it takes to process a
`work group versus the amount of processing time it
`takes to group work elements into a work group. Attri-
`bute processor 306 is coupled to a RAM 308, which is
`employed to store various instructions or data utilized
`by attribute processor 306. Additionally, attribute pro-
`cessor 306 may move data by utilizing other devices
`such as DMA controllers, processors, or with internal
`features within the attribute processor itself. Attribute
`processor 306 may perform graphics processing and
`supply current attribute data to the processing nodes
`305 along with the work to be done.
`A video RAM (VRAM) 310 stores attribute informa-
`tion, in the form of processed attribute primitives, from
`the data streams along with font information and other
`context-related data in accordance with a preferred
`embodiment of the present invention. Attribute proces-
`sor 306 copies attribute data from the graphics data
`streams into VRAM 310. A shared RAM 312 is utilized
`to store font and context data. Both VRAM 310 and
`shared RAM 312 are shared memory areas utilized for
`storing globally accessed data, such as graphics context
`information, fonts, and attribute data. This type of mem-
`ory may be accessible by all of the processors, but is
`accessed relatively infrequently. As a result, contention
`for bus access to this type of memory has minimal im-
`pact on performance.
`Attribute processor 306 distributes work groups to
`the processing nodes 305 through communications
`paths 313. Communications paths 313 are utilized for
`passing data between the various processors in accor-
`dance with a preferred embodiment of the present in-
`vention. These communications paths may be memory
`ports, or any type of hardware well known to those
`skilled in the art that provides a data path to another
`processor.
`Although not necessary for an Understanding of the
`present invention, further details of the operation of
`attribute processor 306 and other elements of the graph-
`ics subsystem 300 may be found in the above-identified
`copending application Ser. No. 07/926,724, the specifi-
`cation of which is incorporated herein by reference.
`Each of the processing nodes 305 includes a first
`processor 314 coupled to a RAM 316 and a second
`processor 318 coupled to a RAM 320. Processor 314
`and processor 318 are serially coupled to each other.
`Processors 314 and 318 are TMS32OC40 processors
`manufactured by Texas Instruments Incorporated in
`accordance with a preferred embodiment of the present
`invention. Information on programming and utilizing
`TMS320C4O processors may be found in TMS320C4x
`User’: Guide, available from Texas Instruments Incor-
`porated. RAM 316 and RAM 320 are utilized to store
`instructions and data for processor 314 and processor
`318 respectively.
`The number of processing nodes 305 may vary in
`accordance with _a preferred embodiment of the present
`invention. Although the depicted embodiment shows
`only two processors per processing node 305, it is con-
`
`0010
`
`

`
`5,394,524
`
`7
`templated that other numbers of processors may be
`utilized in each processing node. Additionally, if more
`than one processor is in a processing node 305, it is not
`necessary that all of the processors in the processing
`node be of the same type or make.
`Processing nodes 305 are separated by bus transceiv-
`ers 321a, 321b, and 321c, which are well known in the
`art. These bus transceivers control access to VRAM
`310 and Shared RAM 312 by the processing nodes 305.
`Closing the bus transceivers creates a single bus, while
`opening the bus transceivers creates two buses. When
`the bus transceivers are all open, node processors 318
`have access to shared RAM 312, while node processors
`314 have access to VRAM 310. Closing all of the bus
`transceivers results in all of the processors in the pro-
`cessing nodes 305 being able to access both shared
`RAM 312 and VRAM 310. Although only three bus
`transceivers and one shared RAM and one VRAM are
`shown in the depicted embodiment, other numbers of
`bus transceivers, and various numbers and types of
`RAM may be utilized in accordance with a preferred
`embodiment of the present invention.
`As work groups are processed within the processing
`nodes 305, the processed work groups are sent from the
`processing nodes, via a bus 324, to a reordering device
`322 en route to a raster subsystem 326.
`Reordering device 322 combines the processed 3D
`data from the processing nodes 305 into a single 3D data
`stream for transmission to the raster subsystem 326.
`'Reordering device 322 also merges the processed 3D
`data stream from 3D subsystem 303 with the processed
`2D data stream from 2D subsystem 301 to form a single
`combined data stream for the raster subsystem 326. In
`this particular embodiment, reordering device 322 is an
`application-specific integrated circuit (ASIC). How-
`ever, reordering device 322 may also be a processor or
`other specialized logic circuit.
`As noted above, processed work groups are recom-
`bined to produce a processed graphics data stream,
`which is sent to raster subsystem 326, which may be an
`specialized ASIC or a processor, for display of a pixel
`image on video display terminal 94 (FIG. 2). The reor-
`dering or recombining of the processed work groups is
`accomplished by assigning a tag or sequence number to
`each work group in accordance with a preferred em-
`bodiment of the present invention. Reordering device
`322 utilizes the synchronization tags to determine the
`order in which to place work groups to produce a data
`stream.
`
`In some cases, the order in which work groups are
`placed may be extremely important, and in other cases,
`the order of work groups may be unimportant. As a
`result, in addition to dividing up a graphics data stream
`into segments, attribute processor 306 may be utilized to
`determine the order in which the segments are reor-
`dered or reassembled at reordering device 322 in accor-
`dance with a preferred embodiment of the present in-
`vention. Furthermore, attribute processor 306 deter-
`mines whether or not the order of a work group is
`important and assigns synchronization tags or sequence
`numbers to each work group to reflect this in accor-
`dance with a preferred embodiment of the present in-
`vention. This determination may be dependent on vari-
`ous factors such as the type of graphics data stream
`being processed or their drawing locations on the
`screen. These synchronization tags or sequence num-
`bers are utilized by reordering device 322 to determine
`the order in which to send processed graphics data to
`
`5
`
`10
`
`15
`
`20
`
`25
`
`50
`
`8
`raster subsystem 326 in accordance with a preferred
`embodiment of the present invention. Work groups
`which do not require any temporal order may be as-
`signed the same synchronization tag or sequence num-
`ber. Reordering device 322 passes these primitives to
`raster subsystem 326 as it encounters them; it will not
`force one to be drawn before another. When order-
`dependent primitives are encountered, attribute proces-
`sor 306 assigns successive sequence numbers that are
`then used by reordering device 322 to output the primi-
`tives in the correct order to raster subsystem 326. The
`disclosed system thus allows those primitives that can
`be drawn without regard to order to be drawn at will,
`while those that must be drawn sequentially are drawn
`sequentially.
`Reordering device 322 has a 2D port 323 for receiv-
`ing the processed 2D data stream from 2D subsystem
`301 and a 3D port 325 for receiving processed 3D data
`from each processing node 305 of 3D subsystem 303;
`each 3D port 325 is associated with a particular node
`305 of the 3D subsystem 303.
`Referring now to FIG. 4, associated with the 2D port
`323 of reordering device 322 is a FIFO 400 for receiv-
`ing and storing the 2D data stream from 2D subsystem
`301 while awaiting dispatching to raster subsystem 326.
`Incoming elements are added to the top of the occupied
`area of FIFO 400 as shown in the figure, while outgoing
`elements are removed from the bottom of the occupied
`FIFO area as shown in the same figure. Any suitable
`means known in the art, such as pointers to an address-
`able memory, may be used to realize FIFO 400. At a
`given instant in time, FIFO 400 might contain a plural-
`ity of data entries 402, constituting elements of the pro-
`cessed data stream from 2D subsystem 301, and an
`“end” tag 404 at the top of the occupied buffer area
`indicating a gap in the 2D data stream. In the embodi-
`ment shown, end tag 404 is added to the stream either
`by attribute processor 306 or by the 2D subsystem pro-
`cessor (to be described) when it detects a gap in the 2D
`data stream.
`In a similar manner, referring now to FIG. 5, associ-
`ated with each 3D port 325 of reordering device 322 is
`a FIFO 500 for receiving and storing 3D data from the
`corresponding node 305 of 3D subsystem 303 while
`awaiting dispatching to raster subsystem 326. As with
`the 2D FIFO 400, incoming elements of the 3D data
`stream are added to the top of the occupied area of each
`FIFO 500 as shown in FIG. 5, while outgoing elements
`are removed from the bottom of the occupied FIFO
`area as shown in the same figure. Any suitable means
`known in the art, such as pointers to an addressable
`memory, may be used to realize FIFO 500. At a given
`instant in time, each 3D FIFO 500 might contain a
`plurality of groups of data entries 504 constituting ele-
`ments of the processed 3D data stream from the corre-
`sponding processing node 305 of 3D subsystem 303,
`with each group of entries being preceded by a se-
`quence number 502 for that group. An “end” tag 506 at
`the top of the occupied buffer area indicates a gap in the
`3D data stream. In the embodiment shown, attribute
`processor 306 adds sequence numbers 502 to the por-
`tions of the 3D data stream that it distributes to the
`processing nodes 305 to indicate the order in which the
`primitives are to be recombined for processing by raster
`subsystem 326. Attribute processor 306 adds end tag 404
`to each distributed portion of the 3D data stream when
`it detects a gap in the 3D data stream.
`
`0011
`
`

`
`5,394,524
`
`9
`Reordering device 322 recombines 3D data arriving
`from the various nodes 305 in such a manner as to en-
`sure that the data reaches raster subsystem 326 in the
`correct order. Reordering device 322 receives the se-
`quence number 502 of each primitive, as indicated
`above, and selects the next sequential primitive to draw.
`The sequence number is incremented after each order-
`dependent primitive (or set of order-independent primi-
`tives) is passed to the raster subsystem 326. FIFOs 400
`and 500 contain sufficient buffering capability to allow
`the 2D subsystem 301 and the processing nodes 305 to
`write their output to the reordering device 322 and
`continue processing, even if their output data is not
`currently selected. Preferably, reordering device 322
`also allows data t

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket