`
`TCL 1007
`
`
`
`pee
`
`soeveCae
`rye
`
`TCL 1007
`
`TCL 1007
`
`
`
`a
`
`r
`
`302[].6 8123,F;l;,
`1 'Association
`Canadian Information Processing Society
`canadienne de 1' informatique
`
`Copyright 1993 by
`
`
`Papers are reproduced:here from camera-ready copy
`
`prepared by the authors.
`
`Nous reproduisons ci-apres les documents prets a
`
`
`photographier rediges par !es auteurs.
`
`Permission is granted to quote short excerpts and to
`Il est permis de citer de courts extraits et de reproduire des
`
`
`reproduce figures and :tables from these proceedings,
`donnees ou tableaux du pr�sent compte rendu, a condition
`provided that the source of such material is fully
`
`
`d'en identifier clairement la source.
`acknowledged.
`
`ISSN 0713-5424
`ISBN 0-9695338-2-9
`
`ISSN 0713-5424
`ISBN 0-9695338-2-9
`
`Conference sponsored by Canadian Human-Computer
`Congres tenu sous les auspices de la Societe canadienne
`
`
`Communications Society (CHCCS); in cooperation with
`du dialogue humain-machine (SCDHM), en cooperation
`
`
`avec le Conseil de recherches en sciences naturelles et en
`
`the Natural Sciences and Engineering Research Council
`genie du Canada et du Conseil national de recherches du
`of Canada and the National Research Council of Canada.
`Canada.
`
`
`from:
`
`Des renseignements sur la SCDHM sont disponibles
`
`Membership information for the CHCCS is available
`a
`
`l'adresse suivante:
`
`Canadian Information Processing Society (CIPS)
`
`430 King Street West, Suite 106
`Toronto, Ontario, Canada
`M5V 1L5
`Telephone: (416) 593-4040
`Fax: (416) 593-5184
`
`1' Associ tion canadienne de l'informatique (ACI)
`
`
`430 ·
`· ng ouest, bureau 106
`Tor.: ·� (0ntario) Canada
`M5V u_5
`Tek. '.one: (416) 593-4040
`Telecopieur: (416) 593-5184
`
`
`
`
`from:
`
`Des exemplaires des comptes rendus sont disponibles aux
`
`Additional copies of these proceedings are available
`actresses suivantes :
`
`Canadian Information Processing Society (CIPS)
`
`In Canada
`
`(address above)
`In the United States
`Morgan Kaufmann Publishers
`
`Order Fulfillment Center
`P.O. Box 50490
`Palo Alto, CA 94303
`U.S.A.
`Telephone: (415) 965-4081
`Fax: (415)578-0672
`All other countries
`Morgan Kaufmann Publishers
`(address above)
`
`Au Canada
`L' Association canadienne de l'informatique
`
`(adresse ci-dessus)
`Aux Etats-Unis
`Morgan Kaufmann Publishers
`
`Order Fulfillment Center
`P.O. Box 50490
`Palo Alto, CA 94303
`U.S.A.
`Telephone: (415) 965-4081
`Telecopieur: (415) 578-0672
`Tous les autres pays
`Morgan Kaufmann Publishers
`(adresse ci-dessus)
`
`
`Published by Canadian Information Processing Society
`
`Pub lie par 1 'Association canadienne de 1 'informatique
`
`
`
`Printed in Canada by New Horizon Printing
`(Summerland, B.C.) Ltd.
`
`Imprime au Canada par New Horizon Printing
`(Summerland, C.B.) Ltd.
`
`Front Cover
`Couverture
`A frame from an animation created by Michaela
`Un image d'une animation, cree par Michaela Zambrans
`
`Zambrans of the Graphics Research Lab, Simon Fraser
`
`
`
`dulaboratoire d'infographie de l'universite Simon Fraser,
`University,
`using LifeForms and Vertigo software.
`
`utilisant LifeForms et Vertigo.
`
`TCL 1007
`
`
`
`
`
`Proceedings I Compte rendu
`
`Graphics Interface '93
`
`Toronto, Ontario
`
`19-21 May/mai 1993
`
`Gravhics Interface '9� �·�
`
`TCL 1007
`
`
`
`155
`
`
`
`A Simple, Flexible, Parallel Graphics Architecture
`
`John Amanatides t, Edward Szurkowski :j:
`
`t Dept. of Computer Science, York University
`
`North York, Ontario, Canada M3J 1P3
`amana@cs.yorku.ca
`
`t:1: Computer Systems Research Lab
`AT&T Bell Labs, Murray Hill, NJ, USA 07974
`
`ABSTRACT
`
`trating on these general-purpose products and using them
`
`
`
`effectively. We wanted an architecture that minimized
`we believe, is an inexpen
`graphics hardware. The result,
`
`sive, powerful and flexible system.
`
`Traditional graphics hardware architectures, with their
`
`
`
`emphasis on the graphics pipeline, are becoming less use
`
`ful. As graphics algorithms evolve and grow more capa
`ble, it becomes much harder to implement them in silicon.
`As well, we will describe a design that utilizes this archi
`
`
`
`
`By using general-purpose hardware technology effectively,
`
`tecture. The origin of this design came from our experi
`one can build powerful graphics hardware that is very flex
`ences with the AT&T Pixel Machine. We wanted to pro
`
`ible, yet inexpensive. In this paper we would like to dis
`
`duce a low-cost next-generation machine. Let us first
`
`cuss one such architecture that allows for both traditional
`
`review architecture of the Pixel Machine.·
`
`interactive graphics (polygon scan conversion) as well as
`
`more advanced graphics (ray tracing and radiosity).
`
`THE PIXEL MACHINE
`
`KEYWORDS
`
`computer graphics hardware
`
`INTRODUCTION
`
`The Pixel Machine (PXM) was designed as a pro
`
`grammable computer subsystem with pipeline and parallel
`closely coupled to a display system [5]. Built
`processing
`by Pixel Machines Corp, a subsidiary of AT&T, it was
`launched in 1987. An important design
`goal was flexibil
`The graphics pipeline has had a long history in computer
`
`ity. Graphics algorithms were not hardwired into the
`graphics [1]. Consisting
`of a front end that performs sim
`
`
`
`design. Instead, digital signal processors (DSP32) were the
`
`
`ple, repetitive floating point calculations on short vectors
`basic building blocks (nodes) of the PXM. This allowed
`
`
`
`(for transformations, clipping, perspective) and a back end
`
`
`for a lot of flexibility as new functionality could be pro
`
`that scan converts primitives into pixels and determines
`In fact, most of the algorithms
`grammed in afterwards.
`
`
`visibility, it easily became a candidate for hardware accel
`
`
`were written in C with only the critical sections written in
`eration [2, 3, 4]. However, as graphics algorithms evolve
`
`
`assembler. This resulted in a product that was ideal for
`
`and become more capable the utility of this specialized
`
`research and development.
`
`hardware is reduced. The required functionality can no
`longer be incorporated easily in hardware and instead must
`
`
`increasingly be performed in software on the host system.
`
`The PXM consisted of a large box (containing up to 20
`
`
`VME boards) which was connected to the host computer
`
`
`via a series of registers in the memory address space of the
`VLSI technology is squeezing more and more onto a chip.
`
`host computer. Data and commands reaching the PXM
`
`However, designing a special-purpose chip is getting
`
`
`would first be sent through a pipeline board consisting of
`
`harder as more functionality is added. This is especially
`nine DSP32 pipe nodes. For interactive
`
`graphics, the pipe
`true of custom graphics chips. Most effort in semiconduc
`
`
`
`nodes would do transformations, clipping and lighting cal
`tor houses is now being put into creating more powerful
`
`
`culations for the various graphics primitives. A second
`
`microprocessors and ever larger DRAMs and VRAMs.
`
`pipeline board could be added to the PXM to increase per
`formance.
`
`In this paper we want to explore an architecture that tries to
`
`
`take advantage of the growing power of VLSI by concen-
`
`
`
`to 16-64 (depend-
`Next, the primitives would be broadcast
`
`
`
`Graphics Interface '93
`
`TCL 1007
`
`
`
`156
`
`It had to be cheap; the target
`was a list price of $25K. Ide
`ally, it would be a single board that would fit into our
`workstations.
`
`DSP32 pixel nodes whose job it was to ras
`ing on options)
`terize
`an inter
`Each pixel node contained
`the primitives.
`of the frame buffer (every eighth pixel of
`leaved portion
`every eighth row in the 64 processor
`As well, it
`version).
`run out of mem
`It needed lots of memory; we had already
`as well as 512KB
`had 32KB SRAM (for program storage)
`and models.
`and needed more for textures
`ory for programs
`map, accu
`texture
`frame buffer,
`of VRAM (double-buffed
`buffer) and 256KB of DRAM (z-buffer). Finally,
`mulation
`Gordon
`design;
`a minimalist
`Finally, we wanted to explore
`to their four neigh
`the pixel nodes were each connected
`Bell bas said that "the cheapest,
`fastest and most reliable
`link.
`bours via a serial
`system are those that aren't
`of of computer
`components
`We wanted to see if we could apply this to graphics
`there."
`the pipe nodes were
`For ray tracing and image processing,
`hardware.
`mostly unused; the work was done by the pixel nodes.
`we found that we
`programs
`When running ray tracing
`ARCHITECTURES
`DIFFERENT
`improve
`performance
`could get two orders of magnitude
`that the PXMs were connected
`ment over the workstations
`[4, 5].
`subsystem
`graphics
`the typical
`Figure 1 illustrates
`to (in our case, a Sun 3/260).
`
`A fully config
`the PXM was expensive.
`Unfortunately,
`cost up to to $150K; it was not some
`ured machine could
`could have in their office. It had sev
`thing every scientist
`The DSPs had a limited address
`eral other limitations.
`process
`(signal
`purpose
`space. This fit into their original
`ran into limi
`has small code) but we quickly
`ing typically
`with sophisticated
`programs
`graphics
`when writing
`tations
`More memory was needed for both programs and
`shading.
`data. The two-dimensional interleaved
`design of the PXM
`meant
`small polygons
`Scan converting
`had drawbacks.
`only drew a few pixels per polygon.
`that each processor
`has to do all the edge setup cal
`every processor
`However,
`this over only a few pixels is
`and amortizing
`culations
`the bot
`graphics
`interactive
`When performing
`expensive.
`in the system would change depending
`on the
`tlenecks
`mapping,
`If there was texture
`involved.
`types of rendering
`If only simple shad
`the pixel nodes were the bottleneck.
`the pipe nodes were the bottleneck.
`ing was performed,
`
`DESIGN GOALS
`
`using the PXM
`In the summ er of 1989, after extensively
`at AT&T Bell
`of graphics research
`projects
`for a variety
`machine that we
`Labs, we decided to design a similar
`to place in each of our offices. As well, Pixel
`could afford
`and late
`was ambitious
`next generation machine
`Machine's
`was on
`and we wanted to offer them an option. The stress
`and power.
`low cost, flexibility
`
`Figure 1
`
`data base and sends the
`the model
`The host traverses
`to the graphics pipeline.
`The front end,
`primitives
`graphics
`Like the PXM we wanted the new machine to be flexible;
`of geometric operations,
`is typi
`a series
`which performs
`graphics
`both interactive
`to perform
`we wanted something
`of floating-point
`&Us
`as a pipeline
`cally implemented
`goal was
`performance
`fast. Our interactive
`and ray tracing
`found between stages
`(labeled G). (The FIFOs typically
`t Our ray tracing
`over lOOK independent polygons/sec.
`the trans
`Afterwards,
`are left out of the diagrams)
`level over one order of magnitude
`goal was a performance
`primitives are sent to the back end which
`formed/shaded
`faster than the workstations
`that it would be connected
`to.
`performs
`Because
`operations.
`and visibility
`rasterization
`requires
`rasterization
`a great deal of pixel
`interactive
`
`t Performance
`polygons was important as some of
`for independent
`them.
`on BSP trees generated
`research
`our graphics
`
`Graphics Interface
`'93
`
`TCL 1007
`
`
`
`157
`
`processors to perform the back end operations of rasteriza
`
`
`tbfoughput, parallel access to the frame buffer is required;
`
`
`
`
`
`tion. Because, the primitives are distributed in this manner,
`
`
`
`tbe back end is typically implemented with multiple raster
`
`
`
`each processor does the same back-end operations as in
`
`
`
`for a jzation processors (labeled R) each being responsible
`
`
`
`previous designs but only 1/nth of the front-end operations
`
`portion of the frame buffer.
`
`(assuming there are n processors).
`A variant approach, illustrated in figure 2, is increasingly
`
`
`
`
`
`
`[6, 7, 8].
`being explored
`
`Figure 3
`
`Figure 2
`
`There are several advantages to this approach. First, the
`
`
`
`
`
`
`design is auto load balancing. For example, if there is a lot
`
`of texture mapping, the processors spend most of their time
`
`
`
`on this. If most of the primitives are simple, then the pro
`
`
`cessors spend the majority of their time on the front end
`Here, the front end, instead of being a pipeline of very sim
`
`
`
`
`
`computation. Alternately, if the computation is ray tracing,
`
`
`ple processors, has become populated with more powerful
`
`
`no front-end processors are wasted idling. Second, the
`
`processors, each capable of performing all the required
`
`
`
`approach tends to be simpler, requiring powerful general
`
`
`
`
`front-end operations on a single primitive. Graphics primi
`
`
`purpose processors instead of special-purpose graphics
`
`
`tives are distributed in a round-robin manner to each front
`
`
`hardware. Because VLSI technology is making these sin
`
`
`
`end processor, processed, and then broadcast to the back
`
`
`gle-chip microprocessors increasingly available this results
`
`
`
`end. Because the processor is more powerful, better shad
`
`
`
`Finally, it is a more in a smaller, simpler board layout.
`
`
`
`
`graphAlso, more powerful ing algorithms can be utilized.
`
`
`
`
`flexible approach. Since the hardware pipeline is not opti
`ics primitives
`
`can be dealt with (splines, for example) and
`
`mized for a particular algorithm, there is a lot of freedom
`
`
`As well, there tessellation can occur further down the pipe.
`
`to change or enhance capabilities. For example, higher
`
`
`is less data movement amongst the various stages in the
`
`
`order primitives can be used (NURBS). If necessary, the
`
`
`pipe (analysis of the PXM indicated that the pipe nodes
`
`
`model can be distributed amongst the processors (assuming
`
`
`
`spent a significant amount of time on this). Several proces
`
`
`that they have the memory) without introducing inefficient
`
`
`sors are typiCally required to keep performance up and care
`asymmetries.
`
`must be taken to make sure primitives st:'ly in priority
`order.
`THE PXMjr DESIGN
`
`As mentioned earlier, we wanted to produce an inexpen
`
`
`The general outline of our new minimalist architecture is
`
`
`
`
`
`sive version of the PXM and began in the summ er of 1989.
`
`
`found in figure 3. Here, the functions of the front and back
`
`We gave the new design the name Pixel Machine Junior
`
`
`
`ends are collapsed onto the same processors. In this case,
`
`(PXMjr). Like the PXM the PXMjr would be a peripheral;
`
`
`
`the host distributes primitives in a round-robin manner to
`
`this would keep the design simple and would allow it to be
`
`
`
`each of the processors. Each processor performs the front
`
`
`
`connected to a variety o� workstations. This would also
`
`end tasks of transforming/clipping/lighting/perspective for
`
`
`result in a fast turnaround time from design to production.
`
`
`
`the primitives. It then broadcasts the results to the other
`
`
`
`Graphics Interface '93
`
`TCL 1007
`
`
`
`158
`
`We also decided that to keep it simple and small, we pre
`
`ferred fewer, more powerful processors. Thus we could
`
`
`have inore memory per processor without incurring high
`costs.
`
`and either 2 or 8 MB of DRAM.
`
`The new design would require more memory for each pro
`
`cessor; a lot of memory is needed as textures and geomet
`ric models, if they are stored locally,
`are space intensive.
`As well, program space
`needs to be larger as each proces
`sor does more work. To keep costs down, we wanted to
`the need for SRAM, and use DRAM instead.
`eliminate
`DRAM also has the advantage that
`a lot more memory can
`DRAM is slower.
`be placed per unit area. Unfortunately,
`
`To get back the speed, we wanted the processor to have an
`on-chip cache. Because of typical DRAM chip layouts and
`bus widths (64 bits), we would need at least 16 chips (nib
`would result in either of 2 MB (1Mb
`ble data paths). This
`chips) or 8 MB (4Mb chips) per processor.
`A similar num
`ber of VRAM chips would be required for the distributed
`frame buffer. At the time 4Mb VRAM chips weren't avail
`
`able; otherwise we would have considered just using
`VRAM. We felt that the simplified
`design and halfed chip
`count may have been worth it.
`
`Host Bus
`
`Message Bus
`
`0000
`
`I
`'
`I
`I
`I
`I
`I I
`I
`I
`I
`I
`I
`'-----------
`
`r--------------------------�
`
`�D�A�
`
`I
`
`After exploring several DSP microprocessors, we decided
`
`
`
`on the Intel i860. It is a powerful general-purpose micro
`THE FRAME BUFFER
`
`
`processor optimized for graphics. According to Intel engi
`neers we could expect 20-25K poly/sec from each proces
`There is a virtual 2Kx2Kx32 frame buffer distributed
`sor. It has both a powerful floating-point
`
`unit (80 MFlops
`in a column interleaved manner.
`amongst the 8 processors
`peak) and on-chip caches. We realized that this was a
`The video rate is programmable from RS-170A (NTSC
`change from the DSP32s in the PXM but since most of our
`
`resolution) to 1280x1024 non-interlaced.
`software was written in C, we felt that we could make this
`change without a big penalty.
`
`Figure 4
`
`1280
`
`768
`
`The processors would be connected to each other and the
`
`
`host via a bidirectional FIFO connected to a message pass
`ing bus. These FIFOs would help keep each processor
`
`
`busy and help perform the necessary multicasting without
`
`
`processor intervention. The message bus has the capability
`
`of both high and low priority packets (the need for which
`
`we will describe later).
`
`1024
`
`-----------
`
`I
`I
`I
`I
`I
`I
`I
`I
`�-------
`I
`I
`I
`I
`I
`i
`I
`I
`I
`
`2048
`
`Like the PXM, the host sees a set of registers in its address
`
`
`FIFO and control
`space which implement a bidirectional
`the PXMjr. It can tell if its input FIFO is empty, its output
`
`FIFO is full and can read and write from the FIFOs. It can
`also reset the PXMjr and there is a mechanism for syncing
`(to synchronize
`with the processors
`
`the completion of each
`
`frame). The host interface was kept simple so that it could
`
`easily be adapted to multiple host types.
`
`A general outline of the new design is found in figure 4. It
`each with 2 MB of VRAM
`
`consisted of 8 i860 processors,
`
`2048
`
`Figure 5
`
`the frame buffer reduces the
`As it is column interleaved,
`
`pixel rate from the individual VRAMs by a factor of 8.
`
`Also, it was found that in the original PXM there was sig
`
`nificant wasted computation because the edge set-up costs
`
`
`Graphics Interface '93
`
`TCL 1007
`
`
`
`159
`
`for polygon scan conversion were not amortized over
`
`INTERACTIVE GRAPHICS
`
`enough pixels. By interleaving in only one dimension,
`
`
`To help understand our design let us describe what would
`
`Finally, it simplifies antiaJllortization is increased.
`
`
`happen when rendering polygons interactively.
`aliasing.
`
`until its output FIFO has space and
`The host busy-waits
`Each processor has 2 MB of VRAM (16x1Mb nible-mode
`
`
`then starts sending packets containing polygons .. These are
`cbips). The 2Kx2K allows for double buffering at
`
`sent in a round-robin fashion to each of the processors (the
`t280xl024 resolution. As well, the extra 768 pixels in
`
`
`
`
`i860s). The host continues until it completes traversing the
`each scan line can store a 16 bit z-buffer (see figure 5). By
`model and then waits until the processors are finished.
`careful optimization we can use the VRAM serial buffer to
`
`reset the frame buffer to the background colour and reset
`
`
`
`me z-buffer during vertical blanking. This is done in con
`
`junction with the i860 executing a very tight loop to gener
`
`on the local bus. ate the appropriate addresses
`
`
`Each processor waits for packets in its input FIFO. Pack
`
`
`ets containing pOlygons come in two flavors: geometry
`messages (GM) and rendering messages (RM). The GMs
`
`come from the host and their polygons are transformed,
`
`clipped, shaded and then sent out as RMs via the output
`THE BUS
`
`
`FIFO to the other processors. When a processor receives a
`Tbe message bus is a conservative design. Originally
`
`RM it scan converts the polygon and draws it in its frame
`designed to match the VME bus, and then the S-bus, it is
`buffer.
`32 bits wide; running at 10 MHz it provides a raw transfer
`complete their tasks of converting
`rate of 40 MB/sec.
`Since processors
`GMs
`
`into RMs at different rates it is possible that RMs arrive at
`Tbe bus transfers fixed-length packets of 32 32-bit words.
`
`
`
`.processor in the wrong priority order. The processor has to
`
`Tbe size was chosen so that triangles and quadrilaterals
`sort this out; it may have to store up to 7 RMs (this is as
`
`would fit into one packet and a constant-size packet simpli
`
`far off as the ordering can get).
`
`fied the design. The first word in the packet indicates the
`
`
`destination. With 1 bit per processor, it allows for multi
`DEADLOCK
`casting.
`
`There are two types of packets: high and low pri
`Since every processor can write into every other proces
`
`
`packets get onto the bus only if
`ority packets. Low priority
`sor's input FIFO at the same time there is a remote possi
`
`no high-priority packets are waiting to be sent in any of the
`
`bility of deadlock. Consider the following scenario: The
`
`other FIFOs. As well, there is a fair scheduling policy in
`host is so fast that it fills all the input FIFOs of all the pro
`which a second packet from any FIFO cannot get onto the
`
`cessors with GM packets. Each pr�ssor converts the GM
`bus until all other FIFOs are first given a chance to send
`into a RM and puts it in its output FIFO. Their output
`theirs.
`
`FIFOs start to fill. What happens if its output FIFO
`a RM it is consumed
`becomes full? If the input FIFO has
`Running at IOMHz, 2 time slots are required for arbitra
`
`(good). However, if th,e next packet
`is a GM it must even
`tion. This results in a maximum ftow rate of 294K packets
`tually be sent out and the processor blocks because there is
`per second. Since each polygon would have to traverse the
`no more room in its output FIFO. If something like this
`bus twice, the maximum number of polygons that can be
`
`happens at several processors deadlock occurs.
`handled is about 145K. This fits well with the lOOK
`poly/second design goal.
`
`The FIFOs can be implemented with two 1Kx18 bidirec
`tional FIFO chips [9). These chips can be programmed
`during reset so that they raise signals at various levels of
`filling. (This will be used to detect if a full packet is ready
`
`and in the deadlock prevention scheme described below).
`There is room for 32 packets in each FIFO. This allows
`
`for considerable incoming work.
`
`The best way to handle deadlock is to design it out in the
`
`first place, In our design we can guarantee that no dead
`lock will occur if the output FIFO at each processor
`is not
`full. We have to make sure that this never happens. RM
`
`
`packets are consumed; they will not cause trouble. It is the
`
`GM packets that must be taken care of.
`
`As was mentioned earlier packets on the bus have two pri
`
`
`
`
`
`
`orities, low and high. low priority packets cannot get onto
`
`the bus if high priority packets are waiting. If we map
`
`GMs to low priority packets and RMs to high priority
`
`packets then the solution is in sight.
`
`Graphics Interface '93 "'�
`
`TCL 1007
`
`
`
`References
`
`The worst possible scenario is when every input queue is
`full of GMs. As soon as one of these GMs turns into a RM
`
`it shuts down further introduction of GMs from the host.
`The current GMs are slowly turned into RMs and wait in
`the output FIFOs. As long as the output FIFOs are at least
`
`as big as the input FIFOs then each processor can do useful
`work. W henever space appears at any input FIFO it is the
`
`all the GMs will RMs that are delivered to it. Eventually,
`
`
`be turned into RMs and will be delivered into their appro
`
`priate input FIFOs. Now, when all the output FlFOs are
`
`empty and the processors are working on the remaining
`RMs in their input FIFOs can the host begin to deliver
`
`GMs again. What will eventually happen is that each pro
`
`cessor will oscillate between working on RMs and GMs; in
`
`both cases, it will be doing useful work. The long FIFOs
`
`
`(room for 32 packets) make sure that the processors are
`kept busy.
`
`1. J.D. Foley, A. Van Dam, S.K. Feiner, and J.J;'
`
`Hughes, Computer Graphics: Principles and Experj�
`ence, Addison-Wesley ·Publishing
`Co, Reading
`Mass, 1990.
`2. T. Myer and I Sutherland, "On the Design of Dis
`Comm. ofthe ACM, vol. 11(6), PP.
`play Processors,"
`410-414, June 1968.
`3. J. Clark, "The Geometry Engine: A VLSI Geometry
`Computer Graphics, Vol.
`System for Graphics,"
`
`16(3), pp. 127-133, July 1982.
`4. K. Akeley and T. Jermoluk, "High-Performan
`ce
`Computer Graphics, vol.
`Polygon Rendering,"
`22(4), pp. 239-246, August 1988.
`�
`5. M. Potmesil and E.M. Hoffert, "The Pixel Machine:
`
`'A Parallel Image Computer," Computer Graphics,
`
`vol. 23(3), pp. 69-78, July 1989.
`PACKAGING
`6. J.G. Torborg, "A Parallel
`Processor Architecture for
`
`The preliminary d�sign called for a single 9VME board
`Computer
`Graphics Arithmentic Operations,"
`
`
`(why pay for cabinet/power supplies?). But because of
`Graphics,
`vol. 21(4), pp. 197-204, July 1987.
`
`changing environments (fewer people were purchasing
`
`
`large-chassis workstations) a "pizza box" design was cho
`
`7. D. Kirk and D. Voorhies, "The Rendering Architec
`ture of the DN1000VS," Computer Graphics,
`
`sen. The case, 16"x16"x3", was designed to fit under a
`vol.
`
`SPARCstation, with a ribbon cable extending into an S-Bus
`24(4), pp. 299-307, August 1990.
`
`slot in the SPARCstation. (Because of the simple host
`8. M. Mehl and H. Joseph, "GRACE: The Graphics
`
`
`interface, adapters were contemplated for other machines).
`Coprocessor Engine of the EuroWorkStation
`In the chassis there would be room for one large circuit
`(EWS)," £urographies '90: Proceedings of the
`
`board, along with power supply and fan. The preliminary
`
`Graphics and Interaction in Esprit Sessions,
`
`design also called for the possibility of a daughter board
`September 1990.
`
`
`with 8 more processors but power dissipation and com
`9. IDT72521, 1Kxl8-Bit CMOS BiFIFO, Preliminary
`plexity problems stopped this at an early stage. There is
`
`
`Data Sheet, Integrated Device Technology, Inc, Jan
`analog RGB and sync out and genlock in.
`uary 1989.
`
`CONCLUSION
`In this paper we have introduced an architecture that allows
`
`
`for simple, flexible yet powerful graphics hardware for
`
`
`both interactive graphics (polygon scan conversion) and
`
`
`more advanced graphics (ray tracing, radiosity). This is
`
`
`accomplished by not dedicating hardware to specific tasks
`
`
`but allowing processors to both transform and render poly
`
`gons. As well, a design, consisting of eight i860s, with
`
`local memory, and a double-buffered 1280x1024 frame
`
`buffer was outlined. This design was finished in the spring
`
`of 1990 and a small prototype was built. However by this
`time the first author had left AT&T and Pixel Machines
`Inc. decided not to continue with development.
`
`and Bruce Naylor for
`We would like to thank Don Mitchell
`
`many valuable suggestions.
`
`
`
`..... . Graphics Interface '93
`
`TCL 1007
`
`
`
`OF TEXAS AT AUSTIN
`THE UNIVERSITY
`THE GENERAL LIBRARIES
`LIBRARY
`PERRY-CASTANEDA
`
`DATE DUE
`
`DATE RETURNED
`
`REr APR 0 B 199fi
`J 0 5 1996
`.j 1ClCJC
`·� .LJ,.,.HJ
`
`"''· .�-- .'
`
`PCl P PCL
`AUG 2 9 2001
`
`TCL 1007
`
`