`Schilling et al.
`
`111111111111111111111111111111111111111111111111111111111111111111111111111
`US006236405Bl
`US 6,236,405 Bl
`May 22,2001
`
`(10) Patent No.:
`(45) Date of Patent:
`
`(54) SYSTEM AND METHOD FOR MAPPING
`TEXTURES ONTO SURFACES OF
`COMPUTER-GENERATED OBJECTS
`
`(75)
`
`Inventors: Andreas Schilling, Gomaringen;
`Guenter Knittel, Tuebingen, both of
`(DE)
`
`(73) Assignee: S3 Graphics Co., Ltd., Cayman Islands
`(KN)
`
`F. C. Crow, "Summed- Area Tables for Texture Mapping",
`Proceedings of SIGGRAPH '84, Computer Graphics, vol.
`18, No. 3, Jul. 1984, pp. 207-212.
`
`A. Glassner, "Adaptive Precision in Texture Mapping",
`Proceedings of SIGGRAPH '86, Computer Graphics, vol.
`20, No.4, Aug. 1986, pp. 297-306.
`
`G. Campbell, et al., "Two Bit/Pixel Full Color Encoding",
`SIGGRAPH '86 Conference Proceedings, Computer Graph(cid:173)
`ics, vol. 20, No. 4, Aug. 1986, pp. 215-223.
`
`( *) Nolice:
`
`Subject to any disclaimer, Lbe term of this
`patent is extended or adjusted under 35
`U.S.C. l54(b) by 0 days.
`
`M. F. Deeroig, et al., "FBIV\M: A New Form of Memory
`Optimized for 3D Graphics", Proceedings of SIGGRAPH
`'94, Jul. 1994, pp. 167-174.
`
`(21) Appl. No.: 08/884,044
`.Jun. 27, 1997
`
`(22) Filed:
`
`Related U.S. Application Oata
`(60) Provisional application No. 60/020,935, fi led on Jul. 1,
`1996.
`Int. C l.7
`...................... ... ... G06T 11/40; H04N 5/217
`(51)
`(52) U.S. Cl . ............................................. 345/ 430; 348/620
`(58) Field of Search ............................ 345/429, 430-432,
`345/426, 428; 348/620
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4/1986 Campbell et at. ................... 340n03
`4,580,134
`5,019,908 • 5/ 1991 Su ........................................ 348!559
`5,461,712 • 10/ 1995 Chelstowski et al. ............... 345/514
`5,606,650 * 2/1997 Kell y ct al. .......................... 345/430
`5,651,104
`7/ 1997 Cosman ............................... 395!128
`5,831,624 • 11/1998 Tarolli et al ......................... 345/430
`5,903,276 • 5/1999 Shiraishi .............................. 345/431
`
`OlliER PUBLICATIONS
`
`P. Heckbert, ·'Color Image Quantization for Frame Buffe r
`Display", Proceedings of SIGGRAPH
`'82, Computer
`Graphics, vol. 16, No.3, Jul. 1982, pp. 297-307.
`L. Williams, " Pyramidal Parametrics", Proceedings of SIG(cid:173)
`GRAPH '83, Computer Graphics, vol.17, No.3, Jul. 1983,
`pp. l-11.
`
`(List continued on next page.)
`
`Primary Examiner-William A Cucblinski, Jr.
`Assislanl Examiner-Tbu Nguyen
`(74) Allorney, Agenl, or Firm-Carr & Ferrell LLP
`
`(57)
`
`ABSTRACT
`
`A first texture mapping uni t generates texture coordinates
`and associated Red, Blue, Greco (RGB) values in response
`to coordinates received from a rasterizer. Tbe first texture
`mapping unit makes usc of compressed texture mipmaps to
`reduce memory storage and baodwidtb requirements. Tbe
`compressed texture maps may be generated by a compres(cid:173)
`sion system employing principles of Block Truncation Cod(cid:173)
`ing (BTC) and Color Cell Compression (CCC). A second
`texture mapping unit generates texture coordinates and
`associated RGB values in response to coordinates received
`from a rasterizer. The second texture mapping unit includes
`a memory organization allowing two mipmap levels to be
`retrieved in a single access, and 8-pon Color Lookup Table
`(CLUT), a trilinear interpolator and a vjdeo port. A footprin t
`assembly system maps textures onto surfaces by approxi(cid:173)
`mating the projection of a pixel onto a texture by a number
`of square mipmapped texels. Tbe second texture mapping
`urut also performs environment mapping, reflectance map(cid:173)
`ping and detail maps.
`
`2 Claims, 11 Drawing S heets
`
`From
`Source
`
`'
`
`Tonext
`stage
`
`From
`Source
`
`0001
`
`Volkswagen 1009
`
`
`
`US 6,236,405 Bl
`Page 2
`
`OTIIER PUBLICATIONS
`G. Kaine!, et al., "GRAMMY: High Performance Graphics
`Using Graphics Memories", Proceedings of the lnternatiooal
`Workshop on lligh Performance Computing for Compmer
`Graphics and Visualization, Swansea, Jul. 1995, pp. 33-48.
`W. Strasser, et aJ.,
`"High Performance Graphics
`Arcbitectures"S'" International Conference of Computer
`
`Graphics and Visualization, St. Petersburg, Russia, JuJ.,
`1995.
`A. Schilling, et al., "TEXTRAM-A Smart Memory for
`Tecturing", IEEEE Computer Graphics a Applications, vol.
`16, No.3, May 1996, pp. 32-41.
`
`* cited by examiner
`
`0002
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 1 of 11
`
`US 6,236,405 Bl
`
`•
`
`•
`
`a,b
`c,d
`
`112
`~
`a,b
`c,d
`
`~
`106
`
`a lb
`c d
`
`110
`~
`a~b
`c d
`
`~
`102
`
`v 104
`
`FIG. 1
`
`0003
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 2 of 11
`
`US 6,236,405 Bl
`
`0 1 0 1
`2 3 2 3
`0 1 0 1
`2 3 2 3
`
`4
`
`6
`
`5
`
`7
`
`LeveiO
`
`Level1
`
`0
`
`2
`
`1
`
`3
`
`FIG. 2
`
`Level2
`
`510 "\.
`
`color a
`
`512
`
`502J
`
`LUT
`256 X 24 bit
`
`508
`
`FIG. 5
`
`0004
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 3 of 11
`
`US 6,236,405 Bl
`
`304
`~
`
`CPU
`
`302
`'1
`
`Rasterizer
`
`312
`'-,
`
`Frame
`Buffer
`
`308 !
`
`j
`
`\_ 310
`
`306 !
`
`Texturing
`Unit
`
`FIG. 3
`
`y
`
`z
`
`z
`
`z
`
`X w w
`
`y
`
`y
`
`z
`
`X w
`
`X w
`
`y
`
`X
`
`\_ 404
`
`color b
`
`color a
`
`\_ 402
`
`0
`
`0
`
`0
`
`0
`
`1
`
`0
`
`0
`
`0
`
`1
`
`1
`
`1
`
`0
`
`1
`
`1
`
`1
`
`1
`
`\_ 406
`
`FIG. 4
`
`0005
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 4 of 11
`
`US 6,236,405 Bl
`
`DRAM
`1M X 16
`or 256k x 16
`
`1\_6
`04
`
`Act
`Out
`
`RAS CAS
`CE WE
`
`11
`
`16
`
`+
`
`+
`I
`
`~08
`
`lr---
`
`IL 624
`
`,-.
`
`61(
`
`,-...J
`
`~ 2HII
`
`color b
`color a
`4x4 Texets: 0: color a 1: color b
`I I I I I I I 111111 I I I I I I II I
`I I I I I I I
`I
`fa
`~07
`612 1\_6
`02
`.fa
`CLUT
`I
`256 x 24 bit (or 512 x 24 bft)
`~2. bll 614
`r-
`
`DRAM
`Control
`
`([
`616
`
`special Cache 4 x 64 bit
`
`4
`
`,......_;
`
`,._922
`
`&L Color Extract = Bilinear
`b 618
`
`RGB
`'Z4
`
`
`
`Rast erizer
`
`624 ~._
`
`63P
`
`64t§L_ (Multiplexer) -
`-
`1
`
`Interpolation
`I
`
`Rasterizer
`
`FIG. 6
`
`0006
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 5 of 11
`
`US 6,236,405 Bl
`
`A
`
`c
`
`A
`
`c
`
`B
`
`0
`
`B
`
`0
`
`A
`
`c
`
`A
`
`c
`
`B
`
`D
`
`B
`
`D
`
`FIG. 7
`
`FIG. 10
`
`0007
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 6 of 11
`
`US 6,236,405 Bl
`
`f 804
`+-
`Address and Control Unit including DRFs
`
`r 812
`
`I
`
`810
`
`7
`
`~0 810 810
`Bank ( Bank
`4
`l 5
`
`~0
`
`2
`
`3
`
`.-~ P-Reg I
`
`S-Reg~
`
`I
`
`ro
`-ro
`0
`~
`
`Address
`~0
`
`810
`,./
`
`0
`
`1
`
`822 ~~ Bank ( Bank
`
`I S-Reg
`
`826
`
`r- 828
`
`"
`'-r+f P-Reg.J::;_ '-r+fP-Reg
`...... P-Reg
`24
`x24
`x24
`S-Reg,
`I S-Reg
`
`814
`,./
`
`,JlUUt CLUT
`'I I
`
`830
`
`816./ Mipmap
`Generation
`Unit
`
`- 832
`
`808-J
`Video Port
`
`~i-lin. Inter-
`polator
`
`818
`Output and
`r- Combination Stage
`
`v 82o
`r-
`
`v 806
`
`P1xel Port
`
`/
`
`802
`
`FIG. 8
`
`0008
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 7 of 11
`
`US 6,236,405 Bl
`
`u, v-coordinates
`
`910
`
`914
`
`u-Offset
`Map
`
`v-Offset
`Map
`
`902
`
`Detail
`Map
`
`906
`
`FIG. 9(a)
`
`Access
`Offset Maps
`
`4
`lJ90
`
`Access
`Detail Map
`
`8
`lJ90
`
`Access Texture Map
`and Interpolate Detail J 912
`Color
`
`Interpolate Texture
`Color and Store Detail J 916
`Color in Combine Unit
`
`Combine Detail and
`Texture Color
`
`920
`
`J
`
`~
`
`Pixel Color
`
`0009
`
`FIG. 9(b)
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 8 of 11
`
`US 6,236,405 Bl
`
`u r v
`
`0
`
`0
`
`16
`
`32
`
`48
`
`8,8
`
`•
`
`8,8
`
`•
`
`40,16
`
`FIG. 11 (a)
`
`Tile Structure
`
`FIG. 11(b)
`
`Mortar Structure
`
`0010
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 9 of 11
`
`US 6,236,405 Bl
`
`1204
`
`FIG. 12
`
`FIG. 13
`
`0011
`
`
`
`U.S. Patent
`
`May 22,2001
`
`Sheet 10 of 11
`
`US 6,236,405 Bl
`
`From
`Source
`
`To level 0
`
`1404
`From level 0
`
`1406
`
`FIG. 14(a}
`
`1410
`
`0 u..
`u..
`
`To next
`stage
`
`To level1
`From
`Source
`
`From lev·el
`A.-1
`
`1416
`
`0
`lL
`lL
`
`To level 'k
`
`FIG. 14(b)
`
`1418
`
`To next stage
`
`0012
`
`
`
`U.S. Patent
`
`May 22, 2001
`
`Sheet 11 of 11
`
`US 6,236,405 Bl
`
`I\
`
`\
`
`~
`f\
`~
`\ 1\
`f\
`I\
`\
`
`~
`f\
`~
`\ ~
`~
`1\
`\
`
`1\
`
`\
`
`00 + ,.._
`
`C/)
`C/)
`<l.l
`8
`<{
`
`1\
`
`I\
`
`I\
`
`\
`
`1\
`
`~
`
`~
`
`\
`
`.
`(!)
`LL
`
`\
`
`\
`
`\
`
`\
`\
`I\
`[\
`\
`
`f\
`[\
`[\
`\ 1\
`I\
`I\
`\
`
`co
`+ 1.{')
`
`C/)
`C/)
`
`~
`:i.
`
`1--
`
`1--
`
`•
`•
`
`J
`~· J
`
`•
`•
`•
`
`.
`(!)
`u..
`
`J
`
`J
`
`1--
`1--
`
`1--
`
`1--
`
`1--
`
`1-
`1--
`
`C/)
`I.J...
`0::
`0
`
`C/) a.
`ro
`E
`a.
`~
`
`0013
`
`
`
`US 6,236,405 Bl
`
`2
`respect to systems where cost is of concern, there is a need
`for an efficient compression scheme which reduces the
`amount of data required to be stored and accessed by a
`texture mapping system.
`
`1
`SYSTEM AND METHOD FOR MAPPlNG
`TEXTURES ONTO SURFACES OF
`COM I•UTER-GENERATED OBJECTS
`
`RELAfED APPLICATIONS
`
`'£his application claims priority to U.S. Provisional Patent
`Application 60/020,935, filed on Jul. 1, 1996.
`
`H ELD OF TilE INVEN"nON
`
`'£his invention relates generally to the field of computer
`graphics and more particularly to the generation and pro(cid:173)
`cessing of textures in computerized graphical images.
`
`13ACKGROUND
`
`SUMMARY
`Embodiments of the present invention advantageously
`enhance computer generated images by texture mapping in
`a rapid and efficient manner. Ln a principle aspect, the
`tO present invention advantageously provides a footprint
`assembly system which provides significant image enhance(cid:173)
`ment at high rendering speeds. Embodiments employing tbc
`principles of the footprint as.<;embly system described herein
`advantageously provide significant image enhancement in
`15 an eiiicient manner by approximating tbe projection of a
`pixel on a texture by a number N of square mipmapped
`texels.
`In another aspect, the present invention provides a data
`compression system to reduce memory storage and band(cid:173)
`width requirements in a simple, yet fast manner, and to
`therefore reduce system costs. Still other aspects of the
`present invention provide novel hardware architectures for
`texture mapping. In one embodiment, a hardware texturing
`unit is provided to operate on compressed textures. In certain
`embodiments the textures may be compressed by way of the
`aforementioned novel compression system. Such an
`embodiment advantageously decreases system cost by
`reducing the amount of memory storage and bandwidth
`required to store texture maps. In a second embodiment, a
`30 hardware texturing unit is provided to operate on uncom(cid:173)
`pressed textures. Such a unit advantageously incorporates
`certain interpolation techniques to provide high image qual(cid:173)
`ity in systems where high image quality is required.
`The hardware units described herein benefit from the
`35 integration of arithmetic units and large memory arrays on
`the same chip. This allows exploitation of tbc enormous
`transfer rates internal to a chip and provides an elegant
`solution to the memory access bottleneck of high-quality
`texture mapping. In addition to achieving higher tex turing
`40 speed at lower system costs, such hardware units incorporate
`new functionality such as detail mapping and footprint
`assembly to produce higher quality images at still real-time
`rendering speed. Other functions which may be integrated
`into certain units include environment and video mapping.
`45 Such hardware units may consequenUy take tbe form of
`extremely versatile texturing coprocessors.
`These and other features and advantages of the present
`invention may be better understood by considering tbe
`following detailed description of certaio preferred embodi(cid:173)
`so ments. In the course of this description, reference will
`frequently be made to the attached drawings.
`
`BRIEF DESCRJPTION OF THE DRAWINGS
`FIG. 1 is a bigb-level illustration of tbe principles of
`mipmapping.
`FIG. 2 is a high-level illustration of a preferred memory
`organization in a mipmapping system.
`FIG. 3 is a schematic high level block diagram of a
`60 graphics controller with a texturing unit.
`FIG. 4 is a high-level illustration sbowing the principles
`of a preferred compression system.
`FIG. 5 is a schematic illustration o[ a preferred decoder
`employed in the decompression system of FIG. 3.
`FIG. 6 is a schematic illustration of a textu re mapping
`system which operates in accordance with the principles
`shown in FIG. 4.
`
`Mapping textures onto surfaces of computer-generated
`objects is a technique which greatly improves the realism of
`their appearance. For instance, many surfaces are charac(cid:173)
`teri7.ccl by surface roughness which in a digitized image
`manifests itself in the form o f local variations in brightness 20
`from one pixel to tbc ncxl. Unfortunately, altering pixels in
`comptller generated images to generate surface textures in
`such images imposes high computational demands and, even
`worse, tremendous memory bandwidth requirements on the
`graphics system. Tight cost constraints imposed upon the 25
`design of most products in conjunction with ever increasing
`user expectations make the design of a powerful texture
`mapping unit a difficult task.
`In the present specification, we usc the term ··texture" as
`a synonym for any image or structure to be mapped onto an
`object, unles.s explicitly stated otherwise. During the raster(cid:173)
`ization process, mapping images (textures) onto objects can
`be COn!>idcrcd as the problem of determining a screen pixel's
`projection on tbc image (referred to herein as the pixel's
`··footprint") and computing an average value wbicb best
`approximates the correct pixel color. In real-time
`environments, where several tens of millions of pixels per
`second arc issued by fast rasterizing units, hardware
`expenses for image mapping become substantial and algo(cid:173)
`rithms must therefore be chosen and adapted very carefu lly.
`One approach is to create a set of prcfiltcred images,
`which are selected according to the level of detail (the size
`of the foo tprint), a od used to interpolate the final pixel color.
`The most common metbod is to organize these maps as a
`rnipmap as proposed by L. Williams, ·'Pyramidal
`Paramerrics··, Proceedings of SIGGRAPH '83, Computer
`Graphics, vol. 17, no. 3, July 1983, pp. 1-11. In a mipmap,
`the original image is denoted as level 0. In Ievei l, each entry
`holds an averaged value and represents the area of 2x2
`texels. As used herein the term ··texel" (texture element)
`refers to a picture element (pixel) of the texture. This is
`continued until tbe top-level is reached, wbicb has only one
`entry holding the average color of the eotire texture. Thus,
`in a square mipmap, level n bas one fourth tbe size of level 55
`n-1.
`Mipmapping in a traditional implementation either
`requires a parallel memory system or sequential accesses to
`the texture buOcr and is therefore either expensive or slow.
`One way to reduce data traffic is image compression. lis
`application to texture mapping, however, is difficult since
`the decompression must be done at pixel frequency.
`There is accordingly a need for a texture mapping system
`which implements mipmapping in a rapid and/or a cost
`efficient manner. There is a further need for a tex ture 65
`mapping system which provides significant image enhance(cid:173)
`ment at high rendering speeds. Moreover, particularly with
`
`0014
`
`
`
`US 6,236,405 Bl
`
`3
`FIG. 7 is a schematic illustration of cache mapping for the
`system of FIG. 6.
`FIG. 8 is a schematic illustration of ano ther embodiment
`of a texture mapping system.
`FIGS. 9(a) and 9(b) are, respectively, schematic and flow
`diagrams of an embodiment employing detail maps in
`accordance with the principles of the invention.
`FIG . .10 illustrates an image used in conjunction with the
`explanation of FIGS. 9(a) and 9(b)
`FIGS. ll(a) and ll(b) illustrate further detai ls of the
`image of FIG. 10.
`FIG. U illustrates an approximation of the projection of
`a pixel on a tex ture map using a sequence of mipmap
`operations.
`FIG. 13 illustrates details of various aspects of determi(cid:173)
`nation of a pixel footprint.
`FIGS. 14(a) and 14(b) illustrate an embodiment of cir(cid:173)
`cuitry of real-ti me mipmap generation.
`FIG. 15 is a schematic illustration of data accesses
`employed in a volume rendering embodiment.
`FIG. 16 is a schematic illustration of map linking.
`
`DETAILED DESCRIPTION
`
`l. Overview
`
`The fo llowing detailed description starts wi th an expla(cid:173)
`nation in section 2 of mipmapping. Section 3 provides an
`explanation of a preferred compression scheme for reduction
`of data storage and bandwidth requirements. Section 4
`provides a description of a hardware unit for texture map(cid:173)
`ping which is particularly well suited to low cost systems.
`Section 5 describes details of a hardware texture mapping
`unit, referred to as TEXRAM, which is particularly well
`suited for applications where high image quality is at a
`premium. Section 6 describes four different configurations
`of the TEXRAM chip for texture mapping. Sections 7, 8 and
`9 describe respectively, environment mapping, reflectance
`mapping and detail maps, which are add itional functions
`which may be performed by the TEXRAM chip. Finally,
`section 10 describes a footprint assembly system which
`efficiently provides significant image enhancement in a
`number of texture mapping architectures including those
`described in sections 4 and 5 herein.
`
`2. Mipmapping
`
`35
`
`4
`used to model objects in a 3-dimensional space. Such a
`rasterizer renders the pixels within the triangles defined by
`the received coordinates and also interpolates the z-(depth)
`value and the color (RGB) value for each pixel coordinate
`5 generated. An example of such a ras terizer is the
`3-dimensional graphics engine contained in the ViRGE line
`of graphics controllers sold by S3 Incorporated, Santa Clara,
`Calif.
`M ipmapping is a reasonable canclidate for a hardware
`10 implementation due to its regular access function. If the
`memory is designed to deliver all eight texels for a tri-linear
`interpolation in a single access, texturing can potentially
`keep up wi th fast rasterizcr units. This is accomplished by
`having eight independent memory banks and a conllict-free
`15 address distribution as shown in FIG. 2. As seen in FIG. 2,
`each 2x2 group of texels in Level 0 is assigned to memory
`banks 0, 1, 2 and 3 and each 2x2 group of texels in Levell
`is assigned to memory banks 4, 5, 6 and 7. Such an
`arrangement allows two mipmap levels to be retrieved in a
`20 single access. Furthermore, to reduce data traffic between
`the rasterizer unit and the texture system, all address calcu(cid:173)
`lations concerning the eight bank add resses as well as the
`tri-linear imerpolation should be performed locally.
`The ideal solution is a highly-integrated memory device
`25 which incorporates aU needed ari thmetic units for fast
`mipmapping. FIG. 3 s hows a schematic block diagram of a
`3-dimensional graphics system which incorporates a texture
`memory in accordance with the princip les of the invention.
`ln FIG. 3, rasterizer 302 receives da ta from CPU 304 in the
`30 form of three-dimensional coordinates (x,y,z) for each ver(cid:173)
`tex of the triangles comprising a modeled object. Also
`received for each vertex are texture coordinates (u,v) which
`define the location of the vertex on the texture.
`The rasterizer 302 computes tell.1ure coordinates (u,v) for
`each pixel. Texturing unit 306 receives the texture coordi(cid:173)
`nates (u,v) of a pixel from rasterizer 302 over line 308 and
`retrieves a plurality of texels from the texture memory and
`interpolates the pixel's texture color (RGB) from the texel
`40 values. T he term "line'' as used herein is intended to refer
`generally to functional coupling of signals between logical
`blocks. As such, tbe term .. line" may refer to a single
`physical s ignal, or to a plural ity of signals such as a bus.
`Rasterizer 302 receives the pixel's texture color from tex-
`45 turing unit 306 over line 310. The fina l pixel color (RGB)
`together with the z-valuc is stored in frame buffer 3U at
`address (x,y). Data stored in frame buffer 3U may be
`subsequently used by rasterizer 302 for further operations in
`addition to being converted to analog form for display on a
`visual display unit (not shown) such as a Cathode Ray Thbe
`(CR'J) or Liquid Crystal Display (LCD). A description of
`two preferred embodiments of texturing unit 306 is provided
`below in Sections 4 and 5.
`Mipmapping in a traditional implementation either
`requires a para llel memory system or sequential acces.ses to
`tbe tex ture bu lfer and is therefore either expensive or slow.
`lo Section 4 below, a hardware architecture for the process(cid:173)
`ing of compressed textures is presented, which integrates
`texture mapping units together with a small texture cache on
`a chip. By means of this texture compression, the off-chip
`bandwidth for updating the on-chip cache is reduced, so tha t
`standard off-the-shelf DRAM devices can be used.
`In Section 5 below, a hardware architecture based on tbe
`combination of arithmetical-logical units and large memory
`arrays for texture mapping is presented. Such units have also
`been shown to provide a quantum leap io performance in
`otber areas such as the Z-Buffer. See, for example, M. F.
`
`50
`
`As no ted above in the '·Background" portion, use of
`mipmaps to organize a set of prefiliered images for use in
`texture mapping was proposed by L. Williams, "Pyramidal
`Parametrics'', Proceedings of SIGGRAPTI '83, Computer
`Graphics, vol. 17, no. 3, July 1983, pp. 1- 11. FIG. 1 of the
`drawings shows a schematic representation of a mipmap. In
`a mipmap, the original image 102 which is comprised of a
`plurality of texels, such as seen at 104, is denoted as level 0. ss
`In level 1, seen at 106, each entry holds an averaged value
`and represents the area of 2x2texels of level 0. For example,
`in FIG. 1, texel112 in Ievell holds an averaged value of the
`four texels seen at 110 in level 0 designated individua lly as
`''a", " b", "c", and ''d" . This is continued until the top-level 60
`108 (level 4 in the example shown in FIG. 1) is reached,
`which has only one entry holding the average color of the
`emire texture. Thus, in a square mipmap, level n bas one
`fourth the size of level n-1.
`As used herein, the term " rasterizer'' refers to a graphics 65
`engine which accepts vertex coordinates, together wi th
`associated color values, which define vertices of triangles
`
`0015
`
`
`
`US 6,236,405 Bl
`
`5
`Deering et al., "FBRAM: A Ne1v Form of Memory Optimized
`for 3D Graphics··, Proceedings of SIGGRAPH '94, July
`1994, pp. 167-74, and G. Kniuel and A. Schilling, "Eiinli(cid:173)
`rwting !he Z-Bujfer Bonleneck'', Proceedings of the Euro(cid:173)
`pean Design and Test Conference, Paris, France, Mar. 6-9,
`1995, pp. 12-16. A graphics pipeline based on enhanced
`memories for high performance in low-cost systems is
`described in a paper by G. Knillel, A. Schilling and W.
`Strasser, "GRAMt\.fY: High Performance Graphics Using
`Graphics Memories", in: lligb Performance Computing for
`Computer Graphics and Visualisation, Springer-Verlag,
`London, 1996.
`An additional disadvantage of traditiona l mipmapping
`steiDS from the assumption of a square footprint. The bard(cid:173)
`ware architectures presented in Sections 4 and 5 below
`efficiently alleviate the deficiencies arising from tbe square
`footprints by means of a new illtering operation (herein
`referred to as "footprint assembly" as described in Section
`10 below). The fil tering can take advantage of the extremely
`high bandwidth, which is available on-chip.
`
`3. Compression
`
`6
`register 510 represent, individually, o ne of the 16 pixels
`shown at 404. Once a 16-texel-block (32 bits) is retrieved
`from memory and stored in register 510, the texel to be
`decompressed is selected by its 4-bit address 512 within the
`s block. Multiplexer 504 feeds the corresponding decision bit
`via line 518 to the select-input of multiplexer 506, which
`passes one of the color quantities ·'a" or "b" to the address
`inputs of look-up table 508. In the embodiment of FIG. 5,
`look-up table 508 contains 256 locations addressable by the
`10 8-bit input, each location storing a unique 24-bit color. The
`output of the decoder 502 is a 24-bit RGB output shown at
`520. The decoder fulfi lls the listed requirements for texture
`mapping in a nearly ideal way.
`The compression of the texture maps can and preferably
`tS shou ld be performed in advance. This not only saves space
`on the storage media for the description of the scene, but also
`allows more complex algorithms to be used for Lbe quanti(cid:173)
`zation due to the olf-line compression. For grey scale images
`the first three sample moments can be preserved by choosing
`20 appropriate threshold and output levels as described by Delp
`ct al. in Lbe article referred to above. [o the Campbell et al.
`article referred to above, only the luminance of the input
`colors is used for clustering. if there are difterent colors with
`similar luminance in the same block, this method will fail.
`25 The quantization witb the minimum mean square error can
`only be found by exhaustive trial, resulting in long com-
`pression times.
`For a quicker approximate solution, embodi ments of the
`present invention advantageously split the input values into
`two clusters by a plane perpendicular to the axis with the
`minimum .. moment of inertia". For that purpose the tensor
`of inertia from the individual colors
`
`30
`
`Compression of textures is often preferable as it saves
`memory costs and reduces memory bandwidth requi re(cid:173)
`ments. However, several requirements bavc to be fulfilled.
`The most important one is, that the local decompression of
`the stored texture is possible and can be performed very
`quickly. Commonly used image compression schemes such
`as JPEG, arc not well su ited to texture mapping since they
`do not fuliiiJ two es.sential requirements for texture mapping:
`(1) the decompression bas to be simple and very fast, and,
`(2) random access to texels must be possible.
`Block Truncation Coding (BTC) was introduced by Delp
`ct. al. in an article cntiiled "Image Compression Using Block 35
`Truncation Coding",
`I EEE Transactions on
`Communications, vol. COM-27, no. 9, September 1979, pp.
`1335-42. Color Cell Compression (CCC) is an extremely
`useful image compression technique for texture mapping.
`CCC which is a modification and application of BT C to 40
`color images was introduced by G. Campbell et al. in an
`art icle entitled "Two Bit/Pixel Full Color Encoding", SIG(cid:173)
`GRAPI-1 '86 Conference Proceedings, Computer Graphics,
`vol. 20, no. 4, August 1986, pp. 215-23.
`FIG. 4 shows the principle of the BTC/CCC. The main 45
`idea of BTC/CCC is to use a local 1-bit quantizer on 4x4
`pixel blocks such as seen at 404. The compressed data for
`such a block thus consists of only two colors (seen at 402)
`and 16 bits (seen at 406 and hereafter referred to as "decision
`bits") that ind icate, which one of the two colors is assigned so
`to each o( the 16 pixels. For example, as shown in FIG. 4,
`the four color values (w, x, y and z) seen at 404 are reduced
`to two color values (a,b) at 402, represented by 0 and 1,
`respectively at 406. For further data reduction, the 24-bit
`colors can be globally quantized to 8 bits, using a standard ss
`quantizer such as the lleckbert quantizer described by P.
`Heckbert, "Color Image Quamization for Frame Buffer
`Display ", Proceedings of SIGG RAPH '82, Compute r
`Graphics, vol. 16, no. 3, July 1982.
`A preferred decoder for decoding of a CCC-encoded 60
`image is shown in FIG. 5. The decoder 502 consists of
`multiplexers 504 and 506 and a lookup table 508. A 3~bit
`quantity is s tored in register 510. This 32-bit quantity
`consists of two 8-bit quantities of color data ("color a" and
`"color b", such as s bown at 402 in FIG. 4), each 8-bit 65
`quantity representative of one of the two colors generated
`for tbe 16 pixel block 404. The o ther 16 bits stored in
`
`0016
`
`is calculated as
`
`16
`
`e., = _l)1jll2
`6;A - xj, xft
`
`j=l
`
`(I)
`
`where i, k€{ R, G, B} and b1k=l for i=k 0 else.
`The eigenvector with the sma llest eigenvalue is then calcu(cid:173)
`lated using standard methods. Multiplication of the indi(cid:173)
`vidual colors with that eigenvector reduces tbc clustering
`problem to one dimension and allows tlhe colors to be sorted
`according to their distance to a plane parallel to the cu tting
`plane. The quantization threshold is set to the mean distance.
`In this way the mean color is in the cutting plane.
`CCC in conjunction with footprint assembly (described in
`Section 10) gives beHer image quality at higher speed and
`lower cost than traditional texture mapping.
`
`4. Texture Mapping Unit
`
`FIG. 6 shows an embodiment of a single chip implemen(cid:173)
`tation of an integrated texture mapping unit 602 which
`operates in conjunction with a conventional Dynamic Ran(cid:173)
`dom Access Memory (DRAM) 604. In a preferred
`embodiment, unit 602 and DRAM 604 are integrated on a
`single chip. The unit 602 receives da ta £rom a rasterizer 302
`such as shown in FIG. 3. DRAM control unit 607 receives
`texture addresses from the rasterizcr 302. Operation of the
`DRAM 604 is controlled by DRAM controluoit 607, which
`generates memory addres.scs for the DRAM in accordance
`with the output of rasterizer 302 and w hich generates other
`
`
`
`US 6,236,405 Bl
`
`8
`alternative ways. By way of example, the capacity of
`register 608 may be increased to allow pipelining of data
`from the DRAM 6~ into the unit 602.
`The visual quality of a compression scheme can in the end
`only be judged by human observers. Compared to original
`textures, still images as well as animations show noticeable
`but no disturbing artifacts for compressed textures. Mip(cid:173)
`mapping with bi-linear interpolation generally exhibits
`severe artifacL<;, independent of tbe use of image compres(cid:173)
`sion. Footprint assembly retains texture details even for
`objects looked <It from a small angle. For cost-efficient
`systems, we propose to combine bilinear interpolation, CCC
`and footprint assembly, a way to improve texture mapping
`performance s igniJicantly.
`
`tO
`
`7
`signals required to operate the DRAM such as row and
`column address strobes and chip and write enable signals. In
`the embodiment of FIG. 6, DRAM 604 provides 32 bits of
`data, 16-bits at a time. One 16-bit quantity represents two,
`8-bit quantities of color data, such as explained above in 5
`connection with FIGS. 4 and 5, and one 16-bit quantity
`represents 16 bits of decision data, such as also explained
`above in connection with FIGS. 4 and 5. The 32-bit quantity
`outputted by the DRAM 604 is stored in a register 608.
`Multiplexer 6 10 performs a 2: L multiplexing function to
`sequentially select the two 8-bit color quantities stored in
`register 608 for output to Color Look'llp Table (CLUl) 6ll.
`C'LUT 6 12 may take a variety of configurations including
`256x24 or 5 L2x24, depending on the number of inputs.
`CLUT 6 12 outputs a 48-bit quantity to a cache 614. 15
`Pre ferably, the cache 614 is a small, four-entry cache of 32
`bytes which is suDicient to s tore lour 4x4 texel blocks or a
`tol:ll o f 64 texel<>.
`The cache 6 14 is a direct mapped cache in which each
`memory address is assigned one unique cache cell (A
`through D) in an interleaved way as s hown in FIG. 7.1n FIG.
`7, each block (A, B, C, or D) represents one emry in the
`cache 6 14. This unambiguous way of addressing allows a
`simple address calculation logic to be used. A simple form
`of addres.<. look-ahead speeds up the operation of the cache:
`neighbors of the current block are preloaded if the DRAM
`interface is idle. The three neighbors adjacent to the quad(cid:173)
`rant of the point addressed by the current u- and v-values
`may be selected. This decision can be made using only two
`bits, one bit of the u- and the other of the v-address. A more
`sophisticated look ahead could use the direction of consecu(cid:173)
`tive accesses within a seaolinc, which could be calculated
`from the two previous access points. The rasterizer should
`provide a flag to indicate a seaolioe change, which could be
`used to disable the prefetch and look ahead calculation for
`that cycle.
`Each of the 64-bit entries in the cache holds two, 24 bit
`quantities of true-color data ("·color a" and "color b" such as
`shown at 402 in FIG. 4) and a 16-bit quantity of decision
`data The 16-bit quantity of decision data is wrilten to cache
`via line 624 from register 608, and the two 24-bit quantities
`o f color data arc written to the cache from color look-up
`table 6 12. The cache write accesses are controlled by
`DRAM control unit 607 via line 616. Tbe cached is con(cid:173)
`nected to color extract unit 618 via four 64-bit lines. Color
`extract unit 618 selects data from the four received 64-bit
`quantities for use by bilinear interpolator 622 in accordance
`with pixel center coordinates receiv