`
`I
`
`L
`1‘
`i
`i
`
`‘,
`
`
`ll
`
`IMPROVING PERFORMANCE USING THE RAMP DRIVER I 257
`
`
`//
`init shared hardware cevice
`CS1aredHardware *3Grfx = new Cshared-ardware():
`//
`init direct draw
`it E1pGrfx->lnitDirectDrawtJ) return FAL5k;
`
`“
`
`‘
`
`“
`
`‘ "' ”"‘
`
`‘
`
`
`
`init d3d
`//
`if iflpGrfx—>1aitDirect3D(USE¥RAMP>)
`
`return FALSE ;
`
`// set up coozerative level
`if (!pGrf><->Se:CoopLevei(hwnd. DDSCL NORMAL)) return FALSE;
`
`Do revisit the code in Section 14.4.2 if you need to refresh your memory on
`how to enumerate and select a driver.
`
`16.3.2
`
`Using the Ramp Driver-—The First Try
`Once we’ve loaded the new driver, can’t we just go ahead and create objects
`CSurface3D and CTriangle3D or CTriangleTeX as usual and run them using
`the Ramp driver, instead of the RGB driver? The answer is no.
`
`VVhen we first tried running our simple triangle with the Ramp driver, we
`saw our background being painted correctly, but our CTriangle3D (shaded
`triangle) was drawn as a black triangle. When we tried using CTriangleTex
`(texture mapped triangle), our application crashed, right in the middle of
`the Ramp driver rendering module, with no clue as to what was wrong.
`
`The key lies in remembering how the Ramp model operates—through
`lookup tables (see section 15.3.4). The Ramp driver uses only the Blue
`component of a color specification and then accesses “a lookup table” to
`interpret the final result. The Ramp driver builds lookup tables from mate-
`rial definitions. If no lookup table has been built (because, say, no material
`was created), then the rendering module crashes. Solution: create materials.
`
`16.3.3
`
`Creating Materials for the Ramp Driver
`
`We have repeated the definition of the DBDWATERI AL structure here for easy
`reference:
`
`WE
`
`ndin to
`ndauifio
`
`|
`
`mori’ to.
`no Cosg‘
`
`:1'edVjd_
`
`:rfor— L
`
`e
`
`lb
`
`quarter
`,01u8%]ed
`‘
`a
`
`3 ids ’
`lor
`on a PC
`
`'
`
`amfinta‘
`dnven
`3 R_amP
`algyll
`to e
`
`.t code
`iclected
`
`he pos-
`n g the
`
`1'1 mam-
`
`
`
`.
`
`i
`
`
`
`276
`
`
`
`258 I CHAPTER 16 UNDERSTANDING AND ENHANCING DiREc1'3D PERFORMANCE
`
`typedet struct _D3DMATEP.IAL l
`DWORD
`dwS’ze:
`D3DCCLOR\/ALUE
`dCVDi Fuse;
`;
`D300: LORVALUE
`(j(jVAm[;1'em ,
`D3DCCLORVALUE
`dcvspecular;
`DBDCCLORVALUE
`dcvEmissive;
`DHDVAIUF
`dvPower
`;
`D3DTE.‘<TJRE%ANDLE Wexture;
`I)‘/4(JRll
`dwRampSi 7e;
`l D3DMATERlAL, *LPD3DMATERIAL;
`
`'
`
`Four different color components.
`
`,\
`
`O Spedfysharpnessofspecubrreflecfion;
`<1 Combine a texture with specified coloring
`\ {J Shading gradient of colors in Ramp/Mono model
`/
`
`4
`
`The Ramp driver builds lookup tables based on the specifications in a
`material structure:
`
`For materials with no specularity, the driver builds a linear “color ramp"
`ranging from the ambient color to the maximum diffuse color.
`For materials with specularity, the driver builds a two-stage color ramp;
`the first stage ranges from the ambient color to the maximum diffuse col-
`or, and the second stage ranges from maximum diffuse color light to the
`maximum specular color. The gradient of the specular ramp is not linear,
`and it is controlled by the dvPower field.
`
`For materials with textures, the Ramp driver builds a color ramp for each
`color in the texture.
`
`The Ramp driver references the dwRampSize field to determine the size
`of the ramp built for each color.
`
`For example, the following code sequence builds a color ramp with sixteen
`shades of red:
`
`D3D\/ALUE(l.OO);
`mMater1'alDesc.dcvDii°fuse.dvR
`mMatcrialDesc.dcvDiffuse.dvG — D3DvALUE(0.00);
`mMaterialDesc.dCvDiffuse.dvB
`D3DVALUE(0.00);
`m,Matcri‘a1Dcs:.hTexture —- NULL:
`m_MaterialDesc.dwRampSize = 16;
`
`In this next example, the Ramp driver builds a color ramp with eight shades
`for each color in the associated texture:
`
`277
`
`
`
`IMPROVING PERFORMANCE USING THE RAMP DRIVER I 259
`
`mMater1'aiDesc.dcvD1f:use.dvR — D3D\/ALLE(1.00);
`mMater1'alDesc.dcvD1f‘use.dvG 4 D3D\/ALLE(l.|)[)1;
`mMater1a1Desc.dcvD1f*use.dvB
`D3DV/\LLE(1.00);
`mMater1a1Desc.hTe><ture = hTe><t.1re;
`mMater1a1Desc.dwRarnpS1ze = 8
`
`Rendering a Triangle with the Ramp Driver
`
`Now that we’ve taken a look at how the Ramp driver builds its lookup
`tables, let’s create a C’l‘riangleRarr1p to render a triangle with a shaded
`Ramp driver.
`
`BOOL CTF"aflgleRamp::lnit(LPDIRECT3I: pD3D, LPJIRECTDRAWPALETTE :Palette,
`{
`
`[HM rReS)
`
`Create material to “set” palette entries for C0l01‘.
`
`pD3D—>CreateMateMai(&“n,pMater1a'Fns, NULL):
`lll_MatEY'ldlDESC.CCVDl:fUSE.(l\/R = D33VALLE(1.00);
`m_Mater1'alL)esc.ccvDi‘fuse.dvG = D3DVALLE(1.00):
`m_Matem'a1Desc.ccvD1ffuse.dvB = D3DVALLE(1.00);
`m_MaterialDesc.hTexture = NUI \;
`m_Matem'a1Desc.cwRampSize = 16;
`m_pMater1'a|Fns—>SetMater"iaI(Xrm_MateMa11)esc);
`wi,pr~1ater1‘a1Fnsv>Getr-andie(p3dFrs, &m_h\4ater1‘a1);
`
`ThCYCXU11'Ch3Hd1€i5NULL,
`theMaxD'Lffusec010ris
`wHITE.andthcRaInpSize
`is 16.TheRaInpdriverwi11
`cteatefnurteenShadesofgrav
`betweeneLAc»<andwHITE.
`
`*1
`
`i
`
`Standard code to allocate system memory space for an Execute Buffer
`#define nTRlS
`1
`ifdefme n\IERTS nT’zis*3
`m_s4LEx = sizeof(D3DTLVER"EX) * HVERTS;
`m_sztEx += s1zeof[D3DZNSTRUCTION)
`* 5:
`m_sztEx += sizeof(D3DSTATEI
`* 2;
`m_S7tlX 4= si7eot(D3DPHZC|SSVlRllCtS);
`m_sztEx += sizeof(D3DTRIANGLE)
`* nT?IS;
`m_pSys[xBuf+er = new BYTE Ln_s7t[x];
`memset(m,pSysExBuffe“, O, m4sztEx?:
`
`Use standard code to initialize vertices and then override the colors.
`
`D3D‘LvERTEX *averts = (D3DTLVERTEX *)fl_pSySE><Buffer;
`setup\/ert1‘ces(nTRIS. a\/erts);
`‘lflt
`";
`l
`'l<FTRl52 i'++)
`for (i=0;
`d\/eFts[0].col0r‘ = RGBA_MAKE(OO0, 000, 255,
`a\/er"ts[l].col0P = RGBA_MAKE(O00, 000, 128,
`a\/er‘tS[2].c0l0r = RG3A_MAKE(O00, 000, 000.
`a\/arts += 3;
`
`The Ramp driver uses
`only the blue compo-
`nent and ignores the red
`and green components.
`
`278
`
`
`
`260 I CHAPTER ‘[6 UNDERSTANDING AND ENHANCING DlRECT3D PERFORMANCE
`
`The notable addition when setting up instructions for a triangle with the ramp
`model is the D3 DOP_STATELIGHT opcode with its D3DLiGHTSTATE_MATERIAL
`operand. (We’re using the 0P_STATE_LI Gl-.lT macro.) Any materials that
`we’ve created have merely instructed the Ramp driver on how we want gm
`lookup tables built. We use the D3 DOP_STATE _ I GHT instruction in an Execute
`Buffer to instruct the Ramp driver to use a specific material for all future
`rendering.
`
`
`The D3DCP_STATELIGI-IT instruction seems to turn off the render state. The default
`state is inoperative, and triangles will not be rendered unless you reset the render
`state. The D3DOP_STATEREllDER specification must follow the D3DCP_S'ATEL1GHT
`instruction, as render states set before the Light state become inoperative. You
`may want to set all render states that concern you and not assume the value of
`any state.
`
`
`
`Set up instructions in Execute Buffer.
`* nVERTS;
`DNCRD dwstart : size0f(D3DTLVERTEX)
`LPVOID T3Tmp = (LFVOID)(m_pSysExButfe“ + dwstart);
`0P_STATE_LIGHT(l,
`lpTmp):
`STATE_DATA(D3DLIGHTsTATE_MATERIAL. m(hMateriai,
`OP_STATE_RENDER(1,
`lpTmp>;
`STAT—kDATAID3DRENDERSTATE_SHADEMODE, D3DSHADE_GDURAUD,
`OP_PROCESS_VERTICES(1,
`lpTmp);
`PROC’SSV[RTTCFS_DATA(D3DPROCE5SVERTiCtS_£OPY, 0, nVERTS, TpTmp):
`OP‘TRIANGLE_LIST(nTRIS, TpTmp);
`for (i=0;
`t<nTRIS;
`i++)
`{
`i*3+O;
`((LP)3D'RIANGLE)TpTmp)->vl
`llll
`i*3+1:
`((LPD3DTRIANGLE)TpTmp)—>v2
`((LPD3D'RIANGLE)TpTmp)->v3 = 1*3+2;
`((LPD3DTRIANGLE)TpTmp)~>wFlags = O;
`lpTm3 e ((char*)ipTmp) + sizeof(D3DTRIANGLE);
`
`lpTmp); <e~——~——~—e~——
`
`lpTmpl:
`
`dP_Ex1T«:ipTmp>;
`DNCRD dw_th = (LPBYTE)lpTmp - m_psysExsmer - cwSta“t;
`
`Tell the reriderer that we want it to use our material to render all future triangles.
`Note that we reset the render state to Gouraud, even though this is the default
`state.
`
`We are now ready to render our triangle with the model. The Ramp model
`only seems to set palette colors once an instruction stream has been exe-
`cuted. Our code currently sets the palette on every End Scene. You may want
`to execute an instruction stream with just the D3 D O LSTAT E L I G H T instruction
`to update the palette during an initialization stage.
`
` I GA
`
`279
`
`
`
`thheecrlgiaduell
`T
`Ti L1GH¢
`rallVe- YOU
`ie value of
`
`,
`
`OPTIMIZING TEXTURE MAPPING I 261
`
`How Does the Ramp Driver Perform?
`
`Table 16-8 compares the performance of the RGB and the Ramp color
`model drivers. We’ve shown results for various rendering options using
`Scene 2 (16 X 5000) from our previous tests.
`
`TABLE 1 6-8 Comparing the Direct3D RGB and Ramp Color Model Drivers
`
`Gouraud
`Flat Shaded
`Gouraud and Specular
`Gouraud and Dither
`
`T
`
`55.3 milliseconds
`55.3 milliseconds ~
`60.3 milliseconds
`55.3 milliseconds
`
`4.3 milliseconds
`l.8 milliseconds
`4.3 milliseconds
`20.6 milliseconds
`
`Texture Map and Gouraud
`
`62.5 milliseconds
`
`l6.7 milliseconds
`
`Texture Map Copy Mode
`
`l
`
`l4.4 milliseconds
`
`14.9 milliseconds
`
`Wow!
`
`1 Look at the speed of the Flat Shaded, Goaraud, and Goaraud and Specular
`options. Now we’re really screaming along!
`
`The performance of Gouraud and Dither is not too shabby either. You
`may not want to use it on all your triangles, but at this performance level,
`you could use it on some.
`
`The only “disappointment” is that the performance of texture mapping
`in Copy mode has not improved. It would have been great ifwe could use
`texture mapping Widely, but at this performance level you probably
`would want to limit its use.
`
`i_ mg
`
`np model
`
`5“ 6X3"
`may want
`rstruction
`
`*
`
`5
`
`16.4 Optimizing Texture Mapping
`Before we close, we’d like to include some advice from the Direct3D docu-
`1ne11tation on optimizing texture mapping:
`
`I Texture mapping performance is heavily influenced by cache behavior.
`Keep textures small; the smaller the textures are, the better chance they
`have of being maintained in the main CPU’s secondary cache.
`
`280
`
`
`
`262 I CHAPTER ‘[6 UNDERSTANDING AND ENHANCING DiREcr3D PERFORMANCE
`
`1 Do not change the textures on a per primitive basis. Try to keep polygons
`grouped in order of the textures they use.
`
`Use square textures whenever possible. Textures whose dimensions are
`256 X 256 are the fastest. If your application uses four 128 X 128 textures,
`for example, try to ensure that all the textures use the same palette, and
`place them into one 256 X 256 texture. This technique also reduces the
`amount of texture swapping required. Of course, you should not use
`256 X 256 textures unless your application requires that much texturing,
`because, as already mentioned, textures should be kept as small as possible.
`
`' Well, we’ve come to the end of this road. Cheers, and may all your 3D applj-
`cations really sizzle.
`
`WHAT HAVE
`YOU LEARNED?
`
`We measured the performance of our simple RGB color model triangle, both its inner
`workings and its various rendering options. We tried some optimizations and found that
`the returns were decent for long Execute Buffers, but overall performance was still far from
`stellar.
`
`Next we learned how to use the Ramp color model driver, including using materials and
`D3DOP_STATE-]GHT to direct the driver to create its lookup tables. And we were rewarded
`with a dramatic improvement in performance.
`
`We've spent sufficient time on DireCt3D’s Immediate mode. In the next chapter we will
`cover mixing our 3D results with 2D and video.
`
`281
`
`
`
`
`
`
`
`CE
`
`’°lYg0I1s
`
`e
`
`ions are
`:eXtures,
`EH6, and
`uces the
`not use
`
`xturing,
`possible.
`
`D appfi-
`
`its inner
`1
`ound that
`ll far from
`
`erials and
`rewarded
`
`er we will
`
`
`
`CHAPTER '17
`
`um
`
`Mixing 3D with Sprites,
`Backgrounds, and Videos
`
`WHY READ
`a
`E THIS CHAPTER?
`
`You might as well ask, ”Why would I need to mix other graphics media types with 3D?"
`Well, here are some scenarios that might prompt mixing:
`
`I You could create your application to be entirely 3D based. But 3D modeling and ren-
`dering is performance intensive. Drawing some objects with faster 2D mechanisms
`may bring an improvement in perlormance.
`I You have your own object types, with their own rendering codes, and you want to in-
`termingle these objects in a 3D model.
`I Say you have designed 3D exploratorium within which you have real-life characters
`communicating with the Explorer. You have motion video footage of these characters,
`and you'd like to transparently overlay the video in your 3D world.
`
`In short, you may want to mix media types because of performance advantages and/or
`because you want to add richness. in this chapter first you'll learn how to mix a 3D object
`within a 2D world, and then you'll learn how to use a video as a texture map within a 3D
`world.
`
`V l1.l Mixing a 3D Object on a 2D Background
`We’ve already seen how mixing works in Part II, where we mixed :1 sprite on
`top of a background. In fact, over the course of Part II, we looked at a vari-
`ety of options for miXing—using GDI, DirectDraw, and RDX.
`
`In Part H We mixed a sprite on top of a background by:
`
`I
`
`creating a CSurface from among the Various options;
`
`I263l
`
`282
`
`
`
`264 I CHAPTER ‘[7 MIXING 3D WITH SPRITES, BACKGROUNDS, AND VIDEOS
`
`I
`
`I
`
`creating a Cliackgro und from among any options suitable to the CSurfage
`and attaching the CBackground to the CSurface;
`
`creating a CSprite from among any options suitable to the CSurface and
`attaching the CSprite to the CSurface; and
`
`I Blting the Cliackground first, Blting the CSprite on top of the CBa¢_k_
`ground, and then refreshing the screen with the mixed image.
`
`17.1.1
`
`Our 3D Surface Is Also a 2D Surface
`
`But wait! Let’s think about where we are. We got access to a 3D surface in
`the first place by “querying” for 3D capabilities. As long as we retained
`access to the original 2D surface——that is, as long as we did not call
`IDirectDrawSurfuce::Releuse()—we can still use its innate 2D-ness.
`
`So to mix a 3D object on top of a 2D background, we could
`
`I
`
`I
`
`I
`
`create a CSurface suitable to be “extended” for 3D capabilities and then
`“extend” the 2D surface to a 3D surface while retaining access to the orig-
`inal 2D surface.
`
`create a 2D background from options suited to the 2D surface and then
`attach this background to the dual 2D-3D surface.
`
`create a 3D triangle from available render styles and then attach the 3D
`triangle to the dual CS urface.
`
`I Blt the background first as usual, Blt the 3D sprite on top of the back-
`ground, and then refresh the screen with the mixed image.
`
`Here’s the 3D Version of the Foll0wM0use() method that handles dual
`surfaces:
`
`
`
`long CSurface3d::Fo1lowM0use(CPo1nt &point,
`(
`
`int nTime}
`
`Pre Scene Init: Set up to use 3D driver and clear Z—Buffer (if any).
`
`m_p3dFns->Begi’nScene( >:
`if (m_bIsZEnabied)
`(
`DSDREC‘
`drDst:
`dr*D5L.><1= 0;
`dr“Dst..y1:0;
`drD5t.><2 = m_dwwidth;
`dr“Dst.y? = m_dwH.eight,;
`ikdefire nF.ECTS 1
`m_p3:Viewport—>C1ear(nRECTS, &d\”DSt, DfjUCl[AR_/BIHFFRJ;
`
`:1,
`
`
`283
`
`
`
`MIXING A 3D OBJECT ON A 2D BACKGROUND I 265
`
`
`
`Set up R1 TDARAMS structure for d11al~surface usage.
`
`BLT3/ARAMS ><Dst;
`:<Dst.pcdsDesc = &rn_SurfDesc;
`;<Dst.pddsFns = m_pZdFns;
`xDst.p3dFns = m_p3dFns;
`xDst.p3dv1‘ewpor: = W p3dV1'ewport;
`
`
`
`
`
`it
`
`_
`
`Blt background to dual surface. Blt either 2D or 3D background based on Init.
`if (m_nNeedLock & SKGLOCK)
`llI,pZClFflS‘>l_OCl<(NUl_l_, &m_SUr*l‘Desc, DDLUCKANAIT, NLLL1;
`(m_pBacl<gr‘ound l= NULL)
`l
`RECT rSrc = l0, 0, m_cwNidth, m_'dwHeight};
`FONT pmst = lO’0l;
`m_p3.ackground—>Bl L(&xDsL, &ptDst, &rSrc);
`)
`it (m_nNeecLock & EKGLOCK)
`m_p2dFns—>Unlor,k(NUI l );
`
`2Dbackgroundmay
`nccdsurfacetobe
`locked.
`
`
`
`it (m_pTm‘anqle l= NULL)
`m_pTrlangle->Blt(&><Dst, &point);
`
`Scene End Stage: End Scene, refresh screen, and return.
`
`Blt attached 3D Triangle.
`
`rr_p3dFns—>EndScene();
`// of.-‘set ds: rect accounting for client a"ea
`long lRlght = m_ptZeroZero.x + m_dwl\Hdth;
`long lnottom = m_pt£eroZero.y + m_dv/Height;
`RECT rDsL = ln_p:ZeroZero.x, rn_ptZeroZero.y,
`RU)!
`rsrc = (0, 0, m_dwwtdth. m_dwHe"ghtl:
`// se: pa’ette and refresh screen
`gpPrimary->SetPalette(gpPalette);
`gpPrlmar‘y->Elt(&hDst. mpzcins, &rSrc, DDBLT_wAIT, NULL);
`// return
`return TRUE:
`
`lRlgh:,
`
`lBo‘l:torn};
`
`Notice the code added to pass both the 2D and the 3D descriptor to the
`object renderers in the B LTPA RAMS structure. Also notice the code added to
`lock and unlock the 2D surface for most 2D background rendering (a
`hardware—accelerated 2D background would not need a Lock/Unlock).
`
`
`
`Some hardware 3D devices may not allow 2D functions to be invoked bet\Neen
`BeginSceneO and EndSceneO. These devices will set the
`DDCAPSZ_N02D‘3URING33SCENE flag in the 2D caps structure (hwCaps.dwCaps2).
`if this flag is set, you will need to modify the FollowMouse code to render a
`2D background before BeginSceneO, but render a 3D Background after
`Beg/nSCeneO. We found that the HEL drivers do not impose this restriction,
`so we have not built this check into our current example.
`
`
`
`
`284
`
`
`
`266 I CHAPTER ‘I7 Mrxms 3D WITH SPRITES, BACKGROUNDS, AND VIDEOS
`
`17.1.2 Measuring Background Performance
`Table 17-1 compares the performance of Bltting a sprite with both 2D and
`3D rendering paths.
`
`TABLE 1 7-’! Comparing 2D and 3D Backgrounds
`
`CBacl<groundCCode
`
`CBackgroundP5
`
`CBackgr0undTex
`
`CBackground3D
`
`7.1 milliseconds
`6.8 milliseconds
`
`46.5 milliseconds
`
`3.8 milliseconds*
`
`* CBacl<ground3D fills the background with a constant color; whereas all other options
`transfer an image to the screen. Therefore he comparison of CBackground3D with
`the other options is not a true apples—to—apples comparison. The figure is shown for
`reference.
`
`CBackgroundTex is an implementation of a texture-mapped 3D back-
`ground object. You implement a texture-mapped 3D background by load-
`ing a texture object and setting its handle in the background material
`structure. Check the source code for the Timing Application 011 our Inter-
`net site. (Note that unlike triangle textures, a background texture need not
`be sized using powers of two.)
`
`A CBackgroundTex is texture mapped to the surface and is not merely Bltted
`to the surface. The implication is that if the source and destination sizes dif-
`fer, the source is stretched (or shrunk) to fit the destination rectangle. Texture
`mapping is much costlier, as the results of our measurements demonstrate.
`
`If all you need is a simple Bit of a background image, then as the perfor-
`mance results indicate, using 2D backgrounds behind 3D objects offers sig-
`nificant performance boosts over using texture mapping.
`
`11.2 Mixing in Sprites
`
`Hey, can’t we add sprites to our dual surface just like we did with back-
`grounds? Technically, yes. But our code lets us have only one active sprite at
`a time. If we wanted to have more than one sprite, we would need to main-
`tain some form of list (or array) of sprites and draw all the active sprites
`within our Refresh functions.
`
`285
`
`
`
`MIXING IN SPRITES I 267
`
`
`
`Since the Intel RDX library provides code to manage lists of sprites and
`draw them in back—to—front order, let’s just use RDX to mix sprites in. If
`you’ve forgotten, or haven’t had a chance to play with RDX yet, do take a
`quick trip through Chapter 8.
`
`17.2.1
`
`Using RDX to Mix in Sprites
`
`The RDX programming model allows us to
`
`I
`I
`
`create a surface of a specified size and pixel depth;
`create mixable objects (such as sprites, backgrounds, grids, and AV ob-
`jects) and connect them to the surface;
`
`: manipulate attributes of the objects (such as draw order, position, trans-
`parency, and visibility); and
`
`I mix and render all visible objects attached to a surface by invoking a single
`srfDraw() function provided by the surface object.
`
`Typically you attach the surface to a window using srfSetDe5tWindow(), and
`the window is automatically refreshed by srfDmw(). RDX also has a
`srfSetDestMemory() function that we can use to specify that the output of
`srfDmw() be sent to a memory buffer that we provide. Let’s use
`srfSetDestMemory() to have RDX output its data into our dual surface:
`
`long CSurfcce3d::Fol'owMouse(CPoint eipofnt,
`l
`
`int Mime)
`
`// pre scere irit
`m_p3dFns->Beg1rScene():
`it (m_hIs7fnabledI
`I
`D3DRECTdrDst;
`drDs:.x1= O; drDst.y1= 0;
`drDs:.xZ = m,dwNi‘dth; drDst.y2 = n_dwHei‘gn:;
`m_p3d\/iewport->Clear(1, &crDst, D3DCLEAR_7RUFFFR);
`
`l
`
`// setup BLTPARAMS stmct for dual-surface usage
`BLTPARAMS xDst;
`xDst.pddsDesc = &rn_SurfDesc:
`xDst.pddsFns = n_p2dFns;
`><Dst.p3dtns = m_p3ctns;
`xDst.p3;“/iewport = m_p3dViewpor'L;
`
`// Blt either 2D or 3D backqround to Dual-Surface
`if (m_nl\eec1Lo<:k K EKGLOCK)
`TT_D2dFnS'>LOCl<(NULL, &m~Sur"fDesC, DDLOCK_WAlT, NULL);
`it (rr_pEackground != NLLL)
`(
`am rSrc = (O. 0, m dwwidth, m,_dwHe1gnt1;
`DOINT ptDst = {0,o}:
`rr_pEackgr‘omd—>Blt(&xDst, &ptD5:, Msrc);
`
`f (rn,nNeedLock & BKGLOCK) m_p2dFn5->Un1o<:<(NULL);
`
`) i
`
`
`
`load-
`
`nter—
`
`d not
`
`Blttecl
`'_
`Sdlf
`exture
`
`rate‘
`
`or—
`
`1'5 sig-
`
`k
`'
`riteat
`.
`mam‘
`fies
`
`'{ .
`V
`~
`
`‘E:
`_g
`
`_
`
`
`
`286
`
`
`
`268 I CHAPTER ‘I7 MIXING 3D WITH SPRITES, BACKGROUNDS, AND VIDEOS
`
`Draw RDX objects by invoking srfDraw on the Dual Surfa ce’s rn_hSurfmember,
`
`.
`{
`if (m_bIsRd><)
`In p2dFns->L0ck(NULL, &m_SurfDesc, DDLOCK,WAIT. NULL);
`srfSetDestMemory(m_hSurf, m_Sur*tDesc.1pSurface, m,SurfDesc.1Pitch);
`sr‘fDraw(rn,hSurf):
`rn_PZdFn$->Un10Ck(NULL):
`
`)
`
`\
`
`LockDirectDraw Surface andpass its data
`pointer to RDX using srfSetDelMemory(),
`Then draw all objects using 5rfDmw().
`RDX draws its objects directly ontothe
`surfacewithorwithouttmnsparency.
`
`// Bit 3D tF1&I’|910
`W (m_pT7"7'6fl91e
`f= NULL)
`m_pTri'ang1e->B1t(&><Dst, &p0i'nt);
`// SceneEnd
`m_p3dFns—>EndScene();
`// offset dst rect accounting for client area
`long Wight = m_ptZer0Zero.>< + ni_dwwi‘dth;
`1ong ‘Bottom = m_ptZeruZerc«.y + m_dwHe7'ght:
`RECT rDst = {m,ptZeroZero.><, m_ptZer0Zer0.y,
`RECT rSrc = (0, 0, mjwwidth, m_dwHe1'ght}:
`// set alette and refresh screen
`gpPri'maEy—>Se:PaIette(gpPa1ette);
`return TRUE;
`gpPr1‘mary->B1t(&r‘Dst, m_p2dFns, &rS>“c, DDBLT_wAIT, NULL);
`
`iR1'ght, mottom};
`
`L
`
`,
`
`Backg"
`CBaCkg“
`CBaCkgn
`CBacl<gn
`
`CBac1<gr<
`
`In the new Follow/Mouse() method that we have outlined above we are
`drawing our background first and then mixing in the RDX output (a com-
`posite of all the RDX objects). Finally we add in our 3D object on top of the
`RDX and background combo.
`
`287
`
`
`
`MIXING IN SPRITES I 269
`
`You’ll probably point out that if we’re using RDX, we can have our back-
`ground be an RDX background (CBackgroundRDX) and not have to worry
`about any CBackground code either. That is true. Very astute of you! In
`fact, Table 17-2 has measurements of mixing the various 2D and 3D objects
`(the sprite measurements were for sixteen sprites of about 5,000 pixels each,
`and the background measurements were for a background of 734 X 475
`pixels).
`
`TABLE 1 7-2 Measuring Mixed 2D and 3D Objects
`
`CBacl<grou ndCCode
`
`CBackgroundP5
`
`CBackgroundRDX
`
`CBacl<groundTex
`
`j CBacl<ground3D
`
`7.1 milliseconds
`6.8 milliseconds
`
`CSpriteRDX
`
`46.5 milliseconds
`
`3.8 milliseconds
`
`C.TriangleTex (Ramp/Copyllilode)
`CTriangle3D (Ramp/Flat)
`
`1.1 milliseconds
`162 milliseconds
`
`2.2 milliseconds
`
`Following are some observations based on the results:
`
`I The MMX technology optimizations that RDX has used for background
`drawing make CBackgroundRDX run at the speed of color filling. Vl/owl
`There is a clear benefit to mixing 2D and 3D.
`
`VVith Ramp mode triangles being rendered in the low—millisecond
`speeds, our Execute Buffer overhead starts becoming important again.
`These tests were performed with only sixteen triangles. It becomes
`worthwhile to invest in code for long Execute Buffers, when you are ren-
`dering many small triangles with the Ramp mode driver.
`
`Flat-shaded Ramp mode triangles compare well with spriting. However,
`texture mapping at 16 milliseconds (half the 30 fps budget) still takes
`quite a bit of time. A judicious mix of shaded and texture—mapped trian-
`gles would be the way to go. And, of course, using 2D sprites wherever
`possible is also a good way to go.
`
`Adding RDX Objects at Front and Back
`
`What if you want to add RDX objects behind and ahead of the 3D object?
`Well, RDX lets you create multiple surfaces. So you can solve this issue by
`creating two RDX surfaces and retaining one as the “behind” surface and
`
`288
`
`
`
`270 I CHAPTER 17 MIXING 3D WITH SPRITES, BACKGROUNDS, AND VIDEOS
`
`the other as the “ahead” surface. All objects attached to the behind surface
`’
`using 0bjSetDe5tinati0n() will get drawn behind the 3D object. And all
`objects attached to the ahead surface will get drawn on top of the 3D object
`This is a simple extra credit exercise. Go on! Try it for yourself.
`
`11.3 Mixing in Video
`Mixing in video is a little more complex than mixing sprites or back-
`grounds. The following factors need to be considered:
`
`I Video files are actually a series of images that need to be displayed
`sequentially. To mix 3D on top of Video, we would need to mix our 3D
`image whenever a new video frame is drawn———lest we “lose” sight of our
`3D object.
`
`A video file is recorded at a specific frame rate. Playback of frames in the
`video must be synchronized to a timer, so that they can be displayed at
`the recorded frame rate.
`
`Video files are usually recorded in high—color resolutions to capture the
`broad range of colors in natural situations. Video codecs prefer to choose
`their own palettes, since they reduce the color range for palletized dis-
`plays. They typically produce very poor quality if they are forced to use a
`specified palette.
`
`The issues of synchronized drawing and timed playback are dealt with in
`detail in Chapter 10. We will use the same code to mix our 3D sprite on top
`of a video object.
`
`Handling Palettes
`
`We do need to add some code to handle palettes. Our 2D objects use colors
`only from the system palette, and there is no palette conflict between 2D
`and Video objects. But Direct3D uses more than the system palette. Let’s
`look at the code needed to manage palettes among these media.
`
`There is no fast and high—quality solution to sharing palettes. Our code
`shows you how to communicate palettes amongst video and 3D objects.
`Since Video codecs don’t like palettes to be forced on them, we have written
`our code to tell Direct3D to use the video objects palette.
`
`Following is the code that takes a palette from a Video object file and uses
`this palette with Direct3D surfaces. Note that Direct3D expects the palette
`to be set on the 2D surface before any 3D functionality is requested.
`
`289
`
`
`
`> 3
`
`1C] surface)
`and all
`
`~
`
`3D Object.
`
`i
`
`)ack—
`
`3 displayed
`nix our 3D
`;ight of our
`
`ames in the
`
`lisplayed at
`
`:apture the
`:1‘ to choose
`etized dis-
`:ed to use a
`
`with in
`
`rite on top
`
`use colors
`ween 2D
`e. Let’s
`
`r code
`
`bjects.
`e written
`
`nd uses
`
`e palette
`d.
`
`7%
`i
`
`
`
`MIXING IN VIDEO I 271
`
`
`
`
`
`Create a palette object
`LGGPALETTE *plogPaleLte;
`PBYTE plrnp ~ new BYlE fsneof (LOGPALEWE) + sizeof (PALE*TEENTRY>*256]:
`plogPalette = (LOGDALETTE *)pTnIp;
`,
`,
`_
`p10gpd]e_LteV>paWerS.‘O‘,, = CKOMO;
`UseRDXtotalktoavideoobjectand getits palette
`plogPalctte->palNumEntries = 256;
`if (!r'dxGetVide0Palette(DlogPalette))
`rezurri FALSE;
`l
`£_u
`
`Change palette entry flags to not allow D3D to change any ofthem N
`PALETTFLNTRY *pPa| = (PALETIEENTRY *)
`(plmp + sizeof(LOGPALETTE));
`‘For (i"1l.i=C;
`1
`C 256;
`ji*F) pPal[i]
`.peFlegs = D3l'lPAl__READONLY;
`
`Querying for a palette from a video file takes a lot of steps. RDX simplifies
`these steps. So our code uses RDX to talk to the video codec. Refer back to
`Section 10.2 for an explanation of how to manage video with RDX. For
`quick reference, we’ve included here the essential code to query a palette
`from an AVI file using RDX:
`
`
`
`BUDL rLlxGetVideoPalette: :GetPalette(LOGPAtETTE *plogPalette, LPSTR *pFile1
`(
`
`// first, create a hFile object and load our‘ AV] file
`err * hflCreate(&m_liFil);
`macExitlfRd><Er‘ror(err. FALSE):
`er" —' hfilLoad (rn,liFil, pFi“e):
`macE><itIt'|7.d><Er"or(err". FALSE):
`
`// create tre AV object and initialize it with the video file
`err = avCWeate(&mJi/W);
`macExitIfRd><Err0r(err, FALSE);
`err = 3‘/AddViceoTracl<(m_hav. m_hFi|, 0, &rv_hvid):
`macExitIfRr.xError(err, FALSE);
`
`the palette from the video object
`// get
`DINORVAL err = vidC.etPalette(m<hl/id, pLogPalette);
`macE><itItRdxIrror(err, FALSE);
`return lRUE;
`
`17.3.2
`
`Using Video as a Texture Map
`
`After seeing how to mix 3D and video on a DirectDraw surface, it is a fairly
`simple extrapolation to use video as a texture map. We merely modify
`our previous code to provide the Texture Map Address when We call
`5rfSetDestinationMemory().
`
`
`
`290
`
`
`
`272 I CHAPTER ‘[7 MIXING BID WITH SPRITES, BACKGROUNDS, AND VIDEOS
`
`Run the demo for this chapter on the CD and check the Texture Mapped
`Video option.
`
`WHAT HAVE
`YOU LEARNED?
`
`By the end of the chapter, you should have learned how to
`
`I mix a 3D object on a 2D background using Direct3D and DirectDraw;
`
`mix a 3D object with RDX sprites and a background (Direct2'>D, DirectDraw, or RDX);
`mix a 3D object on top of a video file where the video file is played through RDX and
`can be either VFW or ActiveMovie based; and
`
`make the simple modification needed to use video as a texture map source (that is, if
`you perused the source code on the CD).
`
`You've reached the end of our 3D coverage. We hope you have learned a lot.
`
`291
`
`
`
`PART VI
`
`vi
`
`Processors and
`
`"Outward Performance Optimization
`
`>urce (that is,
`
`if
`
`I
`
`1 lot.
`
`wr-:’o LIKE To EXTEND AN ACKNOWLEDGEMENT TO FRANK BINNS, SHUKY ERLICH, BRUCE BARTTLET, JuL1E A BRAJENOVICH, K. SRIDHARAN,
`RICK MANGOLD, Boa FABER, BEV BACKMAYER, DEBBIE MARR, BOB REESE, TOM WALSH, MICKEY GUTTMAN, BENNY EITAN, Kora‘!
`GOTFLIEB, ODED LEMPEL, AND DAVID BISTRY
`
`The Pentium Processor Family
`I Basic processor terms
`I Overview of Pentium and Pentium Pro processors
`
`I Overview of MMX technology
`I
`Identifying processor models and features using CPUID
`
`The Pentium Processor
`Detailed overview of processor components
`Instruction pipelining
`
`Integer pairing and scheduling rules
`Address Generation Interlock (AGI)
`I Branch prediction
`I
`How to optimize the sprite sample
`
`I I I I
`
`Chapter 18
`
`Chapter 19
`
`Chapter 20
`
`The Pentium Processor with MMX Technology
`MMX architecture
`
`Instruction set and data types
`EMMS usage guideline
`Saturation versus wraparound
`MMX pairing and scheduling rules
`Optimizing the MMX sprite
`Using scheduling rules to optimize the sprite
`
`292
`
`
`
`274 I
`
`PROCESSORS AND PERFORMANCE OPTIMIZATION
`
`Chapter 2|
`
`VTune and other Performance Optimilation Tools
`n V"une's coverage of pairing and scheduling rules
`I Static and dynamic analysis
`
`Hot—spot monitor and time—based and event-based sampling
`\f'une usage hints
`ReadTime StampC0unter—RDTSC
`Using the PMonitor library
`
`Chapter 22
`
`The Pentium II Processor
`Architectural overview and new features
`
`Pentium ll performance counters
`MMX pairing rules
`
`Detailed component description including event counters for each unit
`Write Combining memory type to speed graphics performance
`Branch mispredictions, partial stalls, and the 4:1 :1 decoder template
`
`Chapter 23
`
`Knowing Your Data and Optimizing Memory
`Overview oi‘ memory subsystem
`
`Differences between Pentium and Pentium Pro member processors
`Cache differences, DCU splits, partial memory stalls
`MMX stack alignment
`Accessing cached memory
`Writing to video memory
`
`When it comes to developing multimedia applications, you'll quickly realize that you're
`dealing with a huge amount of data—most of which is typically used once or twice and
`then thrown away. Unlike database, word processing, or transaction—based applications,
`multimedia applications must quickly display a sequence of pictures to give the illusion of
`motion; they must pump audio data in real time to play uninterrupted sound sequences;
`orthey must rendera 3D model to give the illusion of a 3D world. There are lots of
`calculations to make, lots of data to move around. In order to get smooth motion video,
`audio, and 3D, you still have to fine-tune your applications for the platform they are
`running on.
`
`We decided to include this section because we believe that multimedia applications and
`processor optimization go hand in hand—at least for now. Some developers think optimi-
`zation is an art; some think it's a science. We think it's a mix of both.
`
`First we cover the Pentium family processors, their architecture, and how they work with
`code and data. We optimize our sprite sample for each of the processors we coveI——the
`Pentium processor, the Pentium processor with MMX technology, and the Pentium ll.
`
`293
`
`
`
`PROCESSORS AND PERFORMANCE OPTIMIZATION I 275
`
`When you think about optimizing multimedia applications, don'tjust think about applying
`the optimization rules of the processor (pairing, AG|s, register contentions, and so foith);l
`you should first think about your data access pattern. Optimizing for the processor is most
`useful when you access the data in the L1 cache. From our experience, you should not
`try to squeeze every cycle out of your code; you need only focus on the code segments
`that are called very often and those that consume most of the CPU cycles.
`
`Rather than telling you how to optimize your code, we'll tell you how we go about opti-
`mizing ours. Once the code is written, we typically use Intel's VTune to figure out how to
`schedule instructions for optimal pairing and how the code behaves when it runs on the
`PC—that's the science part. Since VTune does not kn