`wand
`Technology
`
`/I Jumpstart Guide to
`High Performance /IP15
`O
`
`Q
`
`Rohan Coelho and Maher M31511
`
`9
`
`1
`
`SAMSUNG 1010
`
`
`
`DlRECTX®, RDX, RSX,
`AND MMX” TECHNOLOGY
`
`2
`
`
`
`DlRECTX®, RDX, RSX,
`AND MMXTM TECHNOLOGY
`
`A JUMPSTART GUIDE TO
`
`HIGH PERFORMANCE APls
`
`Rohan Coelho
`
`and
`
`Maher Hawash
`
`A V
`
`V
`ADDISON—VVESLEY DEVELOPERS PRESS
`
`An imprint of Addison Wesley Longman, Inc.
`
`Reading, Massachusetts - Harlow, England - Menlo Park, California
`Berkeley, California - Don Mills, Ontario - Sydney
`Bonn - Amsterdam - Tokyo - Mexico City
`
`3
`
`
`
`Many of the designations used by manufacturers and sellers to distinguish their prod-
`ucts are claimed as trademarks. Where those designations appear in this book, and
`Addison-Wesley was aware of a trademark claim, the designations have been printed in
`initial capital letters or all capital letters.
`
`MMX""" Technology, Pentium®, Pentium® II, Pentium® Pro, and Pentium® with MMX
`technology are registered trademarks of Intel Corporation.
`'
`* Other brands and names are the property of their respective owners.
`
`The authors and publisher have taken care in preparation of this book, but make no
`expressed or implied warranty of any kind and assume no responsibility for errors or
`omissions. No liability is assumed for incidental or consequential damages in connec-
`tion with or arising out of the use of the information or programs contained herein.
`
`Library of Congress Cataloging-in—PubZication Data
`
`Coelho, Rohan.
`DirectX®, RDX, RSX, and MMXTM technology : a jumpstart guide to high
`performance APIs / Rohan Coelho and Maher Hawash.
`p.
`cm.
`Includes index.
`ISBN 0—20l—30944~0
`
`3. Intel Realistic display
`2. DirectX.
`1. Multimedia systems.
`mixer.
`4. RSX (CompC0mputer file : Digital Equipment Corporation)
`5. MMX technology.
`I. Hawash, Maher.
`II. Title.
`QA76.575.C64
`1998
`006.7768--dC2l
`
`97-33102
`CIP
`
`Copyright © 1998 by Intel Corporation
`
`All rights reserved. No part of this publication may be reproduced, stored in a retrieval
`system, or transmitted, in any form or by any means, electronic, mechanical, photo-
`copying, recording, or otherwise, without the prior written permission of the publisher.
`Printed in the United States of America. Published simultaneously in Canada.
`
`Sponsoring Editor: Mary Treseler
`Project Manager: John Fuller
`Cover design: Chris Norum
`Text design: Vicki Hochstedler
`Set in 11-point Minion by Octal Publishing
`
`1 2 3 4 5 6 7 8 9—MA—0l00999897
`
`First printing, December 1997
`
`Addison-Wesley books are available for bulk purchases by corporations, institutions,
`and other organizations. For more information please contact the Corporate, Govern-
`ment, and Special Sales Department at (800) 238-9682.
`
`Find us on the \Vorld-Wide I/Veb at:
`
`http://WWw.awl.com
`
`4
`
`
`
`To my parents, Mofeed and Sameeha;
`To my wife Lisa, my son Iared, and baby on its way;
`To my nephew Ahmad and the rest of my family;
`I dedicate this book. ——Maher
`
`To my immediate family: Dad, Mom, Gail, Carmen, and Sarah;
`To my extended family, blood relatives and others;
`And to several others, significant but unnamed;
`Thanks for touching my life. -——R0han
`
`Special thanks to
`Emilie Lengel and Gerald Holzhammer
`For believing in us.
`
`5
`
`
`
`Contents
`
`xvii
`Preface
`Introduction: Organization and Conventions
`
`xxi
`
`SURVEYING MULTIMEDIA
`
`‘I
`
`CHAPTER 1
`
`OVERVIEW OF MEDIA ON THE PC
`
`3
`
`3
`Background
`1.1
`Graphics Device Independence
`1.2
`1.3 Motion Video under Windows
`
`4
`5
`
`1.4 Multimedia Gaming under Windows 95
`1.5
`3D Video Architectures on the PC 7
`1.6
`Audio Architectures on the PC 8
`
`6
`
`CHAPTER 2
`
`PROCESSOR ARCHITECTURE OVERVIEW I I
`
`2.1
`
`2.2
`
`Processor Architecture
`
`12
`
`System Overview 14
`
`ANIMATED GRAPHICS, SPRITES, AND BACKGROUNDS
`
`I7
`
`CHAPTER 3
`
`SIMPLE SPRITES IN GDI
`
`I9
`
`3.1
`3.2
`
`Graphics Device Interface (GDI) Overview 19
`Animation Objects
`20
`3.2.1
`Sprites
`20
`3.2.2
`Backgrounds
`
`21
`
`IVIII
`
`6
`
`
`
`VIII I CONTENTS
`
`3.3
`3.4
`3.5
`3.6
`3.7
`
`Transparent Blts with GDI
`Drawing a Sprite Using GDI
`Backgrounds
`24
`Demo Time
`25
`
`22
`22
`
`How Fast Does GDI Draw Sprites and Backgrounds
`
`26
`
`CHAPTER 4
`
`SPRITES WITH DIRECTDRAW PRIMARY SURFACES
`
`27
`
`4.1
`4.2
`4.3
`4.4
`4.5
`4.6
`4.7
`4.8
`4.9
`4.10
`4.11
`
`Introduction to Microsoft’s DirectDraw 27
`Features of DirectDraw 29
`
`30
`Before You Get Overly Excited
`31
`Instantiating a DirectDraw Object
`Querying and Creating a Primary Surface
`Implementing a Simple Sprite Class
`34
`Drawing a Sprite on the DirectDraw Primary Surface
`Demo Time
`35
`
`32
`
`35
`
`Redrawing Backgrounds on a DirectDraw Primary Surface
`How Fast Can We Draw Sprites and Backgrounds?
`37
`Compositing Objects on a DirectDraw Primary Surface
`
`37
`
`36
`
`CHAPTER 5
`
`HARDWARE ACCELERATION VIA DIRECTDRAW 39
`
`5.1
`5.2
`5.3
`5.4
`5.5
`5.6
`5.7
`5.8
`
`5.9
`5.10
`5.11
`
`39
`Creating an Offscreen Surface
`Drawing a Sprite on the DirectDraw Offscreen Surface
`Demo Time
`42
`
`41
`
`43
`
`How Fast Is OffScreen Surface Drawing?
`Finding Hardware Acceleration
`43
`44
`Setting Up for Hardware Acceleration
`46
`How Fast ls CSurfaceVidMem Drawing?
`Accelerating Offscreen to Primary Transfers by Page Flips
`5.8.1 What Is Graphics Page Flipping?
`47
`5.8.2
`DirectDraw Page Flipping Model
`48
`5.8.3
`Does the Hardware Support Page Flipping?
`5.8.4
`Setting Up DirectDraw to Use Page Flipping
`5.8.5
`“Rendering” Flippable Surfaces
`50
`
`48
`49
`
`47
`
`How Fast Is CSurfaceBackBuffer Drawing?
`Hardware Acceleration to Blt Sprites
`51
`How Fast Is CSpriteGrfx (and CBackgroundGrfx) Drawing?
`
`50
`
`53
`
`CHAPTER 6
`
`RDX! HIGH-PERFORMANCE MIXING WITH A
`HIGH-LEVEL API
`55
`
`6.1
`
`Introduction to InteI’s RDX Animation Library
`6.1.]
`Features of RDX 56
`
`55
`
`6.1.2
`
`Before You Get Overly Excited
`
`58
`
`Using RDX 58
`6.2.1
`Generic Objects with RDX 59
`6.2.2
`' The Programming Model
`60
`
`7
`
`
`
`Working with RDX 60
`6.3.1
`Creating an RDX Surface
`6.3.2
`An RDX Sprite Class
`61
`6.3.3
`Drawing the RDX Sprite
`Dem0Time
`62
`6.4.1
`How Fast Does CSurfaceRdX Draw?
`
`60
`
`62
`
`CONTENTS I
`
`[X
`
`62
`
`Hardware Acceleration with RDX 63
`6.5.1
`Full Screen Mode with RDX 63
`6.5.2
`How Fast Does CSurfaceRdX Draw in Full Screen Mode?
`
`64
`
`6.5.3
`
`Accelerating Objects with RDX 64
`
`PART III
`
`MAKING THE MEDIA MIX
`
`67
`
`CHAPTER 7
`
`VIDEO UNDER WINDOWS
`
`71
`
`71
`Concepts of Motion Video
`7.1
`Capturing and Compressing Video
`7.2
`7.3 Windows Multimedia Architectures
`7.4
`Overview of Video Codecs
`76
`
`72
`74
`
`CHAPTERB
`
`DIRECTSHOW FILTERS
`
`79
`
`8.1
`8.2
`8.3
`8.4
`8.5
`
`79
`
`DirectSh0w Components
`W'hat’s a Filter Graph?
`81
`Understanding the Mighty Filter
`An Overview on the Samples
`83
`Creating a Source Filter
`83
`84
`8.5.1
`The Source Filter Class
`8.5.2
`Create an Instance of the Source Filter
`8.5.3
`The Source Stream Class
`88
`8.5.4
`The Connection Process
`89
`
`82
`
`85
`
`Starting and Stopping
`8.5.5
`8.5.6 Moving the Data
`92
`
`91
`
`93
`Creating a Transform Filter
`96
`Creating a Rendering Filter
`98
`Adding Your Own Interface
`100
`Adding Property Pages to Filters
`8.9.1
`Adding the Property Interface to the Filter
`8.9.2
`Implementing the Property Page Interface
`
`101
`103
`
`105
`Adding a Filter to the Registry
`105
`8.10.1 Using a Registry File Is Not Recommended
`8.10.2 Using Filter Self—Registration Is Recommended
`
`106
`
`CHAPTER 9
`
`DIRECTSHOW APPLICATIONS
`
`109
`
`9.1
`9.2
`
`DirectShow Mechanisms for Working on Filter Graphs
`COM: Automatic Construction of Filter Graphs
`112
`
`110
`
`8
`
`
`
`X I CONTENTS
`
`COM: Manual Construction of Filter Graphs
`9.3.1
`Adding Source Filters
`116
`9.3.2
`Adding Transform and Rendering Filters
`9.3.3
`Connecting the Two Pins
`118
`
`115
`
`118
`
`119
`COM: Accessing Custom Interfaces
`121
`COM: Showing Filter Property Pages
`Creating Events under DirectShow 122
`ActiveX: A Simple Way to Control DirectShow 124
`124
`9.7.1
`Playing a File Using the ActiveX Interface
`9.7.2
`Controlling the ActiveX Control from Your Application
`ActiveX: Handling Events
`128
`
`126
`
`9.8
`
`CHAPTER 10
`
`MIXING SPRITES, BACKGROUNDS, AND VIDEOS
`
`131
`
`10.1
`
`10.2
`
`131
`Introduction to Mixing
`132
`10.1.1 Mixing Sprites with Video
`10.1.2 Mixing Animation with Video
`Mixing with RDX 133
`10.2.1
`Playing Video with the RDX DirectShow Interface
`10.2.2 Mixing a Sprite on Top ofVideo
`136
`10.2.3 Mixing Video on Video
`137
`
`132
`
`134
`
`CHAPTER I '1
`
`STREAMING DOWN THE SUPERHIGHWAY WITH
`REALMEDIA
`139
`
`11.1
`11.2
`11.3
`11.4
`
`140
`Overview of RealMedia
`The RealMedia Plug—in Architecture
`Data Flows: Server to Client
`143
`
`141
`
`144
`Data Management Objects
`11.4.1
`IRMABuffer: Dynamic Memory Allocation Object
`11.4.2
`IRMAValues: Indexed List Object
`145
`11.4.3
`lRMAPacl<et: Packet Transport Object
`
`147
`
`144
`
`148
`
`147
`RealMedia Asynchronous Interfaces
`Common Requirements for All Plug-ins
`Building a File-Format Plug—in
`150
`11.7.1
`Initializing the File-Format Plug—in
`11.7.2
`File and Stream Headers
`153
`11.7.3
`Let the Streaming Begin!
`156
`157
`
`Building a Rendering Plug-in
`RealMedia Audio Services
`162
`
`150
`
`11.9.1
`
`Playing a Simple Pulse Coded Modulation (PCM)
`Audio File
`163
`
`11.9.2
`
`Pump Up the Volume
`
`165
`
`9
`
`
`
`CONTENTS I XI
`
`PART IV ’ PLAYING AND MIXING SOUND WITH DIRECTSOUND
`
`AND RSX 3D
`
`I69
`
`CHAPTER I2
`
`AUDIO MIXING WITH DIRECTSOUND
`
`I7I
`
`12.] Overview of Audio under Vvindows 95
`12.2 DirectSound Features
`172
`12.3 DirectSound Architecture
`
`173
`
`12.4
`
`Playing aWAV File Using DirectSound
`12.4.1
`Initializing DirectSound
`175
`12.4.2 DirectSound Structures
`176
`
`171
`
`175
`
`12.4.3 Creating Sound Buffers
`12.4.4
`Playing the Sound
`178
`12.4.5 Demo Time
`178
`
`177
`
`12.4.6 Mixing Two VVAV Files
`
`178
`
`179
`Controlling the Primary Sound Buffer
`12.5.1
`Initializing to Get Control of the Output Format
`12.5.2 Creating a Primary DirectSound Buffer
`179
`12.5.3 Demo Time
`181
`
`179
`
`CHAPTER 13
`
`REALISTIC 3D SOUND EXPERIENCE: RSX 3D
`
`I83
`
`13.1
`
`RSX 3D Features
`
`184
`
`13.2 Creating an RSX 3D Object
`13.3
`Play One WAV File
`186
`187
`13.4
`Play Another VVAV File
`188
`13.5 Mixing Many WAV Files
`13.6
`RSX Goes 3D——'I‘rue 3D Sound
`
`185
`
`188
`
`Setting Up 3D Sound with RSX 3D 190
`13.7
`13.8 Adding Special Sound Effects with RSX 3D 192
`13.8.1
`The Doppler Effect
`192
`13.8.2
`The Reverberation Effect
`
`193
`
`13.9 Audio Streaming in RSX 3D 194
`
`PART V WELCOME TO THE THIRD DIMENSION
`
`I95
`
`CHAPTER I4
`
`AN INTRODUCTION TO DIRECT3D
`
`I97
`
`14.1
`14.2
`
`Some Background on 3D on the PC 197
`Introduction to Direct3D 199
`
`14.2.1 A Taste of Direct3D’s Retained Mode
`14.2.2 Direct3D’s Immediate Mode
`201
`
`200
`
`14.3
`
`14.2.3 Before You Get Overly Excited
`Inside Direct3D 203
`14.3.1 Direct3D and DirectDraw 203
`
`202
`
`14.3.2 Direct3D Rendering Engine
`
`203
`
`10
`
`
`
`XII I CONTENTS
`
`205
`
`14.4 Revving Up Direct3D 204
`14.4.1
`The Starting Point: IDirect3D Object
`14.4.2
`Enurnerating IDirect3DDeVices
`206
`14.4.3 Creating an IDirecL3DDeVice
`208
`14.4.4
`Preparing a Palette
`210
`14.4.5
`Extending the Surface for 3D 210
`14.4.6 Mapping from a 3].) Model to the 2D Surface Using
`Viewports
`212
`Talking to 3D Devices Through Execute Buffers
`14.4.7
`Execute Operations
`217
`14.4.8
`14.4.9 Operations Used to Render a Simple Triangle
`14.4.10 Executing the Execute Buffers
`222
`14.4.11 Seeing Results from 3D Devices
`223
`14.5 Demo Time
`224
`
`214
`
`218
`
`CHAPTER I5
`
`EMBELLISHING OUR TRIANGLE WITH BACKGROUNDS,
`
`SHADING, AND TEXTURES
`
`225
`
`Continuing Our Look into Direct3D 225
`Repainting the Background Using Direct3D 226
`15.2.1
`Looking at Direct3D Materials
`227
`15.2.2 Creating a Direct3D Background
`228
`15.2.3 Bltting a Direct3D Background
`229
`
`230
`Controlling Shading Options
`15.3.1
`Looking at Some Render States and Their Default Values
`15.3.2 Coloring a Pixel in Direct3D 231
`15.3.3
`Shading with the RGB Color Model
`15.3.4
`Shading with the Ramp Color Model
`15.3.5 Changing Default Render States
`234
`
`233
`233
`
`230
`
`Texture Mapping with Direct3D 235
`15.4.1
`Creating a Texture Map
`235
`15.4.2
`Setting Up Triangle Vertices for Texture Mapping
`15.4.3
`Setting Up Render Operations for Texture Mapping
`15.4.4 Handling “Lit” Texture Maps
`240
`
`238
`239
`
`Z-Buffering with Direct3D 241
`15.5.1 Why Bother with Z—Buffering?
`15.5.2
`Setting Up for Z—Buffering
`242
`
`241
`
`CHAPTER 16
`
`UNDERSTANDING AND ENHANCING DlRECT3D
`PERFORMANCE
`247
`
`247
`16.1 How Fast Does Our Triangle Run?
`248
`16.1.1
`Stages of Rendering Our Triangle
`16.1.2 Measuring the Rendering Stages of Our Triangle
`16.1.3
`Trimming Some Fat from the Rendering Stages
`
`249
`250
`
`11
`
`
`
`CONTENTS I XIII
`
`251
`16.2 Measuring Shading Options
`16.2.1 Measuring the Performance of Shading Options in Our
`Triangle
`251
`16.2.2 Measuring the Performance of Texture-Mapping in Our
`Triangle
`253
`254
`16.2.3 Adding a Z—Buffer to the Recipe
`16.2.4 Getting Perspective: Comparing 3D (RGB Mode) to 2D 255
`
`16.3
`
`256
`Improving Performance Using the Ramp Driver
`256
`16.3.1
`Loading the Ramp Color Model Driver
`257
`16.3.2 Using the Ramp Driver——The First Try
`257
`16.3.3 Creating Materials for the Ramp Driver
`16.3.4 Rendering a Triangle with the Ramp Driver
`16.3.5 How Does the Ramp Driver Perform?
`261
`
`259
`
`16.4 Optimizing Texture Mapping
`
`261
`
`CHAPTER 17
`
`MIXING 3D WITH SPRI'rEs, BACKGROUNDS, AND
`VIDEOS
`263
`
`263
`17.1 Mixing a 3D Object on a 2D Background
`17.1.1 Our 3D Surface Is Also a 2D Surface
`
`264
`
`17.1.2 Measuring Background Performance
`
`266
`
`266
`17.2 Mixing in Sprites
`267
`17.2.1 Using RDX to Mix in Sprites
`17.2.2 Adding RDX Objects at Front and Back
`
`269
`
`270
`17.3 Mixing in Video
`270
`17.3.1 Handling Palettes
`17.3.2 Using Video as a Texture Map
`
`271
`
`PROCESSORS AND PERFORMANCE OPTIMIZATION
`
`273
`
`CHAPTER I8
`
`THE PENTIUM PROCESSOR FAMILY
`
`277
`
`18.1
`18.2
`
`278
`Basic Concepts and Terms
`281
`The Pentium Processors
`281
`18.2.1
`The Pentium Processor
`18.2.2
`The Pentium Pro Processor
`
`282
`
`18.2.3
`18.2.4
`
`The Pentium Processor with MMX Technology
`The Pentium II Processor
`284
`
`283
`
`18.3
`
`Identifying Processor Models
`
`285
`
`CHAPTER I9
`
`THE PENTIUM PROCESSOR
`
`289
`
`19.1 Architectural Overview 290
`19.2
`Instruction andDataL1 Caches
`
`291
`
`19.2.1 Operational Overview 291
`19.2.2
`Performance Considerations
`
`291
`
`12
`
`
`
`XIV I CONTENTS
`
`Instruction Prefetch
`
`293
`
`19.3.1 Operational Overview 293
`19.3.2
`Performance Considerations
`
`294
`
`Branch Prediction and the Branch Target Buffer
`19.4.1 Operational Overview 294
`19.4.2
`A Closer Look at the BTB 295
`19.4.3
`Performance Considerations
`296
`
`294
`
`297
`Dual Pipelinecl Execution
`19.5.1 Operational Overview 297
`19.5.2
`Performance Considerations
`
`298
`
`‘ 298
`Pentium Integer Pairing Rules
`19.5.3
`19.5.4 Address Generation Interlock (AG1)
`Write Buffers
`300
`
`299
`
`19.6.1 Operational Overview 300
`19.6.2
`Performance Considerations
`
`301
`
`302
`Revisiting Our Sprite Sample
`302
`19.7.1 Overview of the Assembly Version of CSprite
`19.7.2 Analyzing the Performance of Our Sprite Sample
`19.7.3 Do I Really Need to Schedule My Code?
`308’
`
`306
`
`CHAPTER 20
`
`THE PENTIUM PROCESSOR WITH MMX TECHNOLOGY
`
`3'11
`
`A Look at MMX Technology
`20.1
`SIMD 312
`20.2
`20.3 Architectural Overview 313
`20.3.1
`The Pool of Four Vv’rite Buffers
`
`311
`
`313
`
`314
`20.3.2 MMX Uses Floating—Point Registers
`20.3.3
`EMMS to the Rescue: How to Mix MIVIX and FP
`Instructions
`315
`
`MMX Technology Data Types
`The MMX Instruction Set
`317
`
`316
`
`Using MMX Technology to Render Our Sprite Sample
`MMX Technology Optimization Rules and Penalties
`20.7.1 MMX Exceptions to General Pentium Rules
`20.7.2 MMX Instruction Pairing Rules
`324
`20.7.3 MMX Instruction Scheduling Rules
`
`325
`
`319
`323
`323
`
`327
`Performance Analysis of Our Sprite
`20.8.1 MMX versus Integer Implementation of the Sprite
`
`330
`
`CHAPTER 2'1
`
`VTUNE AND OTHER PERFORMANCE OPTIMIZATION
`TOOLS
`333
`
`21.1 Overview of Performance Counters
`
`334
`
`21.2
`
`IntroducingVTune
`
`335
`
`13
`
`
`
`CONTENTS I XV
`
`3 3 6
`21 .2.1 VTune Static Analysis
`340
`21.2.2
`Tune Dynamic Analysis
`21.2.3
`Systcmwidc Monitorin g—Time~and Event~Based
`Sampling
`340
`
`343
`21.3 Read Time Stamp Counter
`21.4
`The PMonitor Event Counter Library 345
`
`CHAPTER 22
`
`THE PENTIUM II PROCESSOR
`
`349
`
`22.1 Architectural Overview 350
`
`351
`The Life Cycle of an Instruction on the Pentium II
`22.1.1
`22.1.2 Comparing the Pentium II with the Pentium Pro Processor
`22.1.3 Comparing the Pentium II with the Pentium with MMX
`Technology Processor
`353
`Instruction and Data Caches
`353
`
`352
`
`22.2.1 Operational Overview 354
`22.2.2
`Performance Considerations
`
`355
`
`Instruction Fetch Unit
`
`355
`
`22.3.1 Operational Overview 355
`356
`22.3.2
`Performance Considerations
`22.3.3
`Fetch Performance with Event Counters
`
`357
`
`Branch Prediction and the Branch Target Buffer
`22.4.1 Operational Overview 359
`359
`22.4.2
`Performance Considerations
`22.4.3 Branch Performance with Event Counters
`
`359
`
`361
`
`Instruction Decoders
`
`361
`
`22.5.1 Operational Overview 361
`22.5.2
`Performance Considerations
`
`362
`
`362
`Register Alias Table Unit
`22.6.1 Operational Overview 362
`22.6.2
`Performance Considerations
`
`364
`
`Reorder Buffer and Execution Units
`
`365
`
`22.7.1 Operational Overview 365
`22.7.2
`Performance Considerations
`
`366
`
`Retirement Unit
`
`367
`
`367
`Rendering Our Sprite on the Pentium II
`Speed Up Graphics Writes with Write Combining
`22.10.1 Operational Overview 369
`22.10.2 Performance Considerations
`
`371
`
`369
`
`CHAPTER 23
`
`MEMORY OPTIMIZATION: KNOW YOUR DATA
`
`373
`
`23.1 Overview of the Memory Subsystem 374
`23.1.1 Architectural Overview 374
`
`23.1.2 Memory Pages and Memory Access Patterns
`
`375
`
`14
`
`
`
`XVI
`
`I CONTENTS
`
`377
`23.1.3 MemoryTiming
`23.1.4
`Performance Considerations
`
`378
`
`Architectural Differences among the Pentium and Pentium Pro
`Processors
`379
`23.2.1 Architectural Cache Differences
`23.2.2 Write Buffer Differences
`380
`
`380
`
`23.2.3 Data Controlled Unit Splits on the Pentium Pro Processor
`23.2.4
`Partial Memory Stalls
`382
`
`382
`
`Maximizing Aligned Data and MMX Stack Accesses I 383
`23.3.1
`The Pitfalls of Unaligned MMX Stack Access
`384
`
`384
`Accessing Cached Memory
`VA/riting to Video Memory 385
`385
`23.5.1
`Using Aligned Accesses to Video Memory
`23.5.2
`Spacing Out Writes to Video Memory with Write Buffers
`
`386
`
`EPILOGUE: THE FINALE
`
`389
`
`E.1
`
`389
`The Spiral Continues
`E.1.1
`The Hardware Spiral
`E.1.2
`The Software Spiral
`
`389
`390
`
`Remote Multimedia (a.k.a. Internet Multimedia)
`E.2.1
`Internet Languages
`390
`E.2.2 Multimedia on the Internet
`
`391
`
`390
`
`Evolving Hardware for the Internet
`E.2.3
`E.2.4 Multimedia Conferencing
`392
`
`392
`
`Better, Faster, Cheaper 3D 392
`E.3.1
`3D Hardware Spiral
`393
`E.3.2
`3D Software Spiral
`393
`E.3.3
`3D Scalability
`394
`E.3.4
`Emerging Application Areas
`E.4 Multimedia in the Home
`394
`E.5
`Demo Time
`395
`
`394
`
`E.6
`
`Some Web Sites for Further Reading
`
`395
`
`INDEX
`
`397
`
`CD-ROM LICENSE AGREEMENT NOTICE
`
`418
`
`15
`
`
`
`Why Read This Book?
`
`There's Lots of New Stuff to Learn
`
`In the past few years, the pace of technology growth has been exhilarating.
`Microsoft launched VN/indows 95. Intel debuted the Pentium, Pentium Pro,
`
`and MMX technology processors. Netscape burst the Internet pipe with a
`new class of applications and architectures. These companies and others
`paraded out a slew of new multimedia architectures. And you’ve never
`before felt so lost in space.
`
`Maybe you’re familiar with programming for V/Vindows 95 and now want to
`deliver Windows 95 multimedia applications, and you’re wondering where
`to start. Or maybe you’ve programmed multimedia for DOS/\/Vindows 3.1,
`and now you’re scrambling to learn Windows 95, learn the new computing
`environment, and then learn to deliver high—performance multimedia in
`this environment.
`
`Well, several new architectures have been introduced to help you deliver
`high—performance multimedia under \Vindows 9x,] such as DirectDraw
`
`1. Windows 9x stands for both Windows 95 and the upcoming Windows 98.
`
`16
`
`
`
`XVIII
`
`I PREFACE
`
`DirectSound*, Direct3D*, DirectShow*, RealMedia* , Realistic Sound Expe-
`rience (3D RSX), Realistic Display Mixer (RDX), and so forth. But now
`you’ve got to learn these new architectures, and you’ve got this steep learn-
`ing curve on your hands.
`
`On the hardware frontier, the power of personal computers has increased at
`a dramatic pace~—both in processor and peripheral power. The Pentium,
`Pentium Pro, Pentium II, and MMX technology processors, the accelerated
`graphics port (AGP) bus, and the various graphics hardware accelerators are
`recent hardware advancements that affect multimedia performance. Surely
`your applications would sizzle if you mastered these advancements. But
`mastering these advancements only increases the learning curve.
`
`And, of course, the Internet adds yet another dimension to the puzzle. The
`new programming space includes Internet browsers and their plug—ins;
`programming languages such as Java, HTML, and VRML; Internet archi-
`tectures such as ActiveX, RealMedia, and a huge list of applications such as
`Internet Phones and Chat VVorlds. More to learn, more to wade through,
`more time to spend.
`
`Lightening the Learning Burden
`As multimedia developers, we constantly investigate, evaluate, or learn
`these new technologies. Our typical sources are technical reference manuals
`and sample applications. I/Vith so many recent products, we’ve got a huge
`quantity of material to wade through. VVhen time is precious, as it invari-
`ably is, just getting started can be an overwhelming problem. Spending
`time getting started eats away from time allocated for finishing touches and
`product testing. And overall quality suffers when we’ve spent too much
`time just getting up to speed.
`
`VVouldn’t it be nice if there were a simple way to just get started? To grasp
`the bare essentials and leave the esoteric stuff for on-the—job training (those
`need—to-know moments)? To steer clear of performance pitfalls? I/Vell, do
`we have a deal for you. We, the authors, have been involved in various
`aspects of multimedia development on the PC for five long years. Through
`our employment at Intel and through our relationships with Microsoft and
`other key players, we’ve had the privilege to influence the architectures of
`processors, peripherals, platforms, and software components toward the
`betterment of multimedia on the PC. During that time, we’ve done our fair
`share of defining, reviewing, and implementing numerous multimedia
`architectures, both software and hardware.
`
`17
`
`
`
`PREFACE I XIX
`
`With this book, we hope to use our internal vantage point to give you a
`jump start to high—perfor1nance multimedia development for VVindows 9x.
`We’d like to help you cut to the chase; focus on the bare necessities; stick to
`the essentials; and jump—start a variety of offerings. What’s more, we're hop-
`ing to take you a step beyond getting started-—to extracting performance.
`
`VVe hope to provide you with a quick start to a wide spectrum of multi-
`media advancements for \/Vindows 9x. VVe hope to answer questions like
`Where do I start? Mflzat do I really need? How little can I get away w1'tl1?How
`do I get it to runfaster?
`'
`
`A dose of caution: there’s more than one way to get jump—started and more
`than one Way to extract performance. We’ll share our experiences with you,
`show you “a” way. VVe hope you’ll come away with some tricks, of course,
`but more important, we hope you’ll come away with a thought process—an
`approach.
`
`VVe’ve tried to maintain a light flavor. \/Ve hope you’ll have some fun along
`the way.
`
`hose
`do
`
`0 ugh
`and
`s of
`he
`I 1‘ fair
`a
`
`18
`
`
`
`INTRODUCTION
`
`
`
`Organization
`and Conventions
`
`W]-[Y READ
`THIS CHAPTER?
`
`Since we're talking about the organization of the chapters, it's only appropriate to note that
`all chapters start with the question above: "Why Read This Chapter?" Our purpose is to
`present you with a summary of what we intend to cover in the chapter. We recommend
`that you read the segment to see if what you will get is what you want.
`
`This chapter shows you how we arranged the book, to help you get the most benefit out
`of it. In the following pages, we
`
`describe who we wrote the book for,
`
`show you how we present our material,
`outline the organization of the book, providing overviews of each chapter,
`show some conventions we use to highlight information, and
`list the tools that you'll need when working with the companion CD.
`
`I.l About the Book
`
`When we started to outline the material for this book, we quickly recog-
`nized that we would be covering a lot of ground. We struggled with what to
`present and what to ignore. We asked ourselves, “What kind of a book
`would We have wanted when we started doing whatever we started?”
`
`19
`
`
`
`XXII
`
`ORGANIZATION AND CONVENTIONS
`
`I.I.I Where We're Coming From
`Because of our roles at Intel, we’ve had the good fortune to work on Win-
`dows multimedia architectures right from their infancy. In our work we
`applied both our architectural and our CPU optimization skills, and we used
`them across a wide range of multimedia avenues.
`
`Of late, we’d been called upon to help a number of software companies
`with their multimedia problems. Intel funded and continues to fund these
`software activities, in the interest of encouraging overall PC sales by pro-
`‘
`moting new uses for the PC; and in the interest of boosting demand for
`newer, higher—perfor1nance PCs, by promoting CPU—intensive applications.
`
`To address multimedia performance issues, we would typically optimize
`critical sections of the assembly code. However, when the performance bot-
`tlenecks are at the system level, we would have to demonstrate the use of (or
`even develop) appropriate Windows multimedia architectures.
`
`And this led us to think that we could write a book to offer the same thing
`to a larger audience, to help others get started on a number of different
`multimedia architectures, to help others extract a lot of performance from
`the PC multimedia architecture.
`
`Where We're Not Venturing
`VVe can’t claim to be The Experts in PC multimedia. The field is too big, and
`there are too many excellent software engineers out there for us to presume
`such a status. Nonetheless we feel we’ve been down some paths before and
`can share that experience with you, to get you started.
`
`I/Ve didn’t want to delve deeply into the gory details of any single architec-
`ture; that’s what the reference documents are for. Instead, we decided it
`would be better to get you started with the architectures, and we’re sure that
`your application needs will steer your further learning.
`
`On the flip side, with the breadth of architectures we wanted to cover, we
`knew we would have to skip basic concepts to do the architectures any jus-
`tice. So we’ve presumed some prerequisite knowledge and targeted the book
`to reasonably experienced programmers. V/Ve also narrowed our selections to
`focus on recent/emerging advancements so as to avoid merely putting a
`fresh spin on previously published information.
`
`20
`
`
`
`CHAPTER ORGANIZATION I XXIII
`
`l.1.3 Who Should Read This Book
`
`OK, so who did we think we could help? It was clear to us that our readers
`would
`
`already know how to program under Windows,
`
`understand multimedia concepts and terminology,
`
`be familiar with programming with C, C++, and for some sections, even
`assembly language (Intel Architecture), and
`
`appreciate, or even prefer, a hands—on learning approach (like to learn by
`being pointed in the right direction and then be free to find their own
`way around).
`
`l.2 Chapter Organization
`
`Armed with a clearer picture of our identity and our readers, we were able
`to outline our approach. On the one hand, we wanted to get our readers
`started quickly on the latest multimedia architectures. On the other hand,
`we wanted to show them how to extract high performance on Intel Archi-
`tecture multimedia PCs. Ergo, we have decided to provide simple samples!
`
`We have partitioned the book into six major parts. Each part focuses on a
`specific area of multimedia, with its chapters sequentially building on each
`other. We specifically tried to use the same or similar samples within each
`part. There are a total of twenty~three chapters in the book. VVe concen-
`trated on making each chapter brief, less than thirty pages each, so that
`wordiness wouldn’t dilute our subject matter. We deliberately chose the
`compact format to improve retention (make it less likely for readers to for-
`get what was said before).
`
`Let’s take a closer look at what we cover in each of the parts/chapters.
`
`Part I: Surveying Multimedia
`
`Chapter 1 Overview of Media on the PC. This chapter gives just a small
`overview of current multimedia architectures on the PC. We give a brief
`pass on the Graphics Device Interface (GDI), DirectDraw, DirectSo und,
`Direct3D, DirectShow, Realistic Display Mixer (RDX), and Realistic Sound
`Experience (3D RSX).
`
`Chapter 2 Processor Architecture Overview. Here we approach media
`from a hardware perspective. We give a high—level architectural overview of
`
`21
`
`
`
`XXIV I
`
`ORGANIZATION AND CONVENTIONS
`
`the Pentium, Pentium Pro, the Pentium processor with MMX technology,
`and the Pentium 11 processors. We also touch on the system point of view
`and why it is essential to optimize for the system as well as for the processor.
`
`Part II: Sprites, Backgrounds, and Primary Surfaces
`
`Chapter 3 Simple Sprites in GDI. This chapter introduces the concept of
`transparent sprites and backgrounds under Windo ws. V\7e show you how to
`draw backgrounds and transparent sprites using GDI.
`
`Chapter 4 Sprites with DirectDraw Primary Surfaces. We take our sprite
`to the next level with a DirectDraw Primary surface. We show you how to
`create a Primary surface to get direct access to the display screen. V\7e then
`rewrite the sprite to be drawn onto a Primary surface and compare its per-
`formance with the GDI implementation.
`
`Chapter 5 Hardware Acceleration via DirectDraw. Here we show you how
`to implement our beloved sprite using hardware Bltters on graphics adapt-
`ers. We then show you how to use Page Flipping hardware to minimize the
`cost of double—buffering incurred in the Primary surface implementation.
`Finally, we compare the performance gain of this implementation with the
`Primary surface implementation.
`
`Chapter 6 RDX: High-Performance Mixing with a High-Level API. Realistic
`Display Mixer (RDX) provides a high—level mixing interface without sacri-
`ficing performance. RDX uses hardware acceleration if available; otherwise
`it uses assembly code tuned for various processor flavors. VVe show you how
`to implement sprites with RDX, and we compare the performance of this
`implementation to GDI and DirectDraw implementations.
`
`Part III: Making the Media Mix
`
`Chapter 7 Video under Windows. This chapter introduces current multi-
`media architectures under Windows, including Multimedia Command
`Interface (MCI), Video for Windows (VFW), QuickTime for V\7indows
`(QTW), and ActiVeMovie.
`
`Chapter 8 DirectSh0w Filters. VVe start with an overview of the Direct-
`Show filter graph architecture and show you how to use the graph editor to
`manipulate filters. VVe then show you how to build source, transform, and
`rendering filters, and explain how the connection mechanism works. Next
`We discuss filter registration, custom interfaces, and filter property pages.
`
`22
`
`
`
`CHAPTER ORGANIZATION I XXV
`
`Chapter 9 DirectShow Applications. Building on the previous chapter, we
`show you how to use filters from an application. We show you how to build
`a filter graph directly using the DirectShow COM interface and the Direct-
`Show control interface. 'We then show you how to access custom interfaces
`and property pages.
`
`Chapter 10 Mixing Sprites, Backgrounds, and Videos. In this chapter we
`show you how to use RDX to access Directshow filters. We also explain how
`simple it can be to overlay a sprite on top of a video and even a video on top
`of another video.
`
`Chapter 11 Streaming down the Superhighway with RealMedia. In this
`chapter we look at the latest architecture from RealNetworks, which is a
`cross—platform architecture. We’ll show you how to build custom File-For-
`mat and Rendering plug—ins, which allow you to stream custom data types
`over the Internet. \/Ve’ll also show you how to use RealMedia audio services.
`
`Part IV: Playing and Mixing Sound with Directsound
`and RSX 3D
`
`Chapter 12 Audio Mixing with DirectSound. VVe start the chapter with an
`overview of Microsoft’s DirectSound. Then we show you how to play a sim-
`ple WAV file. VVe then teach you how to mix two sound files and how to
`control the format of the final output——after mixing.
`
`Chapter 13 Realistic 3D Sound Experience: RSX 3D. RSX provides a
`high—leVel programming model optimized for the Intel Architecture. V\7e
`start the chapter with an overview of lntel’s RSX 3D audio, and then we
`show you how to play one or more I/VAV files with it. VVe then give you an
`overview of RSX’s 31) sound model and show you how to achieve a realistic
`sound experience with it.
`
`Part V: Welcome to the Third Dimension
`
`Chapter 14 An Introduction to Direct3D. VVe kick off our 3D section with
`background on 3D on the PC and an overview of Microsoft’s Direct3D.
`Then we discuss Direct3D’s modes and its Immediate mode architecture.
`
`The main purpose of this chapter is to give you the bare minimum code
`needed to render a triangle in Direct3D’s Immediate mode.
`
`23
`
`
`
`XXVI
`
`I
`
`ORGANIZATION AND CONVENTIONS
`
`Chapter 15 Embellishing Our Triangle with Backgrounds, Shading, and
`Textures. In this chapter we add some bells and whistles to the default triangle
`we helped you c