`Page 1 of 23
`Page 2 of 23
`Page 2 of 23Page 2 of 23
`digest of papers
`Page 3 of 23
`digest of papers
`Technologies for the Information Superhighway
`Forty-First IEEE Computer Society International Conference
`Sponsored by- The IEEE Computer Society
`February 25-28, 1996
`Santa Clara, California
`IEEE Computer Society Press
`Los Alamitos, California
`Page 4 of 23
`IEEE Computer Society Press
`10662 Los Vaqueros Circle
`P.O. Box 3014
`Los Alamitos, CA 90720-1264
`Copyright © 1996 by The Institute of Electrical and Electronics Engineers, Inc.
`All rights reserved.
`Copyright and Reprint Permissions: Abstracting is pennitted with credit to the source. Libraries may
`photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume
`that carry a code at the bottom of the first page, provided that the per-copy fee indicated in the code is paid
`through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
`Other copying, reprint, or republication requests should be addressed to: IEEE Copyrights Manager, IEEE
`Service Center, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331.
`The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page. They
`reflect the authors' opinions and, in the interests of timely dissemination, are published as presented and
`without change. Their inclusion in this publication does not necessarily constitute endorsement by the
`editors, the IEEE Computer Society Press, or the Jnstilltte of Electrical and Electronics Engineers, Inc.
`IEEE Computer Society Press Order Number PR07414
`ISBN 0-8186-7414-8
`ISSN 1063-6390
`IEEE Order Plan Catalog Number 96CB35911
`Order Plan ISBN 0-8186-7 415-6
`Microfiche ISBN 0-8186-7416-4
`· "--'
`Additional copies may be ordered from:
`IEEE Computer Society Press
`Customer Service Center
`I 0662 Los Vaqueros Circle
`P.O. Box 3014
`Los Alamitos, CA 90720-1264
`Tel: +l-714-821-8380
`Fax: +1-714-821-4641
`Email: cs.books@computer.org
`IEEE Service Center
`445 Hoes Lane
`P.O. Box 1331
`Piscataway, NJ 08855-1331
`Tel: +1-908-981-1393
`Fax: + 1-908-981-9667
`mis.custserv @computer.org
`IEEE Computer Society
`13, Avenue de 1' Aquilon
`B-1200 Brussels
`Tel: +32-2-770-2198
`Fax: +32-2-770-8505
`euro.ofc @computer.org
`IEEE Computer Society
`Ooshima Building
`2-19-1 Minami-Aoyama
`Minato-ku, Tokyo 107
`Tel: +81 -3-3408-3118
`Fax: +81-3-3408-3553
`Editorial production by Mary E. Kavanaugh
`Cover by Joseph Daigle
`Printed in the United States of America by KNI
`The Institute of Electrical and Electronics Engineers, Inc.
`Page 5 of 23
`- - -............................. ...... -....... , ,, , . ,. , ,
`Proceedings ofCOMPCON '96
`Table of Contents
`Message from the General Chair .................................................... .......... ................................ xi
`Message from the Program Chair ..................... ................................ ........................ ........ ..... xiii
`Organizing Committees .. ................. ... .... .. .. ................. ............... ..................... ........................ .. xiv
`Session 1 : Wireless Interconnects
`~~. f\QfC COWW! ~;~"'""
`~_. ~"( , I
`Chair: John Barr- Motorola
`.$" ·,"
`CDPD and Emerging Digital Cellular Systems .................. .......... t~--~······bP~·-0--4 .. ,9.9.6 .......... ;~
`T. Melanchuk, P. Dupont, and S. Backer
`.....,,..,.,.,.. '.JV
`cc;-."<._...-: •· ~t ·
`. U . M b"l IP k E W . 1 N Ire ess etwor xtens10n smg o 1 e
`· "
`................................ '-'Or.t-v·~-~~~u''-· .. \.~• ..
`.. ..... -J
`~1 I•' ""'";
`R.L. Geiger, J.D. Solomon. and K.J. Crisler
`"'Co""~. ~.,;.~.-
`The Bay Area Research Wireless Access Network (BARW AN) ............................................... 15
`R.H. Katz, E.A. Brewer, E. Amir, H. Balakrishnan, A. Fox, S. Gribble,
`T. Hodes, D. Jiang, G.T. Nguyen, V. Padmanabhan, and M. Stemm
`Ses~ion 2: ATM Networks
`Chair: Anujan Varma- University of California, Santa Cruz
`Performance of Explicit R ate Flow Control in A TM Networks ................. ...... ........ ..... ............ 22
`L. G. Roberts
`MPEG-2 Over ATM: System Design Issues ........ ................................... ......... ................... ...... 26
`S. Varma
`FAST: A Simulation Testbed for ATM Networks .................................................................... 32
`D. Stiliadis and A. Varma
`Session 3: Broadband Interactive Data Services
`Chair: Ilja Bedner- Hewlett-Packard
`HP BIDS- Broadband Interactive Data Solution ................ ......................................... ........... 39
`I. Bedner and A. Ranous
`Design Considerations for a Hybrid Fiber Coax High-Speed Data Access Network ......... ..... .... .45
`D. Picker
`Session 4: Agent Languages
`Chair: Adam Hertz- General Magic
`Mobile Telescript Agents and the Web ................................... ................................................. 52
`P. Dome!
`Mobile Agent Security and Telescript ....................................................... ............................... . 58
`J. Tardo and L. Valente
`Page 6 of 23
`Session 5: World Wide Web
`Chair: Robert Hagmann - Oracle
`People, Places, and Things: The Next Generation Web ............................................................ 65
`J. Gwertvnan and M. Seltzer
`An Internet Difference Engine and its Applications .... ...................................... .. ..... ...... ........ .... 71
`T. Ball and F. Douglis
`Don' t Get Caught in the Web: A Fieldguide to Searching the Net ... .......... ... ... ............... .... ..... 77
`W.R. Tuthill
`Session 6: World Wide Web Servers
`Chair: Winfried Wilcke - HAL Computer Systems
`A Scalable and Highly Available Web Server ..... ........ .. ....... ... ..... ...... .............. .................. ..... ... 85
`D.M. Dias, W. Kish, R. Mukherjee, and R. Tewari
`Session 7: Performance Characterization and Analysis
`Co-Chairs: Nasr Ullah and Marianne Hsiung- Motorola
`The Capture, Characterization, and Performance Analysis of Macintosh® Traces ... ................... 94
`S. McMahon
`A Measurement Study of Memory Transaction Characteristics on a
`PowerPC-Based Macintosh ...... ........ ... ... ... ..... ....................... ................ ...... ................... ......... 100
`T. Adams
`Load Miss Performance Analysis Methodology Using the PowerPCTM 604 Performance
`Monitor for OLTP Workloads .... ...... ...... ............ ..... .. ... ..... .......... ....... ..... ..... ...... ............. ........ 111
`E.H. Welbon, R.S. Moore, F.E. Levine, and C.P. Roth
`Workload Effects on SMP Scaling in AIX Version 4 .................. .. .... ...... .. ........ .. ......... .. ...... .. 117
`K. Dixit, J. Van Fleet, and B. Olszewski
`Session 8: Panel - Networking Virtual Environments
`Chair: Michael Zyda - Naval Postgraduate School
`Panelists: M . Zyda- "Networking Large-Scale Virtual Environments"
`T. Meyer - "The Future of VRML"
`M . Macedonia- "A Taxonomy for Networked Virtual Environments"
`W . Katz -
`"Defense and Entertainment Industry Efforts in Networking
`Virtual Environments"
`Session 9: PowerPC Microprocessors and Systems
`Co-Chairs: Nasr Ullah- Motorola
`Kaivalya Dixit- IBM
`Design of the PowerPC 604e™ Microprocessor ...................................................................... 126
`M. Denman, P. Anderson, and M. Snyder
`The Performance and PowerPC Platform™ Specification Implementation of the
`MPC106 Chipset ....................... ..................... ......................................... ....... ....... ........ ......... 132
`C.D. Bryant, M.J. Garcia, B.K. Reynolds, L.A. Weber, and G.E. Wilson
`Page 7 of 23
`PowerPC Platform: A System Architecture ..... ............................. ........... .......... .......... ........... 140
`S. Bunch, R. Hochsprung, and T. Moore
`Motorola PowerPCTM Migration Tools- Emulation and Translation ............................. ........ 145
`T. Afzal, M. Breternitz, M. Kacher, S. Menyhert, M. Ommerman, and W. Su
`Session 1 0: PA-RISC Evolution
`Chair: Ruby Lee -
`Stanford University
`64-bit and Multimedia Extensions in the PA-RlSC 2.0 Architecture ..... ....... ..... ............ ........... 152
`R. Lee and J. Huck
`Mid-Range and High-End PA-RISC Computer Systems ...... ..... ..... ................. .... ...... ........ ... 161
`R. Elsbernd
`P A 7300LC Integrates Cache for Cost/Performance ...... .......... ....... .... ........... ...... .. ... .............. 167
`D. Hollenbeck, S.R. Undy, L. Johnson, D. Weiss, P. Tobin, and R. Carlson
`Session 11: Having it your Way - High-Code-Density, High-Integration,
`and High-Performance ARMs
`Chair: Allen Baum - Apple Computer
`Thumb: Reducing the Cost of 32-bit RlSC Performance in Portable and
`Consumer Applications ... ..... ................. ............... ........ ............... .......... .... ........ ... ............... ... 176
`L. Goudge and S. Segars
`ARM7100- A High-Integration, Low-Power Microcontroller for PDA Applications ..... .... ... 182
`G. Budd and G. Milne
`StrongARM: A High-Performance ARM Processor .................. ..... .. ...... ....... ........... ...... ....... 188
`R. Witek and J. Montanaro
`Session 12: MPEG2
`Chair: Vivian Shen- Hewlett-Packard
`A Scalable Chip Set for MPEG2 Real-Time Encoding ................ : ............. ........................ ...... 193
`A. Ngai, J. Sutton, C. Boice, and C. Gebler
`Performance Comparison of MPEG 1 and MPEG2 Video Compression Standards ................... 199
`S. Liu
`Mediaprocessing in the Compressed Domain .. .................... ..... ...... ...... .... .. ........... ...... .. ..... .... 204
`V. Bhaskaran
`Session 13: Interactive Television
`Chair: Robert Hagmann - Oracle
`A Distributed System Client/Server Architecture for Interactive Multimedia Applications ... .... 211
`S. Rege
`Dynamic Bandwidth Allocation for Interactive Video Applications over Corporate
`Networks ............. ................................................................................... ......................... ..... 219
`C.J. Beclanann
`.The Tiger Shark File System ........ ........... ...... .................... ................ ...... ...... ..... .................. 226
`R.L. Haskin and F.B. Schmuck
`Page 8 of 23
`Session 14: Interactive 1V Settop
`Chair: Deven Kalra- Hewlett-Packard
`Interactive Television Settop Terminal Architectures ..... ..................... ....... ............................. 233
`A.N. Nair
`Multimedia Transmission Link Protocol -A Proposal for Digital Information
`Transmission in HFC Cable Systems ....................................... ...... ..... ...... .. .... ............ .. ... : ...... 239
`R-F. Chiu and R. Hutchinson
`DA VJD® System Software v2.0 for Interactive Digital Television Networks .......... ........ ........ 241
`A. Davidson
`Session 15: Scalable Clusters
`Chair: Marco Annaratone- DEC Western Research Laboratory
`Overview of Memory Channel Network for PCI ..................................................................... 244
`R. Gillett, M. Collins, and D. Pimm
`Digital's Clusters and Scientific Parallel Applications ............ .... ......................................... .... 250
`R. Kaufmann and T. Reddin
`Overview of Digital UNIX Cluster System Architecture .. .. ........... ..... ....... . : ............................ 254
`W.M. Cardoza, F.S. Glover, and W.E. Snaman, Jr.
`Session 16: HAL Computer Systems
`Chair: Winfried Wilcke- HAL Computer Systems
`A 9.6 GigaByte/s Throughput Plesiochronous Routing Chip ............................ ....................... 261
`A. Mu, J. Larson, R. Sastry, T. Wicki, and W. W. Wilcke
`Performance Limiting Factors in Http (Web) Server Operations ......... ... ......... ............. ............ 267
`F. Prefect, L. Doan, S. Gold, T. Wicki, and W. Wilcke
`Session 17: Exploiting New Storage and Network Technologies
`Chair: Norman J. Pass -IBM Almaden Research Center
`SSA: A High-Perfonnance Serial Interface for Unparalleled Connectivity .............................. 274
`A. Wilson
`Redundant Arrays of Independent Libraries (RAIL): A Tertiary Storage System .......... .......... 280
`D.A. Ford, R.J.T. Morris, andA.E. Bell
`Randomized Data Allocation for Real-Time Disk I/0 ..................... ....... .......... .. ................. ... . 286
`S. Berson, R.R. Muntz, and W.R. Wong
`Services and Architectures for Electronic Publishing ........................... ...................... ............. 291
`D.M. Choy and R.I. T. Morris
`Session 18: Multimedia Authoring
`Chair: Michael A. Harrison - University of California, Berkeley
`Graphical Object-Oriented Multimedia Application Development: Technology
`and Market Trends ............. ................... ........... ......... ......... ....... ..... ..... .. .......... .. .... ....... ...... ..... 299
`H. Steger
`Page 9 of 23
`r r
`'•'''•'·"· · · · · -· .. ·--·""""- ................................................... -----~~ __ ,...._~
`Graphical Containment in Multimedia Authoring ................................. ......... : ........... .............. 300
`H. Epelman-Wang. S. Markowitz, and B. Roddy
`User Interfaces for Authoring Systems with Object Stores ............................... ....................... 305
`B. Roddy, S. Markowitz, and H. Epelman-Wang
`Session 19: Competing Architectures for Multimedia Processing
`Chair: Cary Kornfeld -
`The Mpact™ Media Processor Redefines the Multimedia PC .. ...... : ..... .......... .................... .... .. 311
`P. Foley
`An Architectural Overview of the Programmable Multimedia Processor, TM-1 .... .................. 319
`S. Rathnam and G. Slavenburg
`,. Improving Performance for Software MPEG Players ... .............. ...... .... ....... ................ ........ .... 327
`D. F. Zucker, M.J. Flynn, and R.B. Lee
`Session 20: The MicroUnity Mediaprocessor
`Chair: Steve Manser - Micro Unity Systems
`Architecture of a Broadband MediaProcessor ....... ................. ...... ................ ..... ................... . .. 334
`C. Hansen
`MicroUnity Software Development Environment ............... .... ..... ................. ... ... ........... ....... ... 341
`R. Hayes, G. Loyola. C. Abbott, and H. Massalin
`Broadband Algorithms with the Micro Unity Mediaprocessor .... ......... ..... ....... ....... ...... .. .. ....... . 349
`C. Abbott, H. Massalin, K. Peterson, T. Karzes, L. Yamano, and G. Kellogg
`Session 21: DRAM Technologies
`Chair: S. Peter Song -
`Burst and Latency Requirements Drive EDO and BEDO DRAM Standards .... ......... ............... 356
`A. Mormann
`Synchronous DRAM Evolutionary Changes Bring Cost/Performance Advantages in
`Memory Systems ..... ................................. ......... ............. ................................ ....... .... ............. 360
`A.B. Cosoroaba
`High Bandwidth RDRAM Technology Reduces System Cost .... : ................................. ........... 365
`R. Crisp
`Multi-Gigabyte/sec DRAM with the MicroUnity MediaChannel™ Interface .. ........... ............ .. 378
`T. Robinson, C. Hansen, B. H erndon, and G. Rosseel
`Session 22: Pentium®Pro System Architecture
`Chair: Konrad Lai - Intel
`An Overview of the Pentium®Pro Processor Bus .............. .. .......................... ... ........ ....... ..... .. 383
`N. Sarangdhar and G. Singh
`Pentium®Pro Processor Workstation/Server PCI Chipset .......................... ....... ....................... 388
`M. Bell and T. Holman
`Multiprocessor Validation of the Pentium®Pro Microprocessor .... ........... ........................ ...... . 395
`D. Marr, S. Thakkar, and R. Zucker
`Page 10 of 23
`Session 23: Storage Technology
`Chair: Harry S. Gill- IBM
`Data Storage IC Technolgy ................................................. ................... ........... ...................... 402
`J. Kovacs and R. Kroesen
`Session 24: UltraSPARC and Java
`Chair: Robert Garner -
`Sun Microsystems
`UltraSPARCTM: Compiling for Maximum Floating~Point Performance ......... ..... .................. .. 408
`P. Tirumalai, D. Greenley, B. Beylin, and K. Subramanian
`UltraSPARC-IITM: The Advancement of UltraComputing .. ................. ..... ...... ....................... .417
`G. Goldman and P. Tirumalai
`JavaTM and HotJava: A Comprehensive Overview ............................................. .................... 424
`S. Shaio, A. van Hoff, and H. Jellinek
`Session 25: Desktop Color- From Eye to Paper
`Chair: Allen Baum- Apple Computer
`Digital Cameras and Electronic Color Image Acquisition ..... ..... ... .......... .. .......... .................. . .431 ·
`1. Dalton
`Electronic Color Printing Technology ................. ............. .... ................................................... 435
`G.K. Starkweather
`ColorSync™: Synchronizing the Color Behavior of Your Devices ..... ............ ......... .............. 440
`W-L Chu and S. Swen
`Session 26: Architecture of Workflow Management Systems
`Chair: Berthold Reinwald -IBM Almaden Research Center
`Object-Oriented Workflow Technology in InConcert .. ........ .......... ................ ...... .... .............. . 446
`S.K. Sarin
`Structured Workflow Management with Lotus Notes Release 4 ......................................... .-: .. 451
`B. Reinwald and C. Mohan
`An Architecture for Large-Scale Work Management Systems .......................... .... : ...... ..... .... .. 458
`Mo Beizer
`Session 27: "Toy Story"
`Chair: Darrell Long - University of California, Santa Cruz
`The Making of Toy Story ....... 0 . . . . . . . . . . . . 0 0 . . . . . . . . .
`M. Henne, H. Hickel, E. Johnson, and S. Konishi
`. . . . . . . . . 0
`. . . . . . 0 . . . . . . . . . . . . . . .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . .
`0 . . . . . . 463
`0 . .
`Additional Paper: The following paper was presented as the last paper in Session 12
`Single Chip MPEG2 Decoder with Integrated Transport deocder for Set-top Box .... .. ............. 469
`l o F andrianto
`Author Index .. ................................................... .......... .......... ......... ..
`.. .......... .. ..... .. ... 473
`Page 11 of 23
`i :
`i .
`An Architectural Overview of the Programmable
`Multimedia Processor, TM -1
`Selliah Rathnam, Gert Slavenburg
`Philips Semiconductors
`811 E. Arques Avenue, Sunnyvale, CA 94088
`in a family of programmable multimedia
`the Trimedia product group of Philips
`This "C" programmable processor
`high performat!ce VL!W-CPU core with video and
`peripheral umts deszgned to support the popular
`1fiultmtea,ra applications. TM-1 is designed to concur(cid:173)
`video, audio, graphics, and communica(cid:173)
`The VLJW-CPU core is capable of executing a
`nax,znium of twenty seven operations per cycle, and the
`execution rate is about five operations P.er cy(cid:173)
`tuned applications. The audio unit easzly han(cid:173)
`audio formats including the 16-bit stereo
`unit is capable of processing different
`pixel formats with horizontal and vertical
`color space conversion. TM-1 applications
`· ·
`can range from low-cost, stand alone systems such as
`video phones to programmable, multipurpose plug-in
`cards Jor traditional computers.
`lM-1 is a building-block for high-performance multi(cid:173)
`media applications that deal with high-quality video and
`audio. TM-1 easily implements popular multimedia stan(cid:173)
`dards such as MPEG-1 and MPEG-2, but its orientation
`around a powerful general-purpose CPU makes it capa(cid:173)
`ble of implementing a variety of multimedia algorithms,
`whether open or proprietary.
`More than just an integrated microprocessor with un(cid:173)
`usual peripherals, the TM-1 microprocessor is a fluid
`12cbus to
`camera, etc.
`YUV 4:2:2
`V.34 oriSON
`Front End
`Down & up scaling
`PCI Bus
`Page 12 of 23
`Other Trimedia family members will have different
`sets of interfaces appropriate for their intended use. For
`example, a TM-1 cfnp for a cable-TV decoder box would
`eliminate the video-in interface.
`The key features of TM -1 are:
`• A very powerful, general-purpose _YLI\Y .P!oces(cid:173)
`sor core that coordinates all on-chip activities. In
`addition to implementing the non-trivial parts of
`multimedia algo!ithms, this proc.esso~ runs a ~mall
`real-time operatmg sys~em that IS dnven by mter(cid:173)
`rupts from the other umts.
`• DMA-driven multimedia input/output units that
`operate independently and that properly format
`data to make processing efficient.
`• DMA-driven multimedia coproces~ors that operate
`independently _and perform. operattons spectfic to
`important multtmedia algonthms .
`• A hi~h-performan.ce ~us and memory s~stem that
`provtd~s commumcat1on between TM-1 s process(cid:173)
`mg umts.
`Figure 1 shows a block dia8ram of the TM-1 c~p. The
`bulk of a TM-1 system cons1sts of the TM-1 mtcropro(cid:173)
`cessor itself, a block of. syn.chron~us DRAM (SD~).
`and minimal external ctrcmtry to mterface to the mcom(cid:173)
`ing and/or outgoing multimedia data streams. TM-1 can
`gluelessly inteiface tt? th'? standard PCI bus for personal(cid:173)
`computer-based apphcatwns; thus, TM~l ~an be placed
`directly on the PC mainboard or on a plug-m card.
`Figure 2 shows a J?OSsible TM-_1 system application. A
`video-input stream, 1f pres~nt, ~Ight come du~c~y from
`a CCIR 601-comi?liant digttal vtdeo camera chtp m YUV
`· 4:2:2 format; the mterface is glueless in thi~ case: A non(cid:173)
`standard camera chip can .b~ connected v1a a v1deo de(cid:173)
`coder chip (such as the Phthps SAA 7111). A CCIR 601
`output video stream is provided .directly from th.e ~M-1
`to drive a dedicated video momtor. Stereo audto mput
`and output require external AJ.?C.and DAC s~pP.Ort: The
`operation of tlie video and audw mterface umts ts h1ghly
`customizable through programmable parameters.
`. The &lueless PCI inte!face allows the TM-1 to ~sP.lay
`v1deo via a host PC's vtdeo card and to play aud10 vta a
`host PC' s sound hardware: The. Image Copro~essor pro(cid:173)
`vides display support for hve video m an arbitrary num(cid:173)
`ber of arbitrarily overlapped windows.
`Finally the V.34 interface requires only an external
`modem f~ont-end chi.P a~d phone line interface to pro(cid:173)
`vide remote commumcat10n support. The modem can be
`used to connect TM-1-based systems for video phone or
`video conferencing applicatio!ls, .or ~t can be used for
`general-purpose data commumcatlon m PC systems.
`The key to understanding TM-rol?eration is_ observing
`that the CPU and peripherals are orne-shared and that
`communication between units is through SDRAM mem-
`Figure 2. TM-1 system connections. A minimal
`TM-1 system requires few supporting compo(cid:173)
`. -
`computer system controlled by a small real-time OS k~r
`nel that runs on the VLIW processor core. T~-1 con tams
`a CPU, a high-bandwidth mternal bus, and mternal bus(cid:173)
`mastering DMA peripherals.
`TM-1 is the first member of a family. of .chips that wi~l
`carry investments in software f~rward m time. Compati(cid:173)
`bility between famii.Y. members IS at t~e source-cod~ lev(cid:173)
`el; binary compatibility between famtly mem~ers IS not
`guaranteed. All family members, how~ver •. wtll be .able
`to perform the most important multimedia functiOns,
`such as running MPEG-2 software.
`Defining software compatibility. at the so.urce-code
`level gives Philips the freedom to stnke the opttmum bal(cid:173)
`ance between cost and performance for all the chips in
`the TM-1 family. Powerful compilers ensure that pro(cid:173)
`grammers seldomly need to resort to non-po~ble as(cid:173)
`sembler programmmg. Programmers use TM-1 s power(cid:173)
`ful low-level opera.tions from. source ~~de; thes~ DSP(cid:173)
`like operations are mvoked wtth a famthar function-call
`syntax. Trimedia also provides hand-cod~ and tuned
`multimedia libraries which can be used to mcrease the
`performance of the multimedia applications.
`As the first member of the family, TM-1 is tailored for ·
`use in PC-based applications. Because it is base~ on a
`general-purpose CPU, TM-1 can ~erve as a multi-func(cid:173)
`tiOn PC enhancement vehicle. Typtcally, a PC must deal
`with multi-standard video and audio streams, and users
`desire both decompression !lnd compression, .if possible.
`While the CPU chtps used m PCs are becomm.g cap~ble
`of low-resolution real-time video decompresswn, hlgh(cid:173)
`q_uality video decompression-not to mention compres(cid:173)
`siOn-is still out of reach. Further, users demand that
`their systems provide live video and audio without sacri(cid:173)
`ficing the responsiveness of the system.
`TM-1 enhances a PC system to provide real-time ~ul
`timedia, and it does so with the advantages o~ a special(cid:173)
`purpose, embedded solution-low cost and ch1p count(cid:173)
`and the advantages of a general-puryose processor-re(cid:173)
`programmability. For PC applicatiOns, TM-1 f~ su~
`passes the capabilities of fixed-function multimedia
`Page 13 of 23
`ory. The CPU switches from one task to the next; first it
`decompresses a video frame, then it decompresses a slice
`of the audio stream, then back to video, etc. As neces(cid:173)
`sary, the CPU i.ssues co~mands to the peripheral units. to
`orchestrate their operation.
`the PCI bus for archival on local mass storage, or the host
`can transfer the compressed video over a network, such
`as ISDN. The data can also be sent to a remote system us(cid:173)
`ing the integrated V.34 interface to create, for example,
`a video phone or video conferencing system.
`The TM-1 CPU can enlist the ICP and video-in units
`to help with some of the straightforward, tedious tasks
`assocmted with video processing. The function of these
`units is programmable. For example, some video streams
`are-or need to be-scaled horizontally, so these units
`can handle the most common cases of horizontal down(cid:173)
`and up-scaling without intervention from the TM-1
`Video Decompression in a PC
`A typical mode of operation for a TM-1 system is to
`serve as a video-decompression engine on a PCI card in
`a PC. In this case, the PC doesn't know the TM-1 has a
`powerful, general-purpose CPU; rather, the PC just treats
`the hardware on the PCI card as a "black-box" engine.
`Video decompression begins when the PC operating
`system hands the TM-1 a pointer to compressed video
`data in the PC' s memory (the details of the communica(cid:173)
`tion protocol are t~pically handled by a software driver
`installed in the PC s operating system).
`The TM-1 CPU fetches data from the compressed vid(cid:173)
`eo stream via the PCI bus, decompresses frames from the
`video stream, and places them into local SDRAM. De(cid:173)
`compression may 6e aided by the VLD (variable-length
`decoder) unit, which implements Huffman decoding and
`is controlled by the TM-1 CPU.
`When a frame is ready for display, the TM-1 CPU
`gives the ICP (image coprocessor) a display command.
`The ICP then autonomously fetches the decompressed
`frame data from SDRAM and transfers it over the PCI
`bus to the frame buffer in the PC' s video display card (or
`the frame buffer in PC system memory if tfie PC uses a
`UMA (Unified Memory Architecture) frame buffer).
`The ICP accommodates arbitrary window size, position,
`and overlaps.
`Video Compression
`Another typical application for TM-1 is in video com(cid:173)
`pression. In this case, uncompressed video is usually
`supplied directly to the TM-1 system via the video-in
`umt. A camera chip connected directly to the video-in
`unit supplies YUV data in eight-bit, 4:2:2 format. The
`video-in unit takes care of sampling the data from the
`camera chip and demultiplexmg the raw video to
`SDRAM in three separate areas, one each for Y, U, and
`When a complete video frame has been read from the
`camera chip by the video-in unit, it interrupts the TM-1
`CPU. The CPU compresses the video data in software
`(using a set of powerful data-parallel operations) and
`writes the compressed data to a separate area of
`The compressed video data can now be disposed of in
`any of several ways. It can be sent to a host system over
`Since the powerful, general-purpose TM-1 CPU is
`available, the compressed data can be encrypted before
`being transferred fur security.
`4.1 VLIW Processor Core
`The heart of TM-1 is its powerful 32-bit CPU core.
`The CPU implements a 32-bit linear address space and
`128, fully general-purpose 32-bit registers. The registers
`are not separated into banks; any operation can use any
`register for any operand.
`The core uses a VLIW instruction-set architecture and
`is fully general-purpose. TM-1 uses a VLIW instruction
`length that allows up to five simultaneous operations to
`be Issued. These operations can target any five of the 27
`functional units in the CPU, includmg integer and float(cid:173)
`in!l'-point arithmetic units and data-parallel DSP-like
`Instruction Cache (32Kb)
`Instr. Fetch Buffer
`Decompression Hardware
`Issue Register ( 5 Ops )
`Register Routing and Forwarding Network
`Register File ( 128 X 32 )
`Figure 3. VLIW Processor Core and Instruction
`Page 14 of 23
`Although the processor core runs a tiny real-time op(cid:173)
`erating system to coordinate all activities in the TM-1
`system, the processor core is not intended for true gener(cid:173)
`al-purpose use as the only CPU in a computer system.
`For example, the processor core does not imrlement vir(cid:173)
`tual memory address translation, an essentia feature in a
`general-purpose computer system.
`TM-1 uses a VLIW architecture to maximize proces(cid:173)
`sor throughput at the lowest possible cost. VLIW archi(cid:173)
`tectures have performance exceeding that of superscalar
`general-purpose CPUs without the extreme complexity
`of a superscalar implementation. The hardware saved by
`eliminating superscalar logic reduces cost and allows the
`integration of multimedia-specific features that enhance
`the power of the processor core.
`The TM-1 operation set includes all traditional micro(cid:173)
`processor operations. In addition, multimedia-specific
`operations are included that dramatically accelerate stan(cid:173)
`dard video compression and decompression algorithms.
`As just one of the five operations issued in a single TM-
`1 instruction, a single special or "custom" operation can
`implement up to 11 traditional microprocessor opera(cid:173)
`tions. Multimedia-specific operations combined witb the
`VLIW architecture result in tremendous throughput for
`multimedia applications.
`Internal ''Data Highway" Bus
`The internal data bus connects all internal blocks to(cid:173)
`gether and provides access to internal control registers
`(in each on-chip peripheral units), external SDRAM, and
`the external PCI bus. The internal bus consists of sepa(cid:173)
`rate 32-bit data and address buses, and transactions on
`the bus use a block-transfer protocol. Peripherals can be
`masters or slaves on the bus.
`: Access to the internal bus is controlled by a central ar(cid:173)
`biter, which has a request line from each potential bus
`master. The arbiter is configurable in a number of differ(cid:173)
`ent modes so that the arbitration algorithm can be tai(cid:173)
`lored for different applications. Peripheral units make re(cid:173)
`quests to the arbiter for bus access, and dependin~ on the
`arbitration mode, bus bandwidth is allocated to tne units
`in different amounts. Each mode allocates bandwidth
`differently, but each mode guarantees each unit a mini(cid:173)
`mum bandwidth and maximum service latency. All un(cid:173)
`used bandwidth is allocated to the TM-1 CPU.
`The bus allocation mechanism is one of the features of
`TM-1 that makes it a true real-time system instead of just
`a highly integrated microprocessor with unusual penph(cid:173)
`4.3 Memory and Cache Units
`TM-1 's memory hierarchy satisfies the low cost and
`high bandwidth requirement of multimedia markets.
`Since multimedia video streams can require relatively
`large temporary storage, a significant amount of DRAM
`is re