Proceedings oj7PPS "96
`The 1 0th International Parallel Processing
`April 15-19, 1996
`Honolulu, Hawaii
`Sponsored by
`The IEEE Computer Society Technical Committee on Parallel Processing
`In cooperation with
`The Association for Computing Machinery SIGARCH
`IEEE Computer Society Press
`Los Alarnitos, California
`IEEE Computer Society Press
`10662 Los Vaqueros Circle
`P.O. Box 3014
`Los Alamitos, CA 90720-1 264
`Copyright 0 1996 by The Institute of Electrical and Electronics Engineers, Inc.
`All rights reserved.
`Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may
`photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume
`that carry a code at the bottom of the first page, provided that the per-copy fee indicated in the code is paid
`through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
`Other copying, reprint, or republication requests should be addressed to: IEEE Copyrights Manager, IEEE
`Service Center, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331.
`The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page. They
`reflect the authors ’ opinions and, in the interests of timely dissemination, are published as presented and
`without change. Their inclusion in this publication does not necessarily constitute endorsement by the
`editors, the IEEE Computer Society Press, or the Institute of Electrical and Electronics Engineers, Inc.
`IEEE Computer Society Press Order Number PRO7255
`ISBN 0-8 186-7255-2
`IEEE Order Plan Catalog Number 96TB100038
`Microfiche ISBN 0-8 186-7257-9
`Additional copies may be ordered from:
`IEEE Computer Society Press
`Customer Service Center
`10662 Los Vaqueros Circle
`P.O. Box 3014
`Los Alamitos, CA 90720-1264
`Tel: +1-714-821-8380
`Fax: +1-7 14-821 -4641
`IEEE Computer Society
`IEEE Computer Society
`IEEE Service Center
`13, Avenue de I’Aquilon Ooshima Building
`445 Hoes Lane
`B-1200 Brussels
`2-19-1 Minami-Aoyama
`P.O. Box 1331
`Minato-ku, Tokyo 107
`Piscataway, NJ 08855-1331 BELGIUM
`Tel: +32-2-770-2198
`Tel: +1-908-98 1- 1393
`Fax: +32-2-770-8505
`Tel: +81-3-3408-3118
`Fax: +1-908-981-9667
` Fax: +8 1-3-3408-3553
`Editorial production by Mary E. Kavanaugh
`Cover design by Mike Nomura
`Printed in the United States of America by KNI Inc.
`The Institute of Electrical and Electronics Engineers, Inc.
`Proceedings of IPPS ’96
`Table of Contents
`Message f r o m the General Chair ..............................................
`Message f r o m the Program Chair .......................................................................................... xvii
`Message f r o m the Steering Committee Chair ....................................................................
`Program and Organizing Committees ..............................
`Reviewers .....
`Keynote Address - “Can Multithreaded Prcogramminlf Save Massively Parallel
`Speaker: Charles E. Leiserson - Massachusetts Institute of Technology ...............................
`Session 1 - Compiler Optimization
`Chair: Prith Banerjee - University of Illinois, Urbana
`Eliminating Stale Data References through Array Data-Flow Analysis ..................
`L. Choi and P-C. Yew
`Commutativity Analysis: A Technique for Autonratically Parallelizing Pointer-Based
`Computations ........................................................................
`M. Rinard and P. Diniz
`Profiling Dependence Vectors for Loop Paralleliziition .....................................
`S-Y. Tseng, C-T. King, and C- Y. Tang
`A Method for Register Allocation to Loops in Mu’ltiple Register File Architectures . .
`D. J. Kolson, A. Nicolau, N. Dutt, and K. Kennedy
`Affine-by-Statement Transformations of Imperfectly Nested Loops .........................................
`J. Xue
`The Combined Effectiveness of Unimodular Transformations, Tiling, and
`............................................................... 39
`Software Prefetching ......
`R.H. Saavedra, W. Mao, D. Park, J. Chame, and S. Moon
`Session 2 - Scientific/ Engineering Appliciations
`.. 14
`Chair: Josk D.P. Rolim - University of Geneva
`Ocean Circulation on the Intel Paragon: Modeling and Implementation ...................................
`K-C. Leung, I. Ahmad, and H-M. Hsu
`Implementation of an Automatic Semi-Fluid Motion Analysis Algorithm on a Massively
`Parallel Computer ............................................................................................
`................ 55
`K. Palaniappan, M. Faisal, C. Kambhamettu, and A. F. Hader
`NAS Experiences of Porting CM Fortran Codes to HPF on IBNI SP2 and SGI Power
`Challenge .............................................................................................................
`S. Saini
`Dynamic Alignment and Distribution of Irregularly Coupled Data Arrays for Scalable
`Parallelization of Particle-in-Cell Problems ............................................................................
`W-K, Liao, C-W. Ou, and S. Ranka
`A Hierarchical Parallel Processing System for the Multipass-Rendering Method ......................
`H. Kobayashi, H. Yamauchi, Y. Toh, and T. Nakamura
`Performance Modeling and Composition: A Case Study in Cell Simulation ............................
`S.G. Steinberg, J. Yang, and K. Yelick
`Session 3 - Distributed Memory Systems
`Chair: Behrooz Shirazi - Universig of Texas, Arlington
`A Study of High-Performance Communication Mechanism for Multicomputer Systems .......... .76
`H. Murayama, S. Yoshizawa, T. Aimoto, H. Inouchi, S. Murase, T. Hayashi,
`and H. Iwamoto
`A TeraFLOP Supercomputer in 1996: The ASCI TFLOP System ............................................ 84
`T.G. Mattson, D. Scott, and S. Wheat
`Experience with Parallel Computing on the AN2 Network .......................................................
`D.J. Scales, M. Burrows, and C.A. Thekkath
`Achieving a Balanced Low-Cost Architecture for Mass Storage Management through
`Multiple Fast Ethernet Channels on the Beowulf Parallel Workstation .................................... 104
`T. Sterling, D.J. Becker, D. Savarese, M.R. Berry, and C. Reschke
`Exploiting the Capabilities of Communications CO-Processors .............................................. 109
`K. E. Schauser, C. J. Scheiman, J.M. Ferguson, and P. Z. Kolano
`Effects of Multithreading on Data and Workload Distribution for Distributed-Memory
`Multiprocessors ...................................................................................................................... 116
`A. Sohn, M. Sato, N. Yoo, and J-L. Gaudiot
`Session 4 - Shared Memory Systems
`Chair: Rudolf G. Hackenberg - Technische Universitiit Miinchen
`Formal Verification of Delayed Consistency Protocols ............................................................ 124
`F. Pong and M. Dubois
`Dag-Consistent Distributed Shared Memory ........................................................................... 132
`R.D. Blumofe, M. Frigo, C.F. Joerg, C.E. Leiserson, and K.H. Randall
`Categorizing Network Traffic in Update-Based Protocols on Scalable Multiprocessors .......... ,142
`R. Bianchini, T.J. LeBlanc, and J.E. Veenstra
`Implementing the Data Diffusion Machine Using Crossbar Routers ........................................ 152
`H.L. Muller, P. W.A. Stallard, and D.H.D. Warren
`A Memory Controller for Improved Performance of Streamed Computations on
`Symmetric Multiprocessors ................................................................................................. 159
`S.A. McKee and W.A. Wulf
`Kiloprocessor Extensions to SCI ............................................................................................. 166
`S. Kaxiras
`Session 5 - Algorithms
`. 2 18
`Chair: Joseph JUG - University of Maryland
`Approximate Compaction and Padded-Sorting on Exclusive Write PRAMS ............................
`M. Kutylowski and T. Wierzbicki
`A Parallel Solution to the Extended Set Union Problem with Unlimited Backtracking
`M.C. Pinotti, V.A. Crupi, and S.K. Das
`A Parallel Algorithm for Minimization of Finite Automata ..................................................... 187
`B. Ravikumar and X. Xiong
`A Randomized Algorithm for Voronoi Diagram of Line Segments on Coarse-Grained
`Multiprocessors ...................................................................................................................... 192
`X. Deng and B. Zhu
`Self-Timed Resynchronization: A Post-Optimization for Static Multiprocessor Schedules ..... .199
`S.S. Bhattacharyya, S. Sriram, and E.A. Lee
`Constructing the Spanners of Graphs in Parallel .......................
`W. Liang and R.P. Brent
`Session 6 - Programming Languages
`Chair: Gul Agha - University of Illinois, Urbana
`Converse: An Interoperable Framework for Parallel Programming
`L. V. Kalk, M. Bhandarkar, N. Jagathesan, S. Krishnan, anti J. Yelon
`Dome: Parallel Programming in a Distributed Computing Environment ................................
`J.N.C. Arabe, A. Beguelin, B. Lowekamp, E. Seligman, M. Starkey,
`and P. Stephan
`Nested Parallel Call Optimization
`E. Pontelli and G. Gupta
`The Parallel Break Construct, or How to Kill an Activity Tree ................................................ 230
`Y.I. Friedman, D.G. Feitelson, and I. Exman
`Optimizing COOP Languages: Study of a Protein Dynamics Program ................................... 235
`X. Zhang, V. Karamcheti, T. Ng, andA.A. Chien
`Support for Extensibility and Reusability in a Concurrent Object-Oriented Programming
`....................................................................................................................... 241
`R. Pandey and J.C. Browne
`......................................................... 225
`Session 7 - Communication I
`Chair: Cho-Li Wang - University of Hong Kong
`Modeling the Communication Performance of the IBM SP2 ................................................... 249
`G.A. Abandah and E.S. Davidson
`Adaptive Source Routing in Multistage Interconnection Networks .........................................
`Y. Aydogan, C.B. Stunkel, C. Aykanat, and B. Abali
`The Effects of Network Contention on Processor Allocation Strategies ..................................
`S.Q. Moore and L.M. Ni
`ServerNet Deadlock Avoidance and Fractahedral Topologies ................................................
`R. Horst
`Analysis of Memory Interference in Buffered Multiprocessor Systems in Presence of
`Hot Spots and Favorite Memories ........................................................................................... 281
`S. K. Das and S. K. Sen
`Benefits of Processor Clustering in Designing Large Parallel Systems: When and How? ....... ,286
`D. Basak, D. K. Panda, and M. Banikazemi
`Session 8 - Implementation of Primitive Operations
`Chair: Gregory Plaxton - University of Texas, Austin
`Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and
`Selection .............................................................................................................................. ,292
`D.A. Bader and J. JdJa’
`Parallel Implementation of Bonhka’ s Minimum Spanning Tree Algorithm ............................
`S. Chung and A. Condon
`Practical Algorithms for Selection on Coarse-Grained Parallel Computers .............................
`I. Al-furiah, S. Aluru, S. Goil, andS. Ranka
`Parallel Multilevel Graph Partitioning
`G. Karypis and V. Kumar
`PACKNNPACK on Coarse-Grained Distributed Memory Parallel Machines
`S. Bae and S. Ranka
`Random Seeking: A General, Efficient, and Informed Randomized Scheme for Dynamic
`Load Balancing ......................................................................................................................
`N.R. Mahapatra and S. Dutt
`Session 9 - Resource Allocation and Management
`Chair: Rafael H. Saavedra - University of Southem California
`Resource Placement in Torus-Based Networks ....................................................................... 327
`M.M. Bae and B. Bose
`Simultaneous Compression of Makespan and Number of Processors Using CRP ................... ,332
`Y. Ge and D. Y. Y. Yun
`Implementation of Scalable Blocking Locks Using an Adaptive Thread Scheduler ................. ,339
`B. Mukherjee and K. Schwan
`Hector: Automated Task Allocation for MPI ..........................................
`S.H. Russ, B. Flachs, J. Robinson, and B. Heckel
`An Adaptive Approach to Data Placement .............................................................................. 349
`D.K. Lowenthal and G.R. Andrews
`Complete Parallelization of Computations: Integration of Data Partitioning and Functional
`Parallelism for Dynamic Data Structures ...
`D. Banerjee and J. C. Browne
`Keynote Address - “MPPs versus Clusters”
`Speaker: Charles L. Seitz - Myricom, Inc. ............................................................................ 362
`Session 10 - Communication II
`Chair: Louise Moser - University oJ California, Santa Barbara
`Generating Realignment-Based Communication for HPF Programs .......................................
`T. Kamachi, K. Kusano, K. Suehiro, Y. Seo, M. Tamura, anld S. Sakon
`Software Support for Virtual Memory-Mapped Communication ............................................
`C. Dubnicki, L. Iftode, E. W. Felten, and K. Li
`How to Optimize Residual Communications? ........................................................................
`M. Dion, C. Randriamaro, and Y. Robert
`A Comparative Study of Methods for Time-Deterministic Message Delivery in a
`Multiprocessor Architecture ...................................................................................................
`J. Jonsson and J. Vase11
`ECO: Efficient Collective Operations for Communication on Heterogeneous Networks ........ ,399
`B.B. Lowekamp and A. Beguelin
`Software Techniques for Improving MPP Bulk-Triansfer Perforrnance .................................... 406
`E.A. Brewer, P. Gauthier, A. Fox, and A. Schuett
`Session 11 - Algorithms: lmplementatiorn
`Chair: Mikhail Atallah - Purdue University
`Parallel Algorithms for Image Enhancement and Segmentation by Region Growing with
`.................................................................................................... 414
`an Experimental Study
`D.A. Bader, J. JciJa', D. Harwood, and L.S. Davis
`The Chessboard Distance Transform and the Medial Axis Transform are Interchangeable ...... ,424
`Y-H. Lee and S-J. Horng
`Parallel Algorithms for Image Processing: Practical Algorithms with Experiments ................ .429
`A. Baumker and W. Dittrich
`Study of Scalable Declustering Algorithms for Parallel Grid Files ..
`B. Moon, A. Acharya, and J. Saltz
`A Parallel Algorithm for Text Inference ........................................
`S.M. Harabagiu and D.I. Moldovan
`A Direct Block-Five-Diagonal System Solver for the VLSI Parallel Model ............................
`M. VajterSic
`........... ,434
`........... 441
`Session 12 - Performance Evaluation and Prediction1
`Chair: John Gustafson - Ames Laboratory
`Efficient Execution of Parallel Applications in Mu:ltiprogrammeld Multiprocessor
`Systems .................................................................................................................................
`K. K. Yue and D. J. Lilja
`The Relation of Scalability and Execution Time
`X-H. Sun
`Maximizing Speedup through Self-Tuning of Processor Allocation ........................................
`T. D. Nguyen, R. Vaswani, and J. Zahorjan
`Profiling Optimized Code: A Profiling System for an HPF Com,piler ..................................... 469
`S. Kaneshiro and T. Shindo
`................................................................. 457
`Toward Symbolic Performance Prediction of Parallel Programs .............................................
`T. Fahringer
`Performance Prediction with Benchmaps .............................................................................
`S. Toledo
`Industrial Track - Invited Presentations
`Organizer: John K. Antonio - Texas Tech University
`Session-I: Parallel Architectures - Implementation, Programming, and
`Chair: John K. Antonio - Texas Tech University
`Cray Research, Inc.:
`Communication Latency and Bandwidth on the Cray Research T3E .........................
`F. W. Chism
`IBM System/390 Division:
`Overview of IBM System/390 Parallel S ysplex: A Commercial Parallel Processing
`System .....................................................................................................................
`J.M. Nick, J-Y. Chung, and N.S. Bowen
`Litton Guidance and Control Systems, Inc.:
`Implementing Parallel Processing in a Rugged Embeddable Environment
`A.L. Smeyne
`Mercury Computer Systems, Inc.:
`Planned Direct Transfers: A Programming Model for Real-Time Applications
`G. Vichniac, B. Isenstein, C. Lund, and A. Pool
`...... ,487
`....... 488
`....... 496
`... ........ 502
`Session-Il: Networking and Distributed Computing
`Chair: Richard C . Metzger - Rome Laboratory
`Centre for Development of Advanced Computing:
`........ 507
`DS-Link over Fiber: A High-speed Interconnect for Cluster Computing ..................
`Y. Abhyankar, A. Degwekar, and A. Karandikar
`Electronics and Telecommunications Research Institute:
`A Multiprocessor Server with a New Highly Pipelined Bus ....................
`W-J. Hahn, A. Ki, K-W. Rim, and S-W. Kim
`Tandem Computers Incorporated:
`........ 518
`Performance Modeling of ServerNetTM Topologies ...................................................
`B. Horst, D. Avresky, R. Wilkinson, D. Jewett, W. Watson, L. Young, and C. Cunningham
`Virtual Computer Corporation:
`Distributed Virtual Computing ...........................................................................................
`J. Schewel, M. Thornburg, and S. Casselman
`........ 512
`Session 13 - Synchronization, Virtual Memory, and Runtime System Support
`Chair: Francine Berman - University of California, San Diego
`CoCheck: Checkpointing and Process Migration for MPI .............
`G. Stellner
`Tulip: A Portable Run-Time System for Object-Piarallel Systems
`P. Beckman and D. Gannon
`A Virtual Memory Model for Parallel Supercomputers
`V. L. M. Reis and I. D. Scherson
`A Partitioning Programming Environment for a Novel Parallel Architecture ......
`R. Hartenstein, J. Becker, M. Herz, R. Kress, and U. Nageldinger
`An Integrated Synchronization and Consistency Protocol for the: Implementation of a
`High-Level Parallel Programming Language ..........................................................................
`M. C. Rinard
`Implementation and Evaluation of Prefetching in the Intel Paragon Parallel File System ........ ,554
`M. Arunachakam, A. Choudhary, and B. Rullman
`.......................... 526
`............................ 532
`.................... 537
`Session 14 - Arrays and Hypercubes
`.................................. 568
`............................. 579
`Chair: Oscar Ibarra - University of California, Santa Barbara
`Routing a Permutation in the Hypercube by Two Sets of Edge-Disjoint Paths ......................... 561
`Q-P. Gu and H. Tumaki
`Determining Asynchronous Acyclic Pipeline Execution Times ......
`V. Donaldson and J. Ferrante
`Distributing Tokens on a Hypercube without Error Accumulation ..........................................
`B.S. Chlebus, J.D.P. Rolim, and G. Slutzki
`On Some Global Operations in Faulty SIMD Hypercubes
`A. Sengupta and C.S. Raghavendra
`An Improved Approximation Algorithm for Scheduling Task Trees on Linear Arrays ............ .584
`H.K. Tadepalli and E.L. Lloyd
`Mapping Linear Recurrences onto Systolic Arrays
`L. Kazerouni, B. Rujun, and R.K. Shyamasundar
`................................ 591
`Session 15 - Mathematical Methods
`Chair: Dan I. Moldovan - Southern Methodist University
`Jacobi-like Algorithms for Eigenvalue Decomposition of a Real Normal Matrix Using
`Real Arithmetic ..................................................................................................................... 593
`B. B. Zhou and R. P. Brent
`An Element-Based Concurrent Partitioner for Unstructured Finite Element Meshes ............... ,601
`H.Q. Ding and R.D. Ferraro
`Analysis of the Numerical Effects of Parallelism on a Parallel Genetic Algorithm ................... 606
`W.E. Hart, S. Baden, R.K. Belew, and S. Kohn
`Compiling MATLAB Programs to ScaLAPACK Exploiting Task and Data Parallelism ......... . 6 13
`S. Ramaswamy, E, W. Hodges IV, and P. Banerjee
`Mapping Techniques for Parallel Evaluation of Chains of Recurrences ..................................
`E.V. Zima, K.R. Vadivelu, and T.L. Casavant
`Performance of Asynchronous Linear Iterations with Random Delays ....................................
`A.C. Moga and M. Dubois
`Panel - “For a Massive Number of Massively Parallel Machines:
`What are the Target Applications, Who are the Target Users, and
`What New R&D is Needed to Hit the Target?”
`Moderator: Howard Jay Siege1 - Purdue University .......................................................... 631
`Panelists: William Farmer - Integrated Computing Engines, Inc.
`Richard Freund - NRaD
`Mark Furtney - Cray Research, Inc.
`Paul Messina - Caltech
`Lionel M. Ni -National Science Foundation
`Charles L. Seitz - Myricom, Inc.
`Marc Snir - IBM T. J. Watson Research Center
`Keynote Address - “Clusters for Commercial Computing: An Invisible
`Speaker: Gregory F. Pfister - IBM Sewer Group, Austin
`Session 16 - Interconnection Networks
`Chair: D.K. Panda - Ohio State University
`Generic Methodologies for Deadlock-Free Routing ...............................................................
`H. Park and D.P. Agrawal
`Partitionability of the Multistage Interconnection Networks ...................................................
`Y. Chang
`On Embedding Various Networks into the Hypercube Using Matrix Transformations ............ ,650
`M. Hamdi and S. W. Song
`Optimal Subcube Fault Tolerance in a Circuit-Switched Hypercube .......................................
`B.A. Izadi and F. Ozgiiner
`Fault-Tolerant Ring Embedding in Star Graphs .....................................................................
`Y-C. Tseng, S-H. Chang, and J-P. Sheu
`An Optical Interconnect Model for k-ary n-cube Wormhole Networks .................................... 666
`M. Raksapatcharawong and T.M. Pinkston
`Session 17 - Bus-Based Algorithms
`Chair: Sartaj Sahni - University of Florida
`Fault-Tolerant Multiple Bus Networks for Fan-In Algorithms ................................................
`R. Vaidyanathan and S. Nadella
`Coping with Sparse Inputs on Enhanced Meshes - Semigroup Computation with
`COMMON CRCW Buses .........................................................................................................
`P. Damaschke
`An Optimal Algorithm for the Angle-Restricted All Nearest Neighbor Problem on the
`Reconfigurable Mesh ............................................................................................................. ,687
`K. Nakano and S. Olariu
`Parallel Algorithms Using Unreliable Broadcasts ..................................................................
`J. Matthews and C. Martel
`Efficient Algorithms for the Hough Transform on Arrays with IReconfigurable
`Optical Buses ......................................................................................................................... 697
`S. Pave1 and S.G. Akl
`Integer and Floating Point Matrix-Vector Multiplication on the Reconfigurable Mesh ............ ,702
`J.L. Trahan, C-M. Lu, and R. Vaidyanathan
`Session 18 - Image and Radar Processing
`Chair: D. Martinez - MZT Lincoln 1xborutor-y
`Some Image Processing Algorithms on a RAP with Wider Bus Networks ..............................
`S-S. Lee, S-J. Horng, H-R. Tsai, and Y-H. Lee
`Parallel Synthetic Aperture Radar Processing on Workstation Networks ................................
`P.G. Meisl, M.R. Ito, and I.G. Gumming
`The Evolution of a Massively Parallel Vision System for Real-Time Automotive Image
`. 7 16
`............ 734
`A. Broggi
`2D Object Recognition on a Reconfigurable Mesh .................................................................. 729
`C. Guerra
`Space-Time Adaptive Processing on the Mesh Syinchronous Processor ....
`J.S. McMahon and K. Teitelbaum
`An Experimental Study of Input/Output Characteristics of NASA Earth and Space
`Sciences Applications .........................................................
`................................... 741
`M.R. Berry and T.A. El-Ghazawi
`Session 1 9 - Special-Purpose Applications
`Chair: Kang G. Shin - University of Michigan, Ann Arbor
`Bitonic Sorting on Bene5 Networks ....................................
`B.M. Gocal and K.E. Batcher
`Designing Adaptable Real-Time Fault-Tolerant Parallel Systems ...........................................
`C.E. Moro'n
`Improving Memory Performance for Indirect Accesses on SIMID Computers .........................
`J.D. Allen and D.E. Schimmel
`A New Approach to Pipeline FFT Processor ........................
`S. He and M. Torkelson
`Implementation of a SliM Array Processor ..........
`H. M. Chang, M.H. Sunwoo, and T-H. Cho
`Temporal Characterization of Demands for Data Movement on Parallel Programs .....
`B. Rodriguez, H. Jordan, and G. Alaghband
`Session 20 - Communication 111
`Chair: Jean-Luc Gaudiot - University of Southeirn CaZ$orniu
`Broadcasting Multiple Messages in the Multiport Model ........................................................
`A. Bar-Noy and C-T. Ho
`............................... 749
`............................... 766
`.78 1
`X l l l
`The Necessary Conditions for Clos-Type Nonblocking Multicast Networks ...........................
`Y. Yang and G.M. Masson
`A Class of Interconnection Networks for Multicasting ...........................................................
`Y. Yang
`Performance Prediction of PVM Programs .............................................................................
`M.R. Steed and M.J. Clement
`Algorithms for All-to-All Personalized Exchange in 2D and 3D Tori ...........................
`Y-J. Suh and S. Yalamanchili
`Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application
`to Disha Concurrent .............................................................................................................. .8 15
`A.K. Venkatramani, T.M. Pinkston, and J. Duato
`Session 21 - Clusters and Domain Decomposition
`Chair: Susamma Barua - California State University, Fullerton
`Efficient Run-Time Support for Irregular Task Computations with Mixed Granularities ......... ,823
`C. Fu and T. Yang
`A New Technique for 3-D Domain Decomposition on Multicomputers which Reduces
`............................................................ 831
`J. Gil and A. Wagner
`Application Load Imbalance on Parallel Processors ...............................................................
`V. Govindan and M.A. Franklin
`Native ATM Application Programmer Interface Testbed for Cluster-Based Computing
`P. W. Dowd, T.M. Carrozzi, F.A. Pellegrino, and A.X. Chen
`SWEB: Towards a Scalable World Wide Web Server on Multicomputers .............................. 850
`D. Andresen, T, Yang, V. Holmedahl, and O.H. Ibarra
`Parallel Implementations of Irregular Problems Using High-Level Actor Language ............... ,857
`R.B. Panwar, W. Kim, and G.A. Agha
`Additional Papers .................................................................................................................
`. I
`Author Index ..................................................................................................................... 899
