throbber

`

`

`

`~1'7
`II<
`5/t?.?-.?
`, :;?1?.?1/
`/19?-
`
`ELSEVIER SCIENCE PUBLISHERS BV.
`Sara Burgerhartstraat 25
`P.O. Box 211, 1000 AE Amsterdam, The Netherlands
`
`Keywords are chosen from the ACM Computing Reviews Classification System, !01991, with permission.
`Details of the full classification system are available from
`ACM 11 West 42nd St., New York. NY 10036, USA
`
`ISBN· 0 444 81481 7
`ISSN· 0926-549X
`
`<C 1993 IFIP. All rights reserved.
`No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any
`means. electronic, mechanical. photocopying, recording or otherwise. without the prior written permission of
`the publisher, Elsevier Science Publishers B V., Copyright & Permissions Department, P.O. Box 521 . 1000 AM
`Amsterdam, The Netherlands.
`
`Special regulations for readers in the U.SA - This publlcallon has been registered with the Copyright Clearance
`Center Inc. (CCC). Salem. Massachusetts. Information can be obtained from the CCC about conditions under
`which photocopies of parts of this pubhcat1on may be made in the U.S.A. All other copyright questions, including
`photocopying outside of the U.S A., should be referred to the publisher, Elsevier Science Publishers B.V., unless
`otherwise specified.
`
`No respons1b11ity 1s assumed by the publisher or by IFIP for any injury and/or damage to persons or property
`as a matter of products liability, negligence or otherwise, or from any use or operation of any methods. products,
`instructions or ideas contained in the material herein.
`
`pp. 119-134, 199-218, 267-281 , 367-381 : Copyright not transferred
`
`This book is printed on acid-free paper.
`
`Printed in The Netherlands
`
`Ex.1017.003
`
`DELL
`
`

`

`

`

`viii
`
`A High Speed Data Link Control Protocol
`Ahmed N.Tantawy, IBM Res. Div., T.J. Wa tson Research Center, USA,
`Hanafy Meleis, DEC, Reading, UK
`
`Session C: Parallel Implementation and Transport
`Protocols
`Chair: Guy Pujolle, Universite P. et M. Curie, France
`
`Parallel TCP/IP for Multiprocessor Workstations
`Kurt Maly, S. Khanna, A. Mukkamala, C.M. Overstreet, A. Yerraballi,
`E.C. Foudriat, B. Madan, Old Dominion University, USA
`
`TCP/IP on the Parallel Protocol Engine
`Erich AOtsche, Matthias Kaiserswerth,
`IBM Research Division, Zurich Research Laboratory, Switzerland
`
`A High-Speed Protocol Parallel Implementation: Design and Analysis
`Thomas F. La Porta, AT& T Bell Laboratories, USA,
`Mischa Schwartz, Columbia University, New York, USA
`
`Session D: Multimedia Communication Systems
`Chair: Radu Popescu-Zeletin, GMO FOKUS, Germany
`
`Orchestration Services for Distributed Multimedia Synchronisation
`Andrew Campbell, Geoff Coulson, Francisco Garcia, David Hutchison,
`Lancaster University, UK
`
`Towards an Integrated Quality of Service Architecture (OOS·AI f or
`Distributed Multimedia Communications
`Helmut Leopold, Alcatel ELIN Research, Austria
`Andrew Campbell, David Hutchison, Lancaster University, UK,
`Niklaus Singer, Alcatel ELIN Research, Austria
`
`JVTOS - A Reference Model for a New Multimedia Service
`Gabriel Dermler, University of Stuttgart, Germany
`Konrad Froitzheim, University of Ulm, Germany
`
`Experiences with the Heidelberg Multimedia Communication System:
`Multicast, Rate Enforcement and Pertonnance
`Andreas Cramer, Manny Farber, Brian McKellar, Ralf Steinmetz,
`IBM European Networking Center, Germany
`
`81
`
`101
`
`103
`
`119
`
`135
`
`151
`
`153
`
`169
`
`183
`
`199
`
`Ex.1017.005
`
`DELL
`
`

`

`Session E: QoS Semantics and Management
`Chair: Martina Zitterbart, IBM Res. Div., Watson Research Center, USA
`
`Client-Network Interactions in Quality of Service Communication
`Environments
`Domenico Ferrari , Jean Ramaekers, Giorgio Ventre, International
`Computer Science Institute, USA
`
`The OSI 95 Connection-mode Transport Service: The Enhanced QoS
`Andre Danthine, Yves Baguette, Guy Leduc, Luc Leonard,
`University of Liege, Belgium
`
`QoS : From Definition to Management
`Noemie Simoni, Simon Znaty, TELECOM Paris, France
`
`Session F: Evaluation of High Speed Communication
`Sy~ems
`Chair: Otto Spaniel, Technical University Aachen, Germany
`
`ISO OSI FTAM and High Speed File Transfer: No Contradiction
`Martin Bever, Ulrich Schaffer, Claus SchottmUller,
`IBM European Networking Center, Germany
`
`Analysis of a Delay Based Congestion Avoidance Algorithm
`Walid Dabbous, INRIA, France
`
`Performance Issues In Designing Network Interfaces : A Case Study
`K.K. Ramakrishnan, Digital Equipment Corporation, USA
`
`ix
`
`219
`
`221
`
`235
`
`253
`
`~5
`
`267
`
`283
`
`299
`
`Session G: High Performance Protocol Mechanisms 315
`Chair: Craig Partridge, BBN, USA
`
`Multicast Provision for High Speed Networks
`A.G . Waters, University of Essex, UK
`
`Transport Layer Multicast: An Enhancement for XTP Bucket
`Error Control
`Harry Santoso, MASI, Universite P.et M. Curie, France,
`Serge Fdida, MASI, Universite Rene Descartes, France
`
`A Performance Study of the XTP Error Control
`Arne A. Nilsson, Meejeong Lee,
`North Carolina State University, USA
`
`317
`
`333
`
`351
`
`Ex.1017.006
`
`DELL
`
`

`

`x
`
`Session H: Protocol Implementation
`Chair: SamirTohme, E.N.S.T. , France
`
`ADAPTIVE An Object-Oriented Framework for Flexible and Adaptive
`Communication Protocols
`Donald F. Box, Douglas C. Schmidt, Tatsuya Suda,
`University of California, Irvine, USA
`
`365
`
`367
`
`HIPOD : An Architecture for High-Speed Protocol Implementations
`A.S. Krishnakumar, J.G. Kneuer, A.J. Shaw, AT&T Bell Laboratories, USA
`
`383
`
`Parallel Transport System Design
`Torsten Braun, University of Karlsruhe, Germany,
`Martina Zitterbart, IBM Res. Div., T.J. Watson Research Center, USA
`
`Session I: Network Interconnection
`Chair: Augusto Casaca, INESC, Portugal
`
`A Rate-based Congestion Avoidance Scheme for Interconnected
`DQDB Metropolitan Area Networks
`Nen-Fu Huang, Chiung-Shien Wu, Chung-Ching Chiou,
`National Tsing Hua University, Rep.of China
`
`Interconnection of LANs/802.6 Customer Premises Equipments {CPEs)
`via SMDS on Top of ATM: a case description
`W. Rozenblad, B. Li, R. Peschi,
`Alcatel Bell Telephone, Research Centre, Belgium
`
`Architectures for Interworking between B-ISDN and Frame Relay
`J. Vozmediano, J. Berrocal, J. Vinyes,
`ETSI Telecomunicacion, Spain
`
`Author Index
`
`397
`
`413
`
`415
`
`431
`
`443
`
`455
`
`Ex.1017.007
`
`DELL
`
`

`

`Jilgh Performance Networking. IV (C-14)
`A. Danthinc and 0. Spaniol (Editors)
`Elsevier Science Publishers B.V. (North Holland)
`1993 IFIP.
`
`119
`
`TCP/IP on the Parallel Protocol Engine
`
`Erkh Riitschc and Mauh1a~ Kaiserswcrth
`
`IBM Research Division, Zurich Research Laboratory
`SaumcP.>tmsse 4, 8803 RiiM:hlikon, Switzerland
`
`Abstract
`
`In th" paper. a parallel 1111pkmentation of 1he TCP/JP pro1ocol ~u11e on the Parallel Pro1ocol En
`gine (PPE), a mulliproccssor-based communication subsystem, is descnbed. The execution
`time~ of the various protocol func1ions are used to analy1e the syMem 's performance in two see
`nario,. In the fir\t ~cenario we execute the te\I application on the PPE; in the second we evaluate
`the potential pcrfonnance of our TCP/lP implcmentallon when 111s driven hy an application on
`the workstation. For the second scenario, the end-to-end performance of our 1111plemcntution on a
`four-processor PPE system 1s more than 3300 TCP segments per \econd.
`
`Keyword Codes. C. l.2; C.2.2; D. U
`Keywords: Multiple Daw Stream Architectures (Multiprocessors); Network Protocols:
`Concurrent Programming
`
`1. INTRODUCTION
`
`Progrc~s m high-speed net.,.. orkrng technologies such as fiber opucs have shifted the bottleneck
`in communi<.:ations from 1hc limited bandwidth of the transmission media to protocol processing
`and the operatmg system overhead in the workstation. So-called lightweight protocols and proto(cid:173)
`col ollload to programmable adapter.. are two approaches proposed 10 cope with this problem.
`Prott>l:Ols such as the Xpress Transfer Protocol (XTP)' !PEI 921 and VMTP !Cheriton 88] try 10
`simplify the control mechanisms and packet \lructures such that the protocol implemcnlation be(cid:173)
`come\ less complex and can possibly be done m hardware We took the second approach in build
`f Kaiserswenh 92J. a muluprocessor-bascd
`the Parallel Protocol Engine (PPE)
`mg
`communication adapter, upon which protocol processing can be omoaded from a host system.
`The ~cctarCAB IAmould 891 and the V.MP Network Adapter Board (Kanak1a 881 are other pro(cid:173)
`grammable adapters, each based on a single protocol processor. The XTP chipsct (Chesson 871 is
`a very spec.:ialiied set of RISC processors designed to e;<ecute the XTP protocol. Our objective
`was to 111vcstiga1e and exploit parallelism in many diffrrent protocols. Therefore we decided 10
`de\ clop a gcnaal purpost communication subsystem capable of suppomng standard protocols
`cfftc1cntly 111 software.
`
`1
`
`.\pre" Tran>lcr PrntlK:OI and XTP arc rcgl\tcrcd tradcmarh ol XTP Forum
`
`Ex.1017.008
`
`DELL
`
`

`

`120
`
`In this paper our goal is to demonMrate that a careful 1mplementation of a Mandard tmnspon pro(cid:173)
`tocol stack on a general-purpose multiprocessor architecture allows efficient use of the band(cid:173)
`width availabk Ill today's high-,peed networks. A, an example, we chose to implement the
`TCP/IP protocol l.uite on our -t-processor prototype of the PPE.
`
`We implemented the \ockct interface and a test application directly on the PPE to facilitate our
`performance measurements. In this tesr \Cenario we analyze the performance of TCP/IP and the
`socl..ct layer. We also exanuncd a second scenario to unden.tand how our 1mplementauon would
`pcrfonn when integrated into a workstation. where protocol processing up to the transport layer 15
`perfom1ed on the PPE and applicauons can access the transpon service via the socket interface on
`the:: workstation.
`
`In Section 2 our hardware platform, the PPE. is presented. Section 3 introduces TCP/IP. In the
`follm-.mg secuon we C\pl<tin our approach to parallt:I protoc.:ol unplementation. Section 5 prc\(cid:173)
`e111s the results and discusses the impact of the hardware and software architecture on perfor(cid:173)
`mance. The last section gives the condusion and an outlook on our furure work.
`
`2. THE PARALLEL PROTOCOL El'IGJ~E
`
`The PPE is to be pre~ented only hriefly hen.!. It is described in greater detail in [Wicki 90J and
`( Kat,erswenh 91 . 921 We "Will first concentrate on the hardware and then present the program·
`ming environment.
`
`The PPE is a hybrid \ha.red-memory/message-passing multiprocessor. Message passing is used
`for \ym.:hron1zation. whereas shared memor} is used to store sen.ice primitives and protocol
`frumcs. Figul"I! I shows the archnecture of the PPE ;ind ns u~e as a commumcation subsystem.
`
`The PPE u!>es two separate memories. one for tran~mitting. one for receiving data. Both of these
`memories an:: mapped mto the address space of the worhtallon. (n our tmplementation. four
`T425 transputer\ [IN MOS 89] arc used as protocol processors. On each stde of the adapter, two
`T425s have access to the shared memory. Each processor uses private memory to store its pro(cid:173)
`gram and local data. We decided against using a single shared memory for storing both inbound
`and outbound protocol data, although this would make the adapter more flexible and facilitate
`programming, for the following reason. lligh-speed network interfaces work in a synchronous
`fashion.with data be111g docl..ed in and out of memory. possibly at the same time, ::11 the transml\(cid:173)
`!>ion speed of the physical network. Splitting the adapter mto -.cparate receive and transmit parts
`accommodates simultaneous transmission and reeepuon and only requires memory wtth half the
`speed of that required for a smgle-memory solution. This architecture results in significant cost
`savings. especially when transmiss10n speeds exceed 100 Mb/s
`
`The network interface has read access to the transmit side and write access to the n::cetve side of
`the adapter We emulate a physical network by means of an 8 bit wide parallel interface, which
`allows a po1nt-to-pomt connec11011 between two PPE \ystems operaung with a b1directional
`transrmssion rate of up to 120 Mb/s. The transputer links are used exclusively for signalling and
`control message transfer within the PPE and to and from the ho~t system.
`
`The program111111g language wh11:h best dt:~l:nbcs the transputer\ programrrung model is OC(cid:173)
`CAM [Pounta1n 881. It 1s based on the theory of Cum111w1icaii111{ Sequential Proasses (CSP) de(cid:173)
`veloped hy lloare I Hoare 781. The structuring elements are processes that communicate and
`synd1romze \'lit mes~gcs :\1essage transfer" unbuffered communicating processes must reach
`
`Ex.1017.009
`
`DELL
`
`

`

`

`

`

`

`

`

`124
`
`4.1 IP and ICMP
`Becau~e IP is a datagram protocol, the normal flow of data through IP in an end-system requires
`no interaction betwet:n the receiving and transmitting part. Routing infom1ation and exception
`handling, however, require a data exchange. The handling of exception and control messages is
`the function of ICMP. We therefore partitioned IP into two independent processes lcmp_demu1e
`and lp_demux. To guarantee the timely handling of incoming packeis, we dedicated a separate
`proces\ on the receive side of the PPE to the handling of the physical network interface.
`
`The rouung table 1s shared between both processes on the Lransmit and receive side of the PPE. An
`RPC 1~ used if lcmp_demux needs to send out an ICMP message.
`
`4.2 TCP
`Splitting the PPE hardware into a i.eparate send and receive side had more impact on how \.\-e had
`to deal with TCP, the socket layer, and application layer, than it had on IP.
`
`We dedded to split the finite \tale machine (FSM) responsible for implementing a TCP connec(cid:173)
`tion into two separate FSMl> once the connectton is in the data phase. The actions of these FSMs
`are implemented on the receive side through two processes, rtask and tcp_recv. On the transmit
`side one process xtask implements the FSM. Owing to the duplex nature of TCP and the piggy(cid:173)
`backing of control information in data packets, these processes need to share the protocol's send
`and receive ~rate vanables maintained in the transmil:.ion conrrol block (fCB).
`
`tcp_ recv demultiplexes incoming TCP segments, locates the appropriate TCB and executes the
`required action for the FSM state. Header prediction is used Lo speed up packet handling for pack(cid:173)
`ets amving con~cutively on the same connection. Correctly received segments are appended to
`the rece1 ve queue and the application process waiting on this connccuon is then woken up to move
`the data to its own buffers. When the received data exceeds the c1cknowledgement threshold,
`which is specified as a percentage of the advertised receive window, tcp_recv makes an RPC to
`the transmit side to generate an acknowledgement. The acknowledgement is sent a.<; a separate
`packt:t, unless this information can be piggybacked onto an outgoing data segment.
`
`rtask 1s drivi=n by two timers, one responsible for delayed acknowledgements, the other for keep(cid:173)
`alive messages. In steady state data transmission, rtask should never generate an acknowledge(cid:173)
`ment, as tcp_recv already generates ackno\.\-ledgernents \.\-hile data are received. Only when the
`timer runs out and new unacknowledged data have been received since the last acknowledgement
`will rtask generate an acknowledgement. Similarly, keep-alive messages are also sent only when
`no acnvity has taken place on a rnnnection for some time. Again, both acknowledgements and
`keep-ahvc messages are gencrntcd via RPCs to the trJllSIIllt Mde.
`
`On the transmit side the process xtask manages the trnnsmit queue and the retransmission timers.
`To send data, xtask creates the TCP header and fills in the necessary infonnatlon from the TCB.
`<;uch as addresses and sequence number.. for the data and acknowledgements. The header and a
`pointer to the d;ua are then pas'.>Cd ro the IP process (procedure lp_send), which embeds this in(cid:173)
`fonnataon into an IP datagram.
`
`4.3 Socket Layer and AppliC<1tion
`To fac1lttate our cxpenments wuh TCP/IP, we decided as a first Mep t0 implement the entire sock(cid:173)
`et layer us well as the test applicanon on the PPE. A detailed description of the interaction~ be-
`
`Ex.1017.013
`
`DELL
`
`

`

`

`

`126
`
`wri11en only from the receive side (e.g .. the updated tnu1~mi1 w111dow), and 1hc other wrirten only
`from the tr;111smi1 ~idc (e.g .. the la\t -,end sequence number).
`
`Since we do not have a lockrng protocol for acces\mg shared data structures. It is possible thill for
`a brief period after chc local update and before the remote update has been propagated, the \atnc
`field in the shared data siructure contains 1wo different values. Because of the properties of TCP
`and the way we have 'Pill the proto.:ol onto the cran\mll and receive side of the PPE, this inconMs(cid:173)
`tency will only be of importance if 111s 1he reason for the protocol Mate to change. As an example
`consider the following: assume the n:transrnission timer (it is abo maintained in the TCB) in
`xtask cxpin:s and, because the acknowledgement field in the TCB docs not indica1e reception of
`an acknowledgemc111, xtask decides 10 retransmit the unacknowledged TCP \Cgments. On the
`recdve side. ho\\ ever. ;m at·i-nowlcdgemem has been received in the meantime which makes this
`retransm1s\lon unnecessary4. To avoid this problem, before actually going 10 a retransmit state,
`xtask will reread the acknowledgement field, now however with the value on the receive side, to
`make sure that a retransmission is warranted. Reading a remote field is similar tu writing; a mes(cid:173)
`~age with the address and stze of the variable is sent to tht remote peek_poke process, which then
`returns the value of that field.
`
`RPCs from the receive to the transmtl side have been implemented as follows: any process on the
`receive side can fom1a1 an RPC message, whit·h is then sent via a dedicated transputer link to the
`rpcJlrocess. This process will then execute the remole procedure. or 111 the case of transmission
`request,, pa\s the request via a local (internal) channel 10 the appropriate write process. one of
`.,.. hich exists for each TCP connection Return values are sent, agarn via a dedicated transputer
`ltnk. bac~ to the receive side to rpc_demux, which forwards these values over a local channel to
`the proccs' 1hat had iniuatt:d the RPC. Upon receiving 1he return value. the calli.:r becomes ready
`again and can continue its e"ecution.
`
`4.5 Example
`Sending u TCP dam sewne111: The normal data flow is shown in hgure 3. The send data are in a
`remotely al located buffer on the transmit side. The application creates a socket ;ind establishes a
`TCP connccuon. The socket send call causes an RPC to the n:mote write process which in tum
`copies the data into the TCP send buffer. xtask then controls the tran~mission and eventual re(cid:173)
`transmissions of the data. The send procedure builds the TCP segment and fory.ards the pointer to
`the segmen1 and the assocmted control block to lp_send. Here the IP header is placed in front of
`the TCP segment and then the packet is sent to the network. The data is copied twice: first from the
`applicauon buffer 10 the -.end queue in ~hared memory and from there to the network.
`Receiving a rep datll .\t'f.lmem: Upon receipt the data IS also copied twice: first from the network
`to the receive queue and from there 10 the application buffer. The interrupt handler process serves
`tbe physical interface and forwards poinlers to received datagrams 10 lp_demux, which checks
`the header and forward~ the packet depending on us type to tcp_recv or lcmp demux.
`
`tcp_recv analyzes the TCP header and calls the appropriate handler function for a given protcxol
`~ late. To ~end an acknowle<lgement or a control packet, tcp_recv uses RPCs to the transmit side
`Correctly received segments are appended to the receive queue. rtask wakes up the application
`process which b blocked in the socket receive procedure. This procedure then fill~ the user buffer
`with data from the receive queue.
`
`4 Note: lhc logK of I.he prolornl would allow for a re1rnnsm1~s1on many case.
`
`Ex.1017.015
`
`DELL
`
`

`

`

`

`128
`
`the po'sible pcrformam:c in case the socket-based application programming interface (A Pl) were
`implemented on 1he worbtation. The socket layer would Lhen be split in10 two pans. The upper
`hall resides in the worhtauon. Calls to the API result 111 conrrol flows to and from the lower half
`of the sock.et laycr, \\ h1ch runs on thc PPE. Copying data to and fmm the TCP layer must be done
`by the workstatH)n processor, because the current PPE only functions as a bus slave. Therefore the
`copy opemtions in the socket layer can be combined with the copy between the workstation and
`the PPE. In th1\ 'ccnario we measure the throughput between the lower half of the '><>Ck.et layer on
`tv,.o PPEs. The results of scenario 2 provide ;1n upper bound for the expected performance of such
`an integrated system. As such they are valid if one manages- as shown for our implemcmarion of
`the ISO 8802.2 Logical Link Control protocol [Kaherswerth 911
`to fully overlap 1hc copy op(cid:173)
`ernuons and thc ex.change of control between the work.station and the PPE with the protocol
`execu1 ion on the PPE.
`
`We did not implement TCP checbummmg, because 11 ~hould really be done in hardware
`I Lumley 92]. To do software checksum calculation on the transputer "'ould cost 3 µs per 16-bit
`word. We did, however, implement IP header checksumming
`The Zjhlmonuor'* (ZM'*> I Dauphm 91] monuoring and tracing system was used to record execu(cid:173)
`tion rrat.:es of the PPE subsystem. ZM'* allow~ gathering of trace events from multiple proce!>sof!>.
`The~c events arc timesiamped with a global clock opernting with a resolution of 100 ns. A power(cid:173)
`ful too bet I Mohr 91 I provides trace analysis and visualization
`
`5.2 Measureml'nls
`Becau\e we v.anted to sec the effect~ of pipehnmg and parallel execution of the protocol, we mea·
`sured the time 'pent in the vanous parts of the device driver, IP. TCP and the socket layer. To judge
`Lhe performance of our implementation we measured the number of TCP segments Lhe imple
`mentation can handle per second. Given the segment ~i ze, the expected maximum throughput can
`easily be cakulated
`
`µs/Segment µs/32-bit word
`
`Process (Procedure) on Receiver
`tcp recv
`user task (socket recv/copy)
`ip 1ntrsvc
`ip demux
`Process (Procedure) on Transmitter
`write
`tcp_snd_data
`ip send
`driver send
`Access to Shared Memory (poke call)
`
`' -
`
`235
`31
`9
`23
`
`30
`147
`23
`17
`18.6
`
`,_
`
`0.545
`
`0.545
`,____
`- L -
`
`0.27
`2.4
`
`Table 1 Measured Execution Times
`Table I hsts the e11ecution times of the major processes of our impkmemation. We used segments
`of 4096 bytes in these measurements. The times are reported for the first test scenario. The execu
`tion times per segmem are approximately 41)1- lower for the second ~enario because of reduced
`contention for accesses to the shared memory. The times per 32-blt word for user_task and write
`
`Ex.1017.017
`
`DELL
`
`

`

`

`

`

`

`

`

`132
`
`The prototype PPE interface to the workstation (I BM Rise System/6000) allows a copy lhrough.
`pu1ofonly33 Mb/s 7. lf the application were 10 be executed on the workstation, all copying would
`be done from the workstation's processor and if we assume code similar 10 the second lcSL \Cenar.
`10 running on the PPE, then the hmlled copy throughput rather than the protocol processing will
`be the bottleneck and we should expect the perfonnancc of the integra1cd system to be around 30
`Mb/s.
`
`6. CONCLUSIONS
`
`Our measuremenh show that a full 1mplementarion of TCP/IP on the PPE can cope wnh data rates
`1n the range of 100 Mb/s. The 1hroughput is much higher than the bandwidth of our hardv.are
`1111erface to 1he workstaiion.
`
`II turns out, however, 1hat using a total of four processors, two for 1P and two for TCP offers only
`very little improvement over a 1wo-processor solution, because of the vastly different process mg
`requirements in the two protocol layer\ For full duplc!x traffic, however, the split onto a receiver
`and a transmiuer processor improves protocol performance by a factor of 1.7. Panitioning proto(cid:173)
`cob to obtain even load and linear speedup is a hard problem, in pa11icular for protocols which
`clearly were not designed with para I lei execution in mind. [Zirterbart 911. for example, reports
`even poorer i.peedup factors. With an 8 transputer implementation of the OSI CLNP she only
`achieves a performance increase 3.73 over the single.: processor version.
`
`Having used a DOS implementation of TCP/IP a' the basis for our parallel 1mplementanon was a
`sound deci!>ion Our 1mplementauon runs efficiently, when one compares It with other transpo11
`prowcol implementations. For example, Zitterbart dcscnbes a parnllel implementation of OSI
`TP4 written for a system of8 transpu1ers which was able lo process 460 PDUs/s I Zitterbart 91 ]. ln
`I Braun 91) a parallel 1mplementa11on of XTP is described, there the perfonnance is 1330 PDUs/s.
`
`Once new faster proces.,ors. such as the I 00 MIPS T9000 transputer, become available, the gains
`for pipelined execuuon of protocob "ill have to be reevaluated. While the T9000 will be 10 ume~
`as fa~L as the T425. the delays for interprocessor commumcation will not have shrunk by the .,ame
`fac tor. Therefore the relative overhead for pipelining the protocol execution within a layer and
`even between layers will grow. We claim, however, that the parallel execution of transmit and
`receive functions is sllll a suitable fom1 of parallelism to increase protocol throughput Distrib(cid:173)
`uted 'hared memory. implemented "uh transputer hnks easily allow\ protocol state informanon
`LO he shared between tl1e two side~ of the adapter and impacts the perfonnance of the trnnspon
`protocol much less than expected. First evaluations of a new architecture, which is based on t\\O
`T9000s supported by dedicated hardware for checksumming and extraction of header infom1a(cid:173)
`tion, indicate a performance of over 30000 TCP segments/s.
`
`1 The rcawn \\h}' lh" interlace 1' so ~10\\, 1' that the clocks on the "ork~lfilJon and th.: PPE run as}'nchronou,ly
`When arbitrating an acccs\ from the \,11cro Channel tO the shared memory on the PPF. we arc forced tu u.-e
`me Micro Channel'~ A 1ynchronous £xu•ndrd cycle [IBM 901 tif al least 300 ns. T111s cycle then m:l)' even need
`10 be extended hy up to 487 ns IO match 11 with the appropriate access cycle of the PPE shared memory. In
`.i new design for the Micro Channel interface this problem would be addressed b> l1ulfcnng m the interface
`"'h1Lh would allow \\ntc-bchmd and rcad·ahcad. For con-.ccuU\C accesses, Lill! arb1trauon C)dc for the nc>.l
`"ord accc_,~ tu the 'h.IJ'cd mcmOf) could then he overlapped "1th !he cum;nt "Ord aL-ccss cycle. 1hu' being
`ahk to u-.e regular I\ hero Channel C)dC' or 200 ns. an<l conscqucntl) mcrca,mg the throughpul to more thJn
`~O Mb/~. A bu,ma\tcr interface U\mg the Micro Channel\ o;trcammg mode woulcl ;11low give higher through(cid:173)
`put
`
`Ex.1017.021
`
`DELL
`
`

`

`133
`
`our measurements arc in line with Clark's observation [Clark 891 that the actual protocol proces(cid:173)
`sing i' notthe reason for poor protocol perfom1ance. In rhe PPE. buff er copying and management
`cost rw1ce as much as the prow~ol processing. The second :;cenano shows how 1hroughpu1 can be
`tripled 1f the u'cr data were copied by the workstation processor overlapped 10 the protocol execu(cid:173)
`tion lln the PPC. ln a future design of rhe PPE, we will concentrate on improving the interface 10
`the shared memory for the prorocol pn.x:essor8 and the workstanon.
`
`We also plan 10 work on the design of efficient software interfaces between our subsystem and the
`host system. A' can bc seen from resuhs published for the t\ectar CAB and our own work, cross(cid:173)
`ing the software interface between the host processor and the communication :,ubsyMem is a cost(cid:173)
`J:r orxration. Many re::.earchcrs who advocate the ofnoading of protocol functions into a
`dedicated sub~yslem ignore this i:.sue. For our TCP/IP implementauon onl) a hosr AP! based on
`~ockets will be acceptable, as this interface has become the de· fa cm standard. These sockets musr
`be lightweight enough to provide effic1cn1 pipelined execution between the communication sub(cid:173)
`syMem and the hosr processor 10 exploit the full po"' er of the PPE.
`
`7. REFERENCES
`
`(Arnould 891
`
`[Braun 91)
`
`[Che,son 871
`
`(Cheriton 881
`
`!Clark 891
`
`I Clark 901
`
`I Dauphin 91 J
`
`Arnould. E. A .. Bitz, F. J., Cooper. E. C., Kung, H. T .• Sansom, R. D .•
`Srt:enkiste. P. A .• The Design of Nectar: A Network Backplane for
`I leterogcneous Multicomputers, Proceedings of ASPLOS-rll. pp
`205-216, April 1989.
`
`Br.iua, T .. Zinerhart. M .. A Parallel Implementation of XTP on
`Transputer~. Proc. 16th Annual Conf. on Local Compu1er Networks.
`\1mneapohs. Oct 1991.
`
`Chesson. G., The Protocol Engine Project, Unix Rev1e"". Vol.5 'lo.9.
`Sept. 1987, pp.70-77.
`
`Cheriton. D.R .. VMTP: Ver~arile Me~sagc Transaction Protocol -
`Protocol Specificauon. Network Working Group, Request For
`Comments. RFC 1045, February 1988.
`
`Clark,D. Lambert, M.L., Romkey. J., Sal wen, H., An Analy"s of the
`TCP Proce~sing Overhead. lEEE Communications Magazine, Vol. 27,
`o. 6 (June 1989). pp. 23-29.
`
`Clark, D .• Tennenhouse, D., Architectural Consideration~ for a New
`Generation of Protocols. Proceedings of the SIGCOMM' 10 Symposium.
`Sept 1990, pp. 200 208.
`
`Dauphin, P .• Hofmann. R .• Klar, R .• Mohr. B .• Quick, A.,Siegle, M .•
`Soctz. F. lM4/SlMPLE: A General Approach ro
`Performance-Measurement and -Evaluation of Distnbuted Systems.
`Technical Report 1/91, Erlangen, January 1991.
`
`[HoJre 7XI
`
`Hoare. C.A.R .. Communicating Sequential Processt.:s. Communications
`of the ACM. Vol.21, No 8, August 1978, pp. 666-677
`
`In Ilic PPE a 'hared memory cycle of I.he 11anspu1er is lw1ce a local memory cycle
`
`Ex.1017.022
`
`DELL
`
`

`

`134
`
`I IBM 90]
`
`LBM RISC System/6000 POWERstation and POWERserver Hardware
`Technical Reference - Micro Chrurnel Architecture, 1990.
`
`llNMOS 891
`
`lnmos Limited, The Transputer Databook. First Ed. 1989, Document
`No. 72 TRN 20300, pp. 23-43 and 113-179.
`
`I Kaiserswerth 91 J Kaiscrswerth, M., A Parallel Implementation of the ISO 8802.2-2 LLC
`Protocol, TEEE Tricornm '91 - Communications for Distributed
`Applications and Systems, Chapel Hill NC. April 17-19, 1991.
`I Kaiser~werth 921 Kaiserswenh, M .. The Parallel Protocol Engine, IBM Research Report,
`RZ 2298 (#77818), March 1992.
`
`I Kanakia 881
`
`[Lumley 92]
`
`ILS-C 891
`
`[Mohr 911
`
`Kanakia, H., Cheriton, D.R., The VMP Network Adapter Boirrd (NAB):
`High Performance Network Communication on Multiprocessors, ACM
`SIGCOMM 88, pp. 175-187.
`
`Lumley, J., A High-Throughput Network Interface to a RISC
`Worksuuion, Proceedings of the IEEE Workshop on the Architecture
`and Implementation of High Performances Communication Subsystems,
`Tucson, AZ, Feb. 17- 19. 1992.
`
`Logical Systems, TransputcrToolset. Version 88.4 Feb. 1989.
`
`Mohr, B., SIMPLE: A Performance Evaluation Tool Environment for
`Parallel and Distributed Systems, in A. Bode, Editor, Distributed
`Memory Computing, 2nd European Conference, EDMCC2, pp. 80-89,
`Munich, Germany, April 1991 , Springer Verlag Berlin LNCS 487.
`
`!PEI 921
`
`Protocol Engines Incorporated, XTP Protocol Definition, Revision 3.6.,
`Edited by Protocol Engines Mountain View, CA , January 11, 1992.
`
`[Pountain 881
`
`Pountain, D., May, D., A Tutorial on OCCAM2, BSP Professional
`Books London 1988.
`
`[UM 90]
`
`[Wicki 90]
`
`LBM Corporation, University of Maryland. Network Communications
`Package. Milford 1990.
`
`Wicki, T., A Multiprocessor -Based Controller Architecture for
`High-Speed Communication Protocol Processing, Doctoral Thesis, IBM
`Research Report, RZ 2053 (#72078), Vol 6, 1990.
`
`[Ziuerbart 911
`
`Zitterbart, M., Funktionsbezogene Parallelitat in transportorientierten
`Kommunikationsprotokollen, Dissertation, VOi-Reihe 10 Nr. 183,
`Diisseldorf: VDI-Verlag 1991.
`
`Ex.1017.023
`
`DELL
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket