throbber
Declaration from University of California, Berkeley
`
`RE: Y.H. Thia and C.M. Woodside, “A Reduced Operation Protocol Engine (ROPE) for a
`
`multiple-layer bypass architecture,” Protocol for High Speed Networks IV, 1st Edition (TJ Press
`
`Ltd. 1995), pages 224-239.
`
`I, Lisa Rowlison de Ortiz, declare:
`
`1.
`
`I am the Head of Catalog & Metadata Services at University of California,
`
`Berkeley (“UC Berkeley”) library.
`
`I am familiar with the UC Berkeley library
`
`system, including the library catalog and policies and procedures regarding the
`
`receipt, indexing, and availability of books and periodicals.
`
`2.
`
`According to UC Berkeley Library policies and procedures, Library items are
`
`indexed in the library catalog and are made freely available to the faculty and
`
`student body of UC Berkeley as well as to the general public.
`
`3.
`
`The UC Berkeley library holds a copy of a chapter by Y.H. Thia and C.M.
`
`Woodside, “A Reduced Operation Protocol Engine (ROPE) for a multiple-layer
`
`bypass architecture,” published in the book Protocol for High Speed Networks IV,
`
`1st Edition (TJ Press Ltd. 1995), pages 224-239. (“Thia”).
`
`4.
`
`When a monograph is received and cataloged by the UC Berkeley Library, the
`
`date of cataloging is set and retained in the catalog record. The catalog date (“Cat
`
`Date”) for Thia is February 26, 1996 (see Exhibit A). Furthermore, after the
`
`volume is labeled and sent to its shelving location the date of receipt by this
`
`shelving location is stored in an internal note. This information shows that the
`
`volume was received by the Engineering Library on March 20, 1996. Id. The
`
`volume would have been available to the public within a few days of that date.
`
`Ex.1064.001
`
`DELL
`
`

`

`I declare under the penalty of perjury that I understand that willful false statements and
`
`the like are punishable by fine or imprisonment, or both under Section 1001 of Title 18 of the
`
`United States Code and may jeopardize the validity of the application or any patent issuing
`
`thereon. I declare that all statements made of my knowledge are true, and that all statements
`
`made on information and belief are believed to be true.
`
`Executed on January 27, 2017, in Berkeley, California.
`
`Lisa Rowlison de Ortiz
`
`Ex.1064.002
`
`DELL
`
`

`

`Ex.1064.003
`
`Melbourne•Madras
`LondonGlasgowWeinheimNewYorkTokyo
`CHAPMAN&HALL
`
`InternationalFederationforInformationProcessing(IFIP)
`PublishedbyChapman&Hallonbehalfofthe
`IEEECornSoc
`SponsoredbyIFIPWG6.1IWG6.4incooperationwith
`
`Canada
`Vancouver
`UniversityofBritishColumbia
`DepartmentofComputerScience
`GeraldNeufieldandMaboIto
`Editedby
`
`NetworksIV
`HighSpeed
`Protocolsfor
`
`________
`
`DELL
`
`

`

`Ex.1064.004
`
`UniversityofBritishColumbia
`DepartmentsofElectricalEngineeringandComputerScience
`
`GeraldNeufeld
`MaboIto
`
`CICSR,IJBC,MPRTeltechandNewbridgeNetworks.
`organizationswhichhavecontributedfinanciallytothisworkshop,speciallyNSERC,ASI,
`Wearethankfultoalltheauthorsofthepapersthatweresubmitted.Wealsothankseveral
`programcommitteewho,withtheadditionalreviewers,helpedmaketheselectionofthepapers.
`Manypeoplehavecontributedtothesuccessofthisworkshopincludingthemembersofthe
`
`accommodatemorediscussioninkeepingwiththeformatofaworkshop.
`theprogramcommitteedecidedtokeepthenumberoffullpresentationslowinorderto
`papersandfourwerepresentedasposterpapers.Althoughwereceivedmanyexcellentpapers
`contributions.Thisyear,fortypaperswerereceivedofwhichsixteenwerepresentedasfull
`Stockholm(1993)respectively.Wereceivedalargenumberofpapersinresponsetoourcallfor
`workshopfollowsthreeverysuccessfulworkshopsheldinZurich(1989),PaloAlto(1990)and
`WelcometothefourthIFIPworkshoponprotocolsforhighspeednetworksinVancouver.This
`
`PREFACE
`
`362
`
`361
`
`349
`
`328
`
`308
`
`295
`
`276
`
`260
`
`243
`
`224
`
`205
`
`189
`
`171
`
`Keywordindex
`
`Indexofcontributors
`
`JosephD.Touch
`
`21Protocolparallelization
`
`I.JnoueandN.Morita
`recoveryinB-ISDN
`
`20Partial-frameretransmissionschemefordatacommunicationerror
`
`19FromSDLspecificationstooptimizedparallelprotocolimplementations
`
`S.LeueandP.Oechstin
`
`18AmultimediadocumentdistributionsystemoverDQDBMANs
`
`L.Ororzco-BarbosaandM.Soto
`
`PARTSEVENPosters
`
`RB.Osborne
`
`17AhybriddepositmodelforlowoverheadcommunicationinhighspeedLANs
`
`TKaineda,J.TingandD.Fracchia
`delivery-on-demandofcontinuousmediadata
`
`16Aguaranteed-ratechannelallocationschemeanditsapplicationto
`
`K.MoldeklevandP.Gunningberg
`
`15DeadlocksituationsinTCPoverATM
`
`PARTSIXImplementationandPerformance
`
`YH.ThiaandC.M.Woodside
`bypassarchitecture
`
`14Areducedoperationprotocolengine(ROPE)foramultiple-layer
`
`13PATROCLOS:aflexibleandhigh-performancetransportsubsystem
`
`TBraun
`
`12Highperformancepresentationandtransportmechanismsforintegrated
`
`WS.Dabbous
`communicationsubsystems
`
`11ThedesignofBTOP—anATMbulktransferprotocol
`
`LCasey
`
`PARTFIVEProtocols
`
`Contents
`
`vi
`
`DELL
`
`

`

`tokeepdata
`UAshostprocessingspeedcontinuestooutpacememorybandwidthandasthenetwork
`
`is
`
`it
`
`important
`
`orforbothtogether,andiscompatiblewithotherend-systemsimplementedwithoutabypass.
`throughthenormal“heavyweight”path.Abypasspathcanbeprovidedforsend,forreceive,
`originalsoftware.Conformancetotheprotocolismaintainedbydoingalltheotheroperations
`Abypassaddsanadditionalpathforcertainoperations,withminimalchangestothe
`
`2TheBypassConcept
`
`Section7.
`chipusingtheindustrystandardhardwaredescriptionlanguage,VHDL,withconclusionsin
`forabypassVLSIimplementation.Sections4,5and6describeadesignstudyofaROPE
`Section3analyzesthekeyprotocolprocessingoverheadsanddiscussestherequirements
`implementation.
`
`Thenextsectionintroducesthebypassconcept,
`
`itsarchitectureantI
`
`approaching10bps.
`Itappearstobefeasibletosupportanend-systemsingle-connectiondatarate
`chipdesign.
`interfaceandthechipoperation,andtoreportonaVHDL-basedfeasibilitystudyofthe
`ReducedOperationProtocolEngine.Thecontributionofthispaperistodefinethehost/chip
`andasimplecommandprotocol.ThechipdesignbasedonbypassingiscalledROPE,for
`andminimizestheirinteraction,whichissupportedbyanaccesstest,someDMAprocessing
`layersforcertaincases.Thissimplifiestheinterfacebetweenthehostandtheadaptorchip
`theuseofoffboarclprocessing,byimplementinganentireservicethroughall
`maylimit
`Prediction”algorithm[20]forTCP/IP.Bypasssolvestheproblemsidentifiedabove,which
`basedonthe“protocolbypassconcept”[371whichisageneralizationofJacobson’s“Header
`providesahardware“fastpath”forthem,whichwillbeefficientforbulkdatatransfer.
`is
`combinestherelativelysimpleoperationsneededfordatatransferacrossmultiplelayersand
`It
`
`It
`
`Thispaperpresentsafeasibllitystudyforanewapproachtohardwareassistance.
`movementontheworkstationsclowntotheminimum[4,9,28].
`bandwidthapproachestheprocessormemorybandwidth,
`
`tasksinthehostsoftwareforflexibility.
`significantadvantageinprovidinghardwaresupportforthesefunctionsleavingtheother
`therecanbe
`thefrequentlyexecutedportionoftheprotocolremainrelativelystable,
`Ifthekeyfunctionsof
`
`OThereisatradeoffbetweenperformance,flexibilityandcost.
`
`designedforVLSIimplementation[1,3].
`supportTCPchecksums.Also,somenewerlightweighttransportprotocolsarespecially
`In[8],dedicatedVLSIchipsareusedto
`datalinklayerhasbeendisappointingsofar.
`implementationabovethe
`becauseofthecomplexityofexistingprotocols,VLSI[24]
`fullprotocolstackcanbeoffloaded,generalpurposemicroprocessorsareused.Probably
`In[2,22]wherethetransportprotocollayerisoffloacledorin[7]wherethe
`supports.
`UThechoiceofhardwarefortheadaptordependsonthecomplexityofthefunctionsit
`
`indeeplylayeredprotocolstacks.
`plesincludeinterrupthandling,contextswitchinganddatacopyingatlayerboundaries
`UNon-protocol-specificprocessingisalargepartofthetotalload,asshownin[35].Exam
`
`225
`
`ROPEfora‘nuiriple-layerbypassarchitecture
`
`Ex.1064.005
`
`fleawasatrlete,,Iniversity
`
`ibisrenewal,wasdonewhileDr.
`
`protocol
`maybeoffloaded,butthisleavestheproblemofcontrolforaccessingitwithinthefull
`offsetthepotentialgainfromoffloading.Forexample,
`thebuffermanagementtask[36]
`leadtoacomplexadditionalprotocolbetweenthetwoparts,whichmaycanceloutor
`UPartitioningthefunctionalitybetweenthehostandtheadaptorisdifficultandmayeasily
`
`logic.
`
`Thekeyproblemsassociatedwithoffboardprocessinginclude:
`
`partoftheprotocolfunctionstoanadaptor.Thispapertakesthelatterapproach.
`[14,21,38],specialprotocolstructures[15,30]andhardwareassist[22]byoffloadingallor
`improvedsoftwareimplementationofexistingprotocols[5,35],parallelprocessingtechniques
`thedatastream.Toalleviatetheend-systembottleneckonemayconsidernewprotocols[10],
`combinationofoperatingsystemoverhead,protocolcomplexity,andper-octetprocessingon
`quality-of-serviceguaranteeswillreinforcethiseffect.Theheavyprocessingloadisduetoa
`municationsprocessingintheend-pointsofthesystem[26].Othertrendssuchasimproved
`rates,hasshiftedtheperformancebottleneckfromthecommunicationschannel
`tothecom
`TheadventofFibreOptictechnology,whichoffershighbandwidthandlowbiterror
`
`1Introduction
`
`Keywords:NetworkProtocols,DataCommunicationsDevices
`Keywordcodes:C.2.2,B.4.l
`
`persecond,
`arraytechnology,andsimulationshowsthatitcansupportadatarateapproaching1gigabit
`usingVHDL.Thedesignispracticalintermsofchipcomplexityandarea,usingcurrentgate
`paperdescribesthedesignofaROPEchipfortheOSISessionandTransportlayerprotocols,
`areasignificantoverhead.ROPEisintendedtosupporthigh-speedbulkdatatransfer.The
`andbuffermanagement,contextswitchingandmovementofdataacrosslayers,allofwhich
`hardware.Multiple-layerbypassalsoeliminatessomeinter-layeroperationssuchasqueue
`involvesonlyasmallsubsetofthecompleteprotocol,whichcanthenbeimplementedin
`pathfordatatransfer.Themotivationforidentifyingthisseparateprocessingpathisthatit
`criticalfunctionsofamultiple-layerprotocolstack,basedonthe“bypassconcept”ofafast
`Abstract—TheReducedOperationProtocolEngine(ROPE)presentedhereoffloads
`
`inaconnectionattachedtoanend-system.
`
`Dept.ofSystemsandComputerEngineering,CarletonUniversity,Ottawa,Canada(**)
`NewbridgeNetworks.Inc.,Ottawa.Canada(*)anti
`
`Y.H.Thia(*)landC.M.Woodside(**)
`multiple-layerbypassarchitecture
`AReducedOperationProtocolEngine(ROPE)fora
`
`14
`
`DELL
`
`

`

`Ex.1064.006
`
`parameterB.
`frequencyisimplementation-dependent,andtheirtimingwillbeaggregatedandincludedin
`Per-group-of-packetsoperationsincludeforexampletransmissionofacknowledgments,whose
`theper-packetprocedurestaketimeBperpacket(e.g.addressdecodingantimultiplexing).
`operations.Theper-octetoperationstakeanaveragetimeAperoctet(e.g.,checksum).and
`Protocolprocedurescanbecharacterizedasper-octet,per-packetorper-group-of-packets
`layeredstackonanendsystem.ItfollowsthedescriptionbyHeatleyandStokesberry[161.
`Thissectionsummarizesthemajorfactorsaffectingthroughputperformanceinadeeply
`
`3.1Factorsaffectingsystemperformance
`
`3DesignConsiderationsforaHardwareBypass
`
`reliabletransferofdataacrossthecommunicationsnetwork.
`duringtheentiredatatransferphaseandtheprotocolprocessingisreducedtoensuring
`longasprocessingremainsinthebypasspath.Thestateofthesystemdoesnotchange
`OThefinitestatemachineoftheprotocolisnowreducedtoonlythe“OPEN”state,foras
`OThenumberofpossiblePDUformatsinthebypasspathisreducedtodatatransferPDUs;
`OTheprocessingpathofdataPDUscanbeoptimizeti;
`
`Insummary,theseparationofthebypasspathoffersthefollowingadvantages:
`
`separatelayersintheSPSpathhandletheotherphases.
`theadjacentlayerswhentheyaresimultaneouslyinthedatatransferphase.Meanwhile,the
`Amultiple-layerbypasspathisaconcatenationofprocessingproceduresperformedby
`
`layers,havebeenfurthersubdividedintosublayers.
`Theadvantageisincreasedfurtherincaseswheresomelayers,likethenetworkantIapplication
`
`0Queueingofdataatlayerboundaries.
`
`data;
`
`OExecutingthefullgeneralprotocollogicforthelayerstodecidehowtomanipulatethe
`
`layers:
`
`informationpassedbetsveen
`
`OOverheadofencodingantidecodingtheinterfacecontrol
`
`Abypassformultiplelayersinsteadofjustonegivesadditionalgainsbyavoiding:
`
`2.3Multiple-layerbypass
`
`transferandcanbecontrolledbytheapplicationprocess,theseoverheadsarenotexcessive.
`betweentheSPSandthebypassstack.Sincetheyareonlyinsertedperiodicallyinbulkdata
`primitivescanbeusedassynchronizationpointswithinthebypassarchitecture,asitswitches
`thesessionlayer[19,181whicharemappedbyequivalentapplicationandpresentationservice
`afterwhichcontrolcangobacktotheflag.Tokenmanagementandsynchronizationpointsof
`thenafull“no-in-transitPDU”testmustbeperformedforeachpacketuntilthetestsucceeds,
`antiitissufficienttomaintainaflagtoindicatethis.Onceapacketfails,andgoestotheSPS,
`newconnection,itisautomaticallysatisfied.
`Itholdsaslongasnopacketfailsabypasstest,
`The‘no-in-transitPDU”testcanoftenbeavoided.Atthebeginningofdatatransferona
`
`2.2Efficientlogicforthebypasstest
`
`227
`
`ROPEfora,nultiple-layerbypassarchitecture
`
`discussiononthisispresentedinanotherpaper[331
`inthecurrentpath.
`i.e.“noin-transitPDU5”,beforethechangeismade.Amoredetailed
`SPSandthebypassstack,checksareperformedtoensurethattherearenooutstandingpackets
`andconnectionidentifiers.Wheneverthereisachangeintheprocessingpathbetweenthe
`consistencybetweentheSPSandthebypassstack,includingwindowflowcontrolparameters
`protocolprocessinginthedatatransferphase.Theshareddataareusedtomaintainstate
`identifiesthepredictedbypassableheaders.Thebypassstackperformsall
`therelevant
`phase.ThereceivebypasstestmatchestheincomingPDUheaderswithatemplatethat
`Thesendbypasstestidentifiesoutgoingpacketsthataredatapacketsinthedatatransfer
`
`LShareddataforaccessbythetwotests,theBypassstackantitheSPS.
`0BypassStack;
`0ReceiveBypassTest;
`0SendBypassTest;
`layeredprotocolstack.Thebypasshas4keycomponents:
`withoutthebypass.TheSPSmayrefertoasinglelayerortomultipleadjoininglayersofa
`Thestandardprotocolstack(SPS)istheprocessingpathtakenbyallPDUsduringaconnection
`illustratesthearchitectureofabypassimplementationforanystandardprotocol.
`2.1BypassArchitecture
`
`Figure1
`
`Figure1BypassArchitecture
`
`4’*
`
`j
`
`PartFiveProtocols
`
`226
`
`DELL
`
`

`

`ispossibletobeginsimulationanddebuggingofcomplexsystems
`specificationtool,
`levelsofabstraction,fromlogicgatestothesystemlevel.ByutilizingVHDLasa
`all
`synthesizethechip.VHDLisanindustrystandardlanguagewhichcanbeusedtorepresent
`TheVHSICHardwareDescriptionLanguage(VHDL)[6,27]wasusedtomodeland
`
`it
`
`Engine(ROPE)chipusingVHDL
`4VLSIimplementationofaReducedOperationProtocol
`
`NetworkInterfaceAdaptor.
`thesendbypasstestisdoneonthehostandthereceivebypasstestisdoneonthe
`these,
`chip,andthosewhicharebetterhandledbythehost,duringthedatatransferphase.Besides
`TableIidentifiesprocedureswhicharestrongcandidatesforimplementationinthebypass
`encodinginterferewithefficientcaching,soitisparticularlyappropriatetooffloadthem.
`Howeverlongtraversesthroughthedataforper-octetoperationslikechecksumand
`resourceisoftenthehostbus/memorypath.andcachingistisedtoincreaseitsefficiency.
`theprocessingrequirementsanddatamovementoftheApplicationPDUs.Thecritical
`tolookatboth
`
`important
`
`is
`
`It
`
`0Removeoperationsthatareinefficientonthehost.
`
`TableIBypas,cablever.c,i.rNim-bypassablefunctian.v
`
`a,,,)))MA..
`pool,nr.nory
`(‘ucofdual
`
`(So.rplrucbrtno)j
`M,nuo.li
`
`inpI,maulatjon,
`Depro.fuon!
`
`ke,n,,,ka
`
`‘(
`
`x
`
`x
`
`-
`
`—
`
`x
`
`x
`
`x
`
`x
`
`X
`
`x
`
`Ex.1064.007
`
`TheoperationsimplementedinVLSIareunderstoodtohavelessneedofflexibility.
`efficientlyhandledbythehostgiventheprojectedincreaseinhostprocessingspeed[171.
`processing.Theyarealsoonlyper-packet,sotheyhavelessimpactoverall.Theycanbe
`transferpacketswhicharetypicallysmallbutrequiremoreflexibleandmorecomplex
`OTradeoffbetweenperformance,flexibilityandcost.Thehostsoftwareprocessesnon-data-
`UVLSIimplementationcomplexity:onlythedatatransferfunctionsareimplemented.
`
`incorporatemultiple-layerstacksandremoveoverheadthatway;
`edgmenthandlingaltogetherfromthehost.Also,thebypasssystemcanbeextendedto
`structions,ratherthanbytheprotocolprocessingitself.Ourapproachremovesacknowl
`knowledgmentpacketsisdominatedbyinterrupthandling,
`typicallyafewhundredin
`DReducednon-protocol-specificprocessingoverhead.Forexampletheprocessingofac
`
`thebypassstack;
`atthepacketentrypoint.ThereisrelativelyinfrequentswitchingbetweentheSPSand
`functionsarecompleteinthemselvesandhaveafocussedinterfacewiththehostsoftware
`betweenthehostandadaptorisdesired,andisprovidedbyabypass.Itsparticularsetof
`tocommunicate
`
`DAcleanseparationoffunctionalityrequiringonlyasimpleprotocol
`
`Theproblemsassociatedwithseparateoroffboardprocessing,whichwerediscussedin
`
`theintroduction,areaddressedbytheVLSIdesignasfollows:
`
`3.2RequirementsofabypassVLSIimplementation
`
`Hardwareimplementationisparticularlyefficientforper-octetoperations.
`
`Per-octetprocessinglikepresentationconversionandchecksumroutines.
`serverprocessoutsidethekemeldomain[11];
`problembecomesmorepronouncedinmicrokernelswhichtreatsaprotocoltaskasa
`Crossingprotectiondomains(addressspaces)—e.g.attheuser/kernelboundary.This
`Copyingbetweentheadaptorbufferandthehostsystemmemory;
`
`•
`
`•
`
`•
`
`transfer.ThedataportionofaPDUmaybephysicallymovedforthefollowingreasons:
`limitstheeffectivethroughputpresentedtotheapplicationprocess,especiallyforbulkdata
`memorybandwidthoftheendsystem,thecostofmovingdataandofper-octetprocessing
`networkadaptor.Astherawdatabitratesupporledbyopticalnetworksapproachesthemain
`Theprotocolprocessingloadonanendsystemistypicallysharedbetweenthehostandthe
`
`(3.1.2)
`
`Inbulk
`
`(3.1.1)
`
`=A+B/M
`

`
`datatransfer,as£becomeslarge.
`where.risthesizeoftheusermessageinoctetsandlviisthemaximumPDUsize.
`
`=A.+.Ee/lvi1
`
`f”hulkdo,0rrauoj(.e(if)
`(Ii)A5gregairdI,.Po..Pa.ki
`
`Per-Gr.’up-Qf-Po.rkro
`
`,PeeOririPo.-Parfer
`
`x
`
`J
`
`—
`
`—
`
`—
`
`(A)
`
`I
`
`rln,snntrd.
`W,Qtn,layeruiu
`hyo.doGOopyk,5
`Wthm5II
`
`cnPyf
`
`ot
`
`—
`
`tSwtto,g
`
`AIlltfr,
`
`x
`
`x
`
`X
`
`x
`
`x
`
`X
`
`X
`
`X
`
`X
`
`X
`
`BufferMntagrtnrnr
`
`HeaderOrcotir
`
`Header(‘onulroction
`
`krrrqornu,ng
`
`pauk,n(Flow(‘.u0o((
`Generauno0IALK
`
`TimerMo.,ogemonl
`
`Oho.ko,nn(Opsonal)
`
`JTokenn.anagr.nr,Il
`
`(Cbno4)
`Troospo0
`
`Con,prr.oon
`
`So.rypt...n
`
`Enuods,g
`
`Pro.rimun
`
`Byp.mC/rrp
`
`Pro,rd...e
`
`L5
`
`Thethroughputboundimposedbyprotocolprocessingalone.Arn,inoctetspersecond.
`
`isthengivenbytheequation:
`
`229
`
`ROPEforamultiple-layerbypassarchitecture
`
`PartFiveP,vtocols
`
`228
`
`DELL
`
`

`

`•
`
`•Generationofacknowledgmentpacketsonreceive.
`•Generationoftheheaderfieldincludingthesequencenumber.
`•WindowHowcontrollogic.
`
`thefollowing:
`forexamplethearrivalofaconnectionreleasePDU.Thetestsequenceallowedustoverify
`aseriesofbypassabledatapackets.Non-bypassablepacketsarealsoinsertedtosimulate
`processorsubmodelgeneratesasimpletestsequencethatinitiatesbypassconnectionsand
`isincludedinthismodel,asitIS(luringthisphasethatthebypasschipisactive.Ahost
`andtheTransportprotocolclass2(TP2)protocols.Onlythelogicforthedatatransferphase
`packets.ThefirstdesignimplementedtheBasicCombinedSubset(BCS)ofthesessionlayer
`high-speednetworkinterfaceadapterwhichactssimplyasaninfinitesource/sinkforclam
`thebypasschip,thehostprocessorwithasimplifiedbus/memoryarchitecture,andavestigial
`Itincludes
`weenvisage,andthatthedesignistechnicallyfeasible,forinstanceinareasonablechiparea.
`feasibilitycheck,thatthelogicwespecifiedwillexecutetheprotocolwithintheenvironment
`model,astructuralorRTLmodel,andagateleveldesign.Thesegaveustwokindsof
`Figure3showsthestepsfollowedinthisstudy.Therewerethreestages,abehavioural
`
`AVHDLbehavioralmodelforthesystemwasinitiallywrittenandtested.
`
`4.3FirstDesign:DesignSteps
`
`231
`
`ROPE/hramultiple-layerbypassarchitecture
`
`PartFiveProtocols
`
`230
`
`Next,thebehaviouralmodelofthebypasschipwasmanuallyconvertedtoastructural
`SynchronizationbetweentheSPSandthebypassstack.
`
`forsynthesis,
`
`modeltoensurebehavioralconsistency.
`wasagaingeneratedbythehostprocessor.anditsresultswerecomparedwiththeoriginal
`hardwareincludeIFTHENELSEstatementsandsignalassignments.Thesametestsequence
`inhardware,butisusefulinbehavioralsimulation.Featuresthatcanbeeasilymappedto
`donotmapintotheRTLmodel.ForexampleaWAITFORstatementhasnomeaning
`notallofthefunctionalityofVHDLcanbeusedbecausesomeofthelanguagefeatures
`aretechnology-independent(siliconlibraryindependent).InaVHDLRTLleveldescription,
`operationsIntoclockcyclesisalsomacIc.Thesedescriptions,likebehavioraldescriptions,
`intermsofregisters,switches(multiplexors),andoperations.Aninitialassignmentof
`descriptionhasadefinitearchitectureandclockingscheme,andcharacterizesthesystem
`behavioraldescriptionhasnoimpliedarchitectureinitsrepresentation,whileinRTLlevel
`(RTL)model
`
`A
`
`leavingtheothercomponentsasbehavioralconstructs.
`
`Theseconddesign,withadditionalfunctionality.isdescribedinthenextsection.
`
`characteristicscanbeeasilyextractedfromdatabooks.
`submodels.Thedual-portedSRAMwasnotsynthesizedasitsgatecountsandperformance
`themodelwastoolargefortheSYNOPSYSpackagewewereusing,itwasdividedinto3
`stepofgeneratingachiplayoutforfabricationandfaultanalysiswasriotperformed.As
`ourpresentpurposes.toestimatethespacecomplexityandtimingofthechip,andthefinal
`toobtainthesimulationthroughputresultspresentedinsection5.Thiswassufficientfor
`Thetiminginformationgeneratedbythisprocesswasback-annotatedtothestructuralmodel
`gatelevelgenerationwiththeOSprimBiCMOSmacrolibraryfromTexasInstruments[32].
`ThestructuralmodelwasthenpassedthroughtheSYNOPSYSsynthesistool[31]for
`
`Ex.1064.008
`
`wasnotfullydesigned.
`
`ThepresentationmoduleshownintheFigurewasallowedforinthedatastructuresbut
`
`OThecontrolregistersofthebypasschipareI/Omappedtothehostprocessor.This
`
`enablesthehostprocessortoconfigurethebypasschipdirectly.
`
`constraintsonbusaccesslatencyandthroughput.
`
`toavoidcritical
`
`0On-chipdual-portedmemoryisused,ratherthanthehostmemory.
`
`OMovementofdataacrossthehostbusinterfaceareminimizedbyusingan(in-chipDMA
`
`forfastblockdatatransferto/fromthehostsystemmemory.
`
`inthechipdesigncanbesummarizedasfollows:
`ThisplacesthemaximumstressontheROPEchip.Thearchitecturalconsiderationsinvolved
`modelingpurposestheyweredescribedasbeinginfinitelyfast,eitherasasourceorasasink.
`providelogicalinterfacesforsimulationofbehaviour,buttheyinsertnotimingdelays.For
`Figure2showstheblockdiagramofthesystem.ThehostprocessorandNIAcomponents
`
`4.2ArchitecturalDescription
`
`inwhichany
`Italsooffersthepotential
`
`modificationstothespecificationcanbeeasilypropagatedtothegateleveldesign.
`ofanautomaticpathfromtheprotocolspecificationtoVLSIimplementation.
`beforedetailsregardingtheimplementationarefullyspecified.
`
`Figure2BlockI)iagrarnofVL.Ibypasssystem
`
`TransmissionMedium
`
`DELL
`
`

`

`discarded.Otherwise,aPDUisbufferedforresequencing.DuplicateTPDUscanbedetected
`thereceiverend,out-of-sequencePDUsoutsidetheflow-controlwindowwillbe
`
`At
`
`4.5.3RetransmissionandResequencing
`
`areaofon-chipmemoryisreservedtostorethestateinformationofthetimerlist.
`isstarted,atimerisanautonomousprocessuntilaninterruptsignalisactivated.Aseparate
`otherprotocolprocesses.Theonlyoverheadisinstartingandstoppingthetimers.Onceit
`implementation.On-chiptimersareveryefficientandcanbeexecutedconcurrentlywith
`overheadandupdateprocessingofthetimerqueue[341,butcanbeeasilyhandledbyVLSI
`Timermanagementinsoftwareisanexpensiveprocessduetosoftwareinterrupthandling
`
`andthewindowtimerwereimplementedhere.
`(W)andanInactivitytimer(I).Onlytheretransmissiontimerwithoneintervalperconnection
`DuringthedatatransferphaseTP4usesaRetransmissiontimer(TI),aWindowtimer
`
`4.5.2Timers
`
`headerfields.
`thechecksumcannowbesimplifiedfurtherbyprecomputingthepartialchecksumofthe
`structureandthepositionofthechecksumfieldareknowninadvance.Also,calculationof
`fieldisplacedinthevariablepartoftheheader.However,withthebypasssystem,theheader
`Itisoftendifficulttoperformitontheflyatthesenderendasthetwo-bytechecksum
`mented.
`4.5.1OSIChecksumThetransportprotocolclass4checksumalgorithm[121wasimple
`
`butarealsodiscussedinsection6.
`layerfunctionalityandproceduresforpresentationlayerconversionwerenotimplemented,
`retransmissionontimeoutandresequencingwereimplemented.ExtensionstotheSession
`andTP2functionality,toincludesomecommonTP4functionality.Proceduresforchecksum,
`Thissectiondescribesextensionstothefirstdesign,whichonlysupportsSessionBCS
`
`4.5SecondDesign,includingmajorproceduresforTransportClass4(Implemented)
`
`changedforthedurationoftheconnectionneednotbeupdated.
`parameters,backtothebypasschip.ParametersliketheDST-REFfieldwhichisnot
`onlythosedatathatwereupdatedinthestandardprotocolstack,likewindowflowcontrol
`issueaBYPASS_RESTARTprocedurewhichwillpass
`processingpath,
`6)Wheneverthehostprocessorwishestore-enterthebypasspathafteraswitchinthe
`
`thehostwill
`
`thesessionlayer(seesection6)duringthedatatransferphase.
`itreceives,forexample,aconnectionreleaseprimitiveorasessioncontrolprimitiveof
`hostinordertomaintainstateconsistencybetweenthetwopaths.Thismayoccurwhen
`updatedinformation,forexamplewindowcontrolparameters.fromthebypasschiptothe
`flushthebypasschipofany“in-transitPDU”forthatparticularconnectionandreturnany
`bypassstacktotheSPS,itwillissueaBYPASS_SYNCprocedure.Thisprocedurewill
`fromthe
`
`5)Wheneverthehostprocessorencountersaswitchintheprocessingpath.
`
`i.e.
`
`Ex.1064.009
`
`constructtheheaderfieldveryquickly.
`DMAtransferto/fromthehostprocessor.Aprecomputedheadertemplateisusedto
`Thedual-portedstructureallowsprotocolprocessingtoproceedconcurrentlywithany
`processed.IfthestatusisFILLED,protocolprocessingofthebypassstackcanproceed.
`3)Theprotocolenginepoflsthestatusfieldofapackettocheckifthereareanydatatobe
`
`bufferpointers.
`allocatedinfixedsizesandareaccessedbyasimpleroundrobinschemeusingasetof
`transfersthePDUintotheinternaldual-portedSRAM(StaticRAM).Buffersarepre
`busbetweenthehostandDMAisprovidedbytheDMAreqandDMAacklines.DMA
`Thedestinationaddressissuppliedbythebypasschip.Arbitrationforthehostprocessor
`bysendingthestartingaddresspointerwherethePDUislocated,anditstotal
`length.
`procedurewhichchecksforfreebufferspaceinthebypasschipandprogramstheDMA
`initiatestheBYPASS_DMA
`
`thehostprocessor
`
`2)Forsubsequentbypassablepackets,
`
`controlblocksallocatedinthebypasschip(5inthisstudy—Seefigure4).
`ofconnectionsallowedforsimultaneousbypassingwillbeequaltothenumberofthese
`orylocationsandarealsoaccessiblebythehost(TJOmapped).Themaximumnumber
`arestoredinaprocesscontrolblockfortheparticularconnectioninfixedon-chipmem
`initialwindowflowcontrolparametersandtheDST_REFfieldtothebypasschip.These
`iscalled.Thisproceduresetsupabypassableconnectionbysendinginformationlikeits
`PASS_RESTART.OnreceivingthefirstbypassablePDU,theBYPASS_STARTprocedure
`bypasschip,namely:BYPASS_START.BYPASS_DMA,BYPASS_SYNCandBY
`the
`
`I)Fourhighlevelproceduresaremadeavailabletothehostprocessortocontrol
`
`Thesequenceofoperationinthebypasssystemissummarizedasfollows:
`
`4.4Behaviouraldescription
`
`Figure3I)esigtiflowdiagram
`
`obtainthroughputresutta)
`(Timingsback-annotatedto
`
`4)Processedin-sequencepacketswithinthetransmitwindowarepassedtothenetwork
`
`interfaceadapter,whichinthisstudyactedasaninstantaneoussink.
`
`233
`
`ROPEforamultiple-layerbypassarchitecture
`
`PartFiveProtocols
`
`232
`
`T
`
`DELL
`
`

`

`Ex.1064.010
`
`thetotalgatecountincreasedto54,545gates.TexasInstrumentsoffersa
`timercircuitries,
`is51,231equivalentNAND2gates.WiththeadditionalTP4procedureslikechecksumand
`countforthebypasschipwithSession(BCS),TP2and4KbyteinternaldualportedSRAM
`Hencethethroughputresultincludesjustonecopyoperationofthedatapacket.Thetotalgate
`chiptothenetworkinterfaceadapter,butnotthedatacopyoperationfromthehostmemory.
`includesthetimetakentomovethedatapacketoutfromtheinternalmemoryofthebypass
`Table2showsthethroughputperformanceoftheROPEchip.Thethroughputvalue
`
`O4KbyteofinternaldualportedSRAM.
`COnethousandpacketssvereprocessedforeachiteration.
`
`sinks/sourceofdatapackets.
`
`CThehostbus/memorysubsystemandnetworkinterfaceadapterwereassumedtobeinfinite
`CWindowsizeof64.Anacknowledgmentpacketissentforevery20packetsreceived.
`CIKbytedatapackets
`
`Achipclockrateof66MHz.
`
`madeinthisstudyare:
`modeltoobtainthroughputperformanceresults.Theoperatingparametersandassumptions
`Thetiminginformationobtainedfromthenetlistwasback-annotatedtothestructural
`
`Table2Throughput!‘erformanceandgatecountofbypas.vVLSIchip
`
`313.3
`
`1256]
`
`4227
`
`8334
`
`N/A
`
`41,984
`Approximate/v
`
`2,362.8
`
`9247
`
`N/A
`
`2929
`
`N/A
`
`6318
`
`(Mbps)
`packetlength
`with1Kbyte
`pe,formance
`Throughput
`
`NAND2gates)
`(Equivalent
`TotalArea
`
`NAND2gates)
`Area(Equivalent
`Combinational
`Non
`
`NAND2gates)
`Area(Equivalent
`Combinational
`
`5Results
`
`tintercircuitries
`Procedurewith
`Checksum
`additionofthe
`Class4withthe
`Transport
`Session(BCS)/
`
`SRAM(4Kbyte)
`DualPot-ted
`
`Class2
`Transport
`Session(BCS)/
`
`Procedures
`
`resequencing,andslowerexternalmemorywouldbeneeded.
`toholdtheunacknowledgeddatapacketsforretransmissionortobufferdatapacketsfor
`retransmissionstrategywasused.Foralargesvindow,theon-chipbuffermaynotbesufficient
`InthisdesigntheGo-back-N
`orallTPDUs(Go-back-N)waitingforacknowledgment.
`thetransportentitycanretransmiteitherthefirstTPDU,
`senderend,
`easilybecausethesequencenumbermatchesthatofapreviouslyreceivedTPDU.Atthe
`
`iftimerTIexpires,
`
`Figure4Organizationofinternalbypasschipmemory
`
`STATUSIndicatesthestatusofthebuffer.e.g.EMPTY.FILLING.FILLEDorCLOSED.
`
`BYPASS_SYNCorBYPASS_RESTART.
`
`HostTagThistagissetonreceiptofahostcotninand.e.gBYPASS_START,BYPASS_DMA.
`
`SpaceforData
`Buffer
`Reserved
`
`ProtocolHeader
`
`Reserved
`
`BlockAddress
`
`STATUS
`
`SpaceforData
`Buffer
`Reserved
`
`DataPointer
`
`HeaderPointer
`
`ProtocolHeader
`
`Reserved
`
`BlockAddressPointer
`
`STATUS
`
`.—.-.—32bit.s
`
`DataPointer
`
`HeaderPointer
`
`PresentationContext_ID
`
`Options
`
`FonnatType
`
`Class
`
`DST-REFfield
`UpperWindow
`LowerWindosv
`SequenceNumber
`
`BlockN
`Control
`
`PresentationContext_ID
`
`Options
`
`FonnatType
`
`Class
`
`DST-REFfield
`
`UpperWindow
`LowerWindow
`SequenceNumber
`
`BlockI
`Control
`
`BlockAddress
`
`HostTag
`
`BypasschipFULL
`DMALength
`
`DMAStartAddress
`
`.m___—32Bits
`
`235
`
`ROPEforamultiple-layerbypassarchitecture
`
`PartFiveProtocols
`
`234
`
`It
`
`DELL
`
`

`

`Ex.1064.011
`
`theNectarCommunicationProcessor,”inACMSIGCOMM’90,1990.
`
`[7]CooperE.C,SteenkisteP.A.,SansomRD.andZillB.D.,“ProtocolImplementationon
`[61CnelhoD.R.,“TheVHDLHandbook.”KiuwerAcademicPublishers,1989.
`
`overhead.”inIEEECommum.Mag..vol.27,pp.23-29,June1989.
`
`[5]ClarkD.,JacobsonV.,RomkeyJ.,andSalsvenH.,“AnanalysisofTCPprocessing
`
`Protocols,”inACMSIGCOMM1990.
`
`[4]ClarkD.andTennenhouseD.,“ArchitecturalConsiderationsforaNewGenerationof
`
`IFIPWorkshopProtocolsfor
`
`[3]ChessonG..“XTP/PEDesignConsiderations,”inProc.
`
`High-SpeedNetworks,Zurich.May9-11,pp.27-33,1989.
`
`[2]BeachB.,“UltraNet:AnArchitectureforGigabitNetworking,”inProc.15thConference
`
`onLocalComputerNetworks,MinnesotaOct,1990.
`
`[11BalrajT.S.andYeminiY.,“PuttingtheTransportLayeronVLSI-thePROMPTprotocol
`
`chip”.IFIP,Stockholm,May13-15,1992.
`
`References
`
`theTelecommunicationsResearchInstituteofOntario.
`programofCentersofExcellence,throughtheTelecomSoftwareMethodsProjectofTRIO,
`Bourahlahelpedwiththeiruse.ThisresearchwassupportedbytheOntariogovernment
`Curry,HemiThakar,Dr.ParvizYousefpour,BernardDoray,MikeMajidandMustapha
`Bell-NorthernResearchprovidedtheVHDLtoolsusedinthisstudy,andDr.Simon
`
`Acknowledgments
`
`approachviableforsomeconsiderabletimetocome.
`advancesinspeedcanbeobtainedinproportiontotechnologyimprovements,makingthe
`ontimeoutandresequencingprocedures,
`thethroughputdecreasedto313.3Mbps.Further
`Intheseconddesign,extendedtoincludetheTP4checksum,retransmission
`technology.
`2.30bps(SessionBCSandTP2)for1KbyteTPDUpacketsusingcurrent0.StiuBiCMOS
`Inthefirstdesign,thebypasschipwitha66MHzclockcansupportathroughputrateof
`
`ReassemblysublayeroftheATIVIadaptationlayerisagoodplaceforsuchfunctions[25].
`lowerlayersandshouldoccuronlyonceintheprotocolstack[23].TheSegmentationand
`restriction,asresearchsuggeststhatfragmentationofPDUsshouldberestrictedonlytothe
`isnosegmentation/reassemblywithinthebypasspath,butwedonotseethisasamajor
`Abypassdoesnotincludefastconnectionsetupbutalsodoesnotinterferewithit.There
`Thescopeoffunctionsincludedinabypassmaybenarrowlydefined,ormoreextended.
`
`softsvare,thusprovitlinganeasymigrationpathforcurrentsystems.
`oftheOSIstackcanbeadaptedforbypassingwithonlyasmallmodificationoftheoriginal
`hostprocessorisalsorelievedofacknowledgmentprocessing.Anexistingimplementation
`andreassemblyorSARoperationwouldalsobeinhardware,sinceitisdonefrequently.)The
`
`(InanATMsystemweassumethatthesegmentation
`to-applicationthroughputperformance.
`bandwidthofhigh-speednetworks,e.g.ATMtechnology,therebyincreasingtheapplication-
`Thespeedofcommunicationprocessinginthehostsystemcannowmatchthetransmission
`significantproportionofprotocolprocessingandcanconcentrateontheapplicationprocessing.
`areparticularlyefficientwhenperformedonthechip.Thehostprocessorisrelievedofa
`easilyfitintoacommerciallyavailablegatearrayIntegratedCircuit.Per-octetoperations
`a“ReducedOperationProtocolEngine”(ROPE).Thegatecountforthebypasschipcan
`offloadsthecriticalprotocolfunctionsandtheassociatednon-protocol-specificfunctionsonto
`leastanorderofmagnitudehigherthansoftwareprotocolprocessing.Thebypasssystem
`theperformancewouldbeat
`leastforthetransportandsessionlayers)inVLSIandthat
`Itcanbeconcludedfromthisstudythatitisfeasibletoimplementthebypassstack(at
`
`7Summary
`
`standardssuchasMPEG[13].
`ROPEwithhardwiredpresentationconversionisinvideoserverswiththeproposedencoding
`theinflexibilityofahardwareversionisanevidentweakness.Onepossibleapplicationof
`gainscouldresultifthepresentationconversionsaresimpleandareusedconsistently,although
`consistsonlyofthePresentationdataencoding/decodingfunctions.Substantialperformance
`it
`
`Presentationprocessingcandefinitelybebypass

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket