throbber
SquidandICP:Past,Present,andFuture
`DuaneWessels
`wessels@nlanr.net
`August, 
`Abstract
`HierarchicalWebcachinghasbecomeprevalentthroughouttheInternet.ThepopularSquidsoftware
`hasplayedanimportantrole.WepresentashorthistoryofSquidandtheInternetCacheProtocol,and
`o(cid:11)erourthoughtsregardingitsfutureevolution.
`
`Introduction
`EversincetheWorld-WideWebrosetopopularityaround ,muche(cid:11)orthasfocusedonreducinglatency
`experiencedbyusers.Sur(cid:12)ngtheWebcanbeslowformanyreasons.Serversystemsbecomeslowwhen
`overloaded,especiallywhenhotspotssuddenlyappear.Congestioncanalsooccuratnetworkexchange
`pointsoracrosslinks,andisespeciallyprevalentacrosstrans-oceaniclinksthatoftencostmillionsofdollars
`permonth.
`CachinghasprovenausefultechniqueforimprovingenduserexperiencedlatencyontheWeb.Thefun-
`damentalconceptistheintermediatestorageofcopiesofpopularWebdocumentsclosetotheendusers.
`Cachingise(cid:11)ectivebecausemanyWebdocumentsarerequested(much)morethanonce[ ].Webbrowsers
`havelocaldiskcachesbecauseindividualsoftenbrowsethesamepagesrepeatedly.Additionally,thereis
`likelyoverlapinthesetofdocumentsrequestedbyalargegroupofusers.Theseuserscanbene(cid:12)tfroma
`sharednetworkcache.
`Takingtheproxycachingconceptonestepfurther,wemightliketoconnectsetsofcachestogetherina
`hierarchy.AgroupofWebcachescanbene(cid:12)tfromsharinganothercacheinthesamewaythatagroupof
`Webclientsbene(cid:12)tfromsharingacache.Inasimplecachehierarchy,asetofchildcachesshareacommon
`parentcache.Childcachesforwardtotheirparentsrequestsforobjectstheydonothave.
`TheInternetCacheProtocol(ICP)providessupportforinformedselectionofanext-hopcache,including
`implicitindicationsofnetworkcongestion.AlthoughthediagnosticfunctionalityofICPmaybeuseful,
`simplecachetopologiesdonotreallyrequireICPtooperate;otherapproachescanhelprecognizeanddeal
`withnetworkfailures.ICPisespeciallysuitedtoenablingsiblingrelationships(i.e.queriesbetweenpairs
`ofchildcaches).Agroupofsiblingcachese(cid:11)ectivelyactsasalarge,distributedcache,aparticularlyuseful
`con(cid:12)gurationwhennoorganizationiswillingtooperateaparentcacheforalargecommunitywithnoexplicit
`revenuestreamforit.
`Thedi(cid:11)erencebetweensiblingandparentrelationshipsisintheirroleduringcachemisses:parentscanhelp
`resolvemisses,butsiblingsmustnot.Cachesshouldnotforwardrequeststosiblingcachesunlesstheyknow
`thesiblinghasthedesiredobject.Withoutamechanismtoascertainwhetheracachehasagivendocument,
`siblingrelationshipswouldnotevenbepossible.ICPprovidesthisfunctionality.
`InthisarticleweassumethereaderisalreadyfamiliarwithHTTP/ .[]andatleastawareofHTTP/ . [ ].
`ForadditionalinformationonICP,pleaseseeourInternetDrafts(soontobeRFCs)[,]andotherwork[].
`
`
`GUEST TEK EXHIBIT 1015
`Guest Tek v. Nomadix, IPR2019-01191
`
`

`

` Squid
`. History
`Squidwasdevelopedasafollow-ontoonecomponentoftheHarvestproject[],thelattere(cid:11)ortoriginally
`fundedin bytheU.S.AdvancedProjectsResearchAgency(ARPA).Harvestfocusedondeveloping
`toolsfore(cid:11)ectiveuseofInternetinformation,withemphasisonindexingandsearchingtheWeb.Theproject
`teamdevelopedamulti-facetedarchitecturefore(cid:14)cientcataloginganddisseminationofpopularinformation.
`OnearchitecturalcomponentaimedatminimizingenduserresponsetimeduringWebnavigationwasan
`objectcache,aplatformstrategicallyplacedwithinthenetworktopologythatwouldstorecopiesofpopular
`documentstosaveusersfromhavingtowaitfortransmissionofthedocumentfromtheoriginalsource.
`TheobjectcachecomponentwasnamedHarvestcached.TheotherHarvestcomponents(thegathererand
`thebroker)werequitepopularaswell,andarestillbeingusedinanumberofplaces.Infact,agroupof
`volunteershasemergedtocontinuedevelopmentofthatcode.
`Atthattime,theonlyotherWebcaching(sometimesalsocalledproxy)softwareinwidespreadusewasthe
`CERNHTTPDaemon[].TheHarvestcachehadtwofeaturesthatmadeitanattractivealternativeto
`CERN'sproxy.First,whileCERNusedaseparateprocessforeachrequest,Harvestservicedalmostall
`proxyrequestsinasingleprocess.TheexceptionisFTPtransfers,whicharehandledinanexternalprocess
`duetothecomplexityoftheFTPprotocol.Second,theHarvestcachesoftwaresupportedthenotionof
`cachehierarchies.
`Thesingle-processnatureofHarvestwasattractivebecausemanycacheoperatorssawtheirmachinesslow
`toacrawlduringbusyperiods.Theoperatingsystemwouldspendalotoftimeandresourcesforking
`andschedulingnewprocesses,apricepaidforthesimplicityoftheCERNserverdesign.Harvesttookthis
`tradeo(cid:11)intheoppositedirection.Asasingleprocess,ithastoworryaboutmanyoperatingsystemlevel
`tasks,inparticularmemoryand(cid:12)ledescriptormanagement.
`Note,fortheremainderofthisarticle,unlessstatedotherwise,Harvestreferstothecachecomponentofthe
`Harvestresearchproject.Itdoesnotinanywayrefertothecommercialsoftware,whichisalsoknownas
`NetCache.
`Inlate andearly ,manymembersoftheHarvestprojectmovedontoindustryjobs,bringingthe
`projecttoaprematureclose.Atthesametime,Hans-WernerBraun,kcla(cid:11)y,andMichaelSchwartzreceived
`U.S.NationalScienceFoundationgrantfundingfortheirproposaltodeployaprototypeWebcachingsystem
`withintheU.S.InMarchof IwenttoworkforkandHans-Werneronthisproject.
`Shortlythereafter,PeterDanzig,primarytechnicalarchitectfortheoriginalHarvestcachesoftwareitself,
`formedacompanytosellacommercialversionoftheHarvestcache.InordertosupporttherecentlyNSF-
`fundedcachedeploymentandresearchproject,weneededtohaveopenandpubliclyavailableWebcache
`software.SinceIhadalreadyworkedforayearontheHarvestproject,itmadethemostsensetotakethe
`lastpre-commercialversionofHarvestandenhanceittosupportresearchinvestigations.Toavoidconfusion
`betweenthesetwoderivationsoftheHarvestsoftware,wenamedoursoftwareSquid.Wehavespentthelast
`yearincorporatingcodecontributions,suggestions,newfeaturerequests,patches,errorreportsandother
`feedbackintotheSquidsoftware.
`Afterafewmonthsofbetatesting,the(cid:12)rsto(cid:14)cialversionofSquid( ..)wasreleasedinJulyof .
`Someofthemoresigni(cid:12)cantadditionalfeaturesandenhancementsaredescribedbelow.
`. . SupportforIf-Modi(cid:12)ed-Since
`TheHarvestresearchcodedoesnotsupportIf-Modi(cid:12)ed-Since(IMS)requests.IMSrequestsareusedwhen
`clientsorcacheshaveacopyofanobjectandwanttomakesureitisthemostup-to-dateversion.Harvest
`simplyignoredtheIMSheaderwhichcauseduseragentstoalwaysreceivefullreplies.Theprimaryreason
`thatHarvestdidnotsupportIMSisbecausethelastmodi(cid:12)cationtimeisnotkeptasmetadatainmemory.
`Toimplementthis,Squidneedstoparsethereplyheadersastheobjectwasreadinfromdisk.Muchofthe
`initialcodewaswrittenbyHenrikNordstrom.
`
`
`

`

`InSquid .,IMSisonlysupportedbetweenSquidandtheclient.SquidforwardsIMSheadersforcache
`misses,andprovidestheclientwiththecorrespondingreply,butdoesnotupdateitsownstatewithinfor-
`mationfromanyNotModi(cid:12)edreplies.Squid .alsoneverinitiatesanIMSrequestonitsown.
`. .PublicandPrivateobjects
`Harvest .treatsallobjectsitknowsaboutaspublic.Multipleclientscansimultaneouslyreceivedataas
`anobjectisbeingretrieved.Althoughoftenahugesavingsforlargeobjects,undercertainconditionsthis
`approachmightallowasecondclienttoreceivesomethingthatshouldhavebeengivenonlytothe(cid:12)rst.
`Forexample,considerarequestthatincludessomeauthenticationinformation.Becausethisdatarequires
`authentication,otherclientsshouldnothaveaccesstoitfromthecache.Insteadtheyshouldseparately
`authenticatethemselveswiththeoriginserver.
`Theimplementationdecisionsallowingthisvulnerabilityare:
`(cid:15)Asinglehashtableisusedtoindexallobjectsinthecache,includingpendingobjects.
`(cid:15)ThehashtablekeyissimplytheURLforGETrequests.
`OnereasonforkeyingononlytheURListhatthecacheusestheURLextractedfromanICPreplytolook
`upthependingrequestandcontinuethethreadofexecution.Duringthetimeintervalbetweensendingthe
`ICPqueriesandreceivingthereplies,additionalclientscouldattachthemselvestothereplystream.This
`isaproblembecausethereplyheadersmayindicatethattheobjectshouldonlybegiventothe(cid:12)rstclient.
`Harvestcouldlikelyhaveavoidedthisproblembyusingaseparatehashtableforpendingrequests,or
`alternativelybyusingtheReqnum(cid:12)eldoftheICPmessage.Infact,HarvestalwayssettheReqnum(cid:12)eldto
`zeroinICPreplies.
`SquidimplementstheReqnum(cid:12)eldproperlyandusesittosupportbothprivateandpublicobjects.Private
`objectsareaccessibleonlytotheclientoriginatingtherequest,andpublicobjectsareavailabletoallclients.
`Squidindexesobjectsinthestoragehashtablewithakeythatincludesanintegernumberprependedto
`theURLstring.SquidplacesthisnumberintotheReqnum(cid:12)eldofoutgoingICPquerymessages,anda
`peermustusethesameReqnumvalueinitsreply.ThistechniqueallowsSquidtouseacachekeysothat
`pendingrequestscanbelocatedfromtheICPreplies,butnotbynewclients.Allrequestsstartoutas
`private,andremainsoduringthepeerselectionstage.UponreceiptoftheHTTPreplyheaders,theobject
`willbecomepublic,unlessthereplyindicatesotherwise.Onlypublicobjectsremaininthecache;private
`onesareremovedimmediatelyaftertransfer.IfSquidiscon(cid:12)guredwithanoldHarvestpeer(whichsetsthe
`Reqnum(cid:12)eldtozero),thentheprivateobjectfeaturemustbedisabled,becauseitwillbeunabletolocate
`thependingrequestsfromHarvest'sICPreplies.
`. . Metadatareloadinginthebackground
`Harvest .doesnotserveanyrequestsuntilithasrebuiltthecachemetadatafromtheswaplog(cid:12)le.
`Shuttingthecachedowncleanlywillallowamuchfasterreloaduponrestartingthecache,similartothe
`bene(cid:12)toffsckonUnix(cid:12)lesystems.Dependingonthecachesize,afastreloadcouldtakeanywherefrom
`onetotenminutes.Aslowreloadcouldtakehours.Ineithercase,itisunrealistictodenyservicetoend
`userswhilethecacherebuilds.
`HenrikNordstromimplementedabackgroundprocessingtasktoreloadtheswaplogwhilethecachewas
`servingrequests.Duringthisphase,Squidgivesselect()azerotimeoutvalue.Uponatimeout,Squid
`processesasmallnumberoflinesfromtheswaplog.
`. .NewAccessControlscheme
`WithHarvest .,theonlyaccesscontrol(ACL)mechanismistheclient'sIPaddress.Whileworkingon
`Squid,itquicklybecameapparentthatadministratorswantedarich,(cid:13)exiblesetofaccesscontrols.We
`
`
`

`

`implementedthefollowingadditionalaccesscontrols:
`(cid:15)Requestmethod
`(cid:15)Accessprotocol
`(cid:15)Destinationdomainname
`(cid:15)DestinationIPaddress
`(cid:15)Portnumber
`(cid:15)RegularexpressionmatchingtheURL-path.
`(cid:15)Timeofday
`However,Squidneverpostponesrequeststomakeaccesscontroldecisions.Speci(cid:12)cally,thedestinationIP
`addressisonlyavailableiftheURLhostnameisalreadypresentintheIPcache.SquidneverdelaystheACL
`decisiontowaitforaDNSlookuptocomplete.IftheURLcontainsanIPaddressinsteadofahostname,
`Squid .doesnotmakeareverseDNSlookupandcomparetheansweragainstthedestinationdomain
`name.
`. Present
`The(cid:12)rstversionofSquid . wasreleasedinDecember .Themostrecent( . . )wasreleasedJuly
` , .Wedescribeitsnewfeaturesbelow.
`.. RefreshRulesandIf-Modi(cid:12)ed-Since
`Squid .removesobjectswhentheyexpire.Squid . keepsexpiredobjectsondiskandissuesIf-Modi(cid:12)ed-
`Sincerequestsforthem.Thiscanresultinasigni(cid:12)cantbandwidthsavingssincetheoriginserverdoesn't
`needtoretransmittheentireobject.Unfortunately,IMSdoeslesstoreducelatencythanbandwidth,because
`theoriginservermuststillbecontacted.
`..URLRedirector
`SeveralSquidusersinthecommunityrequestedtheabilitytorewriterequestedURLsorperformHTTP
`redirection.RatherthanimplementthisfeatureinSquiditself,whereanelegantsolutionsatisfyingeveryone
`wouldbedi(cid:14)cult,wechosetomakeitanexternalprocess.Ifcon(cid:12)guredtouseredirection,Squidsends
`everyrequesttooneoftheredirectorprocesses.Theredirectorprogrammustbesupplied(i.e.written)by
`theadministrator.AredirectorprocessreadsURLsonstdin,possiblychangestherequest,thenwritesthe
`resulttostdout.Squidreadstheredirectoroutputandmodi(cid:12)estherequestdatastructures.
`Onepossibleapplicationoftheredirectorfeatureisadditionalaccesscontrols.Iftheredirectorprogramhas
`accesstoalistofforbiddenURLs,itcanredirecttherequesttoapagedescribingexactlywhytheoriginal
`requestcouldnotbeallowed.
`.. ReverseIPLookups,clienthostnameACLs.
`Squid . includessupportforaccesscontrolsbasedonclientdomainname,previouslyimpossiblebecause
`SquiddidnotdoreverseDNSlookups.MuchliketheIPcache,Squidcachesreverselookupresultsin
`theFQDNcache.Wenotethatthedomainlookupswillincursomeadditionaldelayandwethusdonot
`necessarilyrecommendtheiruse.
`
`
`

`

`..Cachedirectorystructurechanges
`ThepreviousversionofSquidused subdirectoriesundereachcachedir.Forlargecaches(e.g.,more
`than ^objects)thisdesignwouldinvolvethousandsof(cid:12)lespersubdirectory.Suchlargedirectoriescan
`potentiallycauseunnecessarydelaysduring(cid:12)lesystemoperations.
`ApatchfromMarkTreacymodi(cid:12)edSquidtouseatwo-leveldirectorystructure,defaultingto (cid:12)rst-level
`directoriesandsecond-leveldirectories.Inthiscon(cid:12)guration,amillionobjectswouldleadtoamore
`sensibleobjectspersubdirectory.
`..NetworkProbeDatabase
`BothHarvestandSquidallowrequeststoberoutedbasedonthehostnamesfromURLs.However,thereare
`afewproblemswiththisapproach.First,itrequiresalotofmanualcon(cid:12)guration.Thecachecon(cid:12)guration
`(cid:12)lemustlisteachdomainwitheachpeer;itisonlypracticaltolisttop-level-domains(TLDs).Inaddition,
`domainrestrictionsdon'tworkforURLswithIPaddressesinsteadoffullyquali(cid:12)eddomainnames.
`Thebiggestproblemisthatthedomainnamesareunrelatedtonetworktopology,apartfromtherough
`mappingprovidedbythetwo-lettercountrycodeTLDs.ThesenationalTLDscanframeonlyaverycoarse
`Webroutingsystem,andinternational(top-level)domains,e.g.,.com,.net,areevenworsebecausethey
`havenobearingatalltotopology.Forthesereasons,anyroutingschemebasedondomainnamesis
`inauspicious;likeanyotherroutingscheme,e(cid:11)ectivecache-levelroutingneedsknowledgeoftheunderlying
`networktopology.
`Squid . hasanoptionalfeaturetoacquiretopologydatawithICMP.Whenthequeryicmpoptionis
`enabled,Squidbuildsupatableofhopcountsandround-triptimesfortheoriginserversitencounters.
`SquidaggregatesthisdatabyIPnetworkundertheassumptionthattwohostsonthesamelocalnetwork
`willhavesimilarvalues.Viaanexternalprocess,SquidcachessendandreceiveICMPechorequeststoserver
`hostsatarateofnomorethanonceper(cid:12)veminutes.Squidthenincludestheresultsofthesenetworkprobe
`measurementsinICPreplymessages.ThecachecollectingICPrepliesusesthenetworkmeasurementsto
`selectthebestpeer,ideallytheonemosttowardtheoriginserver.
`Asacachecollectsnetworkmeasurementsfromitspeers,itaddsthemtoitslocaltable,learningovertime
`whichpeersaregoodchoicesforwhichsources.Thecachewillbeclosertosomesourcesthananyofits
`peersare;fortheseitcansimplyfetchdirectlyandavoidtheICPquerying.Thisapproachcomplicatesthe
`peerselectionalgorithm,sinceinsteadofrememberingasinglebestparent,Squidmustnowrememberalist
`ofpossibleparentsuntilallICPrepliesarrive.
`..Round-RobinIP
`Squid'sIPcachestoresalladdressesreturnedbythegethostbyname()functioncall.However,Squid .
`onlyusesthe(cid:12)rstaddress.PopularWebsitessuchasMicrosoftrequiremanyservers,soanaddresslookup
`onwww.microsoft.comreturns {IPaddresses.Iftheserveratthe(cid:12)rstaddressfails,Squid .would
`nevertryanyoftheotheraddresses.Thisbehavioris(cid:12)xedinversion . .Itcyclesthroughallavailable
`addressesanddeletesaddressesforwhichTCPconnectionsfail.
`..Neighborfeatures(neighbortypedomain,missaccess).
`Whenbuildingameshofcaches,havingthesamepeeringrelationship(parentvs.sibling)forallrequests
`mightnotbedesirable.Consider,forexample,apairofupper-levelcachesintwodi(cid:11)erentcountries.Country
`AmightwanttoforwardrequestsforcountryBrequeststothecacherunningincountryB,invice-versa.In
`otherwords,cacheBwouldbeaparentforBdomains,andcacheAwouldbeaparentforAdomains.But
`whataboutotherrequests?Thecacheadministratorsmightwanttoleveragethefactthattheyarealready
`peeringwitheachother.
`WhensomecacheoperatorsinEuropewerebuildingahigh-levelcachemesh,theywerethwartedbythe
`
`
`

`

`factthattheparentorsiblingrelationshipappledtoallrequests.Forexample,theGermancachewanted
`toforwardrequestsfor.ukrequeststoisneighborcacheintheU.K.Atthesametime,theGermancache
`wantedtousetheU.K.cacheasasiblingforallotherrequests.Squid .didnotsupporttreatingapeer
`asaparentforsomerequests,andasasiblingforothers.Squid . supportsthisfunctionality.Byusing
`thetheneighbortypedomaincon(cid:12)gurationoption,Squidrelationshipscandependsonthedomainnameof
`requestedtheURL.
`NotethatanICPquerydoesnotincludeanyparentorsiblingdesignation,sothereceiverhasnoindication
`ofhowthepeercacheiscon(cid:12)guredtouseit.Thisissuebecomesimportantwhenacacheiswillingtoserve
`cachehitstoanyone,butonlyhandlecachemissesforitsowncustomers.Inotherwords,whethertoallow
`therequestdependsoniftheresultisahitoramiss.Tosupportthisfunctionality,Squidacquiredthe
`missaccessfeatureinOctoberof .
`Inadditiontobeingawkwardtoimplement,missaccessbringsitsowncomplication:itrequiresthatthe
`ICPreplybeanextremelyaccuratepredictionoftheresultofasubsequentHTTPrequest.Thisprediction
`ischallengingifnotimpossiblesincetheICPrequestcannotconveythefullHTTPrequest.Additionally,
`therearemoretypesofHTTPrequestresultsthanthereareforICP.TheICPqueryreplywilleitherbea
`hitormiss,butanHTTPrequestmightresultinaNotModi(cid:12)edreplyfromtheoriginserver.Suchareply
`isnotstrictlyahitsincethepeerneededtoforwardaconditionalrequesttothesource.Atthesametime,
`itsnotstrictlyamisseithersincethelocalobjectdataisstillvalid,andtheNot-Modi(cid:12)edreplyrequired
`fromtheoriginserverisquitesmall.
`..TheNOVMversion
`PerhapsthemostcommoncomplaintaboutSquidisitsmemoryconsumption.Thetwoprimaryusesof
`memoryareforobjectmetadata(approximately bytesperobject)andin-transitrequests.Squidstores
`objectsin-transitfullyinmemoryuntilthetransfercompletes,atwhichpointSquideitherwritesthemto
`diskorpurgesthemfrommemory.Thisdesign(cid:13)awrequiressigni(cid:12)cantmemoryforforacachewithmany
`activeclientsandmanylargeobjects.
`Toalleviatethisproblem,wedevelopedabranchversionofSquid . thatdidnotretainin-transitobjectdata
`inmemory.Instead,thisNOVMversionwritesallobjectstodiskasitretrievesthem,whilecacheclients
`simultaneouslyreadfromthedisk(cid:12)les.Thisapproachessentiallydoublesthenumberof(cid:12)ledescriptors
`requiredbecausetheswap(cid:12)leisopenbothforreadingandwritingasSquiddownloadstheobject.Inother
`words,theNOVMversiontradesmemoryfor(cid:12)ledescriptors.Forcachemachineswithoutenoughphysical
`memory,theNOVMversionisaviablealternative.
`.. MulticastICP
`InspiredbyideasandcodefromMartinHamilton,Squid . supportssendingICPqueriestomulticast
`addresses.Thebandwidthsavingsofmulticastcanbesigni(cid:12)cantforlargecachemeshes.WhiletheICP
`queriescanbemulticasted,ICPrepliesarealwaysunicastforseveralreasons:
`(cid:15)MulticastingICPreplieswouldnotreducethenumberofpacketssent.
`(cid:15)Itpreventsothergroupmembersfromreceivingunexpectedreplies.
`(cid:15)Thereplyshouldfollowunicastroutingpathstoindicate(unicast)connectivitybetweenthereceiver
`andthesendersincethesubsequentHTTPrequestwillbeunicastrouted.
`Withoutmulticast,thecacheadministratormustspecifyeverymemberofacachemeshineverycache's
`con(cid:12)guration(cid:12)le.Withmulticast,newmemberscouldsimplyvoluntarilyjointhemulticastgroup.Although
`itistemptingtoexploretheuseofmulticastasawaytosimplifycachecon(cid:12)guration,italsointroduces
`securityissues.sincejoiningamulticastgrouprequiresnospecialprivilegeorauthentication,andtrustinga
`givenmulticastgroupisnottypicallyasafeassumption.Forthisreason,Squidstillrequiresthatalltrusted
`neighborcachesbespeci(cid:12)edinthecon(cid:12)guration(cid:12)le.
`
`

`

`. TheFutureofSquid
`WehavebegunworkonSquidversion .,inwhichweplantosupportthefollowingnewfeatures.Those
`whichwehavenotyetimplementedweidentifywithanasterisk(*).
`. . Independentswapdirectories
`Squidversions .and . assumethatallcachedirdirectoriesareofthesamecapacity.Evenworse,the
`(cid:12)le-number-to-pathnamealgorithmchangeswhenthenumberofcachedirdirectorieschanges.Thismeans
`thatifyouloseonedisk,youe(cid:11)ectivelyloseyourentirecache.
`Version .makeseachcachedirindependent,i.e.,eachdirectorycanbeadi(cid:11)erentsize,andhaveadi(cid:11)erent
`numberof(cid:12)rst-andsecond-levelswapdirectories.Inaddition,onecancon(cid:12)gureswapdirectoriestoprevent
`anynewobjectsbeingwrittentothatdisk(`object-read-only')tofacilitateplanninginadvancefordirectory
`removal.Eachswapdirectoryhasaseparateswaplog(renamedtoswap.state)forfullindependence.
`Becauseallswapdirectoriesareequalinversion . ,Squidallocatedswap(cid:12)lesinaround-robinfashion.
`Forversion .,Squidallocatesnewswap(cid:12)lestotheswapdirectorythathasthegreatestpercentageoffree
`space.
`. .AsynchronousI/O
`Wheneverpossible,squidusesnon-blockingI/O,whichreallyonlyworksforsocketsandpipes.MostUnix
`systemsexecutediskoperationsimmediately.Thatis,aread()orwrite()neverreturnsEWOULDBLOCK
`fordisk(cid:12)les.Othersystemcallssuchasopen(),close(),andunlink()alwayscompletebeforereturning,
`andalthoughgenerallyveryfast,theydoblockthecallingprocessforshortamountoftime.Whendealing
`withahighrateofdiskoperations,suchblockingcanadverselya(cid:11)ectcacheperformance.
`Onewaytoreducethisburdenistoexecutethediskoperationsinseparateprocesses.Squid . hasan
`externalprocesstounlinkcache(cid:12)les.Earlyon,wetriedusingtheAIOlibraryfunctionsavailableonIrix
`andSolaris.However,thesefunctionsdidnotintegratewellwithSquid'slowerleveldiskI/Oroutines.
`Normally,Squidwouldwritealinkedlistqueueofblockstoadisk(cid:12)le,eachblockusingitsownwrite()
`call.Usinganaiowrite()insteado(cid:11)eredlittlegain,andtherewerealsominorincompatibilitiesbetween
`IrixandSolaris.
`StewartForsterprovideduswithasigni(cid:12)cantpatchthatimplementsportableande(cid:14)cientasynchronous
`I/O.Ratherthanrelyingonthesemi-portableaiowrite()andrelatedfunctions,Stewart'scodeneedsonly
`aPOSIXthreadslibrary(pthreads).Wenowalsocombinemultiplewritebu(cid:11)ersintoasinglecontiguous
`bu(cid:11)erformoree(cid:14)cientoutput.Inadditiontoreadingandwriting,Stewart'scodealsoimplementsopen,
`close,andunlinkasynchronously.
`. . Backgroundvalidation
`StewartForsteralsocontributedcodetoimplementbackgroundvalidationofaslowcacherebuild.IfSquid
`shutsdownwithoutwritingacleanswaplog(cid:12)le,thesubsequentrestartmuststat()everycache(cid:12)leto
`verifyitsexistenceandcorrectsize.Squidversion . ,madethestat()callsasitreadthelog(cid:12)le,soit
`couldpotentiallytakemanyhourstofullyreadthelog(cid:12)le.
`Stewart'spatchallowsSquidtoreadthelog(cid:12)leatfull-speedwhilemarkingobjectsfordelayedvalidation.
`Afterreadingtheentirelog(cid:12)le,Squidbeginsthevalidationofallobjectsviastat().IfSquidreceivesa
`requestduringthisperiodforanobjectithasnotyetvalidated,itvalidatesthatobjectimmediatelybefore
`satisfyingtherequest.
`
`
`

`

`. .HTTP/ . persistentandpipelinedconnections
`TheWebiscurrentlyinatransitionbetweenHTTP/ .andHTTP/ . .TheApachedevelopmentgrouphas
`releasedanHTTP/ . compliantversionoftheirHTTPserver.Ingeneralwe(cid:12)ndthatHTTP/ . featuresare
`implementedincrementally,andSquiditselfislaggingbehindontheHTTP/ . front.Wehaveimplemented
`persistentandpipelinedclientconnections,butasofAugust havenotyetimplementedpersistent
`connectionstoserversorneighborcaches.TheWorld-WideWebConsortiumhasreportedobservingboth
`reducedlatencyandreducednumberofpacketstransmittedwithpipelinedrequests[ ].
`. .FTPrequestshandledinternally
`WeoriginallybelievedthattheFTPprotocolwastoocomplextoimplementinsidethecacheitselfandso
`usedanexternalmodule,ftpget,whichwasuglyanddi(cid:14)culttounderstand.Squid .nolongerusesa
`separateprocess,theFTPfunctionalityisfullyintegratedintothemainlinecode.Besides(cid:12)ndingiteasier
`thanwehadthought,wewerealsomotivatedtointegratebecauseoftheincreasingnumberofcon(cid:12)guration
`optionsSquidneededtopasstoftpget.Additionally,thischangeshouldmakeitmucheasiertoimplement
`FTPPUToperations,althoughwehavenotdonesoyet.
`. .Replacedlocaldomainandinside(cid:12)rewall.
`ThelocaldomainoptionforcesSquid . toforwardcertainrequestsdirectlytotheoriginserver.Similarly,
`theinside(cid:12)rewalloptionpreventsSquidfromeverforwardingcertainrequeststooriginservers.Minor
`problemswiththesetwooptionsinclude:
`(cid:15)TheyuseasselectioncriteriaonlytheURLhostnameorIPaddress;theycannotdi(cid:11)erentiatebased
`ontheaccessprotocol(e.g.FTPvs.HTTP).
`(cid:15)Thenamescanbemisleading.Sometimespeoplewanttheircachetonevermakedirectconnections,
`butarenotbehinda(cid:12)rewall.
`Squid .hasre-implementedthesefeaturesasoptionsalwaysdirectandneverdirect,whichuseSquid's
`ACLelementsforselectioncriteria.
`. .Customizableerrormessages
`Squidcacheadministratorsoftencomplainedthaterrormessageswerehard-codedintoSquid'sexecutable,
`particularlyproblematicfornon-Englishspeakingadministrators.Squidnowsupportsfullycon(cid:12)gurable
`errormessages,usingprintf-styletokenstosubstituterelevantpartsofarequestintothemessage.
`. .Morerobustrequestforwarding*
`ApotentiallyfrustratingaspectofSquid . 'srequestforwardingisitsselectionofasinglelocationtoforward
`arequest.Itmightbetheoriginserver,orperhapsaneighborcache.Iftherequestcannotbeforwardedto
`thatdestination(e.g.duetoconnectionfailure)theuserreceivesanerrormessage.Squid . doesnothave
`aconceptofbackupparent.ItdoesnotretryfailedTCPconnectionattempts.
`Squid .willuseamorerobustscheme,wheretheadministratorcanspecifyanorderedlistofoptionsfor
`forwardingcachemisses.Forexample,onemightwanttospecifyaschemeto:
` .QuerysiblingsusingICPandselect(cid:12)rstICPHIT.
`.Selectprimaryparentcache,withasecondconnectiontimeout
` .Selectbackupparentcache,witha secondconnectiontimeout
`.Forwardtotheoriginserver
`
`
`

`

`. . Moremetadatawrittentoswaplog
`Wenowwriteadditional(cid:12)eldstotheswaplog(swap.state).Inversion . thelast-referencetimewaslost
`whenSquidreloadedthemetadatafromdisk.Actually,Squiddoesnotwritethelast-referencetimetothe
`swaplog;ratheritsetsthelast-referencetimetothelast-validtimeuponrestart.Version .logsseparately
`thelast-referencetime,thereferencecount,andtheStoreEntry->(cid:13)agsvalues.Thereferencecountkeepsthe
`replacementalgorithmconsistentacrossrestart.The(cid:13)agssupporttheHTTP/ . Must-Revalidatefeature.
`. . MIMEchanges
`Inearly thesquid-usersmailinglistcarriedagreatdebateaboutcontenttypes,inspiredbyftpget's
`needtoincludeavalidContent-Typeheaderforthe(cid:12)lesitretrieved.Ftpgetdeterminescontenttypeusing
`theURL(cid:12)lenamesu(cid:14)x,whichifunknownornon-existent(asinthecaseofREADME),defaultedtoeither
`text/plainorapplication/octet-stream.Thedefaultcontenttypeissetatcompile-timewitha#definemacro.
`Ifthedefaultweretext/plain,theuser-agentwoulddisplaythecontentinthebrowserwindow,undesirable
`whentheobjectwasabinarydata(cid:12)leorsomethingelseungracefullyviewable.Adefaultofapplication/octet-
`streamwouldcauseSquidtosavethe(cid:12)letodisk,alsonotwhatyouwantforREADME(cid:12)les.
`ThisproblemstemsfromtheneedtogatewaydatafromanFTPservertoanHTTPclient.TheMIME
`content-typeisafundamentalfeatureofHTTP.Ontheotherhand,FTPhasnoconceptofcontenttypes,
`soSquidhastoguess.Version .determinesthecontent-typefromatableofregularexpressionsloadedat
`run-time,allowingadministratorstocustomizetheircachetomeettheirneeds.
`. . Callbackdataregistry
`BothHarvestandSquidhavebeenpronetonumerousmemoryaccessbugsovertheyears.Infact,both
`packageshavespecialsignal-handlingcodetodetectBusErrorsandSegmentationViolations,andmailthe
`administratorwitharequesttogenerateastacktraceandsendittothedevelopers.Insomecasesthese
`errorsare,inretrospect,quiteobviousandeasyto(cid:12)x.Thereisanotherclass,however,whichisabitmore
`subtle.WhenSquidneedstoperformanoperationthatwouldnormallyblocktheprocess,itregistersacallback
`functiontotriggerwhentheoperationcompletes.Thecallbackfunctionalwaystakessomecallbackdataas
`anargument.Forexample,sincetheoriginalHarvestimplementation,DNSlookupshavealwaysbeenmade
`byexternaldnsserverprocesses.Whenthelookupcompleted,thecallbackwasmadesothattherequest
`processingwouldthencontinue.
`Thisdesignassumesthecallbackdatamemorywillstillbevalidwhenthecallbackoccurs.Ifthememory
`hasbeenfreedpriortothecallback,thecallbackfunctionwilllikelywriteintomemoryallocatedtosome
`otherdatastructure.Then,whenthetrashedmemoryisnextaccessed(perhapsquitesometimelater),a
`fatalSegmentationViolationmayoccur.Becausethecauseande(cid:11)ectofsuchbugsmaybeseparatedby
`minutesorhoursintime,theyareoftendi(cid:14)culttodetect.
`Squid . hasevolvednumerouswaystodealwiththisproblem,insomecasesbylockingdatastructureswith
`counters,inothersbyusingunregisterfunctionstodisablecallbacksbeforetheoperationcompletes.Each
`datastructurepotentiallyusedascallbackdataneedsitsownimplementationtopreventmemoryaccess
`errors.
`Squid .includesagenericcallbackdataregistry,towhichcallbackdatapointersareadded,thenlocked
`whiletheblockingoperationisinprogress,unlockedwhenitcompletes,anddeletedwhenmemoryisfreed.
`Deleteoperationsthatoccurwhilethememoryisstilllockedwillbedelayeduntilthelockcountreaches
`zero.Beforethecallbackfunctioniscalled,Squidveri(cid:12)esthecallbackdatamemorywiththeregistry;if
`markedinvalid,Squidunlocksandfreesit,andnevermakesthecallback.
`
`
`

`

`VERSION
`
`PACKET LENGTH
`
`REQUEST NUMBER
`
`AUTHORIZATION
`
`0
`
`31
`
`SENDER HOST ADDRESS
`
`OPCODEFigure :TheoriginalICPmessageformat.
`. . Limitsonresourceutilization*
`Resource-consciousadministratorshavealsoaskedfortheabilitytolimittheresources(mostlybandwidth)
`consumedbysomesubsetofcacheclients.Forexample,ancache-bearingISPmightwanttoallowunlimited
`usagetousersonnetworkA,butlimittheusersofnetworkBtoKB/seceach.Unfortunatelywehave
`notyetcomeupwithagood,generalframeworkforlimitingnetworkresources.
`
`ICP
` . History
`TheHarvestresearchcodeusedthetheICPmessageformatshowninFigure .
`NotethatICPdoesnothavea(cid:12)eldfortherequestmethod;thepayloadconsistsofonlytheURL.Theearly
`HarvestdesignersassumedthatonlyGETrequestswouldbecacheable,totherewouldbenoneedtouse
`ICPforotherrequestmethods.ThelackofarequestmethodintheICPmessageisnotaseverede(cid:12)ciency,
`sinceinpracticenon-GETrequestsareindeednotcached.
`ItwouldbenicetohavetherequestmethodintheICPmessagewhenwetrytouseICPtoselectan
`appropriatenext-hopcache.ConsideranIntranetcachewithmultipleparentcacheslocatedonthe(cid:12)rewall
`border,oneofwhichitmustselecttoforwardtherequestsincetheinternalcachecannotdirectlycontactthe
`originserver.ForGETrequestswecanuseICP,butforaPOSTrequestwemustuseothercriteria.Normally
`oneoftheparentcacheswillbestaticallydesignatedasthenext-hopcacheforallnon-GETrequests.
`TheoriginalICPdesignhadopcodesforsupportingmultiplexedobjecttransportoverTCP:ICPSEND,
`ICPSENDA,ICPDATABEG,ICPDATA,andICPDATAEND.TheSENDandSENDAopcodessupport
`objectrequests,similartoHTTP'sGET.ThethreeDATAopcodessupporttransmissionofobjectdataback
`totherequester[ ].Intheorytheseopcodesserveasingle,persistentTCPconnectionbetweenneighbor
`caches.ObjectswouldbemultiplexedbysendingtheminsmallchunkswithanICPheadertoidentifythem.
`TheremayhavebeenaveryearlyimplementationoftheseopcodesinHarvest.However,aroundthetime
`ofHarvest .,thesefeatureswereremovedinordertomakethecodesimpler. Sincethen,ICPsupports
`onlycachequerying,whileHTTPhandlesallobjecttransport.
`HarvestneverimplementedAuthorizationinICP,sothetheeight-octectAuthorization(cid:12)eldremainedun-
`used.Infact,nospeci(cid:12)cauthorizationtechniqueswerediscussed;the(cid:12)eldwasonlyaplaceholderforfuture
`work.Thisunusedspaceturnedouttobeveryusefulforfutureversions,butnotforauthorization.
`HarvestalwayssettheReqnum(cid:12)eldtozero.Intheory,this(cid:12)eldwouldbeauniquenumberforeveryICP
`query,andrepliescopythevalueofReqnumintothereply.Squidutilizesthis(cid:12)eldtosupportprivateand
`publicobjects.ThefactthatHarvest .alwayssetReqnumtozerocausedsomeincompatibilitiesinthe
`twoICPimplementations.Toaccommodate,Squiddisablesprivateobjectsifit(cid:12)ndsaHarvestneighbor
`thatfailstosettheReqnum(cid:12)eldproperly.
` ThisexplainswhythereisagapbetweenSquid'sdefaultHTTPport( )andtheICPport( ).OriginallyHarvest
`usedport  forthisnon-standardcache-to-cachecommunicationchannel.
`
`
`

`

`OPCODE
`
`VERSION
`
`PACKET LENGTH
`
`REQUEST NUMBER
`
`OPTIONS
`
`OPTION DATA
`
`0
`
`31
`
`SENDER HOST ADDRESS
`
`Figure:TheICPmessageformatusedbySquid .
`ThedesignoftheHarvestresearchcodemadeICPveryeasytoimplement.RecallthatHarvestalways
`removedexpiredobjects,soifanobjectwasfoundinthecache,itwasknowntobevalid.BecauseHarvest
`didnotsupportIMSrequests,theICPheaderhasnocorrespondingIMS(cid:12)eld.
` . Present
`InSquid,theICPmessageformatisessentiallythesameasithasalwaysbeen.AsshowninFigure,the
`unusedauthorization(cid:12)eldo(cid:11)erssupportfornewandoptionalfeatures.Optionsisabit(cid:12)eldofoption(cid:13)ags,
`andOptionDataisusedtosendadditionalinformation.Thereareonlytwooptionspresentlyde(cid:12)ned.
` .TheHITOBJoptionindicatessupportfortheHITOBJopcode(describedbelow).
`.TheSRCRTToptionsupportsthenetworkprobedatabasedescribedearlier.Whenthisoptionbitis
`setinanICPquery,thereplyshouldincludethepeer'sestimatednetworkRTTtotheoriginserver,
`ifavailable.TheRTTestimateisplacedinthelowerhalfoftheOptionData(cid:12)eld.
` .. ICPHITOBJ
`TheHITOBJoptionwasinspiredbythepresenceofmanysmallWebobjectsamenabletopiggybackingatop
`ICPreplies,avoidingthethree-wayhandshakeandotherTCPoverheadofthesubsequentHTTPrequest.
`UsingHITOBJo(cid:11)ersstrongpotentialforbandwidthsavings.
`OneproblemwithHITOBJisthelargerUDPpacketsize,increasingthelikelihoodo

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket