`whichSearchEnginestoQuery
`AdeleE.Howe
`DanielDreilinger
`ComputerScienceDept. MITMediaLaboratory
`ColoradoStateUniversity Cambridge,MA
`FortCollins,CO
`howe@cs.colostate.edu
`daniel@media.mit.edu
`January,
`Abstract
`SearchenginesareamongthemostsuccessfulapplicationsontheWebtoday.So
`manysearchengineshavebeencreatedthatitisdi(cid:14)cultforuserstoknowwhere
`theyare,howtousethemandwhattopicstheybestaddress.Meta-searchengines
`reducetheuserburdenbydispatchingqueriestomultiplesearchenginesinparallel.
`TheSavvySearchmeta-searchengineisdesignedtoe(cid:14)cientlyqueryothersearch
`enginesbycarefullyselectingthosesearchengineslikelytoreturnusefulresults
`andbyrespondingto(cid:13)uctuatingloaddemandsontheWeb.SavvySearchlearnsto
`identifywhichsearchenginesaremostappropriateforparticularqueries,reasons
`aboutresourcedemandsandrepresentsaniterativeparallelsearchstrategyasa
`simpleplan.
` TheApplication:Meta-SearchontheWeb
`Companies,institutionsandindividualsmusthaveapresenceontheWeb;eachare
`vyingfortheattentionofmillionsofpeople.Nottoosurprisinglythen,themostsuccessful
`applicationsontheWebtodatearesearchengines:toolsthatassistusersin(cid:12)nding
`informationonspeci(cid:12)ctopics.
`Avarietyofsearchenginesareavailable,fromgeneral,robotbased(e.g.,AltaVista
`[MonierandBurrows,],WebCrawler[Pinkerton, ])totopicorareaspeci(cid:12)c(e.g.,
`FTPSearch[Eggeetal., ],DejaNews[Madere, ]).Eachemploysdi(cid:11)erentalgo-
`rithmsforcollecting,indexingandsearchinglinks;thus,eachreturnsdi(cid:11)erentresultsfor
`similarqueries.Empiricalresultsindicatethatnosinglesearchengineislikelytoreturn
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 1
`
`
`
`morethan%oftherelevantresults[SelbergandEtzioni, a].To(cid:12)ndwhatthey
`desire,usersmayneedtoqueryseveralsearchengines;meta-searchenginesautomate
`thisprocessbysimultaneouslysubmittingasinglequerytomultiplesearchengines.
`Thesimplestmeta-searchenginesareformsthatallowtheusertoindicatewhich
`searchenginesshouldbecontacted(e.g.,All-In-One[Cross, ],METASearch[Services, ]).
`ProFusion[ofKansasDesignLab,,Gauchetal., ]givestheuserthechoiceofselect-
`ingsearchenginesthemselvesorlettingProFusionselectthreeofsixrobotbasedsearch
`enginesusinghandbuiltrules.MetaCrawler[SelbergandEtzioni, a,SelbergandEtzioni, b]
`signi(cid:12)cantlyenhancestheoutputbydownloadingandanalyzingthelinksreturnedby
`thesearchenginestopruneoutunavailableandirrelevantlinks.
`Meta-searchenginesreducetheburdenontheuser.Theymakeavailablesearchen-
`ginesthatmayhavebeenunknowntotheuser.Theyhandlethesimultaneoussubmission
`ofqueries;somedirectthequerytoappropriateenginesandsomepost-processtheresults
`aswell.Theyprovideasingleinterface(withthedownsidethattheymaynotsupport
`allthefeaturesofthetargetsearchengines).
`Unfortunately,meta-searchcanleadtothe\tragedyofthecommons"problemfrom
`economicsinwhichanindividual'sbestinterestsruncountertosociety's.
`Individual
`usersappeartobebestservedbysimultaneouslysearchingeverypossiblesearchengine
`ontheWebfordesiredinformation.Yet,theprocessmaywasteWebresources:network
`loadandsearchenginecomputation.
`Webelievethatameta-searchsystemcanbeagoodWebcitizen[Eichmann, ]by
`targetingthosesearchengineslikelytoreturnusefulresultsandrespondingtochanging
`loaddemandsontheWeb.Toprovidethisfunctionality,weincorporatedsimpleAItech-
`niquesinameta-searchengine.Ourmeta-searchenginelearnstoidentifywhichsearch
`enginesaremostappropriateforparticularqueries,reasonsaboutresourcedemandsand
`representsaniterativeparallelsearchstrategyasasimpleplan.
` OurMeta-SearchSystem:SavvySearch
`SavvySearchisourmeta-searchsystem[Dreilinger, ,DreilingerandHowe, ],
`availableathttp://guaraldi.cs.colostate.edu: /.Itrunson(cid:12)vemachines(three
`SUNSPARCStationsandtwoIBMRS s)atColoradoStateUniversity.Thesystem
`was(cid:12)rstmadeavailableinMarch andhasundergoneseveralrevisionssincethe
`originaldesign.Atpresent,twoversionsofthesystemareavailable:theonedescribed
`hereandanexperimentalinterfacethatwillbementionedinSection .
`SavvySearchisdesignedtobalancetwopotentiallycon(cid:13)ictinggoals:maximizing
`thelikelihoodofreturninggoodlinksandminimizingcomputationalandWebresource
`consumption.Thekeytocompromiseisknowingwhichsearchenginestocontactfor
`speci(cid:12)cqueriesatparticulartimes.SavvySearchtrackslongtermperformanceofsearch
`enginesonspeci(cid:12)cquerytermstodeterminewhichareappropriateandmonitorsrecent
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 2
`
`
`
`performanceofsearchenginestodeterminewhetheritisevenworthtryingtocontact
`them.Inthissection,wedescribeSavvySearchfromauser'sperspective.Wefollowa
`runningexampleindicatingwhatauserseesandwhatgoesonbehindthescenesin
`processingasearchrequest.
`. SubmittingaQuery
`To(cid:12)ndoutabout\arti(cid:12)cialintelligenceconferences",weenterthequery,selectthe
`\integrateresults"option,andclickonthe\SavvySearch!"button,asshowninanimage
`oftheinterfaceinFigure .Thesearchform,thequeryinterfacetoSavvySearch,asks
`theusertospecifyasetofkeywords(queryterms)andoptionsforthesearch.Users
`typicallyentertwoqueryterms.
`Theoptionscoverthetreatmentoftheterms,thedisplayofresultsandtheinterface
`language.Querytermsmaybecombinedwithlogical\and"(allquerytermsmust
`beincludedindocuments),\or"(anyquerytermshouldbepresent)orasanordered
`phrase.Threeaspectsoftheresultsdisplaycanbevaried:thenumberoflinksreturned,
`theformatofthelinksdescriptionandthetiming.Bydefault, linksaredisplayed
`withtheURLsanddescriptionswhenavailable,andtheresultsofeachsearchengineare
`listedseparatelyastheyarrive.Alternatively,wecouldchangethenumberoflinksupto
` ,returnlessormoredescriptionofthelinksandinterleavetheresultsoftheseparate
`searchengines.Theinterfaceisalsoavailablein di(cid:11)erentlanguages .
`. ProcessingaQuery
`Whenausersubmitsthequery,SavvySearchmustmaketwodecisions:howmany
`searchenginestocontactsimultaneouslyandinwhatorderthesearchenginesshould
`becontacted.The(cid:12)rstrequiresreasoningabouttheavailableresourcesandthesecond
`aboutrankingthesearchengines.
`.. ResourceReasoning
`Eachsearchenginequeriedexpendsnetworkandlocalcomputationalresources.Thus,
`modifyingconcurrency(numberofsearchenginesqueriedinparallel)isthebestwayto
`moderateresourceconsumption.Concurrencyisafunctionof:
`NetworkLoadEstimateswhicharedeterminedfromalookuptablecreatedfrom
`observationsofthenetworkloadatthistimeofdayinthepast,
`LocalCPULoadwhichiscomputedusingtheUNIXuptimecommand.
` Wethanktheuserswhotranslatedtheinterfaceforus.
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 3
`
`
`
`Figure :UserinterfacetoSavvySearchforenteringaquery
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 4
`
`
`
`Concurrencyhasabasevalueoftwo;uptotwoadditionalunitsareaddedperload
`estimateforperiodsoflowload.Thus,themaximumconcurrencyvalueissix.
`.. RankingSearchEngines
`SavvySearchincludesbothlargerobot-basedsearchenginesandsmallspecialized
`searchenginesinitsset.Thelargesearchenginesarelikelytoreturnlinksforanyquery,
`buttheselinksmaynotbeasappropriateaslinksreturnedbyaspecializedsearchengine
`foraqueryinitsarea.
`Thepurposeofrankingistodeterminewhichsearchenginesaremostworthwhileto
`contactforagivenquery.Searchenginesarerankedbasedon:
`(cid:15)learnedassociationsbetweensearchenginesandqueryterms(storedinameta-
`index)and
`(cid:15)recentdataonsearchengineperformance.
`TheMeta-Index:ACompendiumofSearchExperienceThemeta-indexmain-
`tainsassociationsbetweenindividualqueriesterms(simpli(cid:12)edbystemmingandcase
`stripping)andsearchenginesase(cid:11)ectivenessvalues.Highpositivevaluesindicateexcel-
`lentperformanceofasearchengineonqueriescontainingaspeci(cid:12)cterm;highnegative
`valuesindicateextremelypoorperformance.
`Thee(cid:11)ectivenessvaluesarederivedfromtwotypesofobservationsoftheresultsof
`users'searches.Weusedobservations(passivemeasures)becauseweobtainedalowrate
`ofresponsetorequestsforuserfeedback,aswellassomequestionableresponses.For
`eachsearch,wecollecttwotypesofinformation:
`NoResults:searchenginefailedtoreturnlinks,
`Visits:numberoflinksexploredbytheuser.
`Noresultsreducescon(cid:12)dencethatthesearchengineisappropriatefortheparticular
`query;Visitsindicatesthattheuserfoundsomereturnedlinkstobeinterestingandso
`increasescon(cid:12)dence.
`SavvySearchemploysasimpleweightadjustmentschemeforlearninge(cid:11)ectiveness
`values.Noresultsandvisitsaretreatedasnegativeandpositivereinforcement,respec-
`tively,amortizedbythenumberoftermsinthequery.Thus,ifasearchenginereturned
`nothingfortheexamplequery,thee(cid:11)ectivenessvaluesfor\arti(cid:12)cial",\intelligence"and
`\conferences"wouldeachbereducedby .Althoughsimple,thisschemeprovedtobe
`quitee(cid:11)ective(seeSection forabriefdescriptionofourevaluationofthelearning).
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 5
`
`
`
`TrackingRecentPerformanceSearchenginesoccasionallyareinaccessibleorslow
`torespond.Theirnetworkconnectionsmaybeatfaultortheenginesthemselvesmay
`beexperiencingproblemsorupgrades.SavvySearchmonitorsrecentperformanceby
`recordingthenumberoflinksreturned(hits)andtheresponsetimeforthelast(cid:12)ve
`queriessubmittedtoeachsearchengine.Givencurrentusageofthesystem,(cid:12)vequeries
`correspondstoaboutsecondsforthelarge,generalsearchenginesandabout(cid:12)fteen
`minutesfortheinfrequentlyusedsearchengines.
`CalculatingRankfromExperiencesForagivenquery(q),asearchengine's(s)
`rank(Rq;s)isitsqueryscorereducedbyapenaltyforrecentpoorperformanceonhits
`(h)andresponsetime(r):
`Rq;s=Qq;s(cid:0)(Ps;h+Ps;r)
`Eachtermisnormalizedtobetween and .
`TheequationforQq;sisbasedonacommonapproachfrominformationretrieval
`calledTermFrequencytimesInverseDocumentFrequency[Wittenetal., ]. The
`queryscore,Qq;s,sumsthemeta-indexvaluesforthetermsinthequeryweightedby
`ubiquityofthequerytermandthesearchengine:
`Qq;s=XtqMt;s(cid:3)ItpTs
`Mt;sistheweightfromthemeta-indexofthetermtforsearchengines.
`Itisthe
`inverseserverfrequencyoftheterm,whichestimatestheubiquityofthequeryterm;
`frequentlyoccurring,commontermsprovidelittleinformationfordistinguishingsearch
`engineperformanceandsoarediscounted.Tssumstheabsolutevaluesofallmeta-index
`valuesforsearchengines;thistermestimatesthefrequencyofoveralluseofthesearch
`engine.Becausethemostgeneralsearchenginesarelikelytobeusedmorefrequently
`andarelikelytoreturnsomethingforanyquery,theirweightswilltendtogrowlarger
`andmorequicklythanthespecializedsearchengines.Tsmitigatesthetendencytowards
`theirdominance,allowingmorespecializedenginestobeselectedwhenappropriate.
`Thepenalties(Ps;handPs;r)areaccruedonlyifthresholdsonminimumnumberof
`hitsandmaximumresponsetimeareexceeded.Thethresholdshavebeensetsomewhat
`arbitrarilytoaverageof hitandresponsetimeof secondsrecently.Oncethethresh-
`oldsarepassed,thepenaltiesincreasequadraticallyuptotheworstpossiblevalues:zero
`forhitsandtimeoutofsecondsforresponsetime.Detailsoftheseequationsare
`availablein[Dreilinger, ].
`. DispatchingaQuery
`Thedegreeofparallelismdetermineshowmanysearchenginestoquery;therank
`orderdetermineswhichshouldbequeried.SavvySearchdispatchesthequeryinparallel
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 6
`
`
`
`toeachoftheindicatedsearchengines.Foreachsearchengine,aspecializedinterface
`agentformatsthequeryaccordingtotheinterfaceforthesearchengineandsubmits
`it.Theinterfaceagentwaitsapresetamountoftimeforaresponse,handleserrors
`thatmightoccur,parsestheresultintoauniformformatandforwardsittoanother
`componentfordisplay.
`. PresentingResults
`Forourexamplequery,thethreesearchenginescontacted(WebCrawler,Lycosand
`YellowPages)returned di(cid:11)erentlinkswithsimilarnames.The(cid:12)rstnineareshown
`inanimageoftheresultpageinFigure.Theseresultswereintegrated;SavvySearch
`waiteduntilallresultshadbeenreceivedandthenconstructedasinglerankedlist.
`Resultsareintegratedbynormalizingthescoresreturnedbysearchenginestobetween
` and . andsummingthemforeachlink;linksforsearchenginesthatdidnotreturn
`scoreswerearbitrarilyassignedascoreof ..Duplicatelinksarelistedwiththenames
`ofallthesearchenginesthatreturnedthem.
`Thebottomofeachresultspagedisplaysthesearchplan,asshowninFigure .The
`querywasissuedduringatimeofmoderatelyhighdemand;thus,eachstepincludesonly
`threesearchengines.Searchplanscanincludebetweentwoandsixsearchenginesper
`step.Thepresentversionincludes searchengines;thenumber(cid:13)uctuatesassearch
`enginesappear,disappearandmerge.
`Tosupplementtheresultscollectedsofar,theusercanexecuteanotherstepinthe
`searchplanbyclickingontheonethatlooksthemostpromising.Theorderingrepresents
`SavvySearch'sbestguessaboutwhichsearchengineswillreturnthebestresults;theuser
`caneasilyoverrideitbyselectingadi(cid:11)erentstep.Inonelongtermstudy,wefoundthat
`usersexecuted .stepsintheirsearchplansonaverage.
` RetrospectiveandProspectiveViewsoftheSavvy-
`SearchProject
`TheSavvySearchprojecthasbeenquitesuccessful.Currently,SavvySearchprocesses
`over , querieseachdayand,basedonelectronicmail,hasattractedalargewell-
`satis(cid:12)eduserbase.
`SavvySearch,asdescribedhere,istheresultofalmosttwoyearsofworkanda
`seriesofstudies[DreilingerandHowe,toappear].Thepresentdesignoftheresource
`reasoningandlearningalgorithmresulted,inpart,fromtheresultsofthesestudies
`[Dreilinger, ].Westudiedthee(cid:11)ectsofthelearningbystartingfromaminimalmeta-
`index(compiledfromdaysworthofdata)andallowingittoaccumulateexperienceover
`adayperiod.Wefoundthatoursimplelearningalgorithmimprovedperformanceon
`bothvisitsandnoresultsoverall(includingallsearchenginesandqueryterms):visits
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 7
`
`
`
`Figure:SavvySearchdisplayofresultsofarti(cid:12)cialintelligenceconferencesquery,inter-
`leaveddisplay
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 8
`
`
`
`Figure :Searchplanforarti(cid:12)cialintelligenceconferencesquery,whichallowsusersto
`continuesearchifcurrentresultsareinadequate
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 9
`
`
`
`averaged. inthe(cid:12)rst(cid:12)vedaysand.inthelast(cid:12)vedays;noresultsaveraged.
`inthe(cid:12)rst(cid:12)vedaysand. inthelast.Thus,thislearningalgorithmleadstobetter
`selectionofgoodsearchenginesthanpruningofpoor.
`Inafollowupstudy,weexaminedhowmuchknowledgewasneededtosigni(cid:12)cantly
`improveperformanceusinglearning.Wefoundthatasfewas usagesofaqueryterm
`resultsinahalvingofnoresultsfrom. to. .Whilevisitsincreasesmoreslowlywith
`learning,anincreasefrom. to.isobtainedwithjust examples.Bytheend
`ofthestudy,themostusedterm( usages)hadvisitsof.andnoresultsof. .
`Giventhelevelofusageofmostsearchengines(millionsofqueriesperday),meta-search
`enginesshouldhavelittledi(cid:14)cultyincollectingenoughdatatosigni(cid:12)cantlyimprove
`performancethroughlearning.
`AsWebusageandusersbecomesmoresophisticated,meta-searchwillneedtobe
`abletobepersonalizedandembeddedinothersystems.Meta-searchcurrentlytakesa
`\one-size-(cid:12)ts-all"approachinwhichtheknowledgeunderlyingqueryprocessingisshared
`byallusers.AnewexperimentalversionofSavvySearch(availableviathemainWeb
`page)takesa(cid:12)rststeptowardpersonalizationbydividingsearchesintoeightcategories;
`eachcategorytranslatestoasetofrulesforcreatingasearchplanforthattypeofsearch.
`Ideally,usersthemselvescouldcreateandstoretheirownstereotypicalsearchplansora
`systemcouldinferthem.
`Justasprimarysearchenginesarenowaresourceformeta-searchengines,someta-
`searchenginesshouldbecomeameansofachievinghigherlevelgoals,ratherthananend
`initself.Intelligentagenttechnologypromisestoalleviatethetediumandfrustrationof
`mundanetasksandnavigatevastinformationspaces.Meta-searchshouldbeainforma-
`tiongatheringtoolforhelpinghumanusersandtheirintelligentagents(cid:12)ndwhatthey
`needontheWeb.
`TheWebwillcontinuetogrow.Thus,searchtoolswillcontinuetobecriticalfor
`managingtheinformationdeluge.Tokeeppacewiththeexpansion,thenextgeneration
`mustincludefarmoresophisticatedAItechniquesthanthecurrent,whileretainingsome
`ofthebene(cid:12)tsofthecurrentsystems:beeasytouse,requirelittlefeedbackfromthe
`usersandbemindfulofsharedresources.
` Acknowledgments
`TheexperimentsandequipmentweresupportedinpartbyARPA-AFOSRcontract
`F - -C- ,NSFResearchInitiationAward#IRI- andNSFCareerAward
`#IRI- .WegratefullyacknowledgetheassistanceofLawrenceLivermoreLabora-
`toryandtheComputerScienceDepartmentatColoradoStateUniversityforproviding
`machinestorunSavvySearchandWayneTrzynaandhissta(cid:11)forpatientlykeepingthe
`machinesrunning.
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 10
`
`
`
`References
`[Cross, ]WilliamCross.All-In-OneSearchPage.http://www.albany.net/allinone/,
` .
`[DreilingerandHowe, ]DanielDreilingerandAdeleHowe.Aninformationgather-
`ingagentforqueryingwebsearchengines.Technicalreport,ColoradoStateUniversity,
` .TR - .
`[DreilingerandHowe,toappear]DanielDreilingerandAdeleE.Howe.Experiences
`withselectingsearchenginesusingmeta-search.ACMTransactionsonInformation
`Systems,toappear.
`[Dreilinger, ]DanielDreilinger.Descriptionandevaluationofameta-searchagent.
`Master'sthesis,ComputerScienceDept.,ColoradoStateUniversity,Spring .
`[Eggeetal., ]TorEgge,HugoEideGunnarsen,andStigS(cid:26)therBakken.FTP
`Searchv . .http://ftpsearch.unit.no/ftpsearch, .
`[Eichmann, ]DavidEichmann.
`Ethicalwebagents.
`InElectronicPro-
`ceedingsof
`theSecondWorldWideWebConference' : Mosaicandthe
`Web,
` .
`http://www.ncsa.uiuc.edu/SDG/IT /Proceedings/Agents/
`eich-
`mann.ethical/ethics.html.
`[Gauchetal., ]SusanGauch,GuijunWang,andMarioGomez.Profusion:Intel-
`ligentfusionfrommultiple,di(cid:11)erentsearchengines.JournalofUniversalComputer
`Science,( ),Sept. .
`[Madere, ]Steve
`Madere.
`DejaNews
`research
`service.
`http://dejanews .dejanews.com/, .
`[MonierandBurrows,]LoiusMonierandMikeBurrows.AltaVistaSearchEngine.
`http://www.altavista.digital.com/.
`[ofKansasDesignLab,]UniversityofKansasDesignLab.
`ProFusionmeta-search.
`http://www.designlab.ukans.edu/profusion/.
`[Pinkerton, ]BrianPinkerton.WebCrawlerHomePage.http://webcrawler.com/,
` .
`[SelbergandEtzioni, a]ErikSelbergandOrenEtzioni.Multi-servicesearchand
`comparisonusingtheMetaCrawler.
`InProceedingsofthethInternationalWorld
`WideWebConference,December .
`[SelbergandEtzioni, b]ErikSelbergandOrenEtzioni.TheMetaCrawlerWWW
`SearchEngine.http://metacrawler.cs.washington.edu: /home.html, .
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 11
`
`
`
`search.
`META
`In(cid:12)NET
`Services.
`[Services, ]In(cid:12)NET
`http://members.gnn.com/in(cid:12)net/meta.htm, .
`[Wittenetal., ]IanH.Witten,AlistairMo(cid:11)at,andTimothyC.Bell.ManagingGi-
`gabytes:CompressingandIndexingDocumentsandImages.VonNostrandReinhold,
`NewYork, .
`
`
`
`Petitioner Apple Inc. - Exhibit 1007, p. 12