`[11] Patent Number:
`[45] Date of Patent:
`Nov. 16, 1999
`Dec., et al., “HTML & CGI”, Samsnet, 1995, pp. 432—466.
`Primary Examiner—Thomas G. Black
`Assistant Examiner—David Yiuk Jury
`Attorney, Agent, or Firm—Coudert Brothers
`Inventor: Allen Hobbs, 26 E. 10th St., Apt. 8C,
`New York, NY. 10003
`Appl. No.: 08/871,773
`Jun. 9, 1997
`Int. Cl.6 .................................................... .. G06F 17/00
`US. Cl. ............................ .. 707/4; 707/201; 707/101;
`707/501; 707/10
`Field Of Search ...................... .. 707/1—206, 501—513
`References Cited
`5,367,621 11/1994 Cohen et al. ......................... .. 395/154
`4/1995 Oren et al. ............................ .. 395/600
`5,455,945 10/1995 VanderDrift
`.. 395/600
`4/1996 Miller .................................... .. 395/600
`5/1996 Weiner ............................. .. 364/41901
`6/1996 Meske et al.
`.. 395/600
`8/1996 May et al. ............................ .. 395/600
`8/1996 Jacob ............................... .. 364/47901
`9/1996 Matsunaga et al.
`.. 395/600
`5,577,241 11/1996 Spencer ................................. .. 395/605
`5,586,260 12/1996 Hu ...................................... .. 395/200.2
`5/1997 Oren et al. ..
`. 395/602
`7/1997 Hekmatpour
`6/1998 Logan et al.
`7/1998 Logan et al.
`7/1998 Anderson et al.
`Apparatus and method are disclosed for selecting multime
`dia information, such as video, audio, graphics and text
`residing on a plurality of Data Warehouses, relational data
`base management systems (RDMS) or object-oriented data
`base systems (ODBA) connected to the Internet or other
`network, and for linking the multimedia information across
`the Internet, or other network, to any phrase, word, sentence
`and paragraph of text; or numbers; or maps; charts, and
`tables; or still pictures and/or graphics; or moving pictures
`and/or graphics; or audio elements contained in documents
`on an Internet or intranet web site so that any viewer of a
`web site, or other network resource, can directly access
`updated information in the Data Warehouse or a database in
`real time. The apparatus and method each:
`stores a
`plurality of predetermined authentication procedures (such
`as user names and passwords) to gain admittance to Data
`Warehouses or databases, (ii) stores the Universal Resource
`Locators of intranet and Internet addresses of a plurality of
`expert predetermined optimum databases or Data Ware
`houses containing text, audio, video and graphic
`information, or multimedia information relating to the infor
`mation on the web site or other network resource; (iii) stores
`a plurality of expert-predetermined optimum queries for use
`in the search engines of each of the pre-selected databases,
`each query representing a discrete searchable concept as
`expressed by a word, phrase, sentence or paragraph of text,
`or any other media such as audio and video on a web site,
`or other network resource; and (iv) presents to the user the
`results of a search of the Data Warehouse or database
`through a graphical user interface (GUI) which coordinates
`and correlates viewer selection criteria with the expert
`optimum remote database selection and queries.
`61 Claims, 17 Drawing Sheets
`11mm» phrase In the document InraxampIs
`wins an
`‘[1 Mwwewwme mamas pw
`mmlym mutspsmfn: Imkas‘I' IN un?esol Imks
`related in me Mumahve [mushy
`voa sum MR2
`vcn 51,199 was
`551mm vm
`snwwmémalvan Ieln
`DavaxwepIonmePm Sen/2v
`255 252
`car mam" m]
`mpg-‘mm .25,
`Sends I! in mm]
`usB by mm:
`nuns Dalawa
`[1 , an
`Amhemmalsx um
`s s s
`Runs Send] I"
`Search Engvle
`Search Enqwe r
`Seam Reuum
`Io 2am
`I0 259
`T0 255
`51mm,» on Proxy
`I em?minltmtu
`war m
`'VCR-hks'bunons m
`Hams 2
`U.S. Patent
`Nov. 16,1999
`Sheet 1 0f 17
`EH2 moilv
`U.S. Patent
`Nov. 16,1999
`Sheet 2 of 17
`U.S. Patent
`Nov. 16,1999
`Sheet 3 0f 17
`__:________________ @ mwDOImESS
`@ SEE?
`§\ F 5“
`H M ,
`52mm wwwzomwmm
`5%88 E25 $2
`wkwmgcmm :63
`@555 >5:
`$53 52mm
`5 EN
`W 8N u 8m :63 / E1258 / “E: >
`ii; TE; 526%
`v A f ‘ F A / mm
`1 .GE
`U.S. Patent
`Nov. 16,1999
`Sheet 4 of 17
`U.S. Patent
`Nov. 16,1999
`Sheet 5 of 17
`U.S. Patent
` ao_a8._E:80mamas.umo_mm_wEo..__P_m2:o¢xomm _e.E§=__\mm>>om_o\_osa___
`U.S. Patent
` 6:302 E_._n_:30mmwmE_umo_mmmEoIu._m2:o¢xommao~8_oEno.EEoo.eo8m;.;;z§%c
`U.S. Patent
`Nov. 16,1999
`Sheet 8 0f 17
`as m
`ucE E
`mg 5
`2 momma:
`F .GE
`U.S. Patent
`Nov. 16,1999
`Sheet 9 of 17
` Rm:22%8.8.....m$=_m_5mozmmao>_.oEoSm._o_.mE_£_E8>_3maEoo2mama=$5u8§o:_._m
` ao=8o._MEta:30mwmmE__umo_mmmEoIu._m>.:ou_xomm _EE.~-_8=§\mm>>om__o\_o$a___
`U.S. Patent
`Nov. 16,1999
`Sheet 10 of 17
` 0:00Emezuoo—9.:_o>m._ozEmmao___8%-§o$mCo
`U.S. Patent
`Nov. 16,1999
`Sheet 11 of 17
`YahooBus|PRNewswire“AndieFeedI Netscape
` Location:fiiezl//El/Ciccexp/ClBCWEB/fmdnal1?htm|
`U.S. Patent
`Nov. 16,1999
`Sheet 12 of 17
`U.S. Patent
` _
` ao=8o.. :30$92..umo_om_oEoIEmZ._ou_xomm _EE.~-_8=E<mm>>om_o\_osa__.
`U.S. Patent
`Nov. 16,1999
`Sheet 14 of 17
` _wee“.2x%:_ugmeo
`U.S. Patent
`05.05KOUOMGHu$m_w._-m>_.oE8_._m_UI m_$£s38:UHWN8_H3.H_mo_=m_..u:_Exam:8IU:mE.__<a2___8UI9m_8_Eo:o_UH%50%$U_.__¢L s_§a.8_§DaUcozeoeoo
` =5;.mx%=_am3om_o\_m§ma538..EQEOI
`U.S. Patent
`W0@.NC.___UmO_0W—GEOIU;N)>y_Ou_105$ 5880..E«CIRCUQO _§__%__§__.,.§_§_§o_~_8om_§§E_EE.mx%Emm>>om_o\_msa__.
` .952.58.280@WUSU:mEv__<w2___8_J|.._m_Um_mo_EFwoUH9ommc__E..m:
`U.S. Patent
`Nov. 16,1999
`Sheet 17 of 17
`m_$§>mo>a_.__H_Hmo_.=m=uc_ m=__§mDH=o_§oeoomceam_UHcozsoeoo.$.__UI=o_§oeoo
`1. Field of the Invention
`The present invention relates to information retrieval, and
`the application and deployment architecture for such infor
`mation retrieval. Speci?cally, the present invention concerns
`a multi-tier client/server model for record retrieval Wherein
`optimum record retrieval from a database is achieved based
`on embedded expert judgments linked to Words, phrases,
`sentences and paragraphs of text; or numbers; or maps,
`charts, and tables (including spread sheet; or still pictures
`and/or graphics; or moving pictures and/or graphics; or
`audio elements (hereinafter sometimes collectively referred
`to as the “links” or “Linked Terms,” or When any one of the
`aforementioned elements are used singly, as the “link” or
`“Linked Term”), contained in documents on a netWork
`resource, such as a Web site and incorporating an intuitive
`graphical user interface (GUI) to correlate through a plural
`ity of frames the retrieved records With records from one
`remote database or a large collection of remote databases
`maintained by one company, called a Data Warehouse, plus
`a means to select various databases or Data Warehouses and
`a comprehensive selectable index of the linked embedded
`2. Background Information
`“Pull” Technology
`A conventional information retrieval system includes a
`database of records, a processor for executing searches on
`the records, and application softWare that controls hoW the
`retrieval system, such as a database management system
`(DBMS), accepts the search queries, manages the search,
`and handles the search results. Generally, the database
`includes records such as text documents, ?nancial or court
`records, medical ?les, personnel records, graphical data,
`technical information, audio and video ?les or various
`combinations of such data. Typically, a user enters a pass
`Word and client billing information, and then initiates the
`search by ?nding the appropriate database or groups of
`databases to search and formulating a proper query that is
`sent to the DBMS. This process is knoWn as searching by
`pull technology. To effectively search and retrieve records
`from the database, the DBMS typically offers a limited
`variety of search operations, or query models, speci?cally
`designed to operate on the underlying records in the data
`base. The query models are coordinated and executed by an
`application generally referred to as a search engine. For
`example, a document database, such as a database of court
`opinions, may be organiZed With each court opinion as a
`record With ?elds for the title of the case, jurisdiction, court
`and body text. Asimple search engine may support a full text
`searching query model for all the text ?elds, individual ?eld
`searching, such as searching by court or jurisdiction, and
`various Boolean search operations such as and, or, and not.
`More sophisticated search engines may support the folloW
`ing query models:
`1. nested Boolean or natural language searches;
`2. grammatical connectors that search for terms in a gram
`matical relationship such as Within the same sentence or
`paragraph (e.g., “/s”, “/p”, etc.);
`3. proximity connectors that require search terms to appear
`Within a speci?ed number of terms of each other (e.g.,
`. exclusion terms (“BUT NOT”);
`. Weighted keyWord terms;
`. Wildcards;
`. speci?cation of the order in Which the database processes
`the search request (e.g., grouping Words in parenthetical
`8. restriction of the search to certain ?elds, and formulation
`of a restricted search such as by date, subject, jurisdiction,
`title, etc.; and
`9. combination of the ?elds of search.
`In addition, large commercial database providers, such as
`typically have thousands of individual databases. These
`large commercial database providers are Data Warehouses,
`Which comprise an architecture and process Where data are
`extracted from external information providers, then
`formatted, aggregated, and integrated into a read only data
`base that is optimiZed for decision making. Users subscribe
`to the Data Warehouses by monthly or yearly subscription,
`and then typically pay strati?ed levels of hourly charges for
`access to certain databases, or groups of databases.
`DraWbacks of Pull Technology
`One limitation of existing information retrieval systems,
`especially among the commercial Data Warehouses, is the
`burden on the user to ?rst enter client and billing information
`and passWords to gain access and initiate the search, and
`then formulate the search query. Typically, the subscription
`based commercial database services provide passWord
`administration and extensive catalogues, both in print and
`on-line, describing the content and scope of the databases
`offered, and in some cases, live assistance by telephone by
`reference librarians Who assist the user to ?nd the proper
`databases. HoWever, the user must remember the passWord,
`and spend time ?nding the proper database by catalogue,
`on-line access, or phone, or else incur more expensive
`hourly charges searching through single databases or groups
`of databases for the appropriate database content and scope.
`A second limitation of pull technology is the formulation
`of the search query. To use the more poWerful commercial
`Data Warehouses effectively, a user must be trained to use all
`of the aforementioned query models, and have suf?cient
`knoWledge of the topic to choose the appropriate keyWords
`or natural language terms. The complexity of the search
`process compels the commercial Data Warehouses to offer
`training and keyWord help to their subscribers by multiple
`publications that describe search tips; interactive softWare
`based training modules; account representatives Who visit
`the user and train him or her; and customer service and
`reference librarians available by phone.
`A third limitation of pull technology concerns hoW it is
`employed on the World Wide Web area of the Internet
`(“WWW”) by such search engines as THE ELECTRIC
`WWWWORM, and YAHOOI, just to name a feW. These
`search engines’ query models are beginning to approach the
`sophistication and complexity of those of the commercial
`database companies, but unlike the commercial databases,
`they offer minimal customer support. Another draWback of
`the Internet search engines, Well documented in the com
`puter business and popular press, is that their search engine
`algorithms cause multiple irrelevant responses to a query.
`Other draWbacks of Internet search engines employing pull
`technology include:
`1. The great majority of the Internet search engines have no
`control over the records in their database. Unlike the
`commercial Data Warehouses Who have an ongoing rela
`tionship With the content provider (usually by a license
`agreement), and Who carefully screen, cleanse and format
`the information provided by their information providers,
`many Internet search engines sWeep through the WWW
`periodically and automatically, and catalogue Web sites as
`records in their databases. They also permit any Web
`publisher to submit his or her Web site as a record entry
`With little or no prior screening.
`2. As a result of little or no screening, and absolutely no
`contact With the information provider, Internet search
`engines often provide search results that have multiple
`“dead ends,” the result of links Which are often moved or
`deleted after the search engines have catalogued them.
`Moreover, the Web sites’ authors can sometimes manipu
`late the Words on their site and cause the Internet search
`engines to list their Websites higher on the search engine’s
`relevancy lists than other Web sites.
`3. The search engines’ databases include only a fraction of
`the Internet’s content, and even then, the content may be
`from dubious sources, or sources Which are not updated
`4. Where the Web sites include embedded search terms in
`links in documents to existing Internet search engines or
`current aWareness “neWs” databases, since the Words are
`linked to the free Internet search engines discussed above,
`the information retrieved, for reasons explained above, is
`not reliable and users often receive multiple irrelevant
`responses. Words linked to the current aWareness data
`bases receive more useful information, but there is no
`GUI correlating and synchronizing the records of multiple
`databases. Typically, those Web sites pass authentication
`information by the QUERYiSTRING environment vari
`able. Once placed on the command line by the broWser,
`the vieWer can see all passWords and usernames in the
`authentication argument.
`The considerable logistical and practical draWbacks of
`pull technology are illustrated in the folloWing example of
`an investment banker Who is responsible for buying bonds
`for an institutional investor, such as a bank or an insurance
`company. This hypothetical investment banker, based on an
`actual person, Will be used at different points throughout this
`patent application to illustrate and support the novelty and
`unobviousness of the present invention.
`Every Week, this investment banker must go before a
`board of executives at his bank and provide them With a list
`of bonds that he had examined and analyZed and recom
`mends to the bank to buy. In order to do his due diligence
`he must cover in his report ?ve areas of research concerning
`the bond: 1) compare the bond price to other bond prices (the
`Bond Comparables); 2) obtain historical data concerning the
`bond and the company issuing the bond (the Historical
`Data); 3) obtain the Securities and Exchange ?lings, such as
`10K’s, and 10Q’s for the company issuing the bond (the
`SEC Filings); 4) obtain speci?c information from a Wide
`variety of publications concerning the industry in Which the
`company operates (the Industry Data); and 5) obtain infor
`mation concerning the historical and anticipated perfor
`mance of the company’s stock (the Stock Data).
`Furthermore, he has to read various neWsletters and White
`papers issued by investment banks desiring to sell the bonds
`to him, and Which analyZe the bonds using the same criteria
`mentioned above. In order to collect the data, this invest
`ment banker must log on and enter passWord and billing
`information; ?nd the appropriate databases; and formulate
`the search and obtain the results in three to ?ve different
`Data Warehouses, each of Which are organiZed differently
`from one another and have different methods to enter search
`queries, and different query models. While pull technology
`satis?es the demands for the breadth and depth of the search
`(since the user can formulate his or her oWn queries, and
`make unlimited selections of databases to search) it is time
`consuming, cumbersome and expensive because the user
`must ?nd the appropriate query formulation and database or
`databases Within Which to run the query, sometimes even in
`different Data Warehouses.
`“Push” Technology
`In response to the ?ood of information facing the typical
`Internet user under the pull model, the complexity of the
`query statements, and the Well documented inability of the
`Internet search engines to locate and deliver relevant
`content, softWare companies developed softWare agents to
`push information to users. The push model is also knoWn as
`Under push, computers sift through large volumes of
`information, ?ltering, retrieving and then ranking in order of
`importance articles of current interest. The user ?lls out a
`“pro?le” (also called a “channer”), that de?nes a prede?ned
`area of interest or activates a ?lter. This, in turn, causes the
`Webcast search engine to search its oWn databases, or the
`databases of others, for content matching the pro?le or the
`?lters submitted by the user. The user, in order to access the
`channels and have the content “pushed” to him or her, must
`doWnload special client softWare Which acts either indepen
`dently of, or in conjunction With, the user’s broWser.
`Alternatively, a user can access a dynamically generated Web
`page on the Webcaster’s server that lists the found articles.
`(An example of a dynamically generated Web page is
`“NeWspage Direct” by Individual, Inc.)
`One early version of the Internet push model, developed
`by Pointcast Inc., clogged the netWork behind a company’s
`employees’ ?reWall When large numbers of the company’s
`softWare agents pulled information from Pointcast’s servers
`on the Internet at or near the same time. Pointcast later
`alleviated this problem by providing remote servers that
`could operate behind a company’s ?reWall and request and
`collect (or cache) information at once or at predetermined
`times from the Pointcast severs on the Internet. These
`intermediate servers then pushed the information to
`employees, Which effectively centraliZed the distribution of
`information in the Information Services (IS) department.
`As mentioned above, all push technology requires that
`users compile a “pro?le” to detail their interests. The prior
`art of delivering the information obtained by the search
`engine pursuant to the pro?le is divided into three broad
`categories: of?ine broWsers; e-mail delivered content pro
`viders and information channels.
`The of?ine broWsers typically operate by requiring a user
`to complete a pro?le With predetermined categories; auto
`matically search the Internet for the information speci?ed in
`the pro?le and doWnload the materials to the user’s hard
`drive for vieWing at a later time When the user is off the
`Internet. This ?rst category of products include: Freeloader
`by Freeloader, Inc.; Smart Delivery by FirstFloor, Inc.;
`WebEx by Traveling SoftWare, Inc.; WebRetriever by Folio
`Inc. and Web Whacker by ForeFront Group, Inc.
`The second category of push products delivers the results
`of searches performed pursuant to the user’s pro?le directly
`to the user’s e-mail box, and includes: Netscape’s Inbox
`Direct and Microsoft Mail.
`The third category of push products arranges the prede
`termined categories into “channels” and uses ?lters to alloW
`users to customize their news deliveries from a broad range
`of proprietary news sources. It is claimed that the results of
`the searches are pushed or “broadcast” in real time to the
`viewer. Examples of this type of service include: BackWeb
`by BackWeb, Inc.; Headliner by Lanacom, Inc.; Incisa by
`Wayfarer, Inc.; Intermind by Intermind, Inc.; Pointcast by
`Pointcast, Inc.; and Marimba by Marimba, Inc. However,
`since the retrieved data is first cached on the service pro-
`vider’s server (e. g. Pointcast’s server), and then again on the
`companys’ servers behind the firewall,
`the results of the
`search are not really “broadcast in real time.”
`There is a fourth category of push products which do not
`fall neatly into any of the above three categories of delivery.
`Citizen 1 by Citizen 1 Software, Inc., is a human organized
`hierarchical listing of free Internet search engines. The user
`can then select a number of databases which fall under that
`category, and run several simultaneous queries in the data-
`bases. Digital Bindery by Digital Bindery Company allows
`users to “subscribe” to web pages as they browse. Once a
`subscriber, the user will automatically receive via e-mail any
`updates to the web pages to which the user subscribed.
`Webcasting attempts to eliminate the inefficiencies of pull
`technology, namely the time consuming and unproductive
`hunt for information through Internet search engines.
`Instead of an open ended search through many databases
`linked to the web by various search engines, as is done under
`the pull model, push substitutes one central secure database
`which has collected either the content itself, or the links to
`the content. However,
`in spite of the name, push,
`information provider does not drive the distribution of data.
`Instead, a client (in a client/server arrangement) contacts the
`information provider and requests the information. The
`client then downloads the information in the background,
`giving the impression that it is broadcast, when in fact, it is
`only automatically downloaded at a predetermined time.
`Shortcomings of “Push” Technology
`“Push” may be a satisfactory method for serving infor-
`mation to knowledge workers who depend on a constant
`stream of updated factual
`information served in narrow
`categories. Examples of these kinds of workers would be
`sales representatives who must find new prospects, staff in
`field offices who must be aware of sudden price changes,
`information managers who must distribute software
`upgrades and marketing professionals who must be aware of
`the new products released by the competition.
`there is a category of knowledge workers
`whose information needs are not properly satisfied by push
`technology. The hypothetical investment banker discussed
`above is an example of such a knowledge worker. These
`knowledge workers cannot use “filters” and “profiles” to
`provide the most relevant information since the information
`they need cannot easily fit into categories, but rather spans
`categories. These knowledge workers use information to
`solve problems that are rarely alike. They need information
`to solve a problem, but they do not know what they need day
`to day.