`
`(19) World Intellectual Property
`Organization
`International Bureau
`
`(43) International Publication Date
`16 September 2004 (16.09.2004)
`
`
`
`PCT
`
`(10) International Publication Number
`WO 2004/079485 A2
`
`(51) International Patent Classification’:
`
`G06F
`
`(21) International Application Number:
`PCT/GB2004/000959
`
`(81) Designated States (unless otherwise indicated, for every
`kind of national protection available): AE, AG, AL, AM,
`AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN,
`CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI,
`GB, GD, GE, GH, GM, HR, HU,ID,IL, IN, IS, JP, KE,
`(22) International Filing Date:=5 March 2004 (05.03.2004)
`KG,KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD,
`MG, MK, MN, MW,MX, MZ, NA, NI, NO, NZ, OM,PG,
`PH,PL, PT, RO, RU, SC, SD, SE, SG, SK, SL, SY, TJ, TM,
`TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM,
`ZW.
`
`(25) Filing Language:
`
`(26) Publication Language:
`
`English
`
`English
`
`(30) Priority Data:
`0305145.5
`
`6 March 2003 (06.03.2003)
`
`GB
`
`(71) Applicant (for all designated States except US): IMPE-
`RIAL COLLEGE INNOVATIONSLTD [GB/GB]; Sher-
`field Building, Imperial College, London SW7 2AZ (GB).
`
`(72) Inventor; and
`(75) Inventor/Applicant (for US only): WHITWELL, Mar-
`tyn [GB/GB]; 42 Vale Road, Mansfield Woodhouse, Not-
`tinghamshire NG19 8EA (GB).
`
`(84) Designated States (unless otherwise indicated, for every
`kind of regional protection available): ARIPO (BW, GH,
`GM,KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW),
`Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), Euro-
`pean (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR,
`GB, GR, HU,IE, IT, LU, MC, NL,PL, PT, RO, SE, SI, SK,
`TR), OAPI (BF, BJ, CE, CG, CI, CM, GA, GN, GQ, GW,
`ML, MR,NE, SN, TD, TG).
`
`Published:
`
`(74) Agents: GWILYM,Robertset al.; Kilburn & Strode, 20
`Red Lion Street, London WC1R 4PJ (GB).
`
`without international search report and to be republished
`upon receipt of that report
`
`[Continued on next page]
`
`(54) Title: IMPROVEMENTSIN INTERNET SITE ARCHITECTURE
`
`
`
`17
`
`
`
`
`
`2004/079485AIITIMNUMINNANANTICTHINAMAIAERAT
`
`WO
`
`(57) Abstract: An apparatus for sending to a user data related to a page of information (17) such as a web pageusesa controller
`(12), a search module (14) with an index (15) for containing data relating to the pages of information. When a page of information
`is updated, the controller controls the search module to update the index to reflect the fact that the page has been updated.
`
`PROVI-1016 - Page 1
`
`PROVI-1016 - Page 1
`
`
`
`WO 2004/079485 A2
`
`_[IMINIMNITNIINTAIMNIITA TIAMAT ITAMAYA
`
`For two-letter codes and other abbreviations, refer to the "Guid-
`ance Notes on Codes and Abbreviations" appearing at the begin-
`ning of each regularissue of the PCT Gazette.
`
`PROVI-1016 - Page 2
`
`PROVI-1016 - Page 2
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`IMPROVEMENTSIN INTERNET SITE ARCHITECTURE
`
`The present invention relates to developments in the architecture of Internet
`
`sites. The present invention also relates to the manner in which data may be
`
`handled, retrieved from storage and prepared for sending to a user.
`
`The Internet is a vast network of computers and computer servers, all of which
`
`store information which may be downloaded orretrieved by a user. To use the
`
`Internet, a user with a computer (or other Internet enabled device such as a palm
`
`10
`
`device or mobile phone enabled with a WAP browser, or other suitable device)
`
`uses an Internet connection to browse online through the documents and files
`
`available on the Internet. He may also choose to download variousfiles, such as
`
`text, graphic, video or musicfiles.
`
`15
`
`Internet sites created and/or maintained by both business and non-business users
`
`alike often contain “pages” of information which includelarge files, the content
`
`of these files being in many of the available formats mentioned above. Such
`pages of information come in many formats: for example the page may
`comprise anyorall of the following: text, audio, video or other graphics (GIF or
`
`20
`
`equivalent files), Java applets etc. Often,
`
`the software applications which
`
`manage these Internet sites, when requested by a user, are required to perform
`
`intensive processing in order to retrieve the necessary data from the site in
`
`question and send the information to the user. With the advent of greater
`
`system requirements and the increasing complexity of the information which is
`
`25
`
`available on the Internet, systems presently in use are often cumbersome and
`
`slow to respondto the heavy processing required of them.
`
`PROVI-1016 - Page 3
`
`PROVI-1016 - Page 3
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`2
`
`As a means of browsing the Internet, a user may, typically, employ a web
`
`browser which allows access to various programs and functions for searching
`
`the Internet. One of the preferred methods of searching the Internet is to
`
`employ a search engine, which allows for the searching of pages ofinterest in a
`
`number of ways, for example by “key-word search”.
`
`Oneparticular context in which search engines are knownis their use within a
`
`specific web site or network, i.e. having a defined and finite set of pages in
`
`order to allow key-word searches in search systems. Periodically (for example
`
`10
`
`every few weeks, or possibly more frequently), a search engine will deploy a
`
`program, such as a “spider” program, to go to every page or representative
`
`pages on the web site or sites that the web site owner or administrator
`
`designates to be searchable to read it, using the hypertext links on each page to
`
`discover and read other related pages to the page in question. The spider
`
`15
`
`program can create an index in the search engine, sometimes called a
`
`“catalogue”, from the pages that are being read, populating the data in the
`
`index with data related to the information the spider program has read from the
`
`target page. When a user enters a key-word search entry, the search engine
`
`program compares it to entries in the index whichit already holds andreturns
`
`20
`
`results in the form of exact or close matches to the user. A problem with this
`
`system is that the spider program trawls the web only periodically, thereby
`
`meaning that data entries in its index are often outdated, and sometimes
`
`entirely wrong.
`
`25
`
`The invention is set out in the appended claims.
`
`PROVI-1016 - Page 4
`
`PROVI-1016 - Page 4
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`3
`
`Because the index is updated dynamically and hence is continuously up to date,
`the search engine will always direct the user to a page of information which is
`
`up to date.
`
`In a preferred embodiment of the invention, the invention further comprises a
`
`cache for storing data related to a plurality of pages of information. Caching
`
`servers and systems are used to store information temporarily or permanently,
`
`depending on the type of memory used for the cache; a common type of
`
`memory used for this purpose is very fast types of volatile memory where web
`
`10
`
`pages requested by a user are stored temporarily. Because the cache is
`dynamically updated according to the preferred aspect of the invention, a
`second user requesting the same page will be able to retrieve the desired page
`
`from cache memory rather than being directed to slower formats of storage
`
`which is usually employed as the primary storage device. An advantage of the
`
`15
`
`present arrangement is that all pages are kept updated in cache;
`
`there is a
`
`reduced processing burden on the server system then to provide the page in
`
`readable format to the user.
`
`In embodiments of the invention, the controller is operable to request that at
`
`20
`
`least one of the search module and the cache read said updated page to extract
`
`the relevant information. The controller instructs the search module and/or the
`
`cache to scan the updated web page to extract the relevant information (which
`will be in the form of data for the search engine index, or information in page
`
`format for the cache).
`
`25
`
`In an alternative arrangement, the controller is operable to send data related to
`
`said updated page to at least one of the searching means and caching means in
`
`PROVI-1016 - Page 5
`
`PROVI-1016 - Page 5
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`4
`
`order that they may extract the relevant information themselves. This further
`reduces the processing required of the controller; such processing is “delegated”
`
`to the search engine or caching device.
`
`In embodiments of the invention, the search module and the cache are operable
`
`to be updated simultaneously in response to the request from the controller.
`The efficiency of the system may be further enhanced by having both the search
`
`engine and the cache retrieve the relevant information in response to a single
`
`command fromthe controller.
`
`10
`
`In embodiments of the invention, the controller is operable to communicate
`
`with at least one of said search module and said cache in a cross-platform
`
`protocol. Cross-platform protocols such as Web Services which are relatively
`
`new developments allow Remote Procedure Calls (RPCs) between different
`
`15
`
`computer systems.
`
`In embodiments of the invention, the controller is operable to communicate
`
`with said search module and said cache over a network. Asa result the various
`
`components of the system may be remote from one another, connected over a
`
`20
`
`network. Any network such as Wide Area Network (WAN), or Local Area
`
`Network (LAN)or indeed wireless networks may beutilised in providing such
`
`a network.
`
`In embodiments of the invention, the cache is operable to send data related to a
`
`25
`
`page of information to a user. Once the data has been sent to the cache, the
`
`page may be sent onward from cache to a user who, being connected to the
`
`Internet, could be in any part of the world. He might also be in the same
`
`PROVI-1016 - Page 6
`
`PROVI-1016 - Page 6
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`3
`
`organisation as the web administrator. Aspects of the present invention lend
`themselves equally well to manyorall types of networks.
`
`In embodiments of the invention,
`
`the system further comprises a store for
`
`storing data related to a plurality of pages of information. Typical storage
`means such as a computer hard disk or server may be employedfor this task.
`
`In embodiments of the invention, the store comprises a database (for example,
`
`flat-file or relational databases) or any other type of storage program or
`
`10
`
`software suitable for this purpose, e.g. spreadsheets, proprietary binary files,
`
`simple text files, or XMLfiles (discussed further below).
`
`Alternatively the store could comprise a content management
`system.
`Proprietary content management systems
`such as
`the Microsoft Content
`
`15
`
`Management Server may be employedfor this purpose.
`
`In embodiments of the invention, the cache comprises at least two caching
`
`servers, wherein said caching servers are connected to said storage means.
`
`Multiple caching servers allow fast, easy and efficient
`
`servicing of a
`
`20
`
`multiplicity of users as may be required by popular internet sites which attract
`
`many users every day, and often simultaneously. The caching servers may be
`
`used to serve the various users;
`
`for example if the cache is retrieved from the
`
`store for the purposes of sendingto a user it may be stored in cachefor a further
`
`user to retrieve without placing an undue burden on the processing system.
`
`25
`
`This may remove the necessity to have multiple databases or content
`
`management systems for the purposes of serving numerous users and the
`
`associated problems inherent with such an arrangement (synchronizing data
`
`PROVI-1016 - Page 7
`
`PROVI-1016 - Page 7
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`between the various data stores etc.).
`
`In embodiments of the invention, said cache is operable to process the data
`
`prior to sending said data to a user. The caching server or system may be
`operable to provide secondary processing on the data beforeit is sent to a user;
`
`for example, if the user has limited rights, the caching server may note this and
`
`withhold all or part of the requested page.
`
`In embodiments of the invention, said cache is operable to perform a re-
`
`10
`
`organisation of the data, for example a filtering of data, prior to sending it to a
`
`user. For example, if the user has limited rights, the caching server may note
`
`this and withholdall or part of the requested page.
`
`In embodiments of the invention, said data related to a page of information
`
`15
`
`comprises data in XML format. eXtensible Mark-up Language (XML) is an
`
`industry standard means of storing data.
`
`This protocol allows many
`
`heterogeneous computer platforms to communicate together.
`
`It will be appreciated that features of one aspect of the invention may be applied
`
`20
`
`to features of another aspect of the invention.
`
`Embodiments of the present invention will now be described, by way of
`
`example only, and with reference to the accompanying drawings in which:
`
`25
`
`Figure 1 is a block diagram illustrating an embodimentof the present invention;
`
`Figure 2 is a block diagram illustrating a preferred embodiment of the present
`
`invention;
`
`PROVI-1016 - Page 8
`
`PROVI-1016 - Page 8
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`7
`
`Figure 3 shows the system data flows in a preferred embodimentof the present
`
`invention;
`Figure 4 shows a flow of information between the various components of a
`preferred embodimentof the invention; and
`Figure 5 illustrates the software architecture of a preferred embodimentof the
`
`invention;
`Figure 6 shows a flow chart of the authentication system employed by the
`
`invention; and
`
`Figure 7 illustrates the process by which further caching servers may be added
`
`10
`
`to the system.
`
`Referring now to Figure 1, the system 10 of the present inventionis illustrated.
`
`In the system 10, a controller 12 communicates with a search device 14 and
`pages of information 17 as shown. The pages of information comprise data in
`any format:
`text, graphics, audio, video etc. The search device 14 further
`
`15
`
`comprises an index 15 which, whenin use, is populated with data relating to the
`
`pages of information 17. When one or more pages of information is updated
`
`say, by a user such as a web administrator, the controller 12 is notified of this,
`
`and instructs the search device 14 to obtain data related to the updated page(s).
`
`20
`
`To do this, the search device employs a program (such as a spider program
`
`described above) to read the updated page and extract the required data. When
`
`doing this
`
`the search device communicates directly with the pages of
`
`information, or via the controller. Alternatively, the controller 17 instructs the
`
`page of information to be sent to the searching device 14, possibly routing the
`
`25
`
`page throughitself.
`
`An other embodiment of the invention is illustrated in Figure 2. The system
`
`PROVI-1016 - Page 9
`
`PROVI-1016 - Page 9
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`again, generally referred to by reference 10, comprises three main systems: a
`
`controller 12, a search engine 14, and a caching system 16 coupled to a content
`
`store/database 18 where the pages of information 17 of Figure 1 may be stored.
`
`The content store/database 18 comprises a proprietary Content Management
`
`System, for example as provided by Microsoft. Alternatively,
`
`the content
`
`store/database 18 is another data storage and management tool such as a
`
`database, spreadsheet or other suitable example as discussed above. The
`
`controller 12, the search engine 14 and the caching system 16 all communicate
`
`with the outside world, preferably via the Internet 20. Alternatively, the system
`
`10
`
`is put into use in other types of network such as local area network and/or
`
`Intranet.
`
`In use, the search engine 14 (which is provided remote from the system — i.e.
`
`connected through the Internet — or locally to the controller and content store)
`conducts a search of the content store/database 18 through the controller 12.
`
`15
`
`The search engine contains an index of data related to the information in page
`
`format stored in the contents store/database 18. When a userlocal to the system
`
`updates a page the controller 12 is notified of this and sends out a request to the
`
`searching device to retrieve updated information related to the updated page.
`
`20
`
`This may be done in various ways.
`
`In a first example of updating the
`
`information, the controller sends a request to the search engine to read the
`
`updated page. The search engine utilises the spider or equivalent type program
`
`for this purpose. The spider program then reads the updated page either
`
`directly, or via the controller and extracts the relevant information. In a second
`
`25
`
`example,
`
`the controller downloads
`
`the updated page from the contents
`
`store/database 18 and sends the page to the search device 14 for it to perform
`
`the necessary processing locally.
`
`PROVI-1016 - Page 10
`
`PROVI-1016 - Page 10
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`Additionally, or simultaneously, the cache 16 is similarly updated. However,
`
`the cache 16 will not store the information in an index format;
`
`the contents
`
`store/database arranges the requested information into page format suitable for
`
`sending to a user to view on a browser or by other means; for example, the
`
`information may be stored in XML format and translated by XSL before
`
`sending to a user (as discussed in more detail below). The information is stored
`
`in cache 16 in this format.
`
`It will be appreciated that alternatively the
`
`information is stored in the cache in a manner more suited to the user who will
`
`10
`
`be requesting the information. The cache 16 then sends the requested page to
`the user in a suitable format. The cache device may perform some further
`
`processing on the page prior to sending it to a user; for example, if the user has
`
`restricted rights, the cache device can filter or block some, or all of the page
`content before sending it to the user. In embodiments of the invention, this is
`
`15
`
`effected by the use of XSL (which is discussed further below).
`
`A second or further user may then request the same information as thefirst user
`has done. Instead of instructing the same heavy processing again at the contents
`store/database 18, the controller first checks to see if the requested page is
`resident in cache 16. If the requested page is resident in cache it sends a signal
`to the cache to send the requested page information to the user.
`
`The controller communicates with both the service device 14 and the cache 16
`
`by meansofa cross-platform development such as Web Services. Web Services
`
`is a conceptual way of communicating which allows procedure calls (known as
`RPCs) between different computer systems. These effectively allow different
`
`processes running on different computers to communicate with one another.
`
`20
`
`25
`
`PROVI-1016 - Page 11
`
`PROVI-1016 - Page 11
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`10
`
`to provide a
`(SOAP)
`They utilise the Simple Object Access Protocol
`standardised message format, exchanging messages in XML format.
`The
`controller exposes certain commands to the rest of the system (for example,
`cache, search engine) as web services. This means that
`these commands
`(functions, routines, blocks of code — examples of which might be a command
`to retrieve a web page, or a command to check a username and password) can
`be executed (“called”) by using a web service methodology. The system may
`utilise web services to synchronise the caching system and search engine
`system with the controller.
`
`When a change is made to a document, the controller immediately alerts the
`caching systems and search engine that the document has changed and requires
`re-indexing. By doing so, this means that the search engine and the cache are
`always up to date. The controller communicates with the content store/database
`18 in the same manner,or other suitable means, such as Internal Calls, whereby
`one
`function/routine
`(a
`block
`of
`code)
`executes
`(“calls”)
`another
`function/routine.
`
`Referring now to Figure 3, the system data flows of an embodiment of the
`present invention are now discussed. The embodiment comprises an authoring
`server, generally depicted 22 and a rendering server, generally depicted 24. The
`authoring server itself comprises the following components:
`the search engine
`14, the contents store or database 18, SQL (Structured Query Language) Server
`19 and various web pagesor databases 26 for viewing by a user. SOL, which
`will be discussed further below, is responsible for handling web page resources.
`Atthe core of the system, the data which flows between the various components
`of the system comprises XML data 28, which is communicated to and from the
`
`10
`
`15
`
`20
`
`25
`
`PROVI-1016 - Page 12
`
`PROVI-1016 - Page 12
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`11
`
`search engine 14 and the cache 16 in a cross-platform protocol as discussed
`
`above. As a result content can be generated in a wide variety of formats (web
`
`pages, WAP pages for mobile phones, special pages for talking browsers, PDF
`
`files etc); content can be imported from other XML sources; content can be
`
`exported to other XML consumers etc. XML is also used to communicate with
`
`the controller 12, although, alternatively, XSL translates the XML data into the
`
`relevant format prior to sending from the cache. XSL (eXtensible Style-sheet
`Language) is an industry standard language to convert XML into other formats,
`
`It is at this point that the filtration of data may take place
`including XML.
`depending upon the user’s rights for access to the system; data may be filtered
`out as it is translated from XML. The communication which takes place
`between the search engine 14,
`the SQL server
`19 and the contents
`
`store/database 18 comprises data in XML format. By enabling such an
`atrangement, XML is then at the “heart” of the system; allowing the actual
`content of the data to be distinct from the way in which it is presented. This
`provides “future-proofing” of the system, meaning that virtually any system
`which can handle XML data maybelater bolted-on.
`
`In use, the data comprising XML data 28 is communicated to the cache 16
`
`which is part of the rendering server 24 by the cross-platform protocol. The
`controller 12 communicates with the cache 16, which is done again by the
`cross-platform protocol discussed above. The controller sends a requestto the
`cache for a web page via means of either specifying the URL (Universal
`Resource Locater) of the web page, or communicate in the cross-platform
`protocol. The page requested from cache maybe in a numberof formats: XML,
`HTML, PDF, XHTML (eXtensible Hypertext Mark-up Language), or other
`suitable format. The controller also communicates with the authentication
`
`10
`
`15
`
`20
`
`25
`
`PROVI-1016 - Page 13
`
`PROVI-1016 - Page 13
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`12
`
`system 40 which will be described further below. A user inputs the requestat
`28 for the controller to receive a web page;
`this is done by the user specifying
`the URL of the web page. The controller returns the page after the various
`processing in, for example, XHTML format at 30.
`It will be appreciated that
`the individual components (Cache, SQL, XML, Content Management System
`etc) are well known to the skilled reader and do not require detailed discussion
`here.
`
`10
`
`15
`
`20
`
`25
`
`Turning now to Figure 4, the system process is further illustrated. The process
`begins at Start 50. The user requests theretrieval of a page from the system at
`process step 52. If the page is already in cache,as is queried at process step 54,
`the system retrieves the page from cache and sendsit to the userat process step
`62. If the page is not in cache, a request is sent to the controller via web services
`
`at process step 56. The system interfaces with the contents store/database and
`SQL database in the process at 56 by meansof a request and data return to the
`system. The retrieved web page is, in embodiments of the invention, in XML
`format, and is then stored in cache at 58. This may need translating to other
`formats for viewing by the user. The system translates the Web Services page
`using XSL at process step 60. The cache system 16 then sends the page to the
`user for rendering at the user’s browser (or other means) at process step 62.
`The process endsat 64.
`
`Referring now to Figure 5, the software architectureis illustrated.
`
`A user views retrieved web pages at the client browser 70 which may be
`provided separately.
`In embodiments of the present invention, the user is an
`authorised user of the website, thereby having various rights in order to author
`
`PROVI-1016 - Page 14
`
`PROVI-1016 - Page 14
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`13
`
`data on the web site and/or, for example, perform administrative tasks. For
`reasons ofsecurity, in embodiments of the present invention, there is provided a
`further authentication facility 72 through which an authorised user will log in in
`order to gain full access to the web site. The authentication system 72
`communicates with the rights and configuration system 74 to process the
`request to author data on thesite.
`
`The other various components of the system are illustrated in the architecture
`including rendering 22, authoring 24, cache 16, Search Engine 14, Rights and
`Configuration 74, SQL 15 and content store/database 18.
`
`the authentication system procedure for
`With reference now to Figure 6,
`allowing authoring access to the web site contents store will be discussed
`
`further.
`
`The process begins at process step 80. A user will be prompted for his user
`name and password over a secured channel which may be made over the
`Internet (or Intranet or LAN or other suitable network connection). The
`connection is made using any suitable secure protocol such as Secure Sockets
`Layer (SSL), a preferred implementation of which is HTTPS (Hyper Text
`Transfer Protocol Secure Sockets). This authentication request is made at
`process step 82 as shown in Figure 5 after which, at process step 84, there is an
`attempt to authenticate the user’s rights via the active directory. Should the user
`enter an invalid user nameor incorrect password, he will be deferred back to the
`log in screen through process loop 83 until a valid username and password are
`entered.
`
`10
`
`15
`
`20
`
`25
`
`PROVI-1016 - Page 15
`
`PROVI-1016 - Page 15
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`14
`
`Caller identification (CID) is retrieved from the relevant database, for example,
`a personnel database, at process step 86. If the caller identification has been
`authenticated successfully, but there is no entry for that user in the people
`database at 86, the user is deemed to not exist, and it is assumedthat the user is
`
`to have a standard view of the web site at process step 88. Afterthis process, or
`if the caller identification is valid at 86, an authentication token is stored in a
`
`cookie at process step 89, and the user is granted full accessas his rights allow
`to the web site. The process ends at process step 90.
`
`10
`
`15
`
`20
`
`Referring now to Figure 7,it will be seen that in anotheraspectof the invention
`additional caching servers allow multiple users to connect
`to the content
`database/store 18. As shown, multiple cache devices 16a, 16b, 16c, which may
`comprise any of various types of memory(fast, volatile memory, computer hard
`disks, etc.) are connected to the content store/database 18. Multiple Internet or
`Intranet users 18 can request pages from the content store/database 18 via any
`of the cache devices
`shown. This eases the burden on the processing
`requirements of the system considerably; if the required pages of information
`from the content store/database are stored in cache in a suitable format for a
`
`the processing required by the content
`to download and view,
`user
`store/database 18 to collate the information related to the required page occurs
`only once.
`
`Such an arrangement becomesa particularly powerful tool if the cache servers
`are kept continuously up to date as described above.
`
`It will also be appreciated that further caching devices could be added to the
`illustrated arrangement. Further content store/databases mightalso be provided,
`
`PROVI-1016 - Page 16
`
`PROVI-1016 - Page 16
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`15
`
`butit is anticipated that such additions will not be frequent; the addition of extra
`caching devices would hopefully obviate this latter arrangement.
`
`It will be understood that the present invention has been described above purely
`by way of example, and modifications of detail can be made within the scope of
`the invention.
`
`Each feature disclosed in the description, and the claims and drawings may be
`provided independently or in any appropriate combination.
`
`PROVI-1016 - Page 17
`
`PROVI-1016 - Page 17
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`16
`
`CLAIMS
`
`Apparatus for sending to a user data related to a page of information,
`1.
`said apparatus comprising:
`
`a controller; and
`
`a search module, said search module comprising an index for containing
`data relating to a plurality of pages of information;
`
`wherein, when a page of information is updated, the controller is
`arranged to control the search module to update data in said index relating to
`said updated page of information.
`
`Apparatus according to claim 1, further comprising a cachefor storing
`2.
`data related to a plurality of pages of information, wherein said controlleris
`arrangedto control the cache to update datarelating to said updated page of
`information.
`
`3.
`Apparatus according to claim 1 or claim 2, wherein the controller
`controls at least one of the search module and the cacheto read said updated
`page to update the data.
`
`4.
`
`Apparatus according to claim 1 or claim 2, wherein the controlleris
`
`arranged to send data related to said updated pageto at least one of the search
`module and cache to update the data.
`
`Apparatus according to any of claims 2 to 4, wherein the search module
`5.
`and the cache are controlled to be updated simultaneously.
`
`10
`
`15
`
`20
`
`25
`
`PROVI-1016 - Page 18
`
`PROVI-1016 - Page 18
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`17
`
`6.
`
`Apparatus according to any of claims 2 to 5, wherein said controlleris
`
`arranged to communicate with at least one of said search module and said cache
`
`in a cross-platform protocol.
`
`7.
`
`Apparatus according to claim 6, wherein said controller is arranged to
`
`communicate with said search module and said cache over a network.
`
`8.
`
`Apparatus according to any of claims 2 to 7, wherein said cacheis
`
`arranged to send data related to a page of information to a user.
`
`10
`
`9.
`
`Apparatus according to any preceding claim, further comprising a store
`
`for storing data related to a plurality of pages of information.
`
`10.
`
`Apparatus according to claim 9, wherein the store comprises a content
`
`15
`
`management system
`
`11.
`
`Apparatus according to claim 9, wherein the store comprises a database.
`
`12.
`
`Apparatus according to any of claims 9 to 11, wherein the cache
`
`20
`
`comprisesat least two caching servers, and wherein said caching servers are
`
`connected to said store.
`
`13. Apparatus for sending to a user data related to a page of information,
`
`said apparatus comprising:
`
`25
`
`a cache comprising at least two caching servers; and
`
`a store for storing data related to a plurality of pages of information;
`
`PROVI-1016 - Page 19
`
`PROVI-1016 - Page 19
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`18
`
`wherein said cache is operable to store data received from said store, and
`
`to send data to a user in responseto a request from said user.
`
`14.
`
`Apparatus according to claim 13, wherein said cache is arranged to
`
`process the data prior to sending said data to a user.
`
`15.|Apparatus according to claim 14, wherein said cacheis arranged to
`
`perform a re-organisation ofthe data, for example a filtering of data, prior to
`
`sending it to a user.
`
`10
`
`16.
`
`Apparatus according to any preceding claim, wherein said data related to
`
`a page of information comprises data in XML format.
`
`17.|Apparatus for sending to a user data related to a page of information,
`
`said apparatus comprising a controller andastore for storing data related to a
`15
`plurality of pages of information, wherein said data comprises data in XML
`
`format.
`
`20
`
`25
`
`Apparatus according to claim 16 or claim 17, further comprising core
`18.
`interface meansutilising XSL fortranslating said XML data into other formats.
`
`19.
`
`A method of making data related to a page of information available to a
`
`user, said method comprising:
`
`receiving notification that a page of information has been updated; and
`in responseto said notification, updating an index in a search device with
`data related to said updated page of information.
`
`PROVI-1016 - Page 20
`
`PROVI-1016 - Page 20
`
`
`
`WO 2004/079485
`
`PCT/GB2004/000959
`
`19
`
`A method according to claim 19, further comprising updating a caching
`20.
`device with data related to said upda