`Mighdoll et al.
`
`USOO5918O13A
`Patent Number:
`11
`(45) Date of Patent:
`
`5,918,013
`Jun. 29, 1999
`
`54 METHOD OF TRANSCODING DOCUMENTS
`IN A NETWORKENVIRONMENT USINGA
`PROXY SERVER
`
`75 Inventors: Lee S. Mighdoll, San Francisco; Bruce
`A. Leak, Palo Alto; Stephen G.
`Perlman, Mountain View; Phillip Y.
`Goldman, Los Altos, all of Calif.
`73 Assignee: WebTV Networks, Inc., Mountain
`View, Calif.
`
`Appl. No.: 08/656,924
`21
`21 Appl. No.: 08/656,
`22 Filed:
`Jun. 3, 1996
`51 Int. Cl 6
`
`G06F 13/00; G06F 13/14;
`
`OTHER PUBLICATIONS
`Farrow, Rik, “Securing the Web”: fire walls, proxy servers,
`and data driven attacks, InfoWorld, Jun. 19, 1995. Vol. 7. No.
`25. pp. 103–104.
`Administrator's Guide, Netscape Proxy Server Version 2.0,
`Netscape Communications Corporation, pp. 19-20, 1996.
`Chankhunthod, Anawat et al., “A Hierarchical Internet
`Object Cache,” 1996 USEWIX Technical Conference (6
`pages).
`Matt Rosoff, Review: “Gateway Destination PC,” c/net inc.,
`2 pages, Feb. 19, 1996.
`Robert Seidman, Article: What Larry and Lou Know (That
`You Don't), c/net inc., 2 pages, Jan. 29, 1996.
`Susan Stellin, Article: “The S500 Web Box: Less is More?”
`
`O
`
`-1 - O
`
`- - - - - - - - - - - - - - - - - - - - - - - - - - - -
`
`s GO6F 1700
`
`c/net inc., 2 pages, 1996.
`
`56)
`
`References Cited
`
`52 U.S. Cl. ................................ 395/200.47; 395/200.33;
`Primary Examiner Parshotan S. Lall
`395,300.40, 30520OSs. 305,200.50.305/20076.
`ASSistant Examiner Bharat Barot
`395/187.01
`58 Field of Search ............................ so. Attorney, Agent, or Firm Workman, Nydegger, Seeley
`395/200.36, 200.47-2005, 200.56-200.59,
`57
`ABSTRACT
`500, 200.76-200.79, 185.01, 185.05, 187.01,
`584-685 A method of providing a document to a client coupled to a
`server is provided. The server provides a number of Internet
`Services to the client, including functioning as a caching
`proxy on behalf of the client for purposes of accessing the
`World Wide Web. The proxying server includes a persistent
`document database, which stores various attributes of all
`documents previously retrieved in response to a request
`from a client. When a Web document is retrieved from a
`remote Server in response to a request from the client, the
`database is consulted and the Stored information relating to
`the requested document is used by the Server in transcoding
`the document. The document is transcoded for various
`purposes, including to circumvent bugs or quirks found in
`the document, to Size the document for display on a televi
`Sion Set, to improve transmission efficiency of the document,
`and to reduce latency. The transcoder makes use of the
`document database to perform these functions. The docu
`ment database is also used for prefetching previously
`requested documents and images and for reducing latency
`when downloading images to the client.
`
`U.S. PATENT DOCUMENTS
`4,575,579 3/1986 Simon et al. ............................... 1784
`4,852,151
`7/1989 Dittakavi et al. .
`... 379/97
`4,922,523 5/1990 Hashimoto ................................ 379/96
`4,975,944 12/1990 Cho ...............
`... 379/209
`4,995,074 2/1991 Goldman et al. ......................... 379/97
`5,005,011
`4/1991 Perlman et al...
`... 340/728
`5,095,494 3/1992 Takahashi et al. ........................ 375/10
`it. SE E. t al - - - - - - - - - - - - - - - - - - - - - - - - - - - - - :
`5,2630s4 11/1993 CN A. - - - -
`... 379/215
`5,287,401
`2/1994 Lin ............................................ 379/98
`5,299,307 3/1994 Young
`... 395/161
`5,325,423 6/1994 Lewis ........................................ 379/90
`5,329,619 7/1994 Page et al. ........
`... 395/200.33
`5,341,293 8/1994 Vertelney et al.
`... 364/419.17
`5,369,688 11/1994 Tsukamoto et al.
`... 379/100
`5,410,541 4/1995 Hotto ........................................ 370/76
`(List continued on next page.)
`
`13 Claims, 12 Drawing Sheets
`
`
`
`RECEIVE 00VENT
`REGUEST FROM CLIENT
`
`-60
`
`802
`
`000).
`Eviously RETRE's
`ES
`
`-608
`
`D0CVENT
`8TILLALID
`
`YES
`
`-604
`DOENT YO
`NCHE
`YES
`
`
`
`EEEEEEE-605
`FRORECTE 8ERYER
`TRANsco
`UENSE
`OGAGS FORMATIA-608
`AABASEANO SAYE TO CACHE
`Sy:E-67
`DNA) TRAS300ED
`
`RETRIEEDOCMENT
`FROAREMOTE SERVER
`
`-609
`
`ANALYEEOCUMENT
`FORBS
`
`-60
`
`-Sti
`
`SAYEDAGNOSTIC
`INFORMATION TO
`DATABASE
`Tascoot Docu-ENT - 612
`AND SANE TO EACHE
`
`00 NLCATRANSCODE
`DOCUENT TO LIENT
`
`-63
`
`(WNLOADDOCUMENT
`INCACE TO CET
`
`-608
`
`
`
`5,918,013
`Page 2
`
`U.S. PATENT DOCUMENTS
`379/215
`5,425,092 6/1995 Ouirk
`395/158
`5,469,540 11/1995 Powers, III et al. .
`... 348/8
`5,488,411
`1/1996 Lewis .............
`... 379/96
`5,490,208 2/1996 Remillard .......
`395/200.36
`5,530.852 6/1996 Meske, Jr. et al.
`... 463/41
`5,538.255
`7/1996 Barker ............
`463/42
`5,558,339 9/1996 Perlman .
`... 379/96
`5,561,709 10/1996 Remillard ...
`5,564,001 10/1996 Lewis ...................................... 395/154
`
`5,572,643 11/1996 Judson ............................... 395/200.48
`5,586,257 12/1996 Perlman .................................... 463/42
`5,586.260 12/1996 Hu r 395/200.33
`5,612,730 3/1997 Lewis .......................................... 348/8
`5,623,600 4/1997 Ji et al. .............................. 395/187.01
`5,654,886 8/1997 Zereski, Jr. et al. ........................ 702/3
`5,657,390 8/1997 Elgamai et al.
`... 380/49
`5,657,450 8/1997 Rao et al. .......
`395/200.3
`5,678,041 10/1997 Baker et al. ....................... 395/200.59
`5,802,367 9/1998 Held ........................................ 395/685
`
`
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 1 of 12
`
`5,918,013
`
`
`
`
`
`
`
`
`REMOTE
`SERVER
`
`4
`
`REMOTE
`SERVER
`
`REMOTE
`SERVER
`
`INTERNET
`
`3
`
`WEBTV
`SERVER
`
`5
`
`
`
`29
`
`WEBTV
`CLIENT
`
`1.
`
`WEBTV
`CLIENT
`
`WEBTW
`CLIENT
`
`
`
`
`
`FIG.
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jun. 29, 1999
`Jun. 29, 1999
`
`Sheet 2 of 12
`Sheet 2 of 12
`
`5,918,013
`5,918,013
`
`
`
`fo
`
`FG, 2
`FIG.
`2
`
`Petitioners Ex-1007, Page 4 of 25
`
`
`
`cm
`oe
`us
`=>
`oe
`uss
`cw
`=e
`co
`wu
`=
`
`FIG.3
`
`
`
`TOOTHERWEBTVSERVERS
`
`U.S. Patent
`U.S. Patent
`
`
`
`Jun. 29, 1999
`Jun. 29, 1999
`
`Sheet 3 of 12
`Sheet 3 of 12
`
`5,918,013
`5,918,013
`
`TOINTERNET
`
`Petitioners Ex-1007, Page 5 of 25
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 4 of 12
`
`5,918,013
`
`
`
`WEBTV
`PROXY
`CLIENT I CACHE a?
`
`
`
`TRANSCODER
`66
`WEBTV SERVICE
`
`FG, 4A
`
`
`
`DOCUMENT
`DATABASE
`
`USER
`DATABASE
`
`WEBTY SERVER 5
`
`FG, 4B
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 5 of 12
`
`5,918,013
`
`
`
`
`
`
`
`
`
`RECEIVE DOCUMENT
`REQUEST FROM CLIENT
`
`50
`
`ACCESS DATABASE FOR INFORMATION
`RELATING TO REQUESTED DOCUMENT
`
`502
`
`RETRIEVE AND/OR TRANSCODE
`DOCUMENT ACCORDING TO
`INFORMATION IN DATABASE
`AND DOWNLOAD TO CLIENT
`
`
`
`503
`
`FG, 5
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 6 of 12
`
`5,918,013
`
`
`
`NO
`
`DOCUMENT
`PREviously. RETRIE’s
`
`
`
`603
`
`604
`
`DOCUMENT
`STLyALD
`YES
`
`DOCUMENT
`N CACHE
`
`RETRIEVE DOCUMENT
`FROM REMOTE SERVER
`
`
`
`605
`
`TRANSCODE DOCUMENT BASED
`ONDAGNOSTIC INFORMATION IN
`DATABASE AND SAVE TO CACHE
`
`
`
`606
`
`DOWNLOAD TRANSCODED
`DOCUMENT TO CLIENT
`
`607
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`RETRIEVE DOCUMENT
`FROM REMOTE SERVER
`
`ANALYZE DOCUMENT
`FOR BUGS
`
`SAVE DAGNOSTIC
`INFORMATION TO
`DATABASE
`
`TRANSCODE DOCUMENT
`AND SAVE TO CACHE
`
`DOWNLOAD TRANSCODED
`DOCUMENT TO CLIENT
`
`
`
`
`
`DOWNLOAD DOCUMENT
`IN CACHE TO CLIENT
`
`FG, 6
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 7 of 12
`
`5,918,013
`
`
`
`STANDARD ROUTINE
`FOR INITIAL RETRIEWAL
`
`706
`
`70
`
`DOCUMENT
`REFERENCING MAGE
`PREVIOUSLY
`RETRIVE
`YES
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`DETERMINE SIZE OF
`IMAGE FROM DATABASE
`
`
`
`TRANSCODE DOCUMENT TO INITIALLY
`DISPLAY BLANK AREAS ENVELOPING IMAGE
`
`
`
`
`
`
`
`DOWNLOAD DOCUMENT TO CLIENT
`WHILE RETREWING IMAGES
`
`DOWNLOAD IMAGES TO CLIENT,
`REPLACING BLANK AREAS WITH IMAGES
`
`FIG, 7
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 8 of 12
`
`5,918,013
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ELAPSED
`TIME SINCE
`DOCUMENT X LAST
`RETRIEVED & T,
`
`
`
`RETRIEVE DOCUMENT X
`
`802
`
`803
`
`RETRIEVED
`DOCUMENT X SAME AS
`CACHED ogcyMENT X
`
`SAVE DATE, TIME, AND
`CHANGE STATUS TO DATABASE
`
`FG, 8
`
`806
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 9 of 12
`
`5,918,013
`
`RECEIVE LOGICAL ADDRESS
`OF DOCUMENT FROM CLIENT
`
`90
`
`HAVE A
`REDIRECT ADDRESS
`STORED I, DATABASE
`
`ACCESS OFFERENT REMOTE SERVER
`USING REDIRECT ADDRESS
`
`
`
`
`
`
`
`
`
`ACCESS REMOTE SERVER USING
`ADDRESS PROVIDED BY CLIENT
`
`
`
`REMOTE
`SERVER RESPOND
`WITH A REDIRECT
`
`SAVE REDIRECT ADDRESS
`TODATABASE
`
`
`
`RETRIEVE DOCUMENT
`AND DOWNLOAD TO CLIENT
`
`907
`
`FG, 9
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 10 of 12
`
`5,918,013
`
`DATABASE
`71
`
`
`
`SERVICE
`A
`73
`
`SERVICE
`B
`73
`
`SERVICE
`C
`74
`
`CLIENT
`
`75
`
`O
`FIG.
`(PRIOR ART)
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 11 of 12
`
`5,918,013
`
`
`
`USER
`DATABASE 6
`
`CLIENT
`
`FG,
`
`
`
`U.S. Patent
`
`Jun. 29, 1999
`
`Sheet 12 of 12
`
`5,918,013
`
`1205
`
`
`
`
`
`YES
`
`GATHER USER INFORMATION
`AND GENERATE TICKET
`
`GENERATE LIST OF
`AVAILABLE SERVICES
`
`
`
`DOWNLOAD TICKET AND
`LIST OF SERVICES TO CLIENT
`
`FG, 12
`
`
`
`1
`METHOD OF TRANSCODING DOCUMENTS
`IN A NETWORK ENVIRONMENT USINGA
`PROXY SERVER
`
`5,918,013
`
`FIELD OF THE INVENTION
`The present invention pertains to the field of client-server
`computer networking. More particularly, the present inven
`tion relates to a method and apparatus for providing proxy
`ing and document transcoding in a Server in a computer
`network.
`
`BACKGROUND OF THE INVENTION
`The number of people using personal computerS has
`increased Substantially in recent years, and along with this
`increase has come an explosion in the use of the Internet.
`One particular aspect of the Internet which has gained
`widespread use is the World-Wide Web (“the Web”). The
`Web is a collection of formatted hypertext pages located on
`numerous computers around the World that are logically
`connected by the Internet. Advances in network technology
`and software providing user interfaces to the Web (“Web
`browsers”) have made the Web accessible to a large segment
`of the population. However, despite the growth in the
`development and use of the Web, many people are still
`unable to take advantage of this important resource.
`Access to the Web has been limited thus far mostly to
`people who have access to a personal computer. However,
`many people cannot afford the cost of even a relatively
`inexpensive personal computer, while others are either
`unable or unwilling to learn the basic computer skills that are
`required to access the Web. Furthermore, Web browsers in
`the prior art generally do not provide the degree of user
`friendliness desired by Some people, and many computer
`novices do not have the patience to learn how to use the
`Software. Therefore, it would be desirable to provide an
`inexpensive means by which a perSon can access the Web
`without the use of a personal computer. In particular, it
`would be desirable for a person to be able to access the Web
`pages using an ordinary television Set and a remote control,
`So that the person feels more as if he or she is simply
`changing television channels, rather than utilizing a complex
`computer network.
`Prior art Web technology also has other significant limi
`tations which can make a perSon's experience unpleasant
`when browsing the Web. Web documents are commonly
`written in HTML (Hypertext Mark-up Language). HTML
`documents Sometimes contain bugs (errors) or have features
`that are not recognized by certain Web browsers. These bugs
`or quirks in a document can cause a Web browser to fail.
`Thus, what is needed is a means for reducing the frequency
`with which client Systems fail due to bugs or quirks in
`HTML documents.
`Another problem associated with browsing the Web is
`latency. People commonly experience long, frustrating
`delays when browsing the Web. It is not unusual for a person
`to have to wait minutes after Selecting a hypertext link for a
`Web page to be completely downloaded to his computer and
`displayed on his computer Screen. There are many possible
`causes for latency, Such as heavy communications traffic on
`the Internet and Slow response of remote Servers. Latency
`can also be caused by Web pages including images. One
`reason for this effect is that, when an HTML document
`references an image, it takes time to retrieve the image itself
`after the referencing document has been retrieved. Another
`reason is that, in the prior art, if the referencing document
`does not specify the size of the image, the client System
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`generally cannot display the Web page until the image itself
`has been retrieved. Numerous otherS Sources of latency exist
`with respect to the Web. Therefore, what is needed is a
`means for reducing Such latency, to eliminate Some of the
`frustration which typically has been associated with brows
`ing the Web.
`Security is another concern associated with the Internet.
`Internet Service providers (ISPs) generally maintain certain
`information about each customer in a database. This infor
`mation may include information which a customer may not
`wish to become publicly known, Such as Social Security
`numbers and credit card numbers. Maintaining the confi
`dentiality of this information in a System that is connected to
`an expensive publicly-accessible computer network like the
`Internet can be problematic. Further, the problem can be
`aggravated by the fact that an ISP often provides numerous
`different Services, each of which has access to this database.
`Allowing access to the database by many different entities
`creates many opportunities for Security breaches to occur.
`Therefore, what is needed is a way to improve the Security
`of confidential customer information in a Server System
`coupled to the Internet.
`
`SUMMARY OF THE INVENTION
`A method is described of providing a document to a client
`coupled to a Server. The Server functions as a proxy on
`behalf of the client for purposes of accessing a remote
`server. In the method, a document is retrieved from the
`remote Server in response to a request from the client. The
`document includes data to be used by the client in generating
`a display. The proxying Server alters the data in the docu
`ment to form a transcoded document. The transcoded docu
`ment is then transmitted to the client.
`Other features of the present invention will be apparent
`from the accompanying drawings and from the detailed
`description which follows.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The present invention is illustrated by way of example
`and not limitation in the figures of the accompanying
`drawings, in which like references indicate Similar elements
`and in which:
`FIG. 1 illustrates Several clients connected to a proxying
`Server in a network.
`FIG. 2 illustrates a client according to the present inven
`tion.
`FIG. 3 is a block diagram of a Server according to the
`present invention.
`FIG. 4A illustrates a Server including a proxy cache and
`a transcoder.
`FIG. 4B illustrates databases used in a server according to
`the present invention.
`FIG. 5 is a flow diagram illustrating a routine for
`transcoding a document retrieved from a remote Server using
`data Stored in a persistent database.
`FIG. 6 is a flow diagram illustrating a routine for
`transcoding an HTML document for purposes of eliminating
`bugs or undesirable features.
`FIG. 7 is a flow diagram illustrating a routine for reducing
`latency when downloading a document referencing an image
`to a client.
`FIG. 8 is a flow diagram illustrating a routine for updating
`documents Stored in the proxy cache using data Stored in a
`persistent database.
`
`
`
`3
`FIG. 9 is a flow diagram illustrating a routine used by a
`Server for retrieving documents from another remote Server.
`FIG. 10 is a block diagram of a prior art server system
`showing a relationship between various Services and a
`database.
`FIG. 11 is a block diagram of a Server System according
`to the present invention showing a relationship between
`various Services and a user database.
`FIG. 12 is a flow diagram illustrating a routine used by a
`Server for regulating access to various Services provided by
`the server.
`
`DETAILED DESCRIPTION
`A method and apparatus are described for providing
`proxying and transcoding of documents in a network. In the
`following description, for purposes of explanation, numer
`ous specific details are set forth in order to provide a
`thorough understanding of the present invention. It will be
`evident, however, to one skilled in the art that the present
`invention may be practiced without these specific details. In
`other instances, well-known Structures and devices are
`shown in block diagram form in order to avoid unnecessarily
`obscuring the present invention.
`The present invention includes various Steps, which will
`be described below. The steps can be embodied in machine
`executable instructions, which can be used to cause a
`general-purpose or Special-purpose processor programmed
`with the instructions to perform the Steps. Alternatively, the
`Steps of the present invention might be performed by Spe
`cific hardware components that contain hardwired logic for
`performing the Steps, or by any combination of programmed
`computer components and custom hardware components.
`I. System Overview
`The present invention is included in a System, known as
`WebTVTM, for providing a user with access to the Internet.
`A user of a WebTVTM client generally accesses a WebTVTM
`server via a direct-dial telephone (POTS, for “plain old
`telephone service”), ISDN (Integrated Services Digital
`Network), or other similar connection, in order to browse the
`Web, Send and receive electronic mail (e-mail), and use
`various other WebTVTM network Services. The WebTVTM
`network services are provided by WebTVTM servers using
`Software residing within the WebTVTM servers in conjunc
`tion with software residing within a WebTVTM client.
`FIG. 1 illustrates a basic configuration of the WebTVTM
`network according to one embodiment. A number of
`WebTVTM clients 1 are coupled to a modem pool 2 via
`direct-dial, bi-directional data connections 29, which may be
`telephone (POTS, i.e., "plain old telephone service”), ISDN
`(Integrated Services Digital Network), or any other similar
`type of connection. The modem pool 2 is coupled typically
`through a router, Such as that conventionally known in the
`art, to a number of remote Servers 4 via a conventional
`network infrastructure 3, such as the Internet. The WebTVTM
`system also includes a WebTVTM server 5, which specifi
`cally supports the WebTVTM clients 1. The WebTVTM clients
`1 each have a connection to the WebTVTM server 5 either
`directly or through the modem pool 2 and the Internet 3.
`Note that the modem pool 2 is a conventional modem pool,
`Such as those found today throughout the World providing
`access to the Internet and private networkS.
`Note that in this description, in order to facilitate expla
`nation the WebTVTM server 5 is generally discussed as if it
`were a single device, and functions provided by the
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,918,013
`
`4
`WebTVTM services are generally discussed as being per
`formed by such single device. However, the WebTVTM
`Server 5 may actually comprise multiple physical and logical
`devices connected in a distributed architecture, and the
`various functions discussed below which are provided by the
`WebTVTM services may actually be distributed among mul
`tiple WebTVTM server devices.
`II. Client System
`FIG. 2 illustrates a WebTVTM client 1. The WebTVTM
`client 1 includes an electronics unit 10 (hereinafter referred
`to as “the WebTVTM box10"), an ordinary television set 12,
`and a remote control 11. In an alternative embodiment of the
`present invention, the WebTVTM box. 10 is built into the
`television set 12 as an integral unit. The WebTVTM box 10
`includes hardware and Software for providing the user with
`a graphical user interface, by which the user can access the
`WebTVTM network services, browse the Web, send e-mail,
`and otherwise access the Internet.
`The WebTVTM client 1 uses the television Set 12 as a
`display device. The WebTVTM box 10 is coupled to the
`television set 12 by a video link 6. The video link 6 is an RF
`(radio frequency), S-Video, composite Video, or other
`equivalent form of video link. In the preferred embodiment,
`the client 1 includes both a standard modem and an ISDN
`modem, Such that the communication link 29 between the
`WebTVTM box 10 and the server 5 can be either a telephone
`(POTS) connection 29a or an ISDN connection 29b. The
`WebTVTM box 10 receives power through a power line 7.
`Remote control 11 is operated by the user in order to
`control the WebTVTM client 1 in browsing the Web, sending
`e-mail, and performing other Internet-related functions. The
`WebTVTM box 10 receives commands from remote control
`11 via an infrared (IR) communication link. In alternative
`embodiments, the link between the remote control 11 and the
`WebTVTM box 10 may be RF or any equivalent mode of
`transmission.
`
`III. Server System
`The WebTVTM server 5 generally includes one or more
`computer Systems generally having the architecture illus
`trated in FIG. 3. It should be noted that the illustrated
`architecture is only exemplary; the present invention is not
`constrained to this particular architecture. The illustrated
`architecture includes a central processing unit (CPU) 50,
`random access memory (RAM) 51, read-only memory
`(ROM) 52, a mass storage device 53, a modem 54, a network
`interface card (NIC) 55, and various other input/output (I/O)
`devices 56. Mass Storage device 53 includes a magnetic,
`optical, or other equivalent Storage medium. I/O devices 56
`may include any or all of devices Such as a display monitor,
`keyboard, cursor control device, etc. Modem 54 is used to
`communicate data to and from remote Servers 4 via the
`Internet.
`As noted above, the WebTVTM server 5 may actually
`comprise multiple physical and logical devices connected in
`a distributed architecture. Accordingly, NIC 55 is used to
`provide data communication with other devices that are part
`of the WebTVTM services. Modem 54 may also be used to
`communicate with other devices that are part of the
`WebTVTM services and which are not located in close
`geographic proximity to the illustrated device.
`According to the present invention, the WebTVTM server
`5 acts as a proxy in providing the WebTVTM client 1 with
`access to the Web and other WebTVTM Services. More
`specifically, WebTVTM server 5 functions as a “caching
`
`
`
`5,918,013
`
`S
`proxy.” FIG. 4A illustrates the caching feature of the
`WebTVTM server 5. In FIG. 4A, the WebTVTM server 5 is
`functionally located between the WebTVTM client 1 and the
`Internet infrastructure 3. The WebTVTM server 5 includes a
`proxy cache 65 which is functionally coupled to the
`WebTVTM client 1. The proxy cache 65 is used for tempo
`rary Storage of Web documents, images, and other informa
`tion which is frequently used by either the WebTVTM client
`1 or the WebTVTM Server 5.
`A document transcoder 66 is functionally coupled
`between the proxy cache 65 and the Internet infrastructure 3.
`The document transcoder 66 includes Software which is used
`to automatically revise the code of Web documents retrieved
`from the remote Servers 4, for purposes which are described
`below.
`The WebTVTM service provides a document database 61
`and a user database 62, as illustrated in FIG. 4B. The user
`database 62 contains information that is used to control
`certain features relating to access privileges and capabilities
`of the user of the client 1. This information is used to
`regulate initial access to the WebTVTM service, as well as to
`regulate access to the individual Services provided by the
`WebTVTM system, as will be described below. The docu
`ment database 61 is a persistent database which Stores
`certain diagnostic and historical information about each
`document and image retrieved by the Server 5, as is now
`described.
`
`A. Document Database
`The basic purpose of the document database 61 is that,
`after a document has once been retrieved by the server 5, the
`stored information can be used by the server 5 to speed up
`processing and downloading of that document in response to
`all future requests for that document. In addition, the
`transcoding functions and various other functions of the
`WebTVTM service are facilitated by making use of the
`information stored in the document database 61, as will be
`described below.
`Referring now to FIG. 5, the server 5 initially receives a
`document request from a client 1 (step 501). The document
`request will generally result from the user of the client 1
`activating a hypertext anchor (link) on a Web page. The act
`of activating a hypertext anchor may consist of clicking on
`underlined text in a displayed Web page using a mouse, for
`example. The document request will typically (but not
`always) include the URL (Uniform Resource Locator) or
`other address of the Selected anchor. Upon receiving the
`document request, the Server 5 optionally accesses the
`document database 62 to retrieve Stored information relating
`to the requested document (step 502). It should be noted that
`the document database 62 is not necessarily accessed in
`every case. The information retrieved from the document
`database 62 is used by the Server 5 for determining, among
`other things, how long a requested document has been
`cached and/or whether the document is still valid. The
`criteria for determining validity of the Stored document are
`discussed below.
`The server 5 retrieves the document from the cache 65 if
`the stored document is valid; otherwise, the server 5
`retrieves the document from the appropriate remote Server 4
`(step 503). The server 5 automatically transcodes the docu
`ment as necessary based on the information Stored in the
`document database 61 (step 503). The transcoding functions
`are discussed further below.
`The document database 61 includes certain historical and
`diagnostic information for every Web page that is accessed
`
`6
`at any time by a WebTVTM client 1. As is well known, a Web
`page may correspond to a document written in a language
`such as HTML (Hypertext Mark-Up Language), VRML
`(Virtual Reality Modelling Language), or another Suitable
`language. Alternatively, a Web page may represent an
`image, or a document which references one or more images.
`According to the present invention, once a document or
`image is retrieved by the WebTVTM server 5 from a remote
`server 4 for the first time, detailed information on this
`document or image is Stored permanently in the document
`database 61. More specifically, for every Web page that is
`retrieved from a remote server 4, any or all of the following
`data are Stored in the document database 61:
`1) information identifying bugs (errors) or quirks in the
`Web page, or undesirable effects caused when the Web
`page is displayed by a client 1;
`2) relevant bug-finding algorithms;
`3) the date and time the Web page was last retrieved;
`4) the date and time the Web page was most recently
`altered by the author;
`5) a checksum for determining whether the Web page has
`been altered;
`6) the size of the Web page (in terms of memory);
`7) the type of Web page (e.g., HTML document, image,
`etc.);
`8) a list of hypertext anchors (links) in the Web page and
`corresponding URLs,
`9) a list of the most popular anchors based on the number
`of “hits” (requests from a client 1);
`10) a list of related Web pages which can be prefetched
`11) whether the Web page has been redirected to another
`remote Server 4,
`12) a redirect address (if appropriate);
`13) whether the redirect (if any) is temporary or
`permanent, and if permanent, the duration of the redi
`rect,
`14) if the Web page is an image, the size of the image in
`terms of both physical dimensions and memory space;
`15) the sizes of in-line images (images displayed in text)
`referenced by the document defining the Web page;
`16) the size of the largest image referenced by the
`document;
`17) information identifying any image maps in the Web
`page,
`18) whether to resize any images corresponding to the
`Web page;
`19) an indication of any forms or tables in the Web page;
`20) any unknown protocols;
`21) any links to “dead’ Web pages (i.e., pages which are
`no longer active);
`22) the latency and throughput of the remote server 4 on
`which the Web page is located;
`23) the character set of the document;
`24) the vendor of the remote server 4 on which the Web
`page is located;
`25) the geographic location of the remote server 4 on
`which the Web page is located;
`26) the number of other Web pages which reference the
`Subject Web page;
`27) the compression algorithm used by the image or
`document;
`28) the compression algorithm chosen by the transcoder;
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`
`
`7
`29) a value indicating the popularity of the Web page
`based on the number of hits by clients; and
`30) a value indicating the popularity of other Web pages
`which reference the subject Web page.
`B. Transcoding
`As mentioned above, the WebTVTM services provide a
`transcoder 66, which is used to rewrite certain portions of
`the code in an HTML document for various purposes. These
`purposes include: (1) correcting bugs in documents; (2)
`correcting undesirable effects which occur when a document
`is displayed by the client 1; (3) improving the efficiency of
`transmission of documents from the server 5 to the client 1;
`(4) matching hardware decompression technology within
`the client 1; (5) resizing images to fit on the television Set 12;
`(6) converting documents into other formats to provide
`compatibility; (7) reducing latency experienced by a client 1
`when displaying a Web page with in-line images (images
`displayed in text); and, (8) altering documents to fit into
`Smaller memory Spaces.
`There are three transcoding modes used by the transcoder
`66: (1) streaming, (2) buffered, and (3) deferred. Streaming
`transcoding refers to the transcoding of documents on a
`line-by-line basis as they are retrieved from a remote Server
`4 and downloaded to the client 1 (i.e., transcoding “on the
`fly”). Some documents, however, must first be buffered in
`the WebTVTM server 5 before transcoding and downloading
`them to the client 1. A document may need to be buffered
`before transmitting it to the client 1 if the type of changes to
`be made can only be made after the entire document has
`been retrieved from the remote server 4. Because the process
`of retrieving and downloading a document to the client 1
`increases latency and decreases throughput, it is not desir
`able to buffer all documents. Therefore, the transcoder 66
`accesses and uses information in the document database 61
`relating to the requested document to first determine whether
`a requested document must be buffered for purposes of
`transcoding, before the document is retrieved from the
`remote server 4.
`In the deferred mode, transcoding is deferred until after a
`requested document has been downloaded to a client 1. The
`deferred mode therefore reduces latency experienced by the
`client 1 in receiving the document. Transcoding may be
`performed immediately after downloading or any time there
`after. For example, it may be convenient to perform
`transcoding during periods of low usage of WebTVTM
`Services, Such as at night. This mode is useful for certain
`types of transcoding which are not mandatory.
`1. Transcoding for Bugs and QuirkS
`One characteristic of some prior art Web browsers is that
`they may experience failures (“crashes”) because of bugs or
`unexpected features (“quirks”) that are present in a Web
`document. Alternatively, quirks in a document may cause an
`undesirable result, even though the client does not crash.
`Therefore, the transcoding feature of the present invention
`provides a means for correcting certain bugs and quirks in a
`Web document. To be corrected by the transcoder 66, bugs
`and quirkS must be identifiable by Software running on the
`server 5. Consequently, the transcoder 66 will generally only
`correct conditions which have been previously discovered,
`Such as those discovered during testing or reported by users.
`Once a bug or quirk is discovered, however, algorithms are
`added to the transcoder 66 to both detect the bug or quirk in
`the future in any Web document and to automatically
`correct it.
`
`1O
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,918,013
`
`8
`There are countleSS possibilities of bugs or quirks which
`might be encountered in a