throbber
United States Patent
`Freivald et al.
`
`[19]
`
`[54] CHANGE-DETECTION TOOL INDICATING
`DEGREE AND LOCATION OF CHANGE OF
`INTERNET DOCUMENTS BY COMPARISON
`OF CYCLIC-REDUNDANCY-CHECK(CRC)
`SIGNATURES
`
`[75]
`
`Inventors: Matthew P. Freivald, Sunnyvale;
`Mark S. Richards, San Jose; Alan C.
`Noble, Santa Cruz, all of Calif.
`
`[73] Assignee: Netmind Services, Inc., Campbell,
`Calif.
`
`[21]
`
`Appl. No.: 08/783,625
`
`[22]
`
`[51]
`[52]
`[58]
`
`[56]
`
`Filed:
`
`Jan. 14, 1997
`
`Int. CI.6
`U.S. CI.
`Field of Search
`
`H04L 12/00
`395/200.48; 707/513
`395/200.48, 400.49;
`707/10, 203, 511, 513
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,388,255
`5,630,116
`5,813,007
`
`2/1995 Pytlik et al.
`5/1997 Takaya et al.
`9/1998 Nielsen
`
`395/600
`395/617
`707/10
`
`Primary Examiner~ance Leonard Barry
`Attorney, Agent, or Firm---8tuart T. Auvinen
`
`SOURCE
`DOCUMENT
`SERVER
`
`(WWW SERVER)
`
`.,
`
`12
`
`CHANGE-DETECTION
`TOOL WEB SERVER
`
`MINDER h
`
`22
`
`URL, SECTIONS
`E-MAILADDR
`
`14
`
`USER
`CLIENT
`
`(WWW BROWSER)
`' - - - - - - - - - - ' CONFIRMED 20
`
`·-----r----------------
`
`24
`
`USER SETUP
`
`111111111111111111111111111111111111111111111111111111111111111111111111111
`US005898836A
`Patent Number:
`Date of Patent:
`
`[11]
`
`[45]
`
`5,898,836
`Apr. 27, 1999
`
`[57]
`
`ABSTRACT
`
`A change-detection web server automatically checks web(cid:173)
`page documents for recent changes. The server retrieves and
`compares documents one or more times a week. The user is
`notified by electronic mail when a change is detected. The
`user registers a web-page document by submitting his e-mail
`address and the uniform-resource locator (URL) of the
`desired document. The document is fetched and the user can
`select text on the page of interest. Non-selected text
`is
`ignored; only changes in the selected text are reported back
`to the user. Thus changes to less relevant parts of the
`document are ignored. The document is divided into sections
`bounded by hyper-text markup-language (HTML) tags. A
`checksum is generated and stored for each HTML-bound
`section. Storage requirements are reduced since only check(cid:173)
`sums are stored rather than the original documents. During
`periodic comparisons a fresh copy of the document
`is
`retrieved, divided into HTML-bound sections and check(cid:173)
`sums generated for each section. The freshly-generated
`checksums are compared to the archived checksums. Sec(cid:173)
`tions with non-matching checksums are highlighted as
`changed, and the percentage of changed sections is reported.
`The user-defined selection is also stored as a checksum and
`compared to a freshly-generated checksum. Changed check(cid:173)
`sums outside the user-defined selection do not generate a
`change notification. Re-ordering of sections does not gen(cid:173)
`erate a change notification when the checksums otherwise
`match. Thus format and layout changes do not generate
`change notifications, and the frequency of notices to user is
`reduced.
`
`19 Claims, 10 Drawing Sheets
`
`CHANGE-DETECTION
`TOOL WEB SERVER
`
`URL
`
`SOURCE
`DOCUMENT
`SERVER
`
`(WWW SERVER)
`
`12
`
`14
`
`)
`
`USER
`CLIENT
`
`(WWW BROWSER)
`
`WEEKLY COMPARE
`
`IRESPONDER l
`.-----r----------------
`
`24
`
`20
`
`Oracle Exhibit 1005, pg 1
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 1 of 10
`
`5,898,836
`
`r-----------------------.
`CHANGE-DETECTION:
`TOOL WEB SERVER :
`
`IIIII
`
`MINDER
`
`DATABASE
`
`SOURCE
`DOCUMENT
`SERVER
`
`(WWW SERVER)
`
`USER
`CLIENT
`
`(WWW BROWSER)
`
`RESPONDER
`
`24
`
`-----~----------------
`20
`
`FIG. 1
`
`Oracle Exhibit 1005, pg 2
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 2 of 10
`
`5,898,836
`
`r-----------------------
`CHANGE-DETECTION
`TOOL WEB SERVER
`
`I MINDER h
`
`22
`
`DATABASE
`
`RESPONDER
`
`SOURCE
`DOCUMENT
`SERVER
`
`(WWW SERVER)
`
`~
`12
`
`14
`
`URL, SECTIONS
`E-MAILADDR
`
`USER
`CLIENT
`
`24
`-----")----------------
`(WWW BROWSER)
`----' CONFI RMED 20
`
`10...-
`
`USER SETUP
`
`FIG.2
`
`Oracle Exhibit 1005, pg 3
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 3 of 10
`
`5,898,836
`
`r-----------------------
`: CHANGE-DETECTION
`: TOOL WEB SERVER
`
`III
`
`URL
`
`MINDER
`
`OLD CRC'S
`
`DATABASE
`
`SOURCE
`DOCUMENT
`SERVER
`(WWW SERVER)
`
`12
`
`DOCUMENT
`
`14
`
`~
`
`USER
`CLIENT
`
`(WWW BROWSER)
`
`IRESPONDER L
`
`24
`._----~----------------
`20
`
`WEEKLY COMPARE FIG. 3
`
`Oracle Exhibit 1005, pg 4
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 4 of 10
`
`5,898,836
`
`SOURCE
`DOCUMENT
`SERVER
`
`(WWW SERVER)
`
`E-MAIL
`NOTICE
`
`12
`
`14
`
`USER
`CLIENT
`
`(WWW BROWSER)
`
`r-----------------------
`CHANGE-DETECTION
`TOOL WEB SERVER
`
`MINDER
`
`22
`
`IRESPONDER L
`
`24
`-----~----------------
`20
`
`REPORT CHANGE FIG.4
`
`Oracle Exhibit 1005, pg 5
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 5 of 10
`
`5,898,836
`
`CRC
`SELECTION
`
`CRC1
`
`SOURCE
`DOC
`
`......-.-;.
`
`···....
`
`....... START,
`......
`LEN1
`
`.······················
`
`.........:.
`
`~~
`
`URL
`
`E-MAILADDR
`LEN1
`CRC1
`
`.
`
`LEN2
`
`CRC2
`
`40
`
`RESPONDER
`
`FIG. 5
`
`Oracle Exhibit 1005, pg 6
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 6 of 10
`
`5,898,836
`
`NEXT CHAR
`
`CRC
`SELECTION
`
`N
`
`y
`
`NO MORE
`CHAR'S·,
`STRING
`NOT FOUND
`
`FOUND
`EXACT
`STRING
`(NO CHANGE)
`
`E-MAIL
`CHANGE
`NOTICE
`
`NEW
`SOURCE
`DOC
`..
`..... URL
`
`........
`
`URL
`
`E-MAILADDR
`
`LEN1
`
`LEN2
`
`CRC1
`
`CRC2
`
`40
`
`MINDER
`
`FIG.6
`
`Oracle Exhibit 1005, pg 7
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 7 of 10
`
`5,898,836
`
`CRC1
`
`CRC2
`
`CRC3
`
`CRC4
`
`•••
`
`1
`
`2
`
`3
`
`4
`
`<TAG1>
`TEXT IN SECTION 1...
`</TAG1>
`<TAG2>
`TEXT IN SECTION 2...
`</TAG2>
`<TAG3>
`TEXT IN SECTION 3...
`</TAG3>
`<TAG4>
`TEXT IN SECTION 4...
`</TAG4>
`
`•••
`
`FIG.7
`
`Oracle Exhibit 1005, pg 8
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 8 of 10
`
`5,898,836
`
`SOURCE
`DOC
`
`.....----.
`
`PARSER
`DIVIDER
`
`CRC
`SECTION
`
`SELECT
`SECTIONS
`..........
`
`···············
`···········
`
`RESPONDER
`
`~-""""----, EN/DIS
`SECTION
`
`CRC1
`
`URL
`
`E-MAILADDR
`CRC1
`1
`
`0
`
`2
`3
`
`4
`
`1
`1
`
`0
`
`52
`
`CRC2
`CRC3
`
`CRC4
`
`40'
`
`FIG. 8
`
`Oracle Exhibit 1005, pg 9
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 9 of 10
`
`5,898,836
`
`NEW
`SOURCE
`DOC
`··
`· URL
`·····..
`····

`··
`··.
`
`URL
`
`PARSER
`DIVIDER
`
`76
`
`y
`
`FOUND
`EXACT
`SECTION
`(NO CHANGE)
`
`NEWCRC'S
`
`80
`
`1
`
`2
`
`3
`
`4
`
`NCRC1
`
`NCRC2
`NCRC3
`
`NCRC4
`
`11
`
`00
`01
`
`00
`
`82
`
`FIG.9
`
`E-MAILADDR
`a
`1
`
`CRC1
`
`CRC2
`
`1
`
`2
`
`3
`
`4
`
`1
`a
`
`52
`
`CRC3
`
`CRC4
`
`40'
`
`MINDER
`
`Oracle Exhibit 1005, pg 10
`
`

`

`u.s. Patent
`
`Apr. 27, 1999
`
`Sheet 10 of 10
`
`5,898,836
`
`FIG. 10
`
`USER
`SELECTION
`
`<TAG1>
`TEXT IN SECTION 1...
`</TAG1>
`<TAG2>
`TEXT IN SECTION 2...
`MORE TEXT IN SECTION 2...
`</TAG2>
`<TAG3>
`TEXT IN SECTION 3...
`
`MORE TEXT IN SECTION 3...
`</TAG3>
`
`SECTION #
`
`CRC
`
`1
`
`2
`
`3
`
`4
`
`CRC1
`
`CRC2
`
`CRC3
`
`CRC4
`
`ENAIDIS
`SECTION ft
`0
`
`90
`
`1
`
`1
`
`0
`
`92
`
`)
`
`STARTING
`SECTION #
`
`LENGTH
`
`CRC
`
`2
`
`LEN1
`
`CRC1A
`
`USER
`SELECTION
`
`# 1
`
`Oracle Exhibit 1005, pg 11
`
`

`

`5,898,836
`
`1
`CHANGE-DETECTION TOOL INDICATING
`DEGREE AND LOCATION OF CHANGE OF
`INTERNET DOCUMENTS BY COMPARISON
`OF CYCLIC-REDUNDANCY-CHECK(CRC)
`SIGNATURES
`
`5
`
`2
`These automated software tools are sometimes known as
`"netbots", a network robot which automatically performs
`some task for a user. Netbots allow users to better manage
`the information on the Internet and reduce the amount of
`information that a user must read. Filtering down the amount
`of information is critical to making good use of the over(cid:173)
`whelming amount of information available on the Internet.
`More recent change-detection tools allow users to register
`a document or web page on the Internet and be notified when
`10 any change to that document occurs. The user "registers" a
`document by specifying the URL of the document, and
`providing the user's e-mail address. The change-detection
`tool stores a local copy of the document together with the
`user's e-mail address. Once every day or week the change-
`15 detection tool accesses the source document at the specified
`URL, and compares the retrieved source document to the
`local copy of the document. If a difference between the older
`local copy and the just-retrieved source document
`is
`detected, then a message is sent to the user's e-mail address,
`20 perhaps with a copy of the new document or a copy of the
`changes.
`The document-change tool could store an actual copy of
`the entire document at the tool's web site for comparison.
`However,
`storing the whole document at
`the
`25 documentchange-tool's web site is expensive because large
`amounts of storage are needed. For example, if 500,000
`documents were registered, and each document averages 50
`Kbytes, then 25 GigaBytes of storage are needed to store
`copies of the registered documents.
`Instead of storing the entire document, the revision date or
`time-stamp of the document could be stored. U.S. Pat. No.
`5,388,255 shows a database which compares time stamps to
`determine when data has changed. Since the time-stamp is
`35 much smaller than the entire document, storage space is
`reduced at the tool's web site.
`The inventors have a change-detection tool which stores
`a checksum or CRC of the document rather than the time(cid:173)
`stamp or the entire document. When the document is ini-
`40 tially registered, a checksum is generated for the entire
`source document. This checksum is stored at the tool's web
`site. Each week when the source document is retrieved,
`another checksum is generated and compared to the stored
`checksum. If the stored checksum matches the newly-
`45 generated checksum, then no change is detected. When the
`checksums do not match, then the user is notified of a change
`bye-mail. The user can optionally have a copy of the new
`document attached to the e-mail notification.
`Such a change-detection tool called a "URL-minder" has
`50 been available for free public use at the inventor's web site,
`www.netmind.com. for more than a year before the filing
`date of the present application. Over 150,000 documents or
`URL's are registered at that site for 1.4 million users.
`
`30
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`This invention relates to software retrieval
`tools for
`networks, and more particularly for a change-detection and
`highlighting tool for the Internet.
`2. Description of the Related Art
`Today's society is sometimes referred to as an informa(cid:173)
`tion society. Technology has increased the ease of generating
`and disseminating information. The widespread acceptance
`of the global network known as the Internet allows huge
`amounts of information to be instantly transmitted to per(cid:173)
`sons around the world.
`Explosive growth is occurring in the part of the Internet
`known as the World-Wide Web, or simply the "web". The
`web is a collection of millions of files or "web pages" of text,
`graphics, and other media which are connected by hyper(cid:173)
`links to other web pages. These may physically reside on a
`computer system anywhere on the Internet---{)n a computer
`in the next room or on the other side of the world.
`These hyper-links often appear in the browser as a graphi(cid:173)
`cal icon or as colored, underlined text. A hyper-link contains
`a link to another web page. Using a mouse to click on the
`hyper-link initiates a process which locates and retrieves the
`linked web page, regardless of the physical location of that
`page. Hovering a mouse over a hyperlink or clicking on the
`link often displays in a corner of the browser a locator for the
`linked web page. This locator is known as a Universal
`Resource Locator, or URL.
`The vast amount of information available on the Internet
`has created an overload of information which the casual user
`cannot digest. Internet search tools or search engines allow
`users to find desired information by searching for keywords
`through an index of the millions of documents posted on the
`Internet. Search engines such as Excite of Mountain View,
`Calif. and Digital Equipment's "ALTAVISTA" help users
`quickly sift through huge amounts of information to find the
`desired information.
`A characteristic of the Internet is that it is relatively easy
`to change or update information. The user may wish to know
`when updates are made to the desired information he found
`with a search. For example,
`the information found may
`describe a bug fix or other revision in a software program.
`Initially a crude work-around or even just a notice of the bug
`may be posted on the Internet. Later, this posting may be
`updated with a more robust fix or other useful information.
`The information could also be a list of phone numbers or
`other contact information, or it could be a product list or a
`competitor's web site, advertising, or press releases.
`The user could frequently re-access the information on the
`Internet to see if changes have occurred, but this is time(cid:173)
`consuming. Frequently re-accessing the information is
`tedious, particularly when the information is contained in a
`long document, or when many documents must be checked 60
`for changes.
`Software tools have been developed to automate the task
`of detecting updates to information on the Internet. Early
`tools such as America Online's News Profiles allow users to
`specify keywords which are periodically searched for in a 65
`news database. News articles containing the specified key(cid:173)
`words are sent to the user by electronic mail (email).
`
`55
`
`MINOR CHANGES NOT FILTERED OUT
`While such a change-detection tool is useful, the existing
`tool has several drawbacks. Since minor changes are fre(cid:173)
`quently made to Internet documents, users are notified of
`many insignificant changes. The users can quickly become
`irritated with frequent e-mail notices of the minor, irrelevant
`changes. Statistics taken for the URL-minder tool in May,
`1996, showed that over 100,000 change notices were
`e-mailed in just four days to the 500,000 registered users.
`Internet documents change every few weeks on the average.
`Thus a user with a few dozen registered documents receives
`notices almost every day. This is an undesirably high fre(cid:173)
`quency of notices for many users.
`
`Oracle Exhibit 1005, pg 12
`
`

`

`5,898,836
`
`5
`
`3
`LOCATION OF CHANGE DESIRABLE
`
`than a
`is stored rather
`When the entire document
`checksum, the location of the change in the document can be
`found and highlighted to the user since the original docu(cid:173)
`ment is available for comparison. However, when a single
`checksum is stored for each registered document,
`the
`changes within that document cannot be determined or
`identified. Thus the user is left to determine the location of
`the change within the document, and the relevance of that
`change.
`With the existing URL-minder which stores only
`checksums, when a change is detected, the user is simply
`notified that there was a change. The user can optionally
`receive a copy of the changed document, but the changes are
`not highlighted. Thus the user must
`re-read the entire
`document
`to determine what
`the change was. Often the
`changes are minor and even hard to detect, such as a spelling
`change of a word, or a date change. Sometimes the order or
`arrangement of text has changed but not the content. These
`minor changes are not always significant to the user.
`Thus the user is plagued with frequent notices of minor
`changes, and the user must re-read the entire document to
`the change was. Having to re-read the
`determine what
`documents increases the burden on the user, which is the
`opposite intent of an automated tool or netbot.
`
`LONG, COMPLEX DOCUMENTS COMMON
`
`4
`remote client to register a document for change detection by
`receiving from the remote client a uniform-resource-locator
`(URL) identifying the document. The responder fetches the
`document from the remote document server and generates
`an original checksum for a checked portion of the document.
`The checked portion is less than the entire document.
`A database is coupled to the responder. It receives the
`URL and the original checksum from the responder when
`the document is registered by the remote client. The data-
`10 base stores a plurality of records each containing a URL and
`a checksum for a registered document. A periodic minder is
`coupled to the database and the network connection. It
`periodically re-fetches the document from the remote docu(cid:173)
`ment server by transmitting the URL from the database to
`15 the network connection. The periodic minder receives a
`fresh copy of the document from the remote document
`server. The periodic minder generates a fresh checksum of a
`portion of the fresh copy of the document and compares the
`fresh checksum to the original checksum. A detected change
`20 is signaled to the remote client when the fresh checksum
`does not match the original checksum.
`Thus a change in the document is detected by comparing
`the checked portion of the document.
`a checksum for
`Changes in portions of the document outside the checked
`25 portion are not signaled to the remote client.
`In further aspects the database does not store the docu(cid:173)
`ment. The database stores a checksum for the document.
`Thus storage requirements for the database are reduced by
`archiving checksums and not entire documents.
`In other aspects of the invention a selection means is
`coupled to the responder. It receives a selection from the
`remote client. The selection identifies boundaries of the
`checked portion of the document. A parsing means is
`35 coupled to the periodic minder. It parses the fresh copy and
`generates checksums for a plurality of portions of the fresh
`copy. A compare means is coupled to the parsing means. It
`signals a match when any of the checksums generated by the
`parsing means matches the original checksum from the
`40 database. Thus a change in the document is detected when
`the match is not signaled by the compare means. The parsing
`means generates a plurality of checksums for the plurality of
`portions of the fresh copy.
`In still further aspects of the invention a length field
`45 indicates a size of the checked portion. The length field is
`written by the selection means. The parsing means generates
`each checksum for portions having the size of the checked
`portion. Thus the size of the checked portion is stored and
`used by the parsing means.
`In further aspects the document is a hyper-text markup-
`language (HTML) document containing HTML tags. The
`HTML tags indicate formatting,
`layout, and hyper-links
`specifying URLs of other servers. The change-detection web
`server also has divider means coupled to the responder, for
`55 dividing the document into portions bound by the HTML
`tags. A checksum means generates original checksums. An
`original checksum is generated for each portion bound by
`HTML tags. The database stores the original checksums for
`the portions bound by the HTML tags. The periodic minder
`60 also has a second divider means which divides the fresh
`copy of the document into portions bound by the HTML
`tags. A second checksum means generates fresh checksums
`for portions of the fresh copy bound by HTML tags in the
`fresh copy of the document. A compare means receives the
`65 fresh checksums of the fresh copy from the second check(cid:173)
`sum means. It compares the fresh checksums to the original
`checksums from the database. A report means signals a
`
`30
`
`The change-detection tool allows a user to register a
`document by specifying the uniform-resource-locator
`(URL) of that document. A unique URL is specified for each
`web page on the Internet's world-wide-web. Other informa(cid:173)
`tion sometimes embedded in the URL includes passwords or
`search text
`that
`the user types in, or name and address
`information typed in. Internet documents are usually web
`pages containing several
`individual
`files such as for
`graphics,
`text, and motion video and sound. Sometimes
`these files include small programs such as CGI (common
`gateway interface) scripts. Thus the documents registered
`are fairly complex and often lengthy.
`Often the user is only interested in a small part of a
`document, rather than the whole document. A user might be
`interested only in one contact or phone number on a list of
`hundreds of phone numbers for an office, or only one
`product line in a long list of products. It is desirable to allow
`the user to specify only the portion of a document or web
`page which is of interest.
`What is desired is a storage-efficient change-detection
`tool which detects when changes occur to a registered 50
`document on the Internet. It is desired that minor changes to
`the document be filtered by the change-detection tool to
`reduce the number of change notifications sent to the user.
`It is also desired to give the user an indication of how
`significant the change is. It is desired to allow the user to
`identify relevant portions of a document so that the user is
`not notified of changes to other portions of the document. It
`is further desired to reduce storage requirements for the
`change-detection tool by storing a condensed checksum or
`signature of the registered document rather than storing the
`entire document.
`
`SUMMARY OF THE INVENTION
`
`A change-detection web server has a network connection
`for transmitting and receiving packets from a remote client
`and a remote document server. A responder is coupled to the
`network connection. The responder communicates with the
`
`Oracle Exhibit 1005, pg 13
`
`

`

`5,898,836
`
`5
`change in the document when an original checksum for the
`document has no matching fresh checksum. Thus check(cid:173)
`sums are generated and stored for portions of the document
`bound by the HTML tags.
`In further aspects the report means has a mailer means
`coupled to the network connection. It sends a change noti(cid:173)
`fication message to the remote client when the change is
`signaled. The responder receives an electronic-mail address
`from the remote client and stores the electronic-mail address
`of the remote client in the database. The mailer means reads 10
`the electronic-mail address from the database. The change
`to the remote client as an
`notification message is sent
`electronic-mail message addressed to the electronic-mail
`address. Thus the remote client is notified of the change by
`electronic mail.
`
`5
`
`6
`Client 14 retrieves web pages of files from document
`server 12 through Internet 10. These web pages are identi(cid:173)
`fied by a unique URL (uniform resource locator) which
`specifies a document file containing the text and graphics of
`a desired web page. Often additional files are retrieved when
`a document
`is retrieved. The "document" returned from
`document server 12 to client 14 is thus a composite docu(cid:173)
`ment composed of several files of text, graphics, and perhaps
`sound or animation. The physical appearance of the web
`page on the user's browser on client 14 is specified by layout
`information embedded in non-displayed tags, as is well-
`known for HTML (hyper-text markup language) documents.
`Often these HTML documents contain tags with URL's that
`specify other web pages, perhaps on other web servers
`15 which may be physically located in different cities or coun(cid:173)
`tries. These tags create hyper-links to these other web
`servers allowing the user to quickly jump to other servers.
`These hyper-links form a complex web of linked servers
`across the world; hence the name "world-wide web".
`The user may frequently retrieve files from remote docu(cid:173)
`ment server 12. Often the same file is retrieved. The user
`may only be interested in differences in the file, or learning
`when the file is updated, such as when a new product or
`service is announced. The inventors have developed a soft-
`25 ware tool which automatically retrieves files and compares
`the retrieved files to an archived checksum of the file to
`determine if a change in the file has occurred. When a
`change is detected, the user is notified by an electronic mail
`message (e-mail). A copy of the new file maybe attached to
`30 the e-mail notification, allowing the user to review the
`changes.
`Rather than archive the source files from remote docu(cid:173)
`ment server 12, the invention archives a checksum of CRC
`of the source files. These CRC's and the e-mail address of
`35 the user are stored in database 16 of change-detection server
`20. Comparison is made of the stored or archived CRC of the
`document and a fresh CRC of the currently-available docu(cid:173)
`ment. The CRC is a condensed signature or fingerprint of the
`document. Any change to the document changes the CRe.
`40 Aliasing of CRC's can be reduced to a very small probability
`by using sufficiently large CRC's, such as an 8-byte CRe.
`With an 8-byte CRC it
`is extremely improbable that a
`change to a document results in the same CRC being
`generated. If an identical CRC is generated, then the user is
`45 not notified of any change.
`Change-detection server 20 performs three basic func(cid:173)
`tions:
`1. Register (setup) a web page document for change
`detection.
`2. Periodically re-fetch the document and compare for
`changes
`3. E-mail a change notice to the registered user if a change
`is detected.
`Change-detection server 20 contains three basic compo-
`nents. Database 16 stores the archive of CRC's for registered
`web-page documents. The URL identifying the web page
`and the user's e-mail address are also stored with the
`archived CRC's. Responder 24 communicates with the user
`60 at client 14 to setup or register a web page document for
`change detection. Minder 22 periodically fetches registered
`documents from document server 12 through Internet 10.
`Minder 22 compares the archived CRC's in database 16 to
`new CRC's of the fetched documents to determine if a
`65 change has occurred. When a change is detected, minder 22
`sends a notice to the user at client 14 that the document has
`changed.
`
`20
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a diagram of a change detection tool on a server
`on the Internet.
`FIG. 2 shows a user registering a web page document for
`change detection.
`FIG. 3 shows a periodic comparison of a registered web
`page document to determine if the document has changed.
`FIG. 4 shows a document-change notice being generated
`and sent to the user.
`FIG. 5 illustrates the operation of responder 24 of FIG. 1
`when the registered document is an arbitrary, unstructured
`file.
`FIG. 6 illustrates operation of minder 22 of FIG. 1 when
`the registered document has an arbitrary, unstructured for(cid:173)
`mat.
`FIG. 7 is a diagram of an HTML document and a table of
`checksums for the HTML-delineated sections.
`FIG. 8 illustrates the operation of responder 24 of FIG. 1
`when the registered document is an HTML file.
`FIG. 9 illustrates the operation of minder 22 of FIG. 1
`when an HTML document is checked for recent changes.
`FIG. 10 is a diagram illustrating an alternate embodiment
`which archives separate checksums for HTML-defined sec(cid:173)
`tions and checksums for user-defined sections.
`
`DETAILED DESCRIPTION
`
`invention relates to an improvement
`The present
`in
`Internet-document change-detection tools. The following
`description is presented to enable one of ordinary skill in the
`art to make and use the invention as provided in the context
`of a particular application and its requirements. Various
`modifications to the preferred embodiment will be apparent 50
`to those with skill in the art, and the general principles
`defined herein may be applied to other embodiments.
`Therefore, the present invention is not intended to be limited
`to the particular embodiments shown and described, but is to
`be accorded the widest scope consistent with the principles 55
`and novel features herein disclosed.
`
`OVERVIEW OF CHANGE-DETECTION WEB
`SERVER
`FIG. 1 is a diagram of a change detection tool on a server
`on the Internet. The user operates client 14 from a remote
`site on Internet 10. The user typically is operating a browser
`application, such as Netscape's Navigator or Microsoft's
`Internet Explorer. Client 14 communicates through Internet
`10 by sending and receiving TCP/IP packets to establish
`connections with remote servers, typically using the hyper(cid:173)
`text transfer protocol (http) of the world-wide web.
`
`Oracle Exhibit 1005, pg 14
`
`

`

`7
`OVERVIEW OF OPERATION-FIGS. 2,3,4
`
`5,898,836
`
`5
`
`8
`file. The user initiates registration of a document by provid(cid:173)
`ing the URL identifying the document and the user's e-mail
`address. These can be provided by typing or pasting them
`into fields on a registration web page at change-detection
`server 20.
`Change-detection server 20 uses the URL to fetch a copy
`of source document 30 from document server 12 of FIG. 1.
`Source document 30 could be anyone of millions of
`documents on the thousands of web servers connected to the
`10 Internet. Source document 30 is displayed to the user,
`allowing the user to select portions of source document 30
`for registration. The user can select portions of source
`document 30 by dragging a highlight with a mouse over the
`text to be selected. Alternately, the user can select whole
`15 paragraphs by triple-clicking anywhere inside these
`sections, or a single word or numeric value by double(cid:173)
`clicking on the word. Changes which occur in unselected
`portions of source document 30 do not generate change
`notifications.
`The selection information from the user is encoded as a
`string of length LEN1, with a starting location START.
`Parser 32 reads characters from source document 30 one at
`a time until the first character in the string at the starting
`location START is found. START can simply be an offset in
`25 bytes or in characters from the beginning of the file to the
`beginning of the user's selection. Characters following
`START are sent from parser 32 to CRC generator 34 until
`the number of characters indicated by LENI is reached,
`indicating that the end of the selection has been reached.
`30 CRC generator 34 calculates the cyclic-redundancy-check
`(CRe) of these characters selected by the user from source
`document 30. Methods of generating CRC's and other
`checksums are well-known in the art and any of several
`methods can be used.
`The CRC is typically generated by exclusive-ORing bits
`from a current character with a running checksum to gen(cid:173)
`erate a new checksum, which is then exclusive-ORed with
`bits from the next character. The final value of the running
`checksum, CRC1, is written to record 40 in database 16 of
`40 FIG. 1. The URL and the e-mail address from the user are
`also written to record 40. The length of the selection, LEN1,
`is also written to record 40, but the starting location is not.
`The starting location can change when changes are made to
`the web page document in the non-selected region before the
`selection, such as in a document header. Thus the starting
`location can change even when the selection has not
`changed, and changes in the header should be ignored.
`The user may make several selections on the same source
`50 document 30, and each selection has it length and CRC
`stored in record 40. For example, the second user-selection
`stores LEN2 and CRC2 in record 40.
`FIG. 6 illustrates operation of minder 22 of FIG. 1 when
`the registered document has an arbitrary, unstructured for-
`55 mat. The minder performs change-detection on each of the
`thousands of documents having their URL's registered.
`Checking is preferably performed once for all users regis(cid:173)
`tering the same URL since this saves re-fetching documents
`for different users.
`The minder begins by reading record 40 from database 16
`of FIG. 1. The URLin record 40 is used to access the remote
`document server on the Internet and retrieve a fresh docu(cid:173)
`ment copy 30' of source document 30 which was registered
`as described for FIG. 5. Fresh document copy 30' is parsed
`65 by parser 42 and each successive character of document
`copy 30' is sent to CRC generator 44 until the stored length
`LENI is reached. Anew CRC for this string from document
`
`45
`
`FIG. 2 shows a user registering a web page document for
`change detection. The user on client 14 registers a web page
`document by specifying the URL which identifies the web
`page. A portion of the URL is translated into an IP address
`of a server by a domain-name server. The user also sends his
`e-mail address to responder 24. Responder 24 fetches the
`web page and displays the page to the user. The user then
`selects which portions of the web page document are to be
`compared for changes. The user can select paragraphs of text
`by dragging a highlight across the text. Responder 24 then
`stores the location of the selected text and generates one or
`more CRC for the selected text. Responder 24 then stores the
`CRC(s), URL, and e-mail address in database 16. A confir(cid:173)
`mation that the web page document has been registered is
`finally sent to the user on client 14.
`FIG. 3 shows a periodic comparison of a registered web
`page document to determine if the document has changed.
`Each registered document is compared for changes on a 20
`periodic basis which depends on the number of registered
`documents and the speed of operation of change-detection
`server 20. Typically each document is compared every few
`days, although more frequent comparisons are possible.
`Minder 22 reads the URL of the registered document from
`database 16. Minder 22 automatically fetches from docu(cid:173)
`ment server 12 a fresh copy of the web-page document
`pointed to by the URL. Client 14 is not involved in this
`transaction. Occasionally the URL is deleted or does not
`respond, and a change is then signaled indicating that the
`URL co

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket