throbber
6,012,087
`[11] Patent Number:
`[19]
`Unlted States Patent
`
`Freivald et al.
`[45] Date of Patent:
`Jan. 4, 2000
`
`U5006012087A
`
`[54] UNIQUE-CHANGE DETECTION OF
`DYNAMIC WEB PAGES USING HISTORY
`
`5,898,836
`5,923,880
`
`4/1999 Freivald .................................. 709/218
`7/1999 Rose et a1.
`.............................. 395/705
`
`[75]
`
`TABLES 0F SIGNATURES
`Inventors: Matthew P. Freivald, Sunnyvale; Alan
`C. Noble, Santa Cruz, both of Calif.
`_
`.
`[73] Ass1gnee: NetMlnd Technologies, Inc., Cambell,
`Calif.
`
`[21] Appl. No.: 09/081,991
`.
`F1led:
`
`May 20, 1998
`
`[22]
`
`[
`
`63
`
`I
`
`_
`_
`Related U-S- Appllcatlon Data
`.
`.
`.
`.
`.
`C
`15911;111:3111:népgggogggppheat“)nNO'08/78376257Jan' 14:
`’
`7'
`'
`’
`’
`'
`Int. Cl.
`..................................................... H04L 12/00
`[51]
`[52] US. Cl.
`........................... 709/218; 709/229; 709/201
`[58] Field of Search ..................................... 709/218, 229,
`709/224, 226, 201; 395/705; 380/21, 25,
`29, 4; 705/8
`
`[56]
`
`References Cited
`
`U'S' PATENT DOCUMENTS
`5,109,486
`4/1992 Seymour ................................. 709/224
`
`5,204,897
`4/1993 Wyman ............. 380/4
`5,249,261
`9/1993 Nataraj an .................................. 706/46
`
`3,332,232 11/199: Wyman ....................................... 380/4
`,
`,
`8/199 Wyman .
`705/8
`5,574,906
`11/1996 Moms
`707/1
`
`709/101
`5 596 750
`“1997 Li et al
`
`
`‘
`5,666,502
`9/1997 Capps ‘
`345/352
`
`9/1997 Wolff 6:211.
`5:671:282
`380/25
`
`.........
`5,715,453
`2/1998 Stewart
`707/104
`........................... 709/229
`5,835,726
`11/1998 Shwed et a1.
`
`~
`~
`P ~
`rzmary Exammer—Zarm Maung
`fisszstam Examiner
`.
`anh Quang D1nh
`tromey,
`gent, 0r Firm—Stuart T. Auv1nen
`[57]
`ABSTRACT
`
`An improved change-detection tool detects only relevant
`changes within Internet web pages on the world-wide-web.
`Changes back to an earlier version of a web page are not
`relevant and do not cause the user to be notified. Only
`changes to a new, unique version of the web page generate
`a user notification. After the user finishes registering the web
`page by specifying the URL and the user’s e-mail address,
`the change-detection tool periodically retrieves the web-
`page at the specified URL and generates a checksum or
`signature to determine when to send a notification to the
`user. Signatures from several older versions of the web page
`are stored in a history table. When a new signature for a
`re-fetched page matches the most-recent signature at the top
`of the stack 1n the h1story table, no change has occurred.
`When the new signature matched any of the older signatures
`in the history table, the detected change is not unique and
`notification is not made even though a change has occurred.
`When the new signature matches one of the older, not-most-
`recent signatures in the history table, the signature is moved
`into a Permanent history table- Signatures in the Permanent
`history table are for recurring versions of the web page and
`are likely to appear again. Error pages displayed when a web
`server is down for routine maintenance can be screened out
`’
`[h
`h’ t
`t b1 . Th
`’
`’
`’
`us1ng
`e
`15 ory a
`e
`e .frequency of not1ficat1ons 1s
`tracked. When too many not1ficat1ons are be1ng sent for a
`web page,
`the last-modified header is used rather than
`signature-matching to reduce the frequency of notifications.
`
`20 Claims, 14 Drawing Sheets
`
`PERIODIC
`MINDER
`
`
`
`READ URL FROM DB
`
`FETCH DOC AT URL
`
`
`
`GENERATE NEW
`SIGNATURE FOR DOC
`
`60
`
`62
`
`
`
`Clouding Exhibit 2001 , pg. 1
`
`
`
`
`
`64
`
`
`READ ALL SIG'S FROM HISTORY TABLE IN DB
`
`66
`
`65
`
`DO
`
`ANY SIG‘S IN
`HISTORY TABLE
`MATCH
`?
`
`YES
`
`NO CHANGE;
`NEXT URL
`
`69
`
`
`
`NO
`
`READ LAST_MOD FROM DB
`
`
`
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 1 0f 14
`
`6,012,087
`
`37 C.F.R. RULES
`
`37 C.F.R. 1:8
`
`APPLICANT SHALL
`
`37 C.F.R. 1.62
`
`A CONTINUATION
`
`AN EXTENSION OF....
`
`37 C.F.R. 1.136
`
`DOC SIGNATURE = 5A7
`
`FIG. 1
`
`37 C.F.R. RULES
`
`37 C.F.R. 1.8
`
`APPLICANT SHALL
`
`{MODIFIED RULE}
`
`37 C.F.R. 1.62
`
`DELETED RULE
`
`37 C.F.R. 1.136
`
`AN EXTENSION OF....
`
`DOC SIGNATURE = D6F
`CHANGE DETECTED
`
`FIG. 2
`
`Clouding Exhibit 2001, pg. 2
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 2 0f 14
`
`6,012,087
`
`(ERROR PAGE)
`
`SERVER IS TEMPORARILY
`
`UNAVAILABLE
`
`FOR ROUTINE MAINTENANCE.
`
`SORRY FOR THE INCONVENIENCE.
`
`DOC SIGNATURE = E89
`
`FIG. 3
`
`CHANGE DETECTED
`
`IS NOT RELEVENT
`
`<HTML>
`
`<CONTENT_LEN = 37,428>
`
`<LAST_MOD|FIED = 3.15.98 13:42>
`
`<END__HTML>
`
`:
`HWP 631/4+3
`INTC 623/4 - 12 1/2§
`
`FIG. 4
`
`Clouding Exhibit 2001, pg. 3
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 3 0f 14
`
`6,012,087
`
`SOURCE
`
`DOCUMENT
`
`
`
`
`(WWW SERVER)
`
`------------------—-—----------‘
`
`
`
`
`SERVER
`
`CLIENT
`
`
` USER
`
`
`
`(WWW BROWSER)
`
`
`
`
`CHANGE—DETECTION
`
`TOOL WEB SERVER
`
`
`
`
`
` N N
`
`.3 CD
`
`NA
`
`MINDER
`
`DATABASE
`
`RESPONDER
`
`Vi
`20
`
`i---------u---------------------
`
`Clouding Exhibit 2001, pg. 4
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 4 0f 14
`
`6,012,087
`
`32
`
`34
`
`URL (WWW ADDR)
`
`LAST—MOD
`
` E-MAIL ADDR
`
`SIGNATURE
`
`HISTORY
`
`
`
`36
`
`38
`
`FIG. 6
`
`TABLE
`
`40
`
`
`
`
`
`
`Clouding Exhibit 2001, pg. 5
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 5 0f 14
`
`6,012,087
`
`NEW
`
`SIG:
`
`
`NOTIFICATION
`
`
`D6F
`NO CHANGE
`
`,-.""
`
`
`
`SIG=
`
`EBQ
`
`NO CHANGE
`
`NOTIFICATION
`
`FIG. 7D
`
`
`
`Clouding Exhibit 2001, pg. 6
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 6 0f 14
`
`6,012,087
`
`PERIODIC
`
`MINDER
`
`READ URL FROM DB
`
`60
`
`FETCH DOC AT URL
`
`62
`
`
`
`GENERATE NEW
`SIGNATURE FOR DOC
`
`64
`
`READ ALL SIG'S FROM HISTORY TABLE IN DB
`
`66
`
`68
`
`YES
`
`
`DO
`
`
`
`ANY SIG'S IN
`
`
`HISTORY TABLE
`
`?
`
`NO
`
`MATCH
`
`.
`
`”SEiflAtTSLE'
`
`67
`
`69
`
`READ LAST_MOD FROM DB
`
`FIG. 8A
`
`Clouding Exhibit 2001, pg. 7
`
`

`

`7O
`
`DOES
`DOC HAVE
`
`LAST_MOD
`HEADER
`?
`
`NOTIFY
`
`80
`
`
`
`
`
`IS
`
`LAST_MOD
`FROM DOC SAME
`AS IN DB
`7
`
`72
`
`
`
`US. Patent
`
`Jan. 4,2000
`
`Sheet 7 0f 14
`
`6,012,087
`
`YES
`
`
`
`
`
`
`
`YES
`
`FETCH DOC AT URL AGAIN
`
`RE—GENERATE SIG FOR DOC
`RE-FECTHED
`
`78
`
`ANY
`
`
`SIG'S FROM
`
`
`HISTORY TABLE
`MATCH
`
`?
`
`
`
`YES
`
`FALSE DETECT;
`IGNORE
`
` FIG. 8B
`
`79
`
`Clouding Exhibit 2001, pg. 8
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 8 0f 14
`
`6,012,087
`
`
`
`NOTIFY
`
`ADD NEW SIG TO
`
` 82
`
`HISTORY TABLE
`
` 84
`
`READ E-MAIL ADDR
`
`FROM DB
`
`SEND NOTIFICATION
`
`
`
`
`86
`
`
`MESSAGE TO EMAIL
`
`
`ADDR
`
`
`
`
`
`FIG. 9
`
`Clouding Exhibit 2001, pg. 9
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 9 0f 14
`
`6,012,087
`
`TEMP PERM
`
`HIST
`
`HIST
`
`TABLE TABLE
`
`-- FIG. 10
`--
`.-
`
`50
`
`52
`
`TEMP PERM
`
`HIST /\ TEMP PERM
`HIST
`TABLE TABLE
`NEW
`HIST
`HIST
`SIG=
`TABLE TABLE
`
`
`
`E89
`
`
`
`--
`.....
`
`
`NOCHANGE --
`NOTIFICATION .-
`
`
`
`FIG. 11
`
`50'
`
`52'
`
`Clouding Exhibit 2001, pg. 10
`
`

`

`US. Patent
`
`Jan. 4, 2000
`
`Sheet 10 0f 14
`
`6,012,087
`
`DUAL-TABLE
`
`ADD-ON
`
`
`
`YES
`
`SIG
`
`
`
`
`MACTHES IN
`PERMANENT
`
`HISTORY TABLE
`7
`
`130
`
`132
`
`
`
`SIG
`
`MATCHES
`
`
`
`MOST-RECENT IN
`CONTINUE
`
`TEMP TABLE
`
`
`
`’?
`
`134
`
`REMOVE MATCHING SIG FROM
`
`
`
`
`TEMP TABLE
`
`136
`
`WRITE NEW SIG TO PERM
`
`TABLE
`
`FIG. 12
`
`Clouding Exhibit 2001, pg. 11
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 11 0f 14
`
`6,012,087
`
`E-MAILADDR
`
`URL (WWWADDR)
`
`LASLMOD
`
`38
`
`
` 36
`
`
`
`
`
`
`
`SIGNATURE
`
`HISTORY
`
`TABLE
`
`
`# DETECTS
`
`IGNORE SIG
`
`PERM SIGS:
`EBQ
`
`
`52
`
`32
`
`34
`
`54
`
`56
`
`FIG. 13
`
`Clouding Exhibit 2001, pg. 12
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 12 0f 14
`
`6,012,087
`
`FREQUENCY
`
`CHECK
`
`
`
`
`
`90
`
`
`
`READ # DETECT FROM DB
`
`
`
`FIG. 14
`
` 92
`
`IS
`# DETECTS >
`
`THRESHOLD
`
`?
`
`
`
`91
`
`
`
`DOES
`DOC HAVE
`
`
`LAST_MOD
`?
`
`94
`
`SET |GNORE_SIG FLAG IN
`DB FOR URL
`
`
`
`
`
`CLEAR #DETECTS IN DB
`
`96
`
`NEXT URL
`
`RECORD
`
`Clouding Exhibit 2001, pg. 13
`
`

`

`US. Patent
`
`Jan. 4,2000
`
`Sheet 13 0f 14
`
`6,012,087
`
`PERIODIC
`MINDER
`
`READ URL FROM DB
`
`60
`
`FETCH DOC AT URL
`
`52
`
`75
`,
`
`77
`
`
`NO
`IS
`
`
`
`
`IGNORE_SIG
`GENERATE NEW
`
`
`SIGNATURE FOR DOC
`FLAG SET
`
`7
`
`
`YES
`READ ALL SIG'S FROM
`
`
`HISTORY TABLE IN DB
`
`IS
`
`
`
`LAST_MOD FROM
`
`DOC SAME AS IN DB
`
`?
`
`
`
`
`DO
`
`ANY SIG'S IN
`
`
`HISTORY TABLE
`
`64
`
`66
`
`NO
`
`NOTIFY
`
`80
`
`68
`
`‘39
`
`FIG. 15
`
`
`MATCH
`'2
`
`
`
`FROM DB
`
`READ
`LAST_MOD
`
`.
`NO CEl-I‘LgNGE'
`
`67
`
`Clouding Exhibit 2001, pg. 14
`
`

`

`US. Patent
`
`Jan. 4, 2000
`
`Sheet 14 0f 14
`
`6,012,087
`
`
`
`CONTENT-LENGTH
`REFETCHING
`
`
`
`100
`
`NO
`
`IS
`THERE
`
`
`CONTENT-LENGTH
`
`
`FOR DOC IN HEADER
`
`
`
`
` DOES
`
`HEDAER LEN
`
`MATCH TAG
`
`?
`
`CONTINUE
`
`
`REFETCH WEB-PAGE DOC
`
`
`FIG. 16
`
`Clouding Exhibit 2001, pg. 15
`
`

`

`6,012,087
`
`1
`UNIQUE-CHANGE DETECTION OF
`DYNAMIC WEB PAGES USING HISTORY
`TABLES OF SIGNATURES
`
`RELATED APPLICATION
`
`This application is a continuation-in-part of the applica-
`tion for “Change-Detection Tool Indicating Degree and
`Location of Change of Internet Documents by Comparison
`of CRC Signatures”, U.S. Ser. No. 08/783,625, filed Jan. 14,
`1997 now US. Pat. No. 5,898,836.
`
`FIELD OF THE INVENTION
`
`This invention relates to software retrieval
`
`tools for
`
`networks, and more particularly to improved accuracy for a
`change-detection tool for the Internet.
`
`BACKGROUND OF THE INVENTION
`
`Fast, inexpensive distribution of information has been
`promoted by the widespread acceptance of the Internet and
`especially the world-wide-web (www). This information can
`be easily updated or changed. However, users may not be
`aware of the changes. Unless the user frequently re-reads the
`information, many days or weeks may pass before users
`realize that the information has changed.
`Documents on the web are known as web pages. These
`web pages are frequently changed. Users often wish to know
`when changes are made to certain web pages. The parent
`application disclosed a change-detection tool that allows
`users to register web pages. Each registered web page is
`periodically fetched and compared to a stored checksum or
`signature for the registered page to determine if a change has
`occurred. When a change is detected, the user is notified by
`e-mail. The change-detection tool of the parent application
`allows user to select portions of a web-page document for
`change detection while other portions are ignored.
`Such a change-detection tool as described in detail in the
`parent application is indeed useful and has gained popularity
`with Internet users, as several hundred thousand web pages
`have been registered. For example, patent professionals can
`register the federal regulations and procedures (37 C.F.R.
`and the M.P.E.P) posted at
`the PTO’s web site and be
`notified when any changes are made. The change-detection
`tool is currently free for public use at the www.netmind.com
`web site.
`
`FIG. 1 illustrates a web page registered for change detec-
`tion. This web page contains a copy of one or more of the
`code of federal regulations; specifically the patent office
`regulations at 37 C.F.R § 1.x. Apatent attorney registers this
`web page that contains a copy of the patent rules at 37 C.F.R.
`§ 1.8 to 1.136. The rules may be located on one large web
`page, or spread across many web pages that are each
`registered.
`The user registers this page by using a user-interface for
`the change-detection tool. The user enters his e-mail address
`and the URL for the web page. The change-detection tool
`fetches a copy of this page and generates a signature. The
`signature is a highly-condensed data word that is produced
`by using a cyclical-redundancy-check (CRC) or other algo-
`rithm that produces unique outputs. For the initial page of
`FIG. 1, the signature 5A7 (hex) is generated and stored in a
`database with the user’s e-mail address and the web page’s
`URL.
`
`The change-detection tool periodically fetches this web
`page to see if a change has occurred. A new signature is
`generated for the re-fetched page, and the new signature is
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`compared with the old signature stored in the database. A
`mismatch indicates that a change is detected.
`FIG. 2 shows an updated web page that has a different
`signature that triggers a change notification. Occasionally,
`the patent regulations are updated. Web pages containing a
`copy of these regulations are eventually updated to reflect
`the changed rules. For example, FIG. 2 shows that rule 37
`C.F.R. § 1.62 has been deleted while rule 37 C.F.R. § 1.136
`has been updated, as they were in late 1997.
`The change detection tool re-fetches each registered page
`every few hours or days. Once the rules on the web page are
`updated, a different signature is generated for the updated
`web page. In FIG. 2, the new signature of D6F is generated,
`which does not match the old signature of 5A7 stored in the
`change-detection tool’s database. Thus a change is detected.
`The new signature is stored in the database and the patent
`attorney user is notified by e-mail.
`The user is notified within a few days after the web page
`is updated, allowing the patent attorney to rest easy, not
`having to frequently surf over to the rules page to see if any
`changes have been made.
`False Change Detections—FIG. 3
`The change-detection tool is only useful when it saves
`time and effort for the user. One problem is that false
`notifications can be made, annoying the user with changes
`that are not relevant. The inventors have discovered that the
`world-wide-web itself can trigger false change detections.
`These false detections should be filtered out.
`
`FIG. 3 shows a false change detection caused by a
`non-relevant change in an Internet server. Web pages are
`stored on computer servers. These servers are sometimes
`disconnected from the Internet for maintenance such as
`
`program or hardware updates, or security threats such as
`hacker attacks.
`
`The web server containing the web page with the 37
`C.F.R. patent rules is disconnected from the Internet for
`maintenance. Often such maintenance occurs during low-
`usage times such as weekend nights. Most users do not
`notice that the web pages are offline during these hours.
`Unfortunately, automated software programs such as the
`change-detection tool continue to operate during these
`times, and may perform more fetching during off hours since
`network response times decrease. The change-detection tool
`may find that the web page is not available.
`When no connection can be made with the server, the
`change-detection tool can simply skip the web page until a
`later time. Since TCP/IP packets are not returned from the
`server, the change-detection tool can easily determine that
`the page is not available due to a network problem. The
`change-detection tool does not notify the user, but instead
`tries again later.
`Completely disconnecting servers from the Internet is
`frowned upon since users do not know what is causing the
`errors. Thus many web sites use another server to return a
`message page to the user when the server is down for
`maintenance. This message or error page lets the user know
`that the web page is only temporarily unavailable and the
`user should try back later.
`The error page of FIG. 3 is returned when a user tries to
`retrieve the web page containing the 37 C.F.R. patent rules.
`This same error page is returned to change-detection soft-
`ware trying to fetch the web page. However, since no packet
`or network error is signaled,
`the change-detection tool
`assumes that the error page is the registered web page and
`generates a new signature. The new signature for the error
`page is EB9, which does not match the old signature (D6F)
`that was stored in the database after the last change was
`detected.
`
`Clouding Exhibit 2001, pg. 16
`
`

`

`6,012,087
`
`3
`The change-detection tool then generates a change notice
`that is emailed to the user. The next day when the patent
`attorney reads the change notice, he browses over to the web
`page. By now the server is back up, showing the same web
`page as in FIG. 2. Although the user reads the web page
`carefully, he cannot find any changes.
`Afew days later, the change detection tool again retrieves
`the web page and generates the new signature. Since this
`new signature does not match the error page’s signature that
`was stored, another change notice is generated. The user
`again looks at the web page but finds no changes. At this
`point, after receiving to false change notices,
`the user
`cancels his change-detection service to avoid getting the
`false notifications.
`HTML Headers—FIG. 4
`
`FIG. 4 shows a dynamic web page with HTML headers.
`Acontent-length HTML header <CONTENTiLEN> speci-
`fies the length of the web-page document
`in bytes. A
`last-modified header <LASTiMODIFIED> contains a date
`and time of the last modification of the web page. Dynamic
`content 15 is frequently updated, often by a database or
`search-engine server. Stock quotes are an example of
`dynamic content that appears in a dynamic frame. Dynamic
`images or JAVA programs are often used as dynamic con-
`tent.
`
`Some change-detection software relies solely on the last-
`modified header in the HTTP response from a Web server.
`For example, Microsoft Internet Explorer 4.0 has a feature
`called “Subscriptions” under the “Favorites” menu, which
`detects changes in web pages. This feature relies on the
`last-modified header to determine when a web page has
`changed. Unfortunately, many web pages do not return a
`last-modified header, and Internet Explorer generates false
`change notifications each time it checks a web page lacking
`the last-modified header.
`Not all documents contain a last-modified header. The
`
`last-modified header may or may not reflect changes in
`dynamic content 15. Some web servers update the last-
`modified header only when the static content changes. Thus
`change notifications are not generated when the dynamic
`content changes. This may be undesirable when the dynamic
`content
`is what
`the user desires to have checked. For
`
`example, when the user wants to search newsgroups for the
`appearance of a specific product or company name,
`the
`result of the search is dynamic content. If the web server
`does not return a Last-Modified header, the user is notified
`by an unsophisticated change-detection tool every time the
`search result is checked. If the web server returns a Last-
`
`Modified header based only on the static content, the user is
`not notified when the results of the search—the dynamic
`content—changes.
`The last-modified header may also be updated when the
`HTML header are changed, but not the visible document.
`This can also cause false changes to be reported. Even if the
`change detection tool is intelligent enough to analyze the
`content for changes, rather than relying solely on the Last-
`Modified header, false changes can be reported when the
`server returns only a portion of the web page due to some
`kind of error. The inventors, with the benefit of the experi-
`ence involved in running a change detection tool for hun-
`dreds of thousands of different documents on the Internet,
`have recognized these problems. Without this level of expe-
`rience these problems are not easily recognized.
`What
`is desired is an improved automated change-
`detection tool that detects when changes occur to a regis-
`tered document on the Internet. It is desired that the user not
`
`have to check the web page to see if any changes have
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`occurred. A change-detection tool adapted to filter out false
`change notifications desired. A change-detection tool that
`does not report changes that are not relevant to the user is
`desirable. Identification of temporary error pages is desir-
`able so that
`they are not reported to the user. A more
`sophisticated and more robust change-detection tool
`is
`desired.
`
`SUMMARY OF THE INVENTION
`
`A change-detection web server detects unique changes in
`web pages. A network connection transmits and receives
`packets from a remote client and a remote web-page server.
`A responder is coupled to the network connection. It com-
`municates with the remote client. The responder registers a
`web page for change detection by receiving from the remote
`client a uniform-resource-locator (URL) identifying the web
`page. The responder fetches the web page from the remote
`web-page server.
`A database is coupled to the responder. It receives the
`URL from the responder when the web page is registered by
`the remote client. The database stores a plurality of records
`each containing a URL.
`Ahistory table in each of the records in the database stores
`a most-recent signature and a plurality of older-version
`signatures for a registered web page identified by the URL.
`The older-version signatures are condensed checksums for
`earlier versions of the registered web page previously
`fetched by the change-detection web server. The most-recent
`signature is a condensed checksum for a most-recently-
`fetched copy of the registered web page. A periodic minder
`is coupled to the database and the network connection. It
`periodically re-fetches the web page from the remote web-
`page server by transmitting the URL from the database to the
`network connection. The periodic minder receives a fresh
`copy of the web page from the remote web-page server. The
`periodic minder generates a new signature from the fresh
`copy of the web page. The periodic minder notifies the
`remote client of a unique change when the new signature
`does not match the most-recent signature and does not match
`any of the older-version signatures in the record.
`Thus the unique change in the web page is detected by
`comparing the new signature to the most-recent signature
`and to older-version signatures for the web page. Changes in
`the web page that are not unique but match an earlier version
`of the web page do not notify the remote client.
`In further aspects the database does not store the web
`page. The database stores the most-recent signature and
`earlier-version signatures for the web page. Thus storage
`requirements for the database are reduced by archiving the
`most-recent signature and not entire web pages.
`In still further aspects a permanent history table stores
`new signatures that match one of the older-version signa-
`tures. Thus older-version signatures that are matched are
`copied to the permanent history table.
`In other aspects the history table is a temporary history
`table organized as a first-in-first-out stack. A least-recent
`signature in the history table is replaced by a new signature
`when notification is made. Thus signatures in the permanent
`history table are not deleted by new signatures written to the
`temporary history table.
`In further aspects the older-version signatures are stored
`in both the permanent history table and the history table. The
`periodic minder compares the new signature to older-version
`signatures from both the history table and from the perma-
`nent history table.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 illustrates a web page registered for change detec-
`tion.
`
`Clouding Exhibit 2001, pg. 17
`
`

`

`6,012,087
`
`5
`FIG. 2 shows an updated web page that has a different
`signature that triggers a change notification.
`FIG. 3 shows a false change detection caused by a
`non-relevant change in an Internet server.
`FIG. 4 shows a dynamic web page with HTML headers.
`FIG. 5 is a diagram of a change detection tool on a server
`on the Internet.
`
`FIG. 6 shows a record with a history table of past
`signatures in the database for the change-detection web
`server.
`
`10
`
`FIGS. 7A—7D illustrate how a history table of signatures
`solves the error-page problem of FIGS. 1—3.
`FIGS. 8A, 8B are a flowchart for the periodic minder
`using history tables and last-modified headers to avoid
`non-relevant change notifications.
`FIG. 9 is a flowchart of notification once a unique change
`is detected.
`
`FIG. 10 shows a history table with both temporary and
`permanent signatures.
`FIG. 11 illustrates how the permanent history table is
`loaded for detected changes when any of the older signatures
`in the temporary history table are matched.
`FIG. 12 shows a modification for loading the permanent
`history table when a non-unique change is detected.
`FIG. 13 shows a change-detection record that tracks a
`number of times that change is detected for a registered web
`page.
`
`FIG. 14 is a flowchart for a frequency-check routine that
`stops signature comparison when too many changes are
`being detected for a web page.
`FIG. 15 is a flowchart for change detection that uses
`signatures and last-modified headers.
`FIG. 16 shows re-fetching when the content length is
`incorrect.
`
`DETAILED DESCRIPTION
`
`in
`invention relates to an improvement
`The present
`change-detection software tools. The following description
`is presented to enable one of ordinary skill in the art to make
`and use the invention as provided in the context of a
`particular application and its requirements. Various modifi-
`cations to the preferred embodiment will be apparent to
`those with skill in the art, and the general principles defined
`herein may be applied to other embodiments. Therefore, the
`present invention is not intended to be limited to the par-
`ticular embodiments shown and described, but
`is to be
`accorded the widest scope consistent with the principles and
`novel features herein disclosed.
`
`Overview of Change-detection Web Server—FIG. 5
`FIG. 5 is a diagram of a change detection tool on a server
`on the Internet. The user operates client 14 from a remote
`site on Internet 10. The user typically is operating a browser
`application, such as Netscape’s Navigator or Microsoft’s
`Internet Explorer, or a browser mini-application such as an
`Internet toolbar in a larger program. Client 14 communicates
`through Internet 10 by sending and receiving TCP/IP pack-
`ets to establish connections with remote servers, typically
`using the hypertext transfer protocol (HTTP) of the world-
`wide web.
`
`Client 14 retrieves web pages of files from document
`server 12 through Internet 10. These web pages are identi-
`fied by a unique URL (uniform resource locator) which
`specifies a document file containing the text and graphics of
`a desired web page. Often additional files are retrieved when
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`a document
`
`6
`is retrieved. The “document” returned from
`
`document server 12 to client 14 is thus a composite docu-
`ment composed of several files of text, graphics, and perhaps
`sound or animation. The physical appearance of the web
`page on the user’s browser on client 14 is specified by layout
`information embedded in non-displayed headers, as is well-
`known for HTML (hyper-text markup language) documents.
`Often these HTML documents contain headers with URL’s
`
`that specify other web pages, perhaps on other web servers
`which may be physically located in different cities or coun-
`tries. These headers create hyper-links to these other web
`servers allowing the user to quickly jump to other servers.
`These hyper-links form a complex web of linked servers
`across the world; hence the name “world-wide web”.
`The user may frequently retrieve files from remote docu-
`ment server 12. Often the same file is retrieved. The user
`
`may only be interested in differences in the file, or learning
`when the file is updated, such as when a new product or
`service is announced. The inventors have developed a soft-
`ware tool that automatically retrieves files and compares the
`retrieved files to an archived signature of the file to deter-
`mine if a change in the file has occurred. When a change is
`detected, the user is notified by an electronic mail message
`(e-mail). Acopy of the new file may be attached to the e-mail
`notification, allowing the user to review the changes.
`Rather than archive the source files from remote docu-
`ment server 12, the invention archives a checksum CRC or
`signature of the source files. These signatures and the e-mail
`address of the user are stored in database 16 of change-
`detection server 20. Comparison is made of the stored or
`archived signature of the document and a fresh signature of
`the currently-available document. The signature is a con-
`densed checksum or fingerprint of the document. Any
`change to the document changes the signature.
`Change-detection server 20 performs three basic func-
`tions:
`
`1. Register (setup) a web page document for change
`detection.
`
`2. Periodically re-fetch the document and compare for
`changes
`3. E-mail a change notice to the registered user if a change
`is detected.
`
`Change-detection server 20 contains three basic compo-
`nents. Database 16 stores the archive of signatures for
`registered web-page documents. The URL identifying the
`web page and the user’s e-mail address are also stored with
`the archived signature. Responder 24 communicates with
`the user at client 14 to setup or register a web page document
`for change detection. Minder 22 periodically fetches regis-
`tered documents from document server 12 through Internet
`10. Minder 22 compares the archived signature in database
`16 to a new signature of the fetched document to determine
`if a change has occurred. When a change is detected, minder
`22 sends a notice to the user at client 14 that the document
`
`has changed.
`Change-Detection of Web Pages
`This change-detection tool is disclosed in the co-pending
`parent application, “Change-Detection Tool Indicating
`Degree and Location of Change of Internet Documents by
`Comparison of CRC Signatures”, U.S. Ser. No. 08/783,625,
`filed Jan. 14, 1997, hereby incorporated by reference. A
`basic change-detection tool without the improved methods
`using the signature history tables has been available for free
`public use at the inventor’s web site, www.netmind.com, for
`more than a year before the filing date of the present
`application. The existing “URL-minder” has over 700,000
`documents or URL’s registered for 3.8 million users.
`
`Clouding Exhibit 2001, pg. 18
`
`

`

`6,012,087
`
`7
`Unique-content, not Mere Change, is Detected
`The inventors have realized that change detection must be
`accurate to be useful. False change detections must be
`avoided and non-relevant changes ignored. Often, the user
`does not want to be notified of all changes, but rather only
`for new content. Thus the inventors notify the user when
`“unique” content is detected; not when a mere “change” to
`old content is detected.
`Rather than just store the last signature, the inventors use
`a table of several older signatures. When any of the older
`signatures match the web page, the content is not unique
`even if it has changed since the last check. The web page
`may have reverted back to an older version.
`Previous change-detection tools generate notifications for
`any change, including changes back to an older version.
`With the improvement,
`the user is not notified for the
`older-version change, even though the web page has
`changed. It is likely that the user has already seen the older
`version of the web page. Only unique web pages that are
`unlike any previous versions cause the user to be notified.
`Thus the improved invention is not a “change”-detection
`tool, but a “Unique-content” tool.
`Database Records Include History Table of Signatures—
`FIG. 6
`
`FIG. 6 shows a record with a history table of past
`signatures in the database for the change-detection web
`server. Database 16 of FIG. 5 contains many such records,
`one for each web page or URL. Multiple e-mail addresses
`can be stored for each web page by using a relational
`(multi-table) database, with a separate table linking e-mail
`addresses to registered web pages.
`Each record has one or more e-mail address 32. When a
`
`unique change is detected, a notification message is sent to
`e-mail address 32. URL 36 is the world-wide-web address
`
`that is used to locate the web page. This URL is translated
`to an IP address of a server machine by Internet directories
`when the page is fetched. Length field 34 stores the length
`of the web page and can be used to ensure that the entire web
`page has been fetched.
`Last-modified field 38 contains a copy of the last-
`modified header from the web server for the particular
`web-page. Although the change-detection tool is primarily
`signature-based, improved detection results when the last-
`modified header in the newly-fetched document is compared
`to last-modified field 38.
`
`Rather than store one signature for the most-recent ver-
`sion of the web page, a table of signatures for many older
`versions of the web page is stored. History table 40 contains
`signatures for the three most-recent versions of the web
`page. Signature 2B9 (hex) is the most-recent signature for
`the web page, and the change-detection tool of the parent
`application stores only this signature, or multiple signatures
`for each section of this one most-recent version of the web
`page.
`History table 40 also stores signature D6F, for the next-
`to-last version of the web page, and signature 5A7 for the
`next earlier version of the web page. Thus three signatures
`for the last three versions of the web page are stored in
`history table 40. If a newly-fetched web page changes to any
`of the two earlier versions, a notification is not made, even
`though a change occurred.
`The number of signatures stored in history table 40 can
`vary; the three signatures of FIG. 6 is just for illustration.
`The size of history table 40 does not have to be fixed; it can
`vary under software control according to available storage in
`the database. The size of history table 40 could be adjusted
`to store all signatures in the last month or year rather than a
`fixed number of signatures.
`
`8
`History

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket