throbber
Ulllted States Patent [19]
`Kurtzman, II
`
`US006044376A
`[11] Patent Number:
`[45] Date 0f Patent:
`
`6,044,376
`Mar. 28, 2000
`
`[54] CONTENT STREAM ANALYSIS
`
`5,948,061
`
`9/1999 Merriman et al. .................... .. 709/219
`
`[75] Inventor: Stephen J. Kurtzman, II, San Jose,
`Calif
`
`[73] Assignee: IMGIS, Inc., Cupertino, Calif.
`
`Appl. No.: 08/847,778
`Filed:
`Apr. 24, 1997
`
`l
`[
`l
`[
`Int. C1.7 .................................................... ..G06F 17/30
`[51]
`[5 ] US. Cl. .......................... .. 707/102; 707/10; 707/100;
`707/103; 345/327
`[58] Field of Search ................................ .. 707/4, 10, 100,
`707/102; 5; 6; 14; 345/327; 705/10; 395/226;
`455/42
`
`[56]
`
`References Cited
`
`Us‘ PATENT DOCUMENTS
`1/1994 Pedersen et al. ......................... .. 707/4
`5,278,980
`5,692,132 11/1997 Hogan .... ..
`395/227
`5,696,965 12/1997 Dedrick - - - - - - - - - - - -
`- - - - -- 707/10
`5,710,887
`1/1998 Chelliah et al. .
`395/226
`5,712,979
`1/1998 Graber et al.
`.. 395/200.11
`5,717,860
`2/1998 Graber et al. .
`.. 395/200.12
`5,717,923
`2/1998 Dedrick ................................. .. 707/102
`5,721,827
`2/1998 Logan etal. ..................... .. 395/200.47
`5,724,424
`3/1998 GllffOId ............................ .. 380/24
`5,727,156
`3/1998 Herr-Hoyman et al.
`395/200.49
`5,737,619
`4/1998 Judson ................... ..
`395/761
`5,740,549
`4/1998 Reilly
`395/214
`5,751,956
`5/1998 Kirsch
`-- 395/200-33
`577577917
`5/1998 Rose et a1
`380/25
`577817904
`7/1998 Oren et a1‘
`707/100
`5,794,235
`8/1998 Chess ......... ..
`707/5
`5,802,515
`9/1998 Adar et al.
`707/5
`5,835,087 11/1998 HerZ et al.
`345/327
`5,848,396 12/1998 Gerace . . . . .
`. . . . .. 705/10
`
`FOREIGN PATENT DOCUMENTS
`0 749 081 A1 12/1996 European Pat. Off. ...... .. G06F 17/60
`WO 97/21183
`6/1997 WIPO ......................... .. G06F 151/00
`
`OTHER PUBLICATIONS
`
`Article by Ellis Booker entitled: “Seeing a Gap, A Palo Alto
`Startup Will Debut Advertising Server for the Net” pub
`lished by Web Week, Vol. 2, Issue 2 in Feb. 1996.
`Article by Bob Metcalfe entitled: “From the Ether”, Info
`World, V18 issue 3, Aug. 12, 1996.
`NetGravity Announces AdServer 2.0, Oct. 14, 1996, pub
`lished at h?P1//WWW-n9tgravity-99m
`Article titled “Internet access: Internet marketing revolution
`begins in the US. this Sep.”: Hyper Net Offering on Dec.
`1996 EDGE: Work—group Computer Report, V7 N316.
`Article by Youji Kohda et al. entitled: “Ubiquitous adver
`tising on the
`Merging advertisement on the
`browser” published by Computer Networks and ISDN Sys
`terns, Vol- 28, NO- 11, May 1996- pp- 1493—1499
`Declaration of DWight Allen Merriman submitted With 37
`CFR 1.131 petition during prosecution of Us. Patent No.
`5;948;061_
`
`Primary Examiner—Wayne Amsbury
`Assistant Examiner_Thuy Pardo
`Attorney; Agent; Or Firm_FenWiCk & West LLP
`
`[57]
`
`ABSTRACT
`
`Content stream analysis is a user pro?ling technique that
`generates a user pro?le based on the content ?les selected
`and vieWed by a user. This user pro?le can then used to help
`select an advertisement or other media presentation to be
`Shown to the user
`'
`
`10 Claims, 5 Drawing Sheets
`
`I 670
`
`I 620
`
`CRE A TE AD
`EEA TURE VECTORS
`l
`CREA TE CON TENT
`F E A TURE VECTORS
`i
`DETERMINE
`SIMILARI TY
`MEASURES
`I 640
`l
`MUL TIPL Y SIMILARI TY
`MEASURES 5)’
`DE CA Y E A C TOR
`i
`SUM SIMILARITY
`ME A SURE S
`
`I 630
`
`I 650
`
`1
`
`

`
`U.S. Patent
`
`5
`
`6,044,376
`
`mQ233mSE55
`
`
`u.EQEmo<uxmQ. o.EQEmom.oxoQ\mE.:9:8:
`
`_at
`
`
`
`méeaEé%.05MmckcogqEm:\Co.Em>SbeQ\Qm<O.m.tOQ.m.mm.MPERQm.T\EAR
`
`
`
`
`
`Q9
`
`?.a.Em§\
`
`gmimmSE
`
`53%5.5.5,‘
`
`6923:Ex
`
`2
`
`
`
`
`

`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 2 0f5
`
`6,044,376
`
`f200
`
`usER
`
`202
`1 DISPLA Y
`
`204
`|
`INPUT f
`DEVICES
`
`2081
`
`270
`1
`
`CPU
`
`MOOEM/
`NETWORK
`
`2201
`
`WWW
`
`f 7 70
`
`WEBSITE
`SERVER
`
`I
`WEBSITE
`CORPUS
`/
`230
`
`700 1 |
`AFFINITY
`sER VER
`1
`AD
`BANK
`
`12O\
`
`WORKING
`MEMORY
`
`206
`APPLICA UO/v f
`MEMORY
`(BROWSER)
`
`USER SYSTEM
`
`WEBSITE SYSTEM
`
`F240
`WORKING
`MEMORY
`
`F 242
`)
`APPLICA HUN
`MEMORY
`AFFINI TY
`SERVER
`INSTRUCTIONS
`/
`f
`246
`
`FIG. 2
`
`3
`
`

`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 3 0f 5
`
`6,044,376
`
`770
`WEBSITE f
`SERVER
`246
`*
`AFFINITY f
`SERVER
`INSTRUCTIONS
`
`A 340
`
`330
`HTTP
`TCP IR f
`/
`
`SOCKET f
`+
`700
`AFFINITY f
`
`770
`+
`coNTENT f
`5 TREA M
`ANAL VSIS
`
`f350
`
`"
`NE TW0RK
`HARD+WARE
`200
`USER f
`
`FIG. 3
`
`200
`IISER I
`4701 / \ f 430
`CURRENT
`0 YNAMICALL Y
`PA GE
`GENERA TED
`PAGE
`
`\ / 770
`
`_
`
`WEBSITE f
`SERVER
`I
`
`‘
`
`700
`AFFINITY f
`SERVER
`
`NEW
`PAGE
`/
`420
`
`FIG. 4
`
`4
`
`

`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 4 0f5
`
`6,044,376
`
`Sm
`
`Rm
`
`Qhm
`
`ova
`
`Rm
`
`90
`
`3
`
`
`
` mqmqgQ«S93.\.Q\<.\.EZN35K>E269Kn<.§>50
`
`§RIQkVGWNQ
`
`
`
`meE©\<.\..Ev§><O:.\
`
`QQKmQk.\b,.w.\.Q
`
`WQQQE
`
`OZNEEEmxin.\
`
`mmsqmuomm
`
`m2.§Emm
`
`asmm§,m58E
`
`Em
`
`Rm
`
`QM6
`
`9%
`
`anS:E$.29
`
`:Em-q%§\q%§EEE98.:
`§<Sm>m§.q
`
`.9:§\§<0.58E
`
`mat
`
`
`
`:Em-mE§\qm©gto5..
`
`Qmm
`
`9\E5&6
`
`MQE8.:E2_\m..\
`
`
`
`MQEEsmm»:«GkK20¢>39EEmu
`
`0.33&1”:MQ
`
`\CExvQ.§<~m
`
`Baum.\m§\
`
`\C\.k_\d.\<.\.mxED#3
`
`kegQ«CxVQMQxm.
`
`mmmsm.\m§\
`
`\CE§.Em33%
`
`mudsmis
`
`©GE
`
`Sm
`
`qwm
`
`Qhm,
`
`own
`
`0.:Nmumk
`
`m3
`
`.§.\.m6mE
`
`\<ESWKZE>50
`
`MZNSQEMQ
`
`
`
`mmfi\.\<.FE_\
`
`EmsEQ.\REmmmm
`
`
`
`mmzommmm\<.\.
`
`
`
`MFR.\.\<TE<QR
`
`mGE
`
`5
`
`
`
`
`

`
`U.S. Patent
`
`Mar. 28,2000
`
`Sheet 5 0f5
`
`6,044,376
`
`(
`
`)
`
`I 702
`
`CON vER I ADS
`IN IO INDIVIDUAL
`WORDS
`+
`DISCARD H IML
`FORMA IIINC IA OS
`+
`DISCARD STOP
`WORDS
`+
`APPL Y SIEMMINC
`PROCEDURE
`+
`DETERMINE
`FREOUENCIES OF
`EA CH WORD/WORD—SIEM
`+
`I 712
`CREA IE MUL II-
`DIMENSIONAL vEC IORS
`
`I 704
`
`I 706
`
`I 708
`
`I 770
`
`720
`I
`
`RECEIVE SI IE
`CORPUS
`722
`I
`CONVERT CON IENI I
`FILES IN TO
`INDI VIDUAL WORDS
`I
`DISCARD HIML
`FORMA IIINC TA CS
`+
`DISCARD SIOP
`WORDS
`+
`APPL Y SIEMMINC
`PROCEDURE
`I
`DETERMINE I, 0F
`FILES EACH
`WORD/WORD—SIEM
`OCCURS
`
`I 724
`
`I 726
`
`I 728
`
`I 730
`
`FREQUENCY PAIRS
`
`MODIFY
`AD
`vEC IOR?
`
`USE WORD FREQUENCY
`SIA IISIICS IO
`MODIFY AD
`F E A TURE VE C TOR’
`
`\
`
`FIG. 7
`
`J
`
`6
`
`

`
`6,044,376
`
`1
`CONTENT STREAM ANALYSIS
`
`BACKGROUND OF THE INVENTION
`
`10
`
`15
`
`1. Field of the Invention
`This invention relates to a method of selecting an adver
`tisement to be shoWn to a user based on the content ?les
`selected and vieWed by a user. More particularly, this
`invention relates to determining an af?nity measure betWeen
`an advertisement and a set of content ?les.
`2. Background of the Invention
`Product advertisement in media such as newspaper and
`television have the advantage of reaching many people. At
`the same time, these forms of advertisement are indiscrimi
`nate and may reach many people Who are not interested in
`the product advertised.
`An advertisement is more effective When it can be tar
`geted to a speci?c market that is more likely to be interested
`in the product advertised. For example, advertisements for
`?shing equipment Will be more effective When placed in a
`?shing magaZine.
`On the World-Wide Web (WWW), advertisers can target
`speci?c markets With more discrimination than other media.
`The manner in Which content is presented on the WWW
`25
`means that advertisers can reach increasingly Well-de?ned
`segments of the market. For example, a high percentage of
`people Who access a stock quotes WWW page may be
`interested in a stock broker. A stock broker Who places an
`advertisement on this WWW page mall reach a smaller
`group of people, but a much higher percentage of this group
`Will be potential customers. This is in stark contrast to other
`media such as neWspaper and television, in Which the target
`market may only be a small percentage of the total market
`reached.
`Other media, including emerging and developing tech
`nologies such as on-demand television, Will also give adver
`tisers similar ability to target speci?c markets.
`To take advantage of this ability to target speci?c markets
`on the WWW, advertisers often estimate a user’s interests
`using a variety of pro?ling techniques. These pro?ling
`techniques can help an advertiser to select an advertisement
`to present to the user. Current pro?ling techniques use a
`combination of demographic, geographic, psychographic,
`collaborative ?ltering, digital identi?cation, and hypertext
`transfer protocol (HTTP) information. HoWever, these cur
`rent techniques have met only With limited success.
`What is needed is a more sophisticated pro?ling technique
`for generating a more useful user pro?le. This more useful
`user pro?le Would be valuable in selecting an advertisement
`to be shoWn to the user.
`
`35
`
`45
`
`OBJECTS AND SUMMARY OF THE
`INVENTION
`
`Accordingly, an object of the invention is to provide a
`more sophisticated pro?ling technique for generating a more
`useful user pro?le.
`Afurther object of the invention is to use this user pro?le
`to help select an advertisement or other media presentation
`to be shoWn to the user.
`These and other objects of the invention are achieved by
`using the actual content ?les accessed and vieWed by the
`user. These content ?les may be used alone or in combina
`tion With the other elements knoWn in the prior art to help
`select an advertisement or other media presentation to be
`shoWn to the user. This selection process is performed by an
`af?nity server.
`
`55
`
`65
`
`2
`First, the af?nity server receives both the content ?les and
`the available advertisements. Second, the advertisements are
`compactly represented as advertisement feature vectors. In
`one example, advertisement feature vectors are multi
`dimensional vectors comprised of individual Words mapped
`to their frequency of occurrence. The advertisement feature
`vectors may be modi?ed by Weighting the importance of
`each Word in the context of the Website corpus.
`Next, a content stream including a sequence of one or
`more pages selected and vieWed by the user and including
`content data is also compactly represented in a sequence of
`content feature vectors.
`Lastly, the af?nity is calculated. This is done by calcu
`lating similarity measures betWeen each advertisement and
`the content stream. An affinity measure is obtained by
`combining the similarities. This affinity measure is then used
`to help select an advertisement to be shoWn to a user.
`The method described by this invention can also be
`applied to user-feedback media other than the WWW, such
`as broadcast television or interactive television. For
`example, content streams can be created from the television
`program content, such as re?ected in closed caption text,
`length of time vieWed, and hoW recently the shoW Was
`vieWed. These content streams can then be used in the
`method described above to select a commercial to be shoWn
`to the vieWer. The method described can also target material
`other than advertising, such as entertainment, education, and
`instructional materials.
`
`BRIEF DESCRIPTION OF THE FIGURES
`
`FIG. 1 shoWs a conceptual vieW of content stream analy
`sis.
`FIG. 2 shoWs a schematic of a user and a computer
`connected to a Website server Which contains the content
`stream analysis capability.
`FIG. 3 shoWs a schematic of hoW the content stream is
`directed.
`FIG. 4 shoWs a schematic of hoW content stream is
`performed for a dynamically generated page.
`FIG. 5 shoWs a ?oWchart of content stream analysis
`FIG. 6 shoWs a ?oWchart of determining an af?nity
`measure.
`FIG. 7 shoWs a ?oWchart of creating an advertisement
`feature vector.
`FIG. 8 shoWs a sample advertisement feature vector.
`FIG. 9 shoWs a ?oWchart of creating a content feature
`vector.
`
`DETAILED DESCRIPTION
`Referring to the ?gures, FIG. 1 is a conceptual diagram
`placing context stream analysis in the context of its envi
`ronment. Requests for advertisements are received by the
`Website server 110. The Website server 110 sends these
`requests to the affinity server 100.
`The af?nity server 100 receives requests and selects an
`advertisement. The af?nity server 100 has access to an
`advertisement bank 120. The advertisement bank 120 con
`tains advertisements selected and controlled by the adver
`tisement manager 130.
`The af?nity server 100 uses a combination of procedures
`to select an advertisement, including sponsorship categories
`140, ad inventory 150, and user pro?ling 160.
`Sponsorship categories 140 include page, keyWord, and
`?oating advertisements. Page sponsorship is an advertise
`
`7
`
`

`
`6,044,376
`
`10
`
`15
`
`25
`
`3
`ment anchored to a location on a particular page, typically
`in a prominent position. Keyword sponsorship refers to
`shoWing an advertisement in response to keywords the user
`has entered to perform a search or other query. Floating
`advertisements are not anchored, and may appear anyWhere
`on the page.
`Ad inventory 150 uses impression, freshness, time/day,
`and sequence techniques. Impression refers to the number of
`times an advertisement is shoWn to all users. Freshness
`refers to the number of times an advertisement is shoWn to
`a particular user, and hoW soon the advertisement may be
`shoWn again and hoW many times the advertisement may be
`shoWn Without losing effectiveness. Time/day techniques
`refer to selecting an advertisement based on the time and
`day, e.g. shoWing a fast food advertisement: immediately
`before lunch time. Sequence techniques refer to shoWing a
`sequence of advertisements Which form a uni?ed
`presentation, e.g. a ?rst brand-aWareness advertisement, a
`second product-speci?c advertisement, and a ?nal Where
`to-buy advertisement.
`User pro?ling 160 uses content stream analysis 170, as
`Well as demographic, geographic, pyschographic, digital
`identi?cation, and HTTP information. Content stream analy
`sis 170 refers to the particular pages selected and vieWed by
`the user. Demographic information refers to factors such as
`income, gender, age, and race. Geographic information
`refers to Where the user lives. Psychographic information
`refers to user responses to a questionnaire. Digital identi?
`cation information refers to user domain, broWser, operating
`system, and hardWare information. HTTP information refers
`to transfer protocol information.
`FIG. 2 shoWs a display 202, input devices 204, and a
`broWser 206, all of Which alloW a user 200 to interact With
`a CPU 208. The CPU 208 is connected through a modem or
`35
`netWork connection 210 to the WWW 220. The W 220
`alloWs user 200 to send instructions through broWser 206 to
`the Website server 110.
`The Website server 110 controls a Website corpus 230,
`made up of numerous Website ?les. The Website server 110
`uses a Working memory 240 and an application memory
`242. The application memory 242 contains the instructions
`246 to use the affinity server 100.
`The Website server 110 receives instructions from the user
`200 through the WWW 220. The user 200 instructs the
`Website server 110 to access the Website corpus 230 and
`retrieve and transmit speci?c Website ?les. These speci?c
`?les selected and vieWed by the user 200 are recorded by the
`af?nity server 100. The content stream to be analyZed
`includes the speci?c ?les selected and vieWed by the user.
`FIG. 3 shoWs one example of hoW the content stream is
`directed. After receiving instructions, the Website server 110
`uses instructions 246 to send the ?les 320 through the
`protocol stack 330 and netWork hardWare 350 to the user
`200. Preferably at the same time, the Website server 110 also
`sends the ?les 320 through a socket 340 to the affinity server
`100, Where content stream analysis 170 is performed.
`FIG. 4 shoWs hoW a page may be dynamically generated
`using content stream analysis. The user 200 vieWs a current
`page 410, Which contains links to other pages. When the user
`decides to folloW a link leading to another page, the Website
`server 110 retrieves the neW page 420 and sends it to the
`af?nity server 100. The af?nity server 100 then selects an
`advertisement. This advertisement is sent back to the Web
`site server 110, Where it is associated With the neW page 420
`and sent to the user 200, Where the advertisement and the
`neW page 420 comprises a dynamically generated page 430.
`
`4
`FIG. 5 is a ?oWchart of content stream analysis 170,
`Which involves: (1) receiving a group of advertisements
`from an advertisement bank (block 510); (2) receiving a
`content stream (block 520), (3) determining an affinity
`measure betWeen each advertisement and the content stream
`(block 530); and (4) selecting and presenting an advertise
`ment to the user, based Wholly or partially upon these affinity
`measures (block 540).
`FIG. 6 shoWs the determination of an af?nity measure
`betWeen an advertisement and a content stream (block 610).
`This involves: (1) creating an advertisement feature vector
`for each advertisement (block 620); (2) creating a content
`feature vector for each content ?le in the content stream
`(block 630); (3) determining a similarity measure betWeen
`the advertisement feature vector and the content feature
`vectors (block 640); and (4) multiplying the similarity
`measures by a decay factor (block 66); and (5) summing the
`similarity measures (block 650).
`FIG. 7 shoWs the creation of an advertisement feature
`vector (block 610). First, an advertisement is converted into
`individual Words (block 702). Text data may be parsed into
`their individual Words, While voice data may require auto
`mated voice recognition and transcription to be converted
`into their individual Words.
`Words Which are deemed insigni?cant for discerning the
`content of the advertisement are discarded. Discarded Words
`include formatting codes, such as those Which occur inside
`hypertext markup language (HTML) formatting tags, e.g.
`<title> and <bold> (block 704). The HTML standard is
`available at the World Wide Web Consortium Website (http://
`WWW.W3.org/pub/WWW/) and is incorporated by reference.
`Discarded Words include stop Words, e.g. articles,
`prepositions, and common adjectives, adverbs, and verbs
`(block 706). Words Which are deemed particularly signi?
`cant may be given extra Weight, e.g. Words labeled by the
`HTML <meta keyWord> or <title> tags.
`Next, the individual Words are passed through a stemming
`procedure to obtain Words and Word-stems (block 708). This
`is done to map all Words With a common meaning to the
`same Word. For example, a stemming procedure might map
`the Words nation, national, and nationally to the stem “nati.”
`The book “Information Retrieval” by William Frakes and
`Ricardo BaeZa-Yates, eds., Prentice Hall, 1992, is incorpo
`rated by reference as one example of a stemming procedure.
`The stemming procedure used is a modi?ed version of the
`procedure found in Frakes, et al. This modi?ed version adds
`neW rules for inferring suffixes, and also contains a Word
`pre?x processing scheme. The modi?ed version recogniZes
`When a Word begins With a common pre?x, and removes the
`pre?x before the stemming process is applied. After the
`stemming process is complete, the pre?x is added back on
`to the Word. This improves the accuracy of the stemming
`process, as Words that incorrectly stem to the same Word
`under the original procedure no longer do so.
`After the stemming procedure, the frequencies of each
`Word and Word-stem are determined (block 710). Finally,
`these frequencies are paired With the Words and Word-stems
`to create a multi-dimensional vector (block 712). This
`multi-dimensional vector is knoWn as an advertisement
`feature vector.
`The advertisement feature vector may be modi?ed using
`an inverse, logarithmic, document-frequency measure
`derived from Word frequency statistics (block 714). One
`embodiment of the document-frequency measure is the
`folloWing:
`
`45
`
`55
`
`65
`
`8
`
`

`
`6,044,376
`
`iff=0
`
`%
`2% +10%?) forf > 0
`
`15
`
`6
`easily applied to television programs and help determine
`What kind of commercials Will be shoWn to the user.
`What is claimed is:
`1. A method of selecting an advertisement from a ?le of
`advertisements having a target consumer, comprising the
`steps of:
`receiving content data representing content having par
`ticular characteristics;
`receiving advertisement data representing advertisements
`in the ?le;
`creating a content data structure Which indicates features
`of the content having particular characteristics;
`creating an advertisement data structure Which indicates
`features of the advertisements in the ?le;
`determining similarity measures betWeen the content data
`structure and the advertisement data structure by cal
`culating dot vector products betWeen the content data
`structure and the advertisement data structure and mul
`tiplying the dot vector products by a decay factor;
`determining af?nity measures betWeen the content data
`and the advertisement data in response to the similarity
`measures; and
`presenting to the consumer an advertisement from the ?le
`in response to the af?nity measures.
`2. The method of claim 1, Wherein content data includes
`WWW ?les.
`3. The method of claim 1, Wherein content data includes
`television programs.
`4. The method of claim 1, Wherein creating a content data
`structure Which indicates features of the content having
`particular characteristics comprises the steps of:
`converting the content data into individual Words;
`applying a stemming procedure to the individual Words to
`obtain Words and Word-stems;
`determining frequencies of particular Words and Word
`stems; and
`creating a multi-dimensional vector comprised of the
`Words and Word-stems mapped to their respective fre
`quencies.
`5. The method of claim 4, further comprising the steps of:
`discarding stop Words; and
`discarding Words Which occur inside HTML formatting
`tags, eXcept for those Which occur inside a meta key
`Word tag.
`6. The method of claim 1, Wherein creating an advertise
`ment data structure Which indicates features of the adver
`tisements in the ?le comprises the steps of:
`converting the advertisement data into individual Words;
`applying a stemming procedure to the individual Words to
`obtain Words and Word-stems;
`determining frequencies of particular Words and Word
`stems; and
`creating a multi-dimensional vector comprised of the
`Words and Word-stems mapped to their respective fre
`quencies.
`7. The method of claim 6, further comprising the steps of:
`discarding stop Words;
`discarding Words Which occur inside HTML formatting
`tags, eXcept for those Which occur inside a meta key
`Word tag.
`8. The method of claim 6, further comprising the steps of:
`determining Word frequency statistics for a content avail
`able at a site;
`
`Where,
`n is the number of occurrences of a particular Word Within
`the
`advertisement
`m is the maximum number of Words in the advertisement
`d is the total number of ?les in the site corpus
`f is the number of ?les in the site corpus Which contain the
`particular Word
`To obtain the Word frequency statistics, the site corpus
`received (block 720) and each individual content ?le in the
`site corpus is converted into individual Words (block 722).
`Insigni?cant Words such as formatting tags (block 724) and
`stop Words (block 726) are discarded. The individual Words
`are then passed through a stemming procedure to obtain
`Words and Word-stems (block 728). The number of ?les in
`Which each Word/Word-stem occurs is determined, produc
`ing the Word frequency statistics (block 730). These Word
`frequency statistics are then used to modify the advertise
`ment feature vector (block 732).
`FIG. 8 shoWs a sample advertisement feature vector. The
`Word/Word-stems 810 are mapped to their corresponding
`frequency values 820.
`FIG. 9 shoWs the creation of content feature vectors from
`the content ?les in the content stream (block 620). Each
`content ?le in the content stream is converted into individual
`Words (block 910). Insigni?cant Words such as HTML
`formatting tags (block 920) and stop Words (block 930) are
`discarded. The individual Words are then passed through a
`stemming procedure to obtain Words and Word-stems (block
`940). The Word and Word-stems are counted to determine
`their frequencies (block 950). These frequencies are paired
`With the Words and Word-stems to create a multi
`dimensional vector for each content ?le in the content
`stream (block 960).
`The similarity measure is the dot vector product of an
`advertisement feature vector and a content feature vector.
`Mathematically, let A=(v0, v1, K, V”) represent the content
`stream, Where v0 represents the most recent content feature
`vector in the content stream and vn represents the oldest
`content feature vector in the content stream. Let W be an
`advertisement feature vector. The similarity measure of v to
`W is denoted Sim(v, W). The af?nity measure of A to W is
`denoted Aff (A, W) and is calculated by:
`
`25
`
`35
`
`45
`
`Where 0t is the decay factor, for example
`
`1
`
`55
`
`Although the methods here have been described using
`WWW ?les as an example, they could just as easily be
`applied to television programs and other forms of user
`feedback media. With the advent and development of inter
`active television and automated voice recognition and tran
`scription systems, the methods described here could be
`
`65
`
`9
`
`

`
`6,044,376
`
`7
`modifying the advertisement data structure using an
`inverse, logarithmic, document-frequency measure
`derived from the Word frequency statistics.
`9. The method of claim 8, Wherein determining Word
`frequency statistics for the site corpus comprises the steps
`of:
`converting the content available at a site into individual
`Words;
`applying a stemming procedure to the individual Words to
`obtain Words and Word-stems; and
`determining frequencies of particular Words and Word
`stems.
`
`8
`10. The method of claim 1, Wherein presenting to the user
`an advertisement from the ?le in response to the affinity
`measures comprises the steps of:
`
`5
`
`retrieving the advertisement;
`retrieving a content page;
`
`combining the advertisement and the content page;
`
`transmitting the advertisement and the content page to the
`user.
`
`10

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket