`US0070434 73B 1
`
`c12) United States Patent
`Rassool et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,043,473 Bl
`May 9, 2006
`
`(54) MEDIA TRACKING SYSTEM AND METHOD
`
`(75)
`
`Inventors: Reza Rassool, Stevenson Ranch, CA
`(US); William P. Worzel, Milan, MI
`(US); Brian Baker, Bellevue, WA (US)
`
`(73) Assignee: Widevine Technologies, Inc., Seattle,
`WA (US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 425 days.
`
`(21) Appl. No.: 09/988,824
`
`(22) Filed:
`
`Nov. 20, 2001
`(Under 37 CFR 1.47)
`
`Related U.S. Application Data
`
`5,701,469 A * 12/1997 Brandli et al. .............. 707/102
`5,758,257 A
`5/1998 Herz et al.
`5,774,527 A
`6/1998 Handelman et al.
`5,774,546 A
`6/1998 Handelman et al.
`5,799,089 A
`8/1998 Kuhn et al.
`5,805,705 A
`9/1998 Gray et al.
`5,878,134 A
`3/1999 Handelman et al.
`5,883,957 A
`3/1999 Moline et al.
`5,892,900 A
`4/1999 Ginter et al.
`5,903,892 A * 5/1999 Hoffert et al.
`
`................ 707/10
`
`(Continued)
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`
`658054-BA
`
`6/1995
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`(60) Provisional application No. 60/252,415, filed on Nov.
`22, 2000.
`
`"Detecting Digital Copyright Violations on the Internet" by
`Narayanan Shivakumar, Aug. 1999.
`
`(51)
`
`Int. Cl.
`G06F 7100
`(2006.01)
`(52) U.S. Cl. ......................... 707/6; 707/10; 707/104.1;
`713/179
`(58) Field of Classification Search ............. 707/104.1,
`707/10, 6, 1, 2, 3; 713/150, 168,176, 179-180
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,694,489 A
`5,067,035 A
`5,134,656 A
`5,144,663 A
`5,375,168 A
`5,539,450 A
`5,590,200 A
`5,592,212 A
`5,621,799 A
`5,640,546 A
`5,666,412 A
`5,684,876 A
`
`9/1987
`11/1991
`7/1992
`9/1992
`12/1994
`7/1996
`12/1996
`1/1997
`4/1997
`6/1997
`9/1997
`11/1997
`
`Frederiksen
`Kudelski et al.
`Kudelski
`Kudelski et al.
`Kudelski
`Handelman
`Nachman et al.
`Handelman
`Katta et al.
`Gopinath et al.
`Handelman et al.
`Pinder et al.
`
`(Continued)
`
`Primary Examiner-Safet Metjahic
`Assistant Examiner-Brian Goddard
`(74)Attorney, Agent, or Firm-Jamie L. Wiegand; Darby &
`Darby
`
`(57)
`
`ABSTRACT
`
`A method for identifying a media file transmitted over a
`network includes creating a plurality of kuown media file
`identifiers, each for a respective one of a plurality ofkuown
`media files, using an identifier generating algorithm, storing
`the kuown media file identifiers in a database, creating a
`media file identifier for an unkuown media file with the
`identifier generating algorithm and comparing the media file
`identifier for the unkuown media file with kuown media file
`identifiers in order to produce an identification of the
`unkuown media file.
`
`46 Claims, 9 Drawing Sheets
`
`Madia to
`protacl
`
`EX1016
`Roku V. Media Chain
`U.S. Patent No. 9,715,581
`
`
`
`US 7,043,473 Bl
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`5,910,987 A
`5,915,019 A
`5,917,912 A
`5,920,625 A
`5,920,861 A
`5,922,208 A
`5,923,666 A
`5,933,498 A
`5,939,975 A
`5,943,422 A
`5,949,876 A
`5,982,891 A
`6,009,116 A
`6,009,401 A
`6,009,525 A
`6,021,197 A
`6,035,037 A
`6,038,433 A
`6,049,671 A
`6,055,503 A
`6,073,256 A
`6,112,181 A
`6,138,119 A
`6,157,721 A
`6,178,242 Bl
`6,185,683 Bl
`6,189,097 Bl
`6,191,782 Bl
`6,199,081 Bl*
`6,226,794 Bl
`6,237,786 Bl
`6,240,185 Bl
`6,247,950 Bl
`6,253,193 Bl
`6,256,668 Bl
`6,272,636 Bl
`6,285,985 Bl
`6,292,569 Bl
`6,298,441 Bl
`6,314,409 Bl
`6,314,572 Bl
`6,334,213 Bl
`6,349,296 Bl *
`6,363,488 Bl
`6,389,402 Bl
`6,401,118 Bl*
`6,405,369 Bl
`6,409,080 Bl
`6,409,089 Bl
`6,427,140 Bl
`6,449,367 Bl
`6,449,719 Bl
`6,453,252 Bl *
`6,459,427 Bl
`6,460,050 Bl *
`6,466,670 Bl
`6,505,299 Bl
`6,618,484 Bl
`6,547,829 Bl*
`6,587,561 Bl
`6,628,824 Bl *
`6,629,243 Bl
`
`6/1999
`6/1999
`6/1999
`7/1999
`7/1999
`7/1999
`7/1999
`8/1999
`8/1999
`8/1999
`9/1999
`11/1999
`12/1999
`12/1999
`12/1999
`2/2000
`3/2000
`3/2000
`4/2000
`4/2000
`6/2000
`8/2000
`10/2000
`12/2000
`1/2001
`2/2001
`2/2001
`2/2001
`3/2001
`5/2001
`5/2001
`5/2001
`6/2001
`6/2001
`7/2001
`8/2001
`9/2001
`9/2001
`10/2001
`11/2001
`11/2001
`12/2001
`2/2002
`3/2002
`5/2002
`6/2002
`6/2002
`6/2002
`6/2002
`7/2002
`9/2002
`9/2002
`9/2002
`10/2002
`10/2002
`10/2002
`1/2003
`1/2003
`4/2003
`7/2003
`9/2003
`9/2003
`
`Ginter et al.
`Ginter et al.
`Ginter et al.
`Davies
`Hall et al.
`Demmers
`Gledhill et al.
`Schneck et al.
`Tsuria et al.
`Van Wie et al.
`Ginter et al.
`Ginter et al.
`Bednarek et al.
`Horstmann
`Horstmann
`von Willich et al.
`Chaney
`Vegt
`Slivka et al.
`Horstmann
`Sesma
`Shear et al.
`Hall et al.
`Shear et al.
`Tsuria
`Ginter et al.
`Tycksen, Jr. et al.
`Mori et al.
`Meyerzon et al. .......... 715/513
`Anderson, Jr. et al.
`Ginter et al.
`Van Wie et al.
`Hallam et al.
`Ginter et al.
`Slivka et al.
`Neville et al.
`Horstmann
`Shear et al.
`Handelman et al.
`Schneck et al.
`LaRocca et al.
`Li
`Broder et al. . ................ . 707/3
`Ginter et al.
`Ginter et al.
`Thomas ...................... 709/224
`Tsuria
`Kawaglshi
`Eskicioglu
`Ginter et al.
`Van Wie et al.
`Baker
`Laroche ... .. ... ... ... ... ... .. . 702/7 5
`Mao et al.
`Pace et al. ............... 707/104.1
`Tsuria et al.
`Zeng et al.
`Van Wie et al.
`Meyerzon et al. ....... 715/501.1
`Sered et al.
`Belanger .................... 382/165
`Kleinman et al.
`
`10/2003 Handelmann
`6,634,028 Bl
`10/2003 Ginter et al.
`6,640,304 Bl
`6,643,641 Bl* 11/2003 Snyder .......................... 707/4
`11/2003 Rix
`6,651,170 Bl
`6,654,420 Bl
`11/2003 Snook
`6,654,423 Bl
`11/2003 Jeong et al.
`6,658,403 Bl * 12/2003 Kuroda et al. ................. 707/2
`12/2003 Ginter et al.
`6,658,568 Bl
`6,668,325 Bl
`12/2003 Collberg et al.
`6,675,174 Bl*
`1/2004 Bolle et al. .............. 707/104.1
`6,738,906 Bl *
`5/2004 Hippelainen ................ 713/200
`2002/0059251 Al*
`5/2002 Stern et al. . . . . . . . . . . . . . . . . . . . 707 / 10
`2002/0087515 Al*
`7/2002 Swannack et al. ............. 707/2
`2003/0007 568 Al
`1/2003 Harnery et al.
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`WO
`WO
`WO
`WO
`WO
`
`714204 Bl
`WO-96/06504 Al
`WO-96/32702 Al
`WO-99/30499 Al
`WO-99/54453 Al
`WO-01/35571 Al
`
`5/1996
`2/1996
`10/1996
`6/1999
`10/1999
`5/2001
`
`OTHER PUBLICATIONS
`
`http ://www.ntt.co .jp/news/news02e/0209/020927 .htm, Sep.
`27, 2002.
`Coverage and Generalization in an Artificial Immune Sys(cid:173)
`tem, Balthrop, et al., 2002.
`Video Protection by Partial Content Corruption, C. Griwodz,
`Sep. 1998.
`An Overview of Multimedia Content Protection in Con(cid:173)
`sumer Electronics Devices, Eskicioglu et al.
`Performance Study of a Selective Encryption Scheme for the
`Security of Networked, Real-Time Video, Spanos et al.,
`1995.
`Goonatilake, Suran, ed. et al., Intelligent Systems for
`Finance and Business, 1995, chapter 2-10, pp. 31-173.
`Irdeto Access and Optibase create Strategic Alliance-Dec.
`14, 2000, http://www.irdetoaccess.com/press/0000041.htm.
`System Security, Streaming Media, S. Blumenfeld, Oct.
`2001.
`http://www.cs.unm.edu/-forest/projects.html, Dec. 2, 2003.
`Partial Encryption for Image and Video Communication, H.
`Cheng, 1998.
`A Review of Video Streaming Over the Internet, Hunter et
`al., Dec. 2, 2003.
`Standards Track, Schulzrinne, et al., Apr. 1998, pp. 1-86.
`http://www.optibase.com/html/news/Dec._14_2000.html,
`Dec. 14, 2004.
`Onmeon Video Networks Product Announcement, Broad(cid:173)
`band Streaming, pp. 1-4.
`Yoshida, Kazuhiro, et al., "A Continuous-media Communi(cid:173)
`cation Method for Minimizing Playback Interruptions",
`IS&T/SPIE Conference on Visual Communications and
`Immage Processing, Jan. 1999, San Jose, California, vol.
`3653.
`
`* cited by examiner
`
`
`
`U.S. Patent
`
`May 9, 2006
`
`Sheet 1 of 9
`
`US 7,043,473 Bl
`
`IO
`/
`
`FJ.G.1
`
`FIGZ
`
`j
`
`I
`I
`
`I
`
`I
`
`• lh
`U=
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`I
`
`I
`
`I
`
`I
`
`I
`I
`
`I
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`•
`
`I
`
`I
`
`I
`
`u=½
`
`V•½
`fJ
`• • I
`• • • • I
`• • • I
`• • • • • •
`• • • • I
`• •
`• • •
`' I
`• I
`• • • • I
`' • I
`•
`• •
`• • •
`•
`•
`• I
`•
`• • • •
`•
`•
`• • • •
`• •
`' •
`•
`• • •
`• • I
`'
`• • • • •
`• •
`•
`•
`•
`• •
`• I
`• •
`• I
`•
`• • • • • I
`' • • I
`• • •
`• • I
`• I
`•
`• •
`• • • •
`..
`• •
`• •
`• • • • •
`•
`• • • •
`• • I
`•
`• • • I
`• • • • • • • • I
`• •
`• • I
`•
`• •
`• •
`'
`• •
`• • • I
`'
`• I
`• • I
`• •
`,
`•
`• • I
`•
`•
`• •
`'
`• • • I
`• I
`• J.
`• • • •
`• • •
`• I
`• I
`• • -
`•
`• • I
`• •
`• • • I
`•
`• •
`•
`• • • I
`• • I
`• • I
`M • • •
`FJG_3
`
`... • • •
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`• I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`•
`• I
`
`I
`
`I
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`•
`•
`• I
`
`
`
`U.S. Patent
`U.S. Patent
`
`May9, 2006
`May 9, 2006
`
`Sheet 2 of 9
`Sheet 2 of 9
`
`US 7,043,473 B1
`US 7,043,473 Bl
`
`
`
`NR • 3
`
`NR ""1
`
`I~
`
`I~
`
`
`
`U.S. Patent
`U.S. Patent
`
`May 9, 2006
`May9, 2006
`
`US 7,043,473 Bl
`US 7,043,473 B1
`
`Sheet 3 of 9
`Sheet 3 of 9
`
`HAVUPIewu*
`
`
`
`U.S. Patent
`U.S. Patent
`
`May9, 2006
`May 9, 2006
`
`Sheet 4 of 9
`Sheet 4 of 9
`
`US 7,043,473 B1
`US 7,043,473 Bl
`
`Bers
`
`FIG.~
`
`ai
`rara
`+
`Jeeo
`i
`ora| A
`ar
`
`
`
`U.S. Patent
`U.S. Patent
`
`May 9, 2006
`May9, 2006
`
`Sheet 5 of 9
`Sheet 5 of 9
`
`US 7,043,473 Bl
`US 7,043,473 B1
`
`
`uojenyahs
`
`0--
`rt
`~
`\.L.
`
`- a,
`
`me jl
`-I
`
`~
`"' CD
`
`•'
`:l'
`{~ ~t
`e,-~1
`tll£. s cE. V, 1'1'/ C -
`E a,
`G :, sl
`~i 1-
`::! i
`
`uoneuuoju;
`
`~
`
`'-0
`
`-I
`-C
`-E
`~
`
`C
`
`-l:
`
`..
`I
`I ~
`
`0
`
`~
`
`
`
`Hint
`
`Exclusion
`lnfonnatlon
`
`60
`
`FIG. 10
`
`64
`
`e •
`
`00
`•
`
`Internal Reporting
`
`68
`
`Coordination
`Record
`
`Reference
`Record ____ -1
`
`Internet
`Response
`
`Internet
`Reques1
`
`Mo~e
`
`t
`~ o
`
`Mo~eStore
`
`Mo~e
`
`Media to
`Protect
`
`
`
`80
`
`84
`
`Search Record
`
`102
`
`History Record
`
`104
`
`e •
`
`00
`•
`
`'1
`
`URL Ready
`
`T
`
`72
`
`/
`
`/
`
`T
`
`T
`
`t
`
`Coordlnatton
`Record
`
`Coordlnatton
`
`Record T Coordlnatton
`
`Record
`
`'
`
`lntemet
`Request
`
`lntemet
`Response
`
`FIG. 11
`
`URL
`
`Robot Policy
`
`;>
`Robot Policies
`
`I 110
`t Robot Polley
`
`Robot Status
`
`/T ~ / / -,f
`'
`T
`,
`T
`URL Syntax 1
`t
`1 / Status
`
`1
`
`/
`
`,
`
`"- ' , , ,
`
`Page Status
`',
`
`,: " -,
`
`Trap Status E/0 ,
`
`' ,
`Exclusion T
`' ,
`status
`86
`
`URL
`
`URL
`
`Current URL
`
`
`
`U.S. Patent
`U.S. Patent
`
`May9, 2006
`May 9, 2006
`
`Sheet 8 of 9
`Sheet 8 of 9
`
`US 7,043,473 B1
`US 7,043,473 Bl
`
`
`
`FIG. IZ
`
`
`
`U.S. Patent
`
`May 9, 2006
`
`Sheet 9 of 9
`
`US 7,043,473 Bl
`
`/21,
`
`'
`I
`I
`
`{2()
`
`fIG. ~3
`
`1#-fi
`I
`
`FIG.11
`
`
`
`US 7,043,473 Bl
`
`1
`MEDIA TRACKING SYSTEM AND METHOD
`
`The present application claims priority to U.S. provisional
`application ofRassool et al., Ser. No. 60/252,415, filed Nov.
`22, 2000, the entirety of which is hereby incorporated into
`the present application by reference.
`
`FIELD OF THE INVENTION
`
`The present invention is directed to a media tracking
`system and method for searching machine readable data to
`locate media files and generate unique identifiers for the
`media files located by the tracking system.
`
`BACKGROUND OF THE INVENTION
`
`Modern technology has made possible the digital encod(cid:173)
`ing of audio and image information. Examples of audio
`information that may be digitally encoded include music,
`speech and sound effects. Image information may be roughly
`divided into still images and moving images (hereafter
`"video"). Image information that may be digitally encoded
`includes photographs, paintings, trademarks, product logos,
`designs, drawings and so on. Video information that may be
`digitally encoded includes movies, news and entertainment 25
`television programming, home videos and the like. Audio
`and image information may be digitally encoded when
`initially recorded or may be initially recorded in an analog
`format and later converted to a digital format. Digital
`information is relatively easy to store, retrieve, manipulate, 30
`reproduce and distribute. Digitized information can, for
`example, be easily stored and accessed on a personal com(cid:173)
`puter, and computers can be interconnected to one another
`for data transmission by a network. A network may inter(cid:173)
`connect, for example, computers located in a particular 35
`building or in a particular geographical area. The Internet is
`a well-known example of a worldwide computer network.
`Computer networks such as the Internet may be concep(cid:173)
`tually thought of as a software interface that facilitates the
`storage of, the search for and the transfer of information on 40
`the Internet. The advent of the Internet has made possible the
`creation of a large number of linked commercial and non(cid:173)
`commercial network presences, e.g., web-sites and cross(cid:173)
`links maintained on host computers connected to the Inter(cid:173)
`net. Each network presence may include, for example, 45
`digitized audio and image information ( e.g., stored in
`"media files") and hypermedia-based documents that Inter(cid:173)
`net users can readily access using a software program
`frequently referred to as a "browser". The Internet may be
`also be conceptually thought of as, in essence, a collection 50
`of linked databases distributed throughout the network that
`are accessible to all users ( or some users, in the case of, for
`example, password protected network presences) of the
`network.
`A large body of audio and image information is currently
`available on the Internet and this body of information is
`constantly changing as, for example, new network pres(cid:173)
`ences, e.g., web sites, come into the existence and as new
`files are added to the existing network presences. While the
`abundance of currently available audio and image informa- 60
`tion and the ease of duplicating and transmitting the same
`provide enormous potential benefits, this abundance also
`gives rise to several problems. For example, the usefulness
`on this information is limited because there is often no way
`to locate media files that have a particular media content. 65
`Furthermore, the ease of copying and distributing media
`files has also greatly exacerbated the problem of media
`
`2
`piracy. There is, for example a growing level of piracy of
`copyright protected media files on the Internet. Copyrighted
`material and other proprietary material is being replicated
`and distributed over the Internet without permission, for
`5 both personal and commercial use.
`Pirated media files stored on network presences can be
`downloaded as streaming media content for viewing and/or
`listening (i.e., "playing") in real-time or may be downloaded
`and stored on the computer of the person accessing the
`10 pirating network presence for playing and/or for further
`copying and redistribution at a later time. Newark presences
`offering pirated media may be commercial or non-commer(cid:173)
`cial (sometimes called "free" sites). Recently, Internet ser(cid:173)
`vices such as Napster™ and Gnutella™ have arisen that
`15 facilitate peer-to-peer protocols to enable transfer of copied
`media files between individual Internet users. Therefore,
`media content owners are increasingly concerned a piracy of
`their intellectual property over the Internet.
`Moreover, managing and tracking the distribution of
`20 media files becomes increasingly difficult with the large
`number of mechanisms for disseminating the media and the
`increasing number of pathways that the dissemination may
`follow.
`
`SUMMARY
`
`Thus, there is a need for a media tracking system that can
`be used to search data, locate media files and identify their
`contents, particularly on the Internet. To meet this need, the
`present invention provides a media tracking system that may
`include a set of media identification software tools that can
`provide a wide range of media identification services for a
`wide range of purposes in many environments. The media
`identification tools provide the tracking system with the
`ability to analyze the content of media files and generate
`unique identifiers or "fingerprints" for identifying the file
`contents. The software tools included in the media tracking
`system can use these unique identifiers to perform a wide
`range of services. For example, the media tracking system
`can be used to catalog or index a database of media files or
`can be used to search for media files containing particular
`media content. This tracking system may be used to identify
`the source of media files, how they reached their present
`storage location and any associated pathways of dissemina(cid:173)
`tion.
`The tracking system may also be used to "police" a
`collection of data to determine if unauthorized duplication of
`protected media files has occurred. This intellectual property
`protection service can be performed by the tracking system
`in many environments including, for example, a database
`stored on a single computer or on a network such as the
`Internet.
`Various objects, features and advantages of the present
`invention will become apparent from the following detailed
`55 description, the accompanying drawings, and the appended
`claims.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a black and white reproduction of a full color
`digital image;
`FIG. 2 is a black and white reproduction of a red color
`channel portion of the image of FIG. 1;
`FIG. 3 is a 16x24 initialized, warp grid plotted on a u-v
`coordinate system;
`FIG. 4 shows the initialized warp grid of FIG. 3 super(cid:173)
`imposed on the image of FIG. 2;
`
`
`
`US 7,043,473 Bl
`
`3
`FIG. 5 shows three potential connection patterns for a grid
`point of a warp grid;
`FIG. 6 shows final and intermediate positions of the grid
`points of FIG. 4 after three iterations of a warp grid
`algorithm;
`FIG. 7 shows the positions of the grid points of FIG. 4
`after two hundred and fifty iterations of a warp grid algo(cid:173)
`rithm;
`FIG. 8 shows the equilibrium positions of the grid points
`of FIG. 4 after execution of a warp grid algorithm;
`FIG. 9 is a schematic diagram in Yourdon notation
`showing a media tracking system in an operating environ(cid:173)
`ment thereof;
`FIG. 10 is a decomposed view of the media tracking
`system of FIG. 9 in Yourdon notation showing a database of 15
`the system, a plurality of processes performed by the system
`and indicating data flows therebetween;
`FIG. 11 is a decomposed view of a plurality of processes
`performed by an Internet crawler of the media tracking
`system in Yourdon notation and indicating data flow ther- 20
`ebetween;
`FIG. 12 is a state transition diagram decomposed from
`FIG. 11 of the Internet crawler of the system;
`FIG. 13 is a schematic diagram in Yourdon notation
`showing a plurality of processes for generating image vec(cid:173)
`tors from a series of images of a video; and
`FIG. 14 is a schematic diagram in Yourdon notation
`showing a plurality of processes for generating a series of
`image vectors from a series of frames of a video, for storing
`selected series of image vectors in a database and for 30
`querying the database with other series of image vectors.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`When a media tracking system designed in accordance
`with various embodiments of the invention is used to track
`intellectual property, e.g., to track dissemination of media
`for licensing royalties, to police copyright violations, etc., in
`the Internet environment, the tracking system can be used to
`generate a unique identifier for each media file to be pro(cid:173)
`tected by performing an identifier generating algorithm on
`the media contents of each media file to be tracked, e.g., for
`protection of the contents of that tracked media file. These
`known media file identifiers may be stored in a database. The
`media tracking system may provide a highly distributed
`robot or "crawler" that is adapted to search or "crawl" the
`Internet to locate network presences, e.g., network pres(cid:173)
`ences, that contain media files. Once these network pres(cid:173)
`ences are located, media files located on these network
`presences are analyzed to generate media file identifiers for
`the those media files. The media file identifiers of these
`unknown media files are checked against the known media
`file identifiers of the media files to be traced (hereafter
`"tracked media files") in the media tracking system data(cid:173)
`base. The media tracking system may generate a report
`identifying the network presences storing specified media
`files. This information may be used to determine an amount
`of royalties owed by the network presence( s ), track dissemi(cid:173)
`nation or police intellectual property rights. For example,
`supposing tracked media files are offered for playing and/or
`downloading on network presences without the permission
`of the media owner, the media tracking system can generate
`a report identifying the network presences and the media
`files stored there without permission.
`In accordance with at least one embodiment of the inven(cid:173)
`tion, the need expressed above is met by a method for
`
`4
`identifying a video or audio file resident on a network, the
`method comprising creating a plurality of known media file
`identifiers, each for a respective one of a plurality of known
`media files, using an identifier generating algorithm. Next,
`5 known media file identifiers are stored in a database. Sub-
`sequently, a media file identifier is created for an unknown
`media file with the identifier generating algorithm. Next, a
`comparison is performed between the media file identifier
`for the unknown media file with known media file identifiers
`10 in order to produce an identification of the unknown media
`file.
`In accordance with at least one embodiment of the inven(cid:173)
`tion, the need expressed above is met by an apparatus for
`identifying a video or audio file resident on a network. The
`apparatus may include a module for creating a plurality of
`known media file identifiers, each for a respective one of a
`plurality of known media files, using an identifier generating
`algorithm. The apparatus may also include a database stor(cid:173)
`ing the known media file identifiers as well as a module for
`creating a media file identifier for an unknown media file
`with the identifier generating algorithm. Additionally, the
`apparatus may include a comparison module that compares
`the media file identifier for the unknown media file with
`known media file identifiers in order to produce an identi-
`25 fication of the unknown media file.
`Various embodiments of the invention are directed to a
`media tracking system that includes a set of media recog(cid:173)
`nition utilities that can be used for identifying media content
`contained in media files. The ability of the media tracking
`systems designed in accordance with the embodiments of
`the invention to recognize media content within media files
`provides the tracking system with the ability to perform a
`wide range of tasks in a wide range of environments. This
`media identifying capability can be utilized, for example, to
`35 catalog and index media files in, for example, the environ(cid:173)
`ment of an individual computer or in a computer network
`environment, to locate media files that contain particular
`media content, or to police copyright violations of media
`files. The tracking system is discussed below in more detail
`40 in the context of a particular application in a particular
`environment (i.e., a tracking system for tracking dissemina(cid:173)
`tion of media) to illustrate one use of the media recognition
`capabilities of the system. It should be understood, however,
`that this description is merely meant as one illustration of
`45 potential utility and is not intended to limit the scope of the
`invention to tracking media rights in a particular environ(cid:173)
`ment.
`The tracking system, when configured to track media
`files, utilizes a set of software tools that include an Internet
`50 crawler and a set of media recognition utilities. Media files,
`in the context ofthis description include data files in various
`formats that contain audio and/or image information. Audio
`files may be, for example, computer files that include
`computer code, which encodes audio information such as
`55 music, speech, sound effects and the like. Audio file formats
`currently popular and frequently encountered on the Internet
`include "wave" files (* .wav), MP3 files (* .mp3), liquid
`audio files (* .!qt), and Real Audio™ files (* .rm, * .ram).
`Image files include still images and "moving" images (here-
`60 after referred to generally as "video"). Still images may
`include, for example, textual files, photographs, drawings,
`paintings, trademarks, logos, designs and so on. Video files
`may include, for example, computer files, which include
`computer code encoding a series of images that may be
`65 viewed in rapid succession to create the illusion of motion.
`Video file formats frequently encountered on the Internet
`include MPEG (* .mpg) files, QuickTime (* .qt) files, Vivo
`
`
`
`US 7,043,473 Bl
`
`6
`the content included in the created database may be queried
`for matches between the known file identifiers and the
`unknown media file identifiers. An exemplary identifier
`generating algorithm for use with video files will be con-
`5 sidered first.
`
`EXAMPLE 1
`
`Identifier Utilizing Word Count for Video Files
`
`5
`(* .viv) files and Real Video (* .rm). Some of these file
`formats (Real Audio™ and Real Video™, for example) can
`be downloaded as streaming audio and/or video that is
`played in real-time. Other file formats (for example, * .mp3)
`are typically downloaded in their entirety and stored locally
`for playing and/or for further redistribution at a future time
`after downloading. Such audio and video files are described
`in greater detail below.
`Generally, each known media file to be tracked is ana(cid:173)
`lyzed by an identifier generating algorithm that forms part of 10
`the media recognition utilities of the tracking system. The
`identifier generating process results in the creation of a
`media file identifier for the particular known media file
`(which may be also be referred to as a "known" media file
`identifier). The media file identifiers of the known media 15
`files to be tracked may be stored in a database for later
`comparison with identifiers generated by the identifier gen(cid:173)
`erating algorithm from unknown media files to be analyzed.
`The tracking system may further provide a highly distrib(cid:173)
`uted Internet robot or "crawler" that may be adapted to 20
`search or "crawl" the Internet to locate and search network
`presences that contain media files. Each unknown media file
`(meaning that the contents of the media file are unknown)
`discovered on a target network presence (theses files being
`referred to hereafter as a "target media file") by the crawler
`is then analyzed by the same identifier generating algorithm
`used to generate the known media file identifiers from the
`media content of the tracked media files. The unknown
`identifier of the target file may be checked against the known
`media file identifiers of the tracked media files in the
`database. If an unknown media file identifier of a target
`media file is identical to or sufficiently resembles the known
`media file identifier of a tracked media file, then a "match"
`may be said to have occurred between the two media file
`identifiers. If the media file identifiers of two files match, a
`conclusion may be made that the target media file contains
`tracked media content. If tracked media content is offered
`for playing or downloading on network presences that are
`not authorized by the media owner, the system may generate
`a report identifying the network presence and, e.g., listing
`the files stored there without permission, the files stored
`there with permission, the number of files stored with/
`without permission. Moreover, the system may generate a
`report identifying network presences that offer the tracked
`media content, regardless of whether their possession of the
`content was authorized by the media content owner.
`Various identifier generating algorithms may be used to
`generate media file identifiers. During the explanation of
`these algorithms, it should be appreciated that the terms
`"fingerprint", "media file identifier" and "identifier" are
`synonymous. Examples of the various algorithms used to
`generate media file identifiers are explained below. Subse(cid:173)
`quently, a detailed explanation of the various media tracking
`systems and methods is provided.
`As a preliminary matter, it should be understood that all
`of the software described herein, including the file recogni(cid:173)
`tion software and the Internet crawler, can be encoded for
`execution on a general-purpose computer, specific purpose
`computer, or on a network of general and/or specific purpose
`computers. Generally, a media protection process designed
`in accordance with at least one embodiment of the invention
`may include various operations. For example, a database
`may be created, which contains one ( or more) known media
`file identifiers for each known media file to be tracked.
`Additionally, one or more identifiers may be generated from
`the media content of a target file suspected of being an
`unauthorized duplicate of a tracked media file. Additionally,
`
`"Video files", as used herein, includes any file the
`includes a series of digitally encoded images for sequential
`play. This may include, for example, commercial feature
`films, music videos, webcast news and entertainment pro(cid:173)
`gramming and so on. Although several identifier generating
`algorithms for video files are contemplated, at least one
`embodiment of the invention may use an identifier gener(cid:173)
`ating technique for video files based on a word count
`technique.
`Typically, video files include computer coding for dis-
`playing a sequence or series of images which are viewed
`( e.g., using a personal computer or similar device) in rapid
`succession to produce the illusion of motion. Some of the
`exemplary video
`file content
`recognition procedures
`25 explained below utilize methods that perform calculations
`on selected individual images or pictures ( or groups of
`images or pictures) in the sequence. It should be understood
`that the terms "image", "frame" and "picture" are synony(cid:173)
`mous; therefore, these terms are used interchangeably. As
`30 explained below, in at least one embodiment of the inven(cid:173)
`tion, a video file may be fingerprinted by an algorithm that
`generates a data set for each image or for selected images
`( e.g., such as every nth image) of each video file, each data
`set representing the number of words used to encode each
`35 selected image ( or group of images). The word count
`calculated for each of the selected images may be stored as
`an "identifier" for the particular video file or this data may
`be further processed or compressed to generate a word count
`identifier. The file identifiers may be stored in a database of
`40 the tracking system and may be associated with metadata
`( e.g., title, date of creation, name of the owner, and so on)
`which may be related to the media file from which the
`fingerprint was derived.
`Generally, each image of a digitally encoded video may
`45 be viewed on a screen of a display device (such as a
`computer monitor) as an array of discrete picture elements
`called "pixels". Several attributes are defined for each pixel
`in a particular image. Data stored in a video file defines the
`attributes of each pixel for each frame. Pixel attributes
`50 include color, intensity and brightness. The data in each
`video file may be stored in binary form (that is, as a series
`of bits or O's and 1 's). Binary data may be grouped into units
`called "words". A word may contain 32 bits of data, for
`example. As will become apparent, a measure of the number
`55 of words used to encode images ( or groups of images) of a
`video can be related to the complexity of each image and
`may be used to generate an identifier for each video.
`Before the word count technique is described in more
`detail, it should be appreciated that the present invention
`60 contemplates the use of a fingerprinting process in which the
`identifier generating algorithm in effect analyzes the level of
`complexity of selected frames ( or selected groups of frames
`or pictures) of a video. (The distinction b