`(12) Patent Application Publication (10) Pub. No.: US 2002/0052885 A1
`(43) Pub. Date:
`May 2, 2002
`Levy
`
`US 2002U052885Al
`
`(54) USING EMBEDDED DATA WITH FILE
`SHARING
`
`Publication Classification
`
`(76)
`
`Inventor: Kenneth L. Levy, Stevenson, WA (US)
`
`Correspondence Address:
`DIGIMARC CORPORATION
`"l 9801 SW ';'2Nl) AVENUE
`SUITE 100
`TUALATIN, OR 97062 (US)
`
`(21) App]. No.:
`
`09}'952,384
`
`(22)
`
`Filed:
`
`Sep. 11, 2001
`
`Related U.S. Application Data
`
`(63) Continuation—in—part of application No. 09f620,019,
`filed on Jul. 20, 2000. Non—provisional of provisional
`application No. 60;"232,l63, filed on Sep. 1], 2000.
`Non-provisional of provisional
`application No.
`60;"257,822, filed on Dec. 21, 2000.
`
`(30)
`
`Foreign Application Priority Data
`
`Jul. 20, 2001
`
`PC'l'J'US0l,=’22953
`
`lnt.Cl."'
`(51)
`(52) U.S.Cl.
`
`G06]? 7701)
`..
`
`.. 7077200; 707;’l04.l
`
`(57)
`
`ABSTRACT
`
`Peer—to—peer file sharing is increasing in popularity on the
`Internet, faster than any product known in history. Although
`file-sharing can enable massive piracy, it has many advan-
`tages for distribution of inion'nation including scalability.
`Alternatively, file-sharing can be sabotaged with falsified
`files and used to distribute viruses. To this end, a solution
`that maintains the scalability of file—sharing and promotes
`reliability is proposed. The solution involves embedding
`data within the file or content and using the data to identify
`the content, demonstrate its completeness and lack of
`viruses, and verify the file can be shared. The embedded data
`can be checked when the file is registered with the database
`for sharing, and before or while the file is being uploaded
`andfor downloaded. Ideally, the embedded data is added at
`the time of creation for the file. The embedded data may
`include a watermark and be linked to other copy manage-
`ment systems, such as those proposed in DVD and SDMI.
`Finally, the embedded data can be used to enable purchases
`of files that owners do not have rights to share.
`
`1."f:'.rim:r'.|i.I.<l
`rilr: Registry
`
`_.
`
`
`
`I or: at I Ilos
`
`'5'" ' _ ____._.
`I
`[D Addrmra
`
`.,_.,.|.‘,_.
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:20)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 1
`
`
`
`Patent Application Publication May 2, 2002 Sheet 1 of 4
`
`US 2002/0052885 Al
`
`I ncal Files
`
`#114
`
`Deteci ‘and
`Read
`Embedded
`Data
`
`
`
`
`
`ID, CCI. File
`Shaw
`
`
`
`flag,etc
`
`
`
`
`
`Embed Data
`
`ID, CCI. FIIF.‘
`share
`
`flag_etc
`
`
`
`
`
`
`Act on
`Ijnbedded
`
`Data (e.g..
`inhibit
`transfer.
`connect to
`
`
`
`
`
`
`
`
`
`
`rights
`server, etc.
`
`
`
`
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:21)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 2
`
`
`
`Patent Application Publication May 2, 2002 Sheet 2 of 4
`
`US 2002/0052335 A1
`
`" '
`'
`'C0n1pfini1_t_
`Ripper or }—>fi1n1bed Data]
`Marker
`_ EC '
`E.;.im.iq
`T
`_ Tj._:.:i__,...____ ..
`Read Embedded Data
`1...
`_______E_
`i**~~~m4
`
`2
`
`..
`
`.
`
`Router
`
`J‘
`
`
`2 File
`.
`Sharing
`
`—
`I
`
`11)
`t_
`R ,t
`' Cgls m 1011...]
`
`A Register
`IX;
`.
`_
`Search for Songs]
`—
`2
`.
`+2
`2
`
`\‘
`
`=
`
`__
`
`1DO\Vlli0£l(iFE{_)_ligS4’
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:22)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 3
`
`
`
`Patent Application Publication May 2, 2002 Sheet 3 of 4
`
`US 2002/0052885 Al
`
`ID Format
`
`i Coigyright I Date
`! (1-3 hits)
`
`| [16 bits)
`
`Unique Song ID
`(24-32 bits)
`
`Retail Channel ii
`I (1246 hits)
`E
`
`Database Format
`
`l Song Info
`Song
`Artist
`' I ‘ill :3
`
`‘Label
`
`Daleifi
`
`Connected URLs
`Song—
`Artist
`Label
`Retail
`writer
`channel
`
`Default
`
`
`
`ID
`
`
`
`Fig. 5
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 4
`
`
`
`Patent Application Publication May 2, 2002 Sheet 4 of 4
`
`US 2002/0052885 Al
`
`1
`_:
`Com pliant I
`Ripper
`%
`
`
`Watermark ID
`database L
`.
`_ _‘_ __J
`
`3 N
`
`_,
`1
`
`.
`
`I II
`
`at--""
`
`,
`* CDDBI
`
`iI
`
`Gracenote:
`
`- CDDB (now Gracenote)
`deternlines CD and song Wratcmnrk In Format
`titles from the CD Table
`_
`.
`..
`.
`
`of Cements (TOC)
`
`fC,1>nI;1()(, ll) ] lmckll) L
`
`Fig. 6
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:24)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 5
`
`
`
`US 2002/0052885 A1
`
`May 2, 2002
`
`USING EMBEDDED DATA WITH FILE SHARING
`
`RELATED APPLICATION DATA
`
`[0001] This patent application is a continuation in part of
`U.S. patent application Ser. No. 09t620,019, filed Jul. 20,
`2000. This application also claims priority to U.S. Provi-
`sional Patent Application No. 60,=’232,]63, filed Sep. 11,
`2000, and No. 60f257',822, filed Dec. 21, 2000. This patent
`application also claims priority to PCT Application PCT!
`USIllf22953, filed Jul. 20, 2001. These patent applications
`are hereby incorporated by reference.
`
`[0002] This application also relates to Utility patent appli-
`cation Ser. No. U9!4JIl4,291 filed Sep. 23, 1999 by Kenneth
`L. Levy, and Ser. No. 09t404,292 filed Sep. 23, 1999 by
`Kenneth L. Levy, which are incorporated herein by refer-
`ence.
`
`TI£Cl INl(.TAI.. FIEI .I)
`
`[0003] The invention relates to file sharing systems for
`computer networks such as the Internet, and specifically
`relates to using embedded data in files to enhance such
`systems.
`
`I3ACK(}R()UNI) AND SUMMARY
`
`[0004] With the explosive growth of the Internet, file-
`sharing programs have evolved. One popular file sharing
`program is known as Napster, with a user base that has
`grown to between 10 and 20 million users in 1 year. This is
`one of the fastest growing products today. Currently, scores
`of music files can be found from Napster's database of
`current online users, and downloaded from another user’s
`computer, m a data transfer scheme known as peer—to—peer
`file sharing. File—sharing is easily extended to all content,
`such as done with Scouncom.
`
`In the Napster system, web site servers store a
`[0005]
`database of directories of the digital music libraries on the
`hard drives of thousands of registered users. The digital files
`of the songs themselves remain on the users’ hard drives. If
`a user wants a particular song title, he logs onto the Napster
`web site and types in a Search query for the title. Client
`S-['|fl.\cV¢':lfC OH l.flC USERS CI.l]Tlpl.ll.t'.]' CUTll'ICCL“a
`[O lflfl NHPSICF
`server and receives a list of active users who have the
`
`requested file on their computer. In response to selecting a
`handle name, the client software opens a link between the
`user’s corn puter and the computer of the selected user, and
`the client software executing on the two computers transfer
`the requested file.
`
`[0006] Many new file-sharing systems are evolving in
`which the database is dynamic and not stored on a central
`server. One example of software with a dynamic database is
`known as Gnutella. Initially, when a user logs on to the
`(lnutella network, the user downloads client software from
`a Gnutella website. Next,
`the user types in the Internet
`address of an established Gnutella user ('e.g., from a listing
`available at the web site). The client software then transmits
`a signal on the network that informs other computers in the
`(lnutella file sharing network of its network address and
`connection status. Once a link with the other computer is
`secure, the other computer informs other computers of the
`Gnutella network that it has encountered in previous ses-
`sions of the user's presence (e.g., address and connection
`status).
`
`[0007] After this initial session, the client software stores
`the addresses of other computers that it has encountered hr
`the Gnutella network. When the client software is loaded, it
`recalls these addresses and attempts to reconnect with the
`other computers located at these addresses in the Gnutella
`network. The Gnutella software enables users to exchange
`many types of files. It enables users to issue a search request
`for files containing a desired text string. In response, the
`Gnutella clients connected with the user's computer search
`their respective hard drives for files satisfying the query. The
`client on the user's computer receives the results (e.g., files
`and corresponding addresses) and displays a list of them. By
`clicking on a file item in the user interface, the user instructs
`the client software to transfer the selected file.
`
`In another file sharing system known as Itreenet,
`[0008]
`the identity of the person downloading and uploading the
`files can be kept secret. Alternatively,
`the flies could be
`stored on a central server, but uploaded by users such that
`the central server does not know the origin or true content of
`the files.
`
`[0009] Unfortunately, the file—sharing methodology also
`allows massive piracy of any content, such as text, music,
`video, software, and so on. However, due to the scalability
`and freedom of distribution with file—sharing, it provides a
`powerful tool to share information. As such, there is a need
`for technology that facilitates and enhances authorized file
`sharing while respecting copyrights.
`
`[0010] A few examples of the benefiLs of file-sharing
`follow. A file sharing system allows unknown artists to
`obtain inexpensive and worldwide distribution of their cre-
`ative works, such as songs, images, writings, etc. As files
`become more popular, they appear on more of the users’
`computers; thus, inherently providing scalability. In other
`words, there are more places from which to download the
`file and most likely several files exist in close proximity to
`the downloading computer, thus improving efficiency. In
`addition, anonymous file—sharing, like FreeNet, foster politi-
`cal debate in places around the world where such debate
`might trigger reprisals from the government.
`
`[0011] Current attempts to curb unauthorized file sharing
`include enforcement of copyright laws and use of files with
`content bombs. The current legal enforcement elforts allege
`that uses of file sharing systems violate copyright
`laws.
`Content bombs involve placing files that appear to be the
`correct content, but contain alternative content or viruses.
`For example, a MP3 file can have the middle replaced with
`someone saying "do not copy songs" instead of the desired
`music. Neither of these solutions will help the Internet grow
`and improve the quality of life, worldwide.
`
`[0012] Current copy management systems allow copying,
`but block rendering on equipment if the person does not
`have rights, where rendering only refers to reading a text file,
`seeing an image, watching a movie, listening to an audio file,
`smelling a smell file, or executing a software program.
`Although this can limit piracy within a file—sharing system,
`it does not improve the system for the user. In fact, this
`rendering based method of copy protection detracts from the
`system. This detraction stems from the fact that current copy
`control systems are implemented on the user’s computer at
`the time of importing into the secure system, rendering, or
`moving to a fxirtable rendering device or media, as described
`in the Secure Digital Music Initiative’s specifications ver-
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:25)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 6
`
`
`
`US 2002/0052885 A1
`
`May 2, 2002
`
`sion 1 {available at http:.t';‘www.sdmi.org, and incorporated
`by reference). In other words, current copy control systems
`do not check rights at
`the time of copying or transfer
`between computers. For example, the user downloads the
`protected file, and then llnds out that he,/she cannot render
`the file (Le. play the song}. In addition, the user does not
`know if the file is the correct file or complete until after
`downloading and attempting to render the file. More spe-
`cifically, the file is encrypted by a key related to an Unique
`identifier within the user’s computer; thus, after copying to
`a new computer, the file cannot be decrypted. In addition,
`watermarks can only be used after the file has been
`decrypted, or designed to screen open (i.e. decrypted) con-
`tent for importation into the user’s secure management
`system after the file has been copied to their computer.
`
`[0013] Another approach would be to use a database
`lookup to detenrtine whether the content is allowed to be
`shared. For example, music in the MP3 file format can be
`determined whether it can be shared by the ID3 song title
`tag. However,
`this solution does not scale. Specifically,
`every downloaded file needs to access and search this central
`database, and this database’s access does not improve as the
`file becomes more popular. In addition, the approach can be
`bypassed by changing the file’s title tag or
`filename,
`although this makes searching more difficult.
`
`[0014] A desirable solution includes embedding data
`throughout the content in which the embedded data has any
`of the following rules. The embedded data can have an
`identifier that has many uses, such as identifying the file as
`the content that the user desires, allowing the file to be
`tracked for forensic or accounting purposes, and connecting
`the user back to the owner andfor creator of the file. The
`embedded data can be analyzed in terms of continuity
`throughout the file to quickly demonstrate that the file is
`complete and not modified by undesirable content or
`viruses. An additional role is to identify the content as
`Something that is allowed to be shared, or used to determine
`the level or type of sharing allowed, such as for subscription
`users only.
`
`in the header or
`[0015] The embedded data may exist
`footer of the file, throughout the file as an out-of-band signal,
`such as within a frame header, or embedded in the content
`while being minimally perceived, most importantly without
`disturbing its function, also known as a watermark.
`
`In the utilization of this embedded data, the com-
`[0016]
`puter from which the content to be downloaded {i.e.
`the
`uploading computer) can check to make sure the content is
`appropriate to be uploaded when the files (e.g., music files)
`on this computer are added to the central database and/or
`when the content is requested. Similarly, the downloading
`computer can also check that
`the requested content
`is
`appropriate before, after or during the downloading process.
`An appropriate file can be defined as any of the following:
`the content is allowed to be shared, i.e.
`it is not copyright
`material, the file is the correct content, and that the content
`is complete and does not contain any viruses.
`
`BRIEF DljS(TRIP'l‘I()N OI-' Tl-Ili DRIXVVINGS
`
`[0017] FIG. 1 is an overview of peer—to—peer file sharing
`system demonstrating locations at which embedded data can
`be used to control file-sharing.
`
`[0018] FIG. 2 is a flowchart of an embedding process.
`
`[0019] FIG. 3 is a flowchart of a detecting process.
`
`[0020] FIG. 4 is a diagram of a file sharing system using
`embedded data.
`
`[0021] FIG. 5 is a diagram of an embedded data format
`and corresponding database format.
`
`[0022] FIG. 6 is a diagram illustrating an arrangement for
`generating a unique ID based on content.
`
`DETAILED DESCRIPTION
`
`[0023] The following sections describe systems and meth-
`ods for using auxiliary data embedded in files to enhance file
`sharing systems. FIG. 1 depicts an example of a file sharing
`system for a computer network like the Internet. The solu-
`tion described below Uses data embedded in a file to identify
`a file as having content desired for downloading, to verify
`that the content of the file is complete and free of viruses,
`and to allow the file to be shared among users’ computers at
`the user’s share level. In many applications, an embedding
`process encodes auxiliary data in the file during creation, but
`it may also be embedded at a later time. For example, the file
`may be embedded (or re—embedded) as part of a file transfer
`process or electronic transaction where a user is granted
`usage rights for the file.
`
`[0024] FIG. 2 depicts an embedding process for adding
`auxiliary data to files in a file sharing system. A data
`embedding process 200 (e.g., steganographic encoder, file
`header encoder, data frame header encoder, etc.) embeds
`auxiliary data 202 in a file 204 to create a data file 206
`including the embedded data 202. The file may then be
`distributed in a file sharing system comprising a number of
`computers or other devices in communication with each
`over via a network. The auxiliary data embedded in the file
`is used to manage file sharing operations, and to enhance the
`ttser’s experience.
`
`[0025] Types of Embedded Data
`
`[0026] The embedded data can be placed in the header or
`footer of the file, throughout the file such as within frame
`headers, or hidden in the content itself using steganographic
`encoding technology such as digital watermarking. The file
`may contain any combination of text, audio, video, images
`and software, in compressed or uncompressed format.
`
`[0027] Auxiliary data used to manage sharing of a file may
`be embedded in headers and footers of the file for each type.
`When the data is to be embedded throughout the file, the file
`can be broken into frames of known size, with a header for
`each frame including space for embedded data. For MPEG
`compressed audio and video, these frames already exist. The
`embedded data can be hidden in copyright, private or
`auxiliary bits. The data embedded in frame headers can be
`modified by the audio in any frame andior encrypted
`(defined as dynamic locking in patent application Ser. No.
`09;’404,291, already incorporated by reference) to improve
`its robustness to duplication in another content file, a content
`bomb, or virus.
`
`there are many
`to watermarking,
`[0028] With respect
`known techniques for embedding data within software,
`image, audio, video, and text in the state of the art, and new
`techniques will evolve, especially for software. Examples of
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:26)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 7
`
`
`
`US 2002/0052885 A1
`
`May 2, 2002
`
`steganographic encoding and decoding technologies are
`described in U.S. Pat. No. 5,862,260, and hi co—pending
`patent application Ser. No. 091503381, filed Feb. 14, 2000.
`The watermark may exist only in one place in the content,
`several places in the content, or continuously throI.tghout the
`content. For example, in an audio file, the watermark may be
`repeated in temporal segments of the audio track. In a still
`image, the watermark may be repeated in spatial segments
`of the image. In video, the watermark may be repeated in
`temporal or spatial segments of the video signal.
`[0029] Roles of Embedded Data
`[0030] The embedded data may include an identifier (ID)
`that serves as an index to an entry in a searchable database
`that describes or otherwise identifies the content of the file.
`For example, the database can include elements, where each
`element comprises an ID, song title, album (or CD) title,
`release year, and artist name. This database can be indexed
`by any of these elements, thus improving automated search-
`ing capabilities. Specifically, rather than needing to search
`for “Ilelp and Beatles”, “The I3eatles—IIelpl"', and so on, a
`unique II) can be used in a search query to identify The
`Beatles’ song Help, and different II)s may be used for
`different releases.
`
`[0031] The user, via an automated search program, only
`needs to submit a search query including that ID. When
`searching, the user may be presented with a drop down menu
`of titles of flies from the database that satisfy the search
`query. The search program automatically knows the ID from
`the database so that
`the correct
`file can be found and
`
`downloaded from a computer at an address associated with
`that file in the database. In addition, these IDs could help
`music be searched by year, which is desirable to many
`people who want to hear music from their high school or
`college days.
`[0032]
`In addition to facilitating automated searches for
`content in files, the II) may also be used to track these files.
`For example, the file transfer system can add the II) of a file
`to an event log when the file is transferred (e .g., downloaded,
`uploaded, etc.). The specific components of the file transfer
`system involved in the event logging process may vary with
`the implementation. Also, the time at which the event is
`triggered and logged may also vary.
`[0033] The client system responsible for sending a file
`may issue and log an event, and either store the log locally,
`andfor send it to a central or distributed database for com-
`munication to other systems. The client system that receives
`the file may perform similar event logging actions. Addi-
`tionally, if a server system is involved in a file transfer, it
`may also perform similar event
`logging actions. For
`example, the server may transfer the file, or facilitate the
`transfer between two clienLs, and as part of this operation,
`log an event of the operation including the lile ID, the type
`of event, etc. In distributed systems where no central server
`is involved, the event logs can be stored on computers in the
`file sharing network (or a subset of the computers), and
`composite event logs can be compiled by having the com-
`puters broadcast their event logs to each other. Liach com-
`puter, in this approach, could maintain a copy of the event
`log, which is synchronized upon each broadcast operation.
`The log could be used to account for all file transfers, and be
`used to properly pay the rights holders.
`[0034] Another use for the embedded data when it con-
`tains a unique ll), such as unique to the retailer, song, artist
`
`andfor rights holder, is to link the consumer to more infor-
`mation, such as information about the retailer, song, artist
`andfor rights holder. The ID could be used to link to the
`retailer’s web site, where the consumer can find additional
`songs in the same genre, year and by the san1e artist. Or, the
`ID could be used to link to the artist’s web site where the
`consumer finds additional information about the artist and
`
`song, and can locate other songs by the artist. Or, the ID
`could be used to link back to the rights owner, such as the
`record label where the consumer can find additional infor-
`mation and music.
`
`[0035] This connected content link could be displayed by
`the lile sharing application during the downloading process.
`This provides the user with benefits of not wasting time
`during the downloading process, and gaining access to more
`music and information. The ftle sharing company can use
`this process to increase the revenues generated from the file
`sharing system through deals with the companies who gain
`access to the user via the connected content links.
`
`[0036] The unique II) could be generated from the con-
`tent, such as done with CDDB, which generates an ID from
`a CD ‘s table of contents (TOC), and then steganographically
`embedded into the content. Alternatively, the unique ID may
`not be embedded but inherently linked to the content via a
`hash or fingerprint function that turns son1e or all of the
`content into a few bits of data. The number of bits allowed
`determines the likelihood that different files transform into
`the same number of bits. Ilowever, even with as few as 32
`bits, this is unlikely. In addition, this is less likely if the hash
`function prioritizes parts of the data that are most percep-
`tually relevant. This process is sometimes referred to as
`fingerprinting.
`
`[0037] The embedded data, when continuously embedded
`throughout the content, can improve the reliability of the
`content by, for example, demonstrating that the content is
`complete and has no viruses. One way to make the embed-
`ded data continuous is to insert it
`in periodically spaced
`frame headers, or steganographically encode it at locations
`spread throughout the file.
`
`[0038] A person trying to sabotage the file—sharing system
`can try to replicate the embedded data through a content
`bomb (such as audio repetitively saying “do not copy”) or
`virus to fool the system. 'l'hI.Ls, the harder it is to duplicate the
`embedded data, the more reliable the system is. When trying
`to resist duplication,
`it
`is advantageous to encrypt
`the
`embedded data payload, thus making it harder to duplicate.
`In addition, the embedded data payload can he modified by
`the content to improve resistance to duplication. Finally, the
`embedded data can be modified by the content and then
`encrypted for more secure applications. The above three
`robustness methods are labeled dynamic locking and dis-
`closed in patent application Ser. No. 09;"4-(14,291, already
`incorporated by reference. When the embedded data is a
`watermark, meaning that it is steganographically embedded
`within the content and not just as auxiliary data in each
`frame, it is Ltsually inherently robust to duplication because
`many watennarks use secret keys that are required to detect
`the watermark and read the information carried in it. One
`
`form of key is a pseudo—random noise (PN) sequence used
`as a carrier to embed, detect, and read the watermark. In
`particular, a spreading function is used to modulate the PN
`sequence with the watermark message. The resulting signal
`
`(cid:51)(cid:68)(cid:87)(cid:72)(cid:81)(cid:87)(cid:3)(cid:50)(cid:90)(cid:81)(cid:72)(cid:85)(cid:3)(cid:38)(cid:82)(cid:81)(cid:87)(cid:72)(cid:81)(cid:87)(cid:42)(cid:88)(cid:68)(cid:85)(cid:71)(cid:3)(cid:43)(cid:82)(cid:79)(cid:71)(cid:76)(cid:81)(cid:74)(cid:86)(cid:15)(cid:3)(cid:44)(cid:81)(cid:70)(cid:17)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:75)(cid:76)(cid:69)(cid:76)(cid:87)(cid:3)(cid:21)(cid:19)(cid:19)(cid:28)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:27)
`Patent Owner ContentGuard Holdings, Inc. - Exhibit 2009, p. 8
`
`
`
`US 2002/0052885 A1
`
`May 2, 2002
`
`is then embedded into the host data (eg., perceptual or
`transform domain data) using an embedding function. The
`embedding function modifies the host signal such that
`it
`makes subtle changes corresponding to the message signal.
`Preferably, these changes are statistically imperceptible to
`humans yet discernable in an automated steganographic
`decoding process. Encryption and changing the watermark
`message or PN sequence adaptively based on the content can
`improve the robustness of the watermark to duplication.
`
`[0039] Alternatively, if the embedded data is generated
`from the content, the embedded data is inherently linked to
`the content and is difficult to duplicate in a virus or content
`bomb. For example, pseudo—randomly chosen frames can be
`hashed into a few data bits that can be embedded in other
`
`pseudo-randomly chosen frames. Thus, without knowledge
`of the pseudo-random sequence (i.e. key) used to choose the
`frames and the hash function, the hacker cannot duplicate
`the embedded data.
`
`lmportantly, header and footer structures should be
`[0040]
`of known size or protected so a hacker cannot slip a virus
`into the header or footer.
`
`[0041] The embedded data can also demonstrate that the
`file is allowed to be shared, which means its owner has
`authorized copying (i.e. sharing) rights. The watermark
`message may include standard copy control
`information
`such as two message bits to encode copy permission states
`of “no more copy,”“copy once” and "copy freely.” In
`addition, Only one bit can be used, thus indicating whether
`or not sharing, is allowed.
`
`[0042] The copyright can be linked to other copy man-
`agement systems. For example, according to the DVD-
`Audio specification (available at http:.t,/www.dvdforum.org)
`and the Portable Device Specification of the Secure Digital
`Music Initiative (available at http:ii’vvww.sdmi.org), audio
`may be watenrtarked with copy control information. This
`information may automatically be passed along if encoded
`within a watermark robust enough to survive the compres-
`sion used in most
`file—sharing systems. Alternatively, the
`watermark can be read and re—embedded as embedded data,
`possibly another type of watermark {as discussed in patent
`applications Ser. No. {l9f404,292, already incorporated by
`reference).
`
`In addition, the copyright data can provide more
`[0043]
`information than just copy or not. For example, the bits may
`inform file sharing software, system or device that this file
`can be shared by subscription users, but not free ttsers. Or,
`it can inform the level or type of subscription which allows
`sharing of the file. Specifically, subscription users who pay
`per month can share files that a free user cannot share. With
`music sharing, a popular band may allow only subscription
`users (or possibly users with an expanded subscription) to
`share their file so that they can earn revenue directly from
`the file. However, a new band may allow their song to be
`shared by all users.
`
`[0044] Embedded Data Payload
`
`form of the embedded data is a
`[0045] The simplest
`payload of one bit determining whether or not the file can be
`copied. A better payload is one with two bits for copy control
`and more bits, such as 32 bits, for a unique identifier that can
`be ttsed to verify that the file contains the correct content.
`Note that demonstrating the file is complete does not depend
`
`upon the payload, but upon completeness ofembedded data
`throughout
`the content. A decoding process can verify
`whether the file is complete by determining whether or not
`the embedded data is present at predetermined intervals or
`segments of the content. Finally, the payload can have a
`payload type, such as 8 bits, and then more bits, like 32 bits,
`of information that depends upon the document type and
`probably includes copy control and an identification section.
`
`[0046] One way to verify that a file is complete without
`spreading embedded data throughout the file is to embed a
`hash of the file data at one or more selected locations within
`
`the file. The completeness of the file is checked by a program
`or device that recomputes the hash from the file and com-
`pares it with the previously computed hash which is embed-
`ded in the file.
`
`[0047] When the payload is to be continuously embedded
`with dynamic locking and it contains only a few hits, such
`as 1 bit, a system designer ca