throbber
Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 1 of 32
`
`
`
`Exhibit 26
`
`
`
`
`
`
`
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 2 of 32
`ETlQmégbfigv-OBQE-PGG-SN Document 239-8 Filed 11/12/20 PageZof 32
`
`This application is submitted in the name of inventor Erling H. Wold.
`
`AMC—Ol—OO4P
`
`SPECIFICATION
`
`METHOD AND APPARATUS FOR
`
`IDENTIFYING AN UNKNOWN WORK
`
`BACKGROUND OF THE INVENTION
`
`Field of the Invention
`
`The present invention relates to data communications. In particular, the present
`
`
`
`
`
`
`,4“?
`
`10
`
`invention relates to a novel method and apparatus for identifying an unknown work.
`
`The Prior Art
`
`Background
`
`Digital audio technology has greatly changed the landscape of music and
`
`entertainment. Rapid increases in computing power coupled with decreases in cost have
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 3 of 32
`ETR§§§6W02396PGGSN Document 239-8 Filed 11/12/20 Page3of 32
`
`AMC-Ol-OO4P
`
`made it possible for individuals to generate finished products having a quality once
`
`available only in a major studio. One consequence of modern technology is that legacy
`
`media storage standards, such as reel—to—reel tapes, are being rapidly replaced by digital
`
`storage media, such as the Digital Versatile Disk (DVD), and Digital Audio Tape (DAT).
`
`Additionally, with higher capacity hard drives standard on most personal computers,
`
`home users may now store digital files such as audio or video tracks on their home
`
`computers .
`
`Furthermore, the Internet has generated much excitement, particularly among
`
`those who see the Internet as an opportunity to develop new avenues for artistic
`
`expression and communication. The Internet has become a Virtual gallery, where artists
`
`may post their works on a Web page. Once posted, the works may be viewed by anyone
`
`having access to the Internet.
`
`
`
`
`One application of the Internet that has received considerable attention is the
`
`ability to transmit recorded music over the Internet. Once music has been digitally
`
`15
`
`encoded, the audio may be both downloaded by users for play, or broadcast (“streamed”)
`
`over the Internet. When audio is streamed, it may be listened to by Internet users in a
`
`manner much like traditional radio stations.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 4 of 32
`ETfifiggébW-OBQE-PGG-SN Document 239-8 Filed 11/12/20 Page40f 32
`
`AMC-Ol-OO4P
`
`Given the widespread use of digital media, digital audio files, or digital video files
`
`containing audio information, may need to be identified. The need for identification of
`
`digital files may arise in a variety of situations. For example, an artist may wish to verify
`
`royalty payments or generate their own Arbitron®—like ratings by identifying how often
`
`their works are being streamed or downloaded. Additionally, users may wish to identify
`
`a particular work. The prior art has made efforts to create methods for identifying digital
`
`audio works.
`
`
`
`
`itin—3412‘.=dim.
`
`However, systems of the prior art suffer from certain disadvantages. One area of
`
`difficulty arises when a large number of reference signatures must be compared to an
`
`unknown audio recording.
`
`The simplest method for comparing an incoming audio signature (which could be
`
`from a file on the Internet, a recording of a radio or Internet radio broadcast, a recording
`
`from a cell phone, etc) to a database of reference signatures for the purpose of
`
`identification is to simply compare the incoming signature to every element of the
`
`15
`
`database. However, since it may not be known where the reference signatures might have
`
`occurred inside the incoming signature, this comparison must be done at many time
`
`locations within the incoming signature. Each individual signature-to—signature
`
`comparison at each point in time may also be done in a “brute-force” manner using
`
`U)
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 5 of 32
`ETEfigégW-OBQdPGG-SN DO0Ument239—8 Filed 11/12/20 Page50f 32
`
`AMC-Ol-OO4P
`
`techniques known in the art; essentially computing the full Euclidean distance between
`
`the entire signatures’ feature vectors. A match can then be declared when one of these
`
`comparisons yields a score or distance that is above or below some threshold,
`
`respectively.
`
`However, when an audio signature or fingerprint contains a large number of
`
`features such a brute—force search becomes too expensive computationally for real-world
`
`databases which typically have several hundred thousand to several million signatures.
`
`Many researchers have worked on methods for multi—dimensional indexing,
`
`although the greatest effort has gone into geographical (2-dimensional) or spatial (3-
`
`dimensional) data. Typically, all of these methods order the elements of the database
`
`based on their proximity to each other.
`
`For example, the elements of the database can be clustered into hyper-spheres or
`
`hyper—rectangles, or the space can be organized into a tree form by using partitioning
`
`planes. However, when the number of dimensions is large (on the order of 15 or more), it
`
`
`
`
`15
`
`can be shown mathematically that more-or-less uniformly distributed points in the space
`
`all become approximately equidistant from each other. Thus, it becomes impossible to
`
`cluster the data in a meaningful way, and comparisons can become both lengthy and
`
`inaccurate.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 6 of 32
`Equggwm-OZPJQE-PGG-SN Document 239-8 Filed 11/12/20 Page60f 32
`
`Hence, there exists a need to provide a means for data comparison which
`
`overcomes the disadvantages of the prior art.
`
`AMC-01—004P
`
`BRIEF DESCRIPTION OF THE INVENTION
`
`A method and apparatus for identifying an unknown work is disclosed. In one
`
`5
`
`aspect, a method may includes the acts of providing a reference database having a
`
`
`
`
`
`reduced dimensionality containing signatures of sampled works; receiving a sampled
`
`work; producing a signature from the work; and reducing the dimensionality of the
`
`signature.
`
`BRIEF DESCRIPTION OF THE DRAWING FIGURES
`
`Figure 1A is a flowchart of a method according to the present invention.
`
`Figure 1B is a flowchart of another method according to the present invention.
`
`Figure 2 is a diagram of a system suitable for use with the present invention.
`
`Figure 3 is a diagram of segmentng according to the present invention.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 7 of 32
`Efigfgfiblfifig-OZEG-PGG-SN Document 239-8 Filed 11/12/20 Page7of 32
`
`Figure 4 is a detailed diagram of segmenting according to the present invention
`
`AMC-Ol-004P
`
`showing hop size.
`
`Figure 5 is a graphical flowchart showing the creating of a segment feature vector
`
`according to the present invention.
`
`5
`
`Figure 6 is a diagram of a signature according to the present invention.
`
`
`
`Figure 7A is a flowchart of a method for preparing a reference database according
`
`to the present invention.
`
`Figure 7B is a flowchart of method for identifying an unknown work according to
`
`.1
`
`the present invention.
`
`DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
`
`Persons of ordinary skill in the art will realize that the following description of the
`
`present invention is illustrative only and not in any way limiting. Other embodiments of
`
`the invention will readily suggest themselves to such skilled persons having the benefit of
`
`15
`
`this disclosure.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 8 of 32
`EngqsgéGggtpg-OBQE-PGG-SN Document 239-8 Filed 11/12/20 Page80f 32
`
`It is contemplated that the present invention may be embodied in various computer
`
`AMC-Ol-004P
`
`and machine-readable data structures. Furthermore, it is contemplated that data
`
`structures embodying the present invention will be transmitted across computer and
`
`machine—readable media, and through communications systems by use of standard
`
`protocols such as those used to enable the Internet and other computer networking
`
`standards.
`
`
`
`15
`
`The invention further relates to machine-readable media on which are stored
`
`embodiments of the present invention. It is contemplated that any media suitable for
`
`storing instructions related to the present invention is within the scope of the present
`
`invention. By way of example, such media may take the form of magnetic, optical, or
`
`semiconductor media.
`
`The present invention may be described through the use of flowcharts. Often, a
`
`single instance of an embodiment of the present invention will be shown. As is
`
`appreciated by those of ordinary skill in the art, however, the protocols, processes, and
`
`procedures described herein may be repeated continuously or as often as necessary to
`
`satisfy the needs described herein. Accordingly, the representation of the present
`
`invention through the use of flowcharts should not be used to limit the scope of the
`
`present invention.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 9 of 32
`EMpggmg-OBQE-PGG-SN Document 239-8 Filed 11/12/20 Page90f 32
`
`AMC-Ol-OO4P
`
`The present invention may also be described through the use of web pages in
`
`which embodiments of the present invention may be Viewed and manipulated. It is
`
`contemplated that such web pages may be programmed with web page creation programs
`
`using languages standard in the art such as HTML or XML. It is also contemplated that
`
`the web pages described herein may be viewed and manipulated with web browsers
`
`running on operating systems standard in the art, such as the Microsoft Windows® and
`
`Macintosh® versions of Internet Explorer® and Netscape®. Furthermore, it is
`
`contemplated that the functions performed by the various web pages described herein
`
`may be implemented through the use of standard programming languages such a Java®
`
`or similar languages.
`
`The present invention will first be described in general overview. Then, each
`
`element will be described in further detail below.
`
`
`
`
`Referring now to Figure 1A, a flowchart is shown which provides a general
`
`overview of the present invention as related to the preparation of a database of reference
`
`15
`
`signatures. Two overall acts are performed to prepare a reference database in
`
`accordance with the present invention:
`
`in act 100, the present invention reduces the
`
`dimensionality of reference signatures; and the reference database is indexed in act 102.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 10 of 32
`EfigsgflgéglgggOBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 10 of 32
`
`AMC-01-004P
`
`Referring now to Figure 1B, a flowchart is shown which provides a general
`
`overview of the present invention as related to the identification of an unknown signature
`
`in accordance with the present invention. In act 104, a sampled work is received. In act
`
`106, the present invention reduces the dimensionality of the received work. In act 108,
`
`the present invention determines initial candidates. In act 110, the present invention
`
`searches for the best candidate.
`
`Prior to presenting a detailed overview of each act of FIGS. 1A and 1B, some
`
`background will first be presented.
`
`Structural embodiment of the present invention
`
`Referring now to Figure 2, a diagram of a system suitable for use with the present
`
`invention is shown. FIG. 2 includes a client system 200. It is contemplated that client
`
`system 200 may comprise a personal computer 202 including hardware and software
`
`standard in the art to run an operating system such as Microsoft Windows ®, MAC OS ®
`
`
`
`Palm OS, UNIX, or other operating systems standard in the art. Client system 200 may
`
`15
`
`further include a database 204 for storing and retrieving embodiments of the present
`
`invention. It is contemplated that database 204 may comprise hardware and software
`
`standard in the art and may be operatively coupled to PC 202. Database 204 may also be
`
`used to store and retrieve the works and segments utilized by the present invention.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 11 of 32
`EfifiggéégggSOZBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 11 of 32
`
`AMC-Ol—OO4P
`
`Client system 200 may further include an audio/video (A/V) input device 208.
`
`A/V device 208 is operatively coupled to PC 202 and is configured to provide works to
`
`the present invention which may be stored in traditional audio or video formats. It is
`
`contemplated that A/V device 208 may comprise hardware and software standard in the
`
`art configured to receive and sample audio works (including video containing audio
`
`information), and provide the sampled works to the present invention as digital audio
`
`files. Typically, the A/V input device 208 would supply raw audio samples in a format
`
`such as 16-bit stereo PCM format. A/V input device 208 provides an example of means
`
`for receiving a sampled work.
`
`It is contemplated that sampled works may be obtained over the Internet, also.
`
`Typically, streaming media over the Internet is provided by a provider, such as provider
`
`218 of FIG. 2. Provider 218 includes a streaming application server 220, configured to
`
`retrieve works from database 222 and stream the works in a formats standard in the art,
`
`
`
`
`
`such as Real®, Windows Media®, or QuickTime®. The server then provides the
`
`15
`
`streamed works to a web server 224, which then provides the streamed work to the
`
`Internet 214 through a gateway 216. Internet 214 may be any packet—based network
`
`standard in the art, such as IP, Frame Relay, or ATM.
`
`10
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 12 of 32
`EfifigfigaggggOBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 12 of 32
`
`AMC-01-004P
`
`To reach the provider 218, the present invention may utilize a cable or DSL head
`
`end 212 standard in the art operatively, which is coupled to a cable modem or DSL
`
`modem 210 which is in turn coupled to the system’s network 206. The network 206 may
`
`be any network standard in the art, such as a LAN provided by a PC 202 configured to
`
`run software standard in the art.
`
`
`
`
`It is contemplated that the sampled work received by system 200 may contain
`
`audio information from a variety of sources known in the art, including, without
`
`limitation, radio, the audio portion of a television broadcast, Internet radio, the audio
`
`portion of an Internet video program or channel, streaming audio from a network audio
`
`server, audio delivered to personal digital assistants over cellular or wireless
`
`communication systems, or cable and satellite broadcasts.
`
`Additionally, it is contemplated that the present invention may be configured to
`
`receive and compare segments coming from a variety of sources either stored or in real-
`
`time. For example, it is contemplated that the present invention may compare a real-time
`
`15
`
`streaming work coming from streaming server 218 or A/V device 208 with a reference
`
`segment stored in database 204.
`
`S egmenting background
`
`11
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 13 of 32
`Efififigéc’bébfié-OZBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 13 of 32
`
`It is contemplated that a wide variety of sampled works may be utilized in the
`
`present invention. However, the inventors have found the present invention especially
`
`useful with segmented works. An overview of a segmented work will now be provided.
`
`AMC—01-004P
`
`Figure 3 shows a diagram showing the segmenting of a work according to the
`
`present invention. FIG. 3 includes audio information 300 displayed along a time axis
`
`302. FIG. 3 further includes a plurality of segments 304, 306, and 308 taken of audio
`
`information 300 over some segment size T.
`
`In an exemplary non-limiting embodiment of the present invention, instantaneous
`
`values of a variety of acoustic features are computed at a low level, preferably about 100
`
`times a second. In particular, 10 MFCCS (cepstral coefficients) are computed. It is
`
`contemplated that any number of MFCCs may be computed. Preferably, 5-20 MFCCs
`
`are computed, however, as many as 30 MFCCs may be computed, depending on the need
`
`for accuracy versus speed.
`
`
`
`
`
`
`
`Segment-level features are disclosed US Patent #5,918,223 to Blum, et al., which
`
`15
`
`is assigned to the assignee of the current disclosure and incorporated by reference as
`
`though fully set forth herein. In an exemplary non-limiting embodiment of the present
`
`invention, the segment-level acoustical features comprise statistical measures as disclosed
`
`in the ‘223 patent of low-level features calculated over the length of each segment. The
`
`12
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 14 of 32
`EfiflfSSlSWOBga'PGG'SN Document 239-8 Filed 11/12/20 Page 14 of 32
`
`data structure may store other bookkeeping information as well (segment size, hop size,
`
`item ID, UPC, etc).
`
`AMC-01-004P
`
`As can be seen by inspection of FIG. 3, the segments 304, 306, and 308 may
`
`overlap in time. This amount of overlap may be represented by measuring the time
`
`between the center point of adjacent segments. This amount of time is referred to herein
`
`as the hop size of the segments, and is so designated in FIG. 3. By way of example, if the
`
`segment length T of a given segment is one second, and adjacent segments overlap by
`
`50%, the hop size would be 0.5 second.
`
`The hop size may be set during the development of the software. Additionally,
`
`the hop sizes of the reference database and the real—time signatures may be predetermined
`
`to facilitate compatibility. For example, the reference signatures in the reference
`
`database may be precomputed with a fixed hop and segment size, and thus the client
`
`applications should conform to this segment size and have a hop size which integrally
`
`
`
`divides the reference signature hop size. It is contemplated that one may experiment with
`
`15
`
`a variety of segment sizes in order to balance the tradeoff of accuracy with speed of
`
`computation for a given application.
`
`The inventors have found that by carefully choosing the hop size of the segments,
`
`the accuracy of the identification process may be significantly increased. Additionally,
`
`13
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 15 of 32
`Efigsssfléafibfié-OBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 15 of 32
`
`AMC-01-004P
`
`the inventors have found that the accuracy of the identification process may be increased
`
`if the hop size of reference segments and the hop size of segments obtained in real-time
`
`are each chosen independently. The importance of the hop size of segments may be
`
`illustrated by examining the process for segmenting pre-recorded works and real—time
`
`works separately.
`
`Reference signatures
`
`
`
`
`
`
`
`Prior to attempting to identify a given work, a reference database of signatures
`
`must be created. When building a reference database, a segment length having a period
`
`of less than three seconds is preferred. In an exemplary non-limiting embodiment of the
`
`present invention, the segment lengths have a period ranging from 0.5 seconds to 3
`
`seconds. For a reference database, the inventors have found that a hop size of
`
`approximately 50% to 100% of the segment size is preferred.
`
`It is contemplated that the reference signatures may be stored on a database such
`
`as database 204 as described above. Database 204 and the discussion herein provide an
`
`15
`
`example of means for providing a plurality of reference signatures each having a segment
`
`size and a hop size.
`
`Unknown signatures
`
`l4
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 16 of 32
`Efigf85%6b§-fig-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 16 of 32
`
`The Choice of the hop size is important for the signatures of the audio to be
`
`identified, hereafter referred to as “unknown audio.”
`
`AMC-01-004P
`
`Figure 4 shows a detailed diagram of the segmentation of unknown audio
`
`according to the present invention. FIG. 4 includes unknown audio information 400
`
`displayed along a time axis 402. FIG. 4 further includes segments 404 and 406 taken of
`
`audio information 400 over some segment length T. In an exemplary non—limiting
`
`embodiment of the present invention, the segment length of unknown audio segments is
`
`chosen to range from 0.5 to 3 seconds.
`
`As can be seen by inspection of FIG. 4, the hop size of unknown audio segments is
`
`chosen to be smaller than that of reference segments. In an exemplary non-limiting
`
`embodiment of the present invention, the hop size of unknown audio segments is less
`
`than 50% of the segment size. In yet another exemplary non—limiting embodiment of the
`
`present invention, the unknown audio hop size may be 0.1 seconds.
`
`
`
`
`
`The inventors have found such a small hop size advantageous for the following
`
`15
`
`reasons. The ultimate purpose of generating unknown audio segments is to analyze and
`
`compare them with the reference segments in the database to look for matches. The
`
`inventors have found at least two major reasons why an unknown audio recording would
`
`not match its counterpart in the database. One is that the broadcast channel does not
`
`15
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 17 of 32
`Efigfisslécbébfié-OBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 17 of 32
`
`produce a perfect copy of the original. For example, the work may be edited or processed
`
`or the announcer may talk over part of the work. The other reason is that larger segment
`
`boundaries may not line up in time with the original segment boundaries of the target
`
`AMC-Ol-OO4P
`
`recordings.
`
`The inventors have found that by choosing a smaller hop size, some of the
`
`segments will ultimately have time boundaries that line up with the original segments,
`
`notwithstanding the problems listed above. The segments that line up with a “Clean”
`
`segment of the work may then be used to make an accurate comparison while those that
`
`do not so line up may be ignored. The inventors have found that a hop size of 0.1 seconds
`
`seems to be the maximum that would solve this time shifting problem.
`
`As mentioned above, once a work has been segmented, the individual segments
`
`are then analyzed to produce a segment feature vector. Figure 5 is a diagram showing an
`
`overview of how the segment feature vectors may be created using the methods described
`
`in US Patent #5,918,223 to Blum, et a1. It is contemplated that a variety of analysis
`
`
`
`
`15
`
`methods may be useful in the present invention, and many different features may be used
`
`to make up the feature vector.“ The inventors have found that the pitch, brightness,
`
`bandwidth, and loudness features of the ‘223 patent to be useful in the present invention.
`
`Additionally, spectral features may be used analyzed, such as the energy in various
`
`l6
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 18 of 32
`EfigffisfléaébfisrozwfiPGG-SN Document 239-8 Filed 11/12/20 Page 18 of 32
`
`AMC—Ol-004P
`
`spectral bands. The inventors have found that the cepstral features (MFCCs) are very
`
`robust (more invariant) given the distortions typically introduced during broadcast, such
`
`as EQ, multi-band compression/limiting, and audio data compression techniques such as
`
`MP3 encoding/decoding, etc.
`
`
`
`
`In act 500, the audio segment is sampled to produce a segment. In act 502, the
`
`sampled segment is then analyzed using Fourier Transform techniques to transform the
`
`signal into the frequency domain. In act 504, mel frequency filters are applied to the
`
`transformed signal to extract the significant audible characteristics of the spectrum. In
`
`act 506, a Discrete Cosine Transform is applied which converts the signal into mel
`
`frequency cepstral coefficients (MFCCs). Finally, in act 508, the MFCCS are then
`
`averaged over a predetermined period. In an exemplary non—limiting embodiment of the
`
`present invention, this period is approximately one second. Additionally, other
`
`characteristics may be computed at this time, such as brightness or loudness. A segment
`
`feature vector is then produced which contains a list containing at least the 10 MFCCS
`
`15
`
`corresponding average.
`
`The disclosure of FIGS. 3, 4, and 5 provide examples of means for creating a
`
`signature of a sampled work having a segment size and a hop size.
`
`17
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 19 of 32
`Efiagfagégfilécys-OZBQSPGG-SN Document 239-8 Filed 11/12/20 Page 19 of 32
`
`AMC-Ol-OO4P
`
`Figure 6 is a diagram showing a complete signature 600 according to the present
`
`invention. Signature 600 includes a plurality of segment feature vectors 1 through It
`
`generated as shown and described above. Signature 600 may also include an
`
`identification portion containing a unique ID. It is contemplated that the identification
`
`portion may contain a unique identifier provided by the RIAA (Recording Industry
`
`Association of America) or some other audio authority or cataloging agency. The
`
`identification portion may also contain information such as the UPC (Universal Product
`
`Code) of the various products that contain the audio corresponding to this signature.
`
`Additionally, it is contemplated that the signature 600 may also contain information
`
`pertaining to the characteristics of the file itself, such as the hop size, segment size,
`
`number of segments, etc., which may be useful for storing and indexing.
`
`Signature 600 may then be stored in a database and used for comparisons.
`
`The following computer code in the C programming language provides an
`
`example of a database structure in memory according to the present invention:
`
`typedef struct
`{
`
`float hopSize;
`float
`segmentSize;
`MFSignature* signatures;
`} MFDatabase;
`
`/* hop size */
`/* segment size */
`/* array of signatures */
`
`l8
`
`
`
`
`15
`
`20
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 20 of 32
`gage 1'14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 20 of 32
`056099US
`
`AMC-O l -004P
`
`10
`
`
`
`The following provides an example of the structure of a segment according to the
`present invention:
`
`typedef struct
`{
`
`char* id;
`
`numSegrnents;
`long
`float* features;
`
`long
`size;
`float hopSize;
`float
`segmentSize;
`} MFSignature;
`
`/* unique ID for this audio clip */
`/"< number of segments */
`/* feature array */
`/* size of per—segment feature vector */
`
`The discussion of FIG. 6 provides an example of means for storing segments and
`
`signatures according to the present invention.
`
`A more detailed description of the operation of the present invention will now be
`
`provided.
`
`Referring now to Figure 7A, a flowchart showing one aspect of a method
`
`according to the present invention is presented.
`
`Reference database preparation
`
`19
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 21 of 32
`Efig§85%&94§fi\§02396-PGG-SN Document 239-8 Filed 11/12/20 Page 21 of 32
`
`AMC-Ol—OO4P
`
`Prior to the identification of an unknown sample, a database of reference
`
`signatures is prepared in accordance with the present invention.
`
`In an exemplary non—limiting embodiment of the present invention, a reference
`
`signature may comprise an audio signature derived from a segmentation of the original
`
`audio work as described above. In a presently preferred embodiment, reference
`
`signatures have 20 non-overlapping segments, where each segment is one second in
`
`duration, with one—second spacing from center to center, as described above. Each of
`
`these segments is represented by 10 Mel filtered cepstral coefficients (MFCCs), resulting
`
`in a feature vector of 200 dimensions. Since indexing a vector space of this
`
`dimensionality is not practical, the number of dimensions used for the initial search for
`
`possible candidates is reduced according to the present invention.
`
`Reducing the dimensionality
`
`
`
`
`
`
`
`Figure 7A is a flowchart of dimension reduction according to the present
`
`invention. The number of dimensions used for the initial search for possible candidates is
`
`15
`
`reduced, resulting in what the inventors refer to as a subspace. By having the present
`
`invention search a subspace at the outset, the efficiency of the search may be greatly
`
`increased.
`
`20
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 22 of 32
`Efifigjfléc’légpkOBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 22 of 32
`
`AMC—Ol—OO4P
`
`Referring now to FIG. 7A, the present invention accomplishes two tasks to
`
`develop this subspace: (1) the present invention uses less than the total number of
`
`segments in the reference signatures in act 701; and (2) the present invention performs a
`
`principal components analysis to reduce the dimensionality in act 703.
`
`Using less segments to perform an initial search
`
`
`
`
`
`The inventors empirically have found that using data from two consecutive
`
`segments (i.e., a two-second portion of the signature) to search for approximately 500
`
`candidates is a good tradeoff between computation complexity and accuracy. The
`
`number of candidates can be altered for different applications where either speed or
`
`accuracy is more or less important.
`
`For example, the present invention may be configured to extract a predetermined
`
`percentage of candidates. In an exemplary non—limiting embodiment of the present
`
`invention, a list of candidates may comprise 2% of the size of the reference signature
`
`database when using 2 segments for the initial search. In another exemplary non-limiting
`
`15
`
`embodiment of the present invention, a list of candidates may be those reference
`
`signatures whose distances based on the initial 2 segment search are below a certain
`
`threshold.
`
`21
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 23 of 32
`Efigfissléfig-fié-OBQE-PGG-SN Document 239-8 Filed 11/12/20 Page 23 of 32
`
`AMC-Ol-OO4P
`
`As will be appreciated by those of ordinary skill in the art, the dimension
`
`reduction of the present invention may be used to perform initial search using fewer
`
`segments for data other than MFCC—based feature vectors.
`
`It is contemplated that any
`
`feature—based vector set may be used in the present invention.
`
`Furthermore, the segments used in the initial search do not have to be the same
`
`size as the segments used for the final search. Since it may be better to use as few
`
`dimensions as possible in the initial search for candidates, a smaller segment size is
`
`advantageous here. The full segment size can then be used in the final search. In an
`
`exemplary non—limiting embodiment of the current invention, the initial search may use
`
`the higher-order MFCCs (since these are the most robust) — this is a simple way to reduce
`
`the dimensionality.
`
`In the next section, we will discuss another, more sophisticated, method for
`
`reducing the segment size for the initial candidate search.
`
`
`
`Perform alternate encoding
`
`15
`
`The second step is to use an alternate encoding of the MFCC data which has the
`
`same information but with fewer features.
`
`22
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 24 of 32
`fiafi6éggg§fi§2396-PGG-SN Document 239-8 Filed 11/12/20 Page 24 of 32
`
`AMC-Ol-004P
`
`To accomplish this, the present invention first performs an eigenanalysis of N
`
`candidates to determine the principal components of the MFCCs for our typical audio
`
`data. In an exemplary non-limiting embodiment of the present invention, the present
`
`invention examines 25,000 audio signatures of 20 segments each - each taken from a
`
`different recording, which gives provides 500,000 sets of MFCCs. The inventors have
`
`found that this is enough to be a good statistical sample of the feature vectors.
`
`
`
`
`
`As is appreciated by those of ordinary skill in the art, the number examined in the
`
`present invention may be adjusted to provide a good statistical sample of different kinds
`
`of music. For example, 100 or a 1000 segments may be satisfactory.
`
`Next, a Karhunen-Loeve transformation is derived. Each set of 10 MFCCS
`
`becomes a column of a matrix A. We then compute ATA and find the 10 eigenvalues and
`
`eigenvectors of this matrix. Sorting the eigenvectors by eigenvalue (largest eigenvalue
`
`first) results in a list of orthogonal basis vectors that are the principal components of the
`
`segment data. For a database of typical music recordings, 95% of the information in the
`
`15
`
`MFCCs is contained in the first 7 components of this new basis.
`
`As is known by those having ordinary skill in the art, the Karhunen—Loeve
`
`transformation is represented by the matrix that has the all 10 of the above eigenvectors
`
`as its rows. This transformation is applied to all the segments of all the reference
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 239-8 Filed 11/12/20 Page 25 of 32
`gfigl%§l64690§/6032396-PGG-SN Document 239-8 Filed 11/12/20 Page 25 of 32
`
`AMC-Ol-004P
`
`signatures in the database as well as to all the segments of any signatures that are to be
`
`identified. This allows approximate

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket