`
`
`
`Exhibit 24
`
`
`
`
`
`
`
`
`
`
`(19) United States
`(12) Patent Application Publication (10) Pub. No.: US 2002/0133499 A1
`Ward et al.
`(43) Pub. Date:
`Sep. 19, 2002
`
`US 2002O133499A1
`
`(54) SYSTEM AND METHOD FOR ACOUSTIC
`FINGERPRINTING
`
`(76) Inventors: Sean Ward, Alexandria, VA (US);
`Isaac Richards, Willoughby, OH (US)
`Correspondence Address:
`William L. Feeney
`Miles & Stockbridge, P.C.
`I 500 OcKDridge,
`1751 Pinnacle Drive
`McLean, VA 22102-3833 (US)
`9
`(21) Appl. No.:
`09/931,859
`(22) Filed:
`Aug. 20, 2001
`Related U.S. Application Data
`
`(60) Provisional application No. 60/275,029, filed on Mar.
`13, 2001.
`
`Publication Classification
`
`(51) Int. Cl. ................................................... G06F 7700
`(52) U.S. Cl. .............................................................. 707/102
`(57)
`ABSTRACT
`
`A method for quickly and accurately identifying a digital
`file, Specifically one that represents an audio file. The
`identification can be used for tracking royalty payments to
`copyright owners. A database Stores features of various
`audio files and a globably unique identifier (GUID) for each
`file. Advantageously, the method allows a database to be
`updated in the case of a new audio file by Storing its features
`and generating a new unique identifier for the new file. The
`audio file is Sampled to generate a fingerprint that uses
`Spectral residuals and transforms of Haar wavelets. Advan
`tageously, any label used for the work is automatically
`updated if it appears to be in error.
`
`Aggregation. Feature vector
`atch Overview
`
`uu---
`risea Later
`\ ^
`
`es
`y
`Store inconing
`string fingerprintin
`te
`desing
`fingerprint
`
`
`
`
`
`
`
`
`
`No
`
`Return Matched
`identifies
`
`{
`
`)
`
`fes
`
`- Receive
`-
`is
`
`fingerprint use
`createratio
`match algorithin to
`translate to string
`signatureform,
`consisting of
`feature, cluster
`ID tuples
`
`accence
`rate of east
`cuserID in sing
`fingerprint.
`
`Subdividestring
`fingerprint into at
`east 4 sections, in
`each sector find
`cluster with
`West accTeace
`rate.
`
`v
`lost listic string
`fingerprints
`containing cluster
`sidentified it
`previous step, and
`lead listed string
`firgerprints
`
`For each loaded
`stringfingerprint,
`use string
`fingerprint
`comparison
`fiction to check
`for match
`
`N bust
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 2 of 16
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 1 of 9
`
`US 2002/0133499 A1
`
`FIG. 1
`
`Overview of Preprocessing
`Steps
`
`Open sound file, -u1 l O
`determine format
`
`
`
`Decompress Audio
`Yes-o stream to raw audio
`samples
`
`Compressed?
`
`
`
`
`
`load decompressed
`
`audio stream
`
`| 4.
`
`Correct for DC
`
`Offset
`
`| 8
`
`Downmix audio
`stream to 11025 ha,
`16 bit, mono
`
`
`
`Advance Audio
`stream until first
`non-silence sample
`
`Begin Signature
`
`Generation
`
`R
`
`F.
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 3 of 16
`
`
`
`Patent Application Publication
`
`Sep.19, 2002. Sheet 2 of 9
`
`US 2002/0133499 A1
`
`FG, 2
`
`Overview of Signature
`Creation
`
`Window overlap 6 to
`50, frame size to 4500"
`window size samples,
`and frameoverlap 96 to
`
`into working butter
`from audio stream
`
`
`
`
`
`Fraze Frate
`Vector Set
`
`
`
`Domain Features
`
`
`
`Save Final Feature
`Wector Set
`
`N.
`
`For each Window
`in Frame, compute
`Haar Wavelet
`Transform
`
`For Each Wavelet
`coefficient,
`wavelet, haarik
`+= wavelete
`
`For each Window
`in Frame, apply
`blackman-harts
`window
`
`Computefit of each
`frate 1\-
`
`2
`
`Compute Spectral
`Domain Features
`
`Finalize Frare
`
`Vector 1\u-
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 4 of 16
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 3 of 9
`
`US 2002/0133499 A1
`
`FG.3
`
`Time Domain Feature
`Computation
`
`
`
`
`
`Compute Meanzero
`Crossing Rate
`
`(ABS) mean(RMS)
`Ratio
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 5 of 16
`
`
`
`Store Tre Oomain
`Features into Frame
`V Feature Vector
`
`
`
`Patent Application Publication
`
`Sep.19, 2002. Sheet 4 of 9
`
`US 2002/0133499 A1
`
`FIG, 4 Spectral Feature Generation
`
`w
`
`For Each FFT
`window in Frame
`
`FIG.5 BeatTracking
`
`
`
`in beatStore, For
`each wal,
`beatStore =
`beatStore
`Linease
`- Y -
`Find maxValin
`beatStore, store in
`beatinax. beatax
`beatinax'080
`
`Set flagsbeat to
`false, beatcount to
`0, and lastbeat to 0.
`For eachwal in
`beatStore, indexed
`as i:
`
`--
`For each spectral
`band, Spec,
`convert to logscale,
`freq)=
`(log10(spec |
`4096)+6)*6and
`store meanin
`Specimeant
`For eacspecial
`Of eaC
`a
`calculation, store
`the sum of the
`second and third
`spectral bands
`times 5 in
`beatStorefitnumber
`
`i.
`
`For Each Log
`Scaled band,
`spec, Compute
`abs(delta from last
`windowval). Store
`meanin
`AvgFFTDelta)
`
`--
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 6 of 16
`
`ComputeBeats Per
`Minute
`
`4. 1 > 14
`
`ore FF Features
`in Frame Vector )
`
`Loop Completion
`
`
`
`
`
`
`
`Return beatcount
`as BPM featre
`
`
`
`Patent Application Publication
`
`Sep.19, 2002 Sheet 5 of 9
`
`US 2002/0133499 A1
`
`F.G. 6
`
`Frame Finalization
`
`---
`Receive Frame
`Feature Wector )
`
`
`
`For Each spectral
`mean (spec) find
`the minimum band
`mean, specnom.
`
`For Each spectral
`mean (spec)
`compute the
`spectral residual,
`specia spec)-
`specnorm
`
`Compute Spectral
`Residualsum
`(surn(speci)
`
`For Each wavelet
`coef (haar), find
`the minimum coef
`mean, haarinOrm.
`
`For Each Wavelet
`coef (haar) haar
`= signhaar)
`(abshaar) -
`haanorm).
`
`
`
`For Each wavelet
`coef (haar) if
`abschaaf) < 1
`haar= 0. else
`haar
`20"log10(abs(haari
`)
`
`
`
`Softwavelet coes
`(haar)in
`ascending order
`
`Store Final Fame
`Feature Wect
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 7 of 16
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002. Sheet 6 of 9
`
`US 2002/0133499 A1
`
`Aggregation Match Server
`Components
`
`Feature Class->
`Comparison
`Function, max
`distance tuple
`able
`
`FIG.7
`
`
`
`Feature Wector ->
`identifier Database
`
`Feature Vector
`Hash index
`
`Collection Match Server
`Components
`
`
`
`Feature Wector ->
`identifier Database
`
`Feature Class->
`Comparison
`Function, max
`distance tuple
`able
`
`Fingerprint String.
`>identifier
`Database
`
`Feature Vector
`Hash index
`
`Feature Vector
`>Feature Vector
`cluster index
`
`Cluster D-> String
`Location Database
`
`
`
`
`
`FIG.9
`
`Concatenated Vector Match
`Algorithm Overview
`
`Custer D->
`Occurance Rate
`Database
`
`? Receive incoming
`Feature Vector
`
`Doés More
`Thanone\
`Feature Class X-
`NComparison
`Functionexist?
`Y
`No
`
`
`
`Load Default
`Comparison
`Fingah
`
`Use Feature Vector
`Hash Index, load
`nearest neighbor
`feature vector set
`
`
`
`Yes
`
`For each Feature
`Class Vector,
`compute distance
`Wrt incoming
`feature vector
`
`LoadComparison
`Function and match
`distance for
`Feature Class
`vector with smallest
`distanceform
`incoming vector
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 8 of 16
`
`oEach Feature
`Vector in nearest
`neighborset,
`compute distance
`from incoming
`feature vector to
`each nearest
`neighbor vector,
`using loaded
`Comparison
`
`m
`
`
`
`
`
`Generate new
`identifier and insert
`incoming vector?
`identifier tuple into
`Vector database
`
`
`
`/
`oes Near
`Neighbor exist with
`distance Cloaded
`atch distance?
`
`Yes
`
`Returnkentifier for
`closest nearest
`neighbor vector
`
`
`
`
`
`Return Paired
`identifier
`
`)
`
`
`
`Patent Application Publication Sep.19, 2002 Sheet 7 of 9
`
`US 2002/0133499 A1
`
`FG.10
`
`Vector Comparison function
`
`Receive Feature
`Vector Pair,
`comparison Weight
`array
`/
`
`
`
`
`
`
`
`
`
`
`
`Feature vector
`vect, vector 2,
`vec2, and
`comparison weight
`Weight
`
`For Length of
`feature vector 1,
`distanceSite
`(abstvec.
`vec2O)" weight
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 9 of 16
`
`
`
`Patent Application Publication Sep. 19, 2002 Sheet 8 of 9
`
`US 2002/0133499 A1
`
`FIG. 11
`
`Aggregation Feature Vector
`Match Overview
`
`^ Receive
`Aggregation
`Fingerprint
`
`s
`)
`-
`
`For each feature
`vector within
`fingerprint, use
`concatenation
`match algorithm to
`translate to string
`signature form,
`consisting of
`feature 1D, cluster
`ID tuples
`
`Load OCCufence
`Rate of each
`cluster ID in string
`fingerprint.
`--
`
`
`
`N
`
`rtis is a single\
`x
`signal
`X-No
`Nfingerprint/
`N /
`
`
`
`
`
`Load "No Match"
`dentifier
`
`Yes
`
`Store incoming
`string fingerprint in
`database, update
`cluster tables,
`assign new
`identifier to string
`fingerprint
`
`
`
`Return Matched
`identifier
`
`Subdividestring
`fingerprint into at
`least 4 Sections. In
`each section find
`cluster D with
`lowest occurence
`rate.
`
`Load list of string
`fingerprints
`containing cluster
`ID's identified in
`previous step, and
`load listed string
`fingerprints
`
`For each loaded
`string fingerprint,
`use string
`fingerprint
`comparison
`function to check
`for match
`
`
`
`
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 10 of 16
`
`Yes
`
`
`
`Patent Application Publication
`
`Sep.19, 2002. Sheet 9 of 9
`
`US 2002/0133499 A1
`
`FIG, 12String Fingerprint
`MatchFunction
`
`
`
`Fingerprints )
`( Receive 2 String
`
`
`
`initialize misnatch
`court to 0.
`
`Starting atcluster
`IDWitkowest
`occurence rate,
`compare
`successive cluster
`D's of both string
`fingerprints. For
`each nisiac,
`increment
`mismatch count,
`otherwise
`increment match
`
`N.
`/
`Bismatch counta
`mismatch threshold
`and match court?
`
`docs a t
`4tal ster
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 11 of 16
`
`
`
`US 2002/0133499 A1
`
`Sep. 19, 2002
`
`SYSTEMAND METHOD FOR ACOUSTIC
`FINGERPRINTING
`
`CROSS-REFERENCE TO RELATED
`APPLICATION
`0001. The present application claims the benefit of U.S.
`provisional application No. 60/275,029 filed Mar. 13, 2001.
`That application is hereby incorporated by reference.
`
`FIELD OF THE INVENTION
`0002 The present invention is related to a method for the
`creation of digital fingerprints that are representative of the
`properties of a digital file. Specifically, the fingerprints
`represent acoustic properties of an audio signal correspond
`ing to the file. More particularly, it is a System to allow the
`creation of fingerprints that allow the recognition of audio
`Signals, independent of common Signal distortions, Such as
`normalization and psycho acoustic compression.
`
`DESCRIPTION OF THE PRIOR ART
`0003) Acoustic fingerprinting has historically been used
`primarily for Signal recognition purposes, in particular,
`terrestrial radio monitoring Systems. Since these were pri
`marily continuous audio Sources, fingerprinting Solutions
`were required which dealt with the lack of delimiters
`between given signals. Additionally, performance was not a
`primary concern of these Systems, as any given monitoring
`system did not have to discriminate between hundreds of
`thousands of Signals, and the ability to tune the system for
`Speed Versus robustness was not of great importance.
`0004. As a survey of the existing approaches, U.S. Pat.
`No. 5,918,223 describes a system that builds sets of feature
`vectors, using Such features as bandwidth, pitch, brightness,
`loudness, and MFCC coefficients. It has problems relating to
`the cost of the match algorithm (which requires Summed
`differences across the entire feature vector Set), as well as the
`discrimination potential inherent in its feature bank. Many
`common Signal distortions that are encountered in com
`pressed audio files, Such as normalization, impact those
`features, making them unacceptable for a large-scale System.
`Additionally, it is not tunable for Speed versus robustness,
`which is an important trait for certain Systems.
`0005 U.S. Pat. No. 5,581,658 describes a system which
`uses neural networks to identify audio content. It has advan
`tages in high noise situations verSuS feature vector based
`Systems, but does not Scale effectively, due to the cost of
`running a neural network to discriminate between hundreds
`of thousands, and potentially millions of Signal patterns,
`making it impractical for a large-scale System.
`0006 U.S. Pat. No. 5,210,820 describes an earlier form
`of feature vector analysis, which uses a simple spectral band
`analysis, with Statistical measures Such as variance,
`moments, and kurtosis calculations applied. It proves to be
`effective at recognizing audio signals after common radio
`Style distortions, Such as Speed and Volume shifts, but tends
`to break down under psycho-acoustic compression Schemes
`Such as mp3 and Ogg Vorbis, or other high noise situations.
`0007 None of these systems proves to be scalable to a
`large number of fingerprints, and a large Volume of recog
`nition requests. Additionally, none of the existing Systems
`are effectively able to deal with many of the common types
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 12 of 16
`
`of Signal distortion encountered with compressed files, Such
`as normalization, Small amounts of time compression and
`expansion, envelope changes, noise injection, and psycho
`acoustic compression artifacts.
`
`SUMMARY OF THE INVENTION
`0008. This system for acoustic fingerprinting consists of
`two parts: the fingerprint generation component, and the
`fingerprint recognition component. Fingerprints are built off
`a Sound Stream, which may be Sourced from a compressed
`audio file, a CD, a radio broadcast, or any of the available
`digital audio Sources. Depending on whether a defined Start
`point exists in the audio Stream, a different fingerprint
`variant may be used. The recognition component can exist
`on the Same computer as the fingerprint component, but will
`frequently be located on a central Server, where multiple
`fingerprint Sources can access it.
`0009 Fingerprints are formed by the Subdivision of an
`audio Stream into discrete frames, wherein acoustic features,
`Such as Zero crossing rates, spectral residuals, and Haar
`wavelet residuals are extracted, Summarized, and organized
`into frame feature vectors. Depending on the robustness
`requirement of an application, different frame Overlap per
`centages, and Summarization methods are Supported, includ
`ing Simple frame vector concatenation, Statistical Summary
`(Such as variance, mean, first derivative, and moment cal
`culation), and frame vector aggregation.
`0010 Fingerprint recognition is performed by a Manhat
`tan distance calculation between a nearest neighbor set of
`feature vectors (or alternatively, via a multiresolution dis
`tance calculation), from a reference database of feature
`vectors, and a given unknown fingerprint vector. Addition
`ally, previously unknown fingerprints can be recognized due
`to a lack of Similarity with existing fingerprints, allowing the
`System to intelligently indeX new signals as they are encoun
`tered. Identifiers are associated with the reference database
`vector, which allows the match subsystem to return the
`asSociated identifier when a matching reference vector is
`found.
`Finally, comparison functions can be described to
`0011
`allow the direct comparison of fingerprint vectors, for the
`purpose of defining Similarity in Specific feature areas, or
`from a gestalt perspective. This allows the Sorting of fin
`gerprint vectors by Similarity, a useful quantity for multi
`media database Systems.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0012. The invention will be more readily understood with
`reference to the following FIGS. wherein like characters
`represent like components throughout and in which:
`0013 FIG. 1 is a logic flow diagram, showing the
`preprocessing Stage of fingerprint generation, including
`decompression, down Sampling, and dc offset correction.
`0014 FIG. 2 is a logic flow diagram, giving an overview
`of the fingerprint generation Steps.
`0015 FIG. 3 is a logic flow diagram, giving more detail
`of the time domain feature extraction Step.
`0016 FIG. 4 is a logic flow diagram, giving more detail
`of the Spectral domain feature extraction Step.
`
`
`
`US 2002/0133499 A1
`
`Sep. 19, 2002
`
`0017 FIG. 5 is a logic flow diagram, giving more detail
`of the beat tracking feature Step.
`0.018
`FIG. 6 is a logic flow diagram, giving more detail
`of the finalization Step, including spectral band residual
`computation, and wavelet residual computation and Sorting.
`0.019
`FIG. 7 is a diagram of the aggregation match
`Server components.
`0020 FIG. 8 is a diagram of the collection match server
`components.
`0021
`FIG. 9 is a logic flow diagram, giving an overview
`of the concatenation match Server logic.
`0022 FIG. 10 is a logic flow diagram, giving more detail
`of the concatenation match Server comparison function.
`0023 FIG. 11 is a logic flow diagram, giving an over
`View of the aggregation match Server logic.
`0024 FIG. 12 is a logic flow diagram, giving more detail
`of the aggregation match Server String fingerprint compari
`Son function.
`0.025
`FIG. 13 is a simplified logic flow diagram of a
`meta-cleansing technique of the present invention.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`0026. The ideal context of this system places the finger
`print generation component within a database or media
`playback tool. This system, upon adding unknown content,
`proceeds to generate a fingerprint, which is then Sent to the
`fingerprint recognition component, located on a central
`recognition Server. The resulting identification information
`can then be returned to the media playback tool, allowing,
`for example, the correct identification of an unknown piece
`of music, or the tracking of royalty payments by the play
`back tool.
`0027. The first step in generating a fingerprint is access
`ing a file. AS used herein, “accessing” means opening,
`downloading, copying, listening to, viewing (for example in
`the case of a video file), displaying, running (for example in
`the case of a Software file) or otherwise using a file. Some
`aspects of the present invention are applicable only to audio
`files, whereas other aspects are applicable to audio files and
`other types of files. The preferred embodiment, and the
`description which follows, relate to a digital file representing
`an audio file.
`0028. The first step of accessing a file is the opening of
`a media file in block 10 of FIG. 1. The file format is
`identified. Block 12 tests for compression. If the file is
`compressed, block 14 decompresses the audio Stream.
`0029. The decompressed audio stream is loaded at block
`16. The decompressed stream is then scanned for a DC offset
`error at block 18, and if one is detected, the offset is
`removed. Following the DC offset correction, the audio
`stream is down sampled to 11025 ha at block 20, which also
`Serves as a low pass filter of the high frequency component
`of the audio, and is then down mixed to a mono Stream, Since
`the current feature banks do not rely upon phase informa
`tion. This Step is performed to both Speed up extraction of
`acoustic features, and because more noise is introduced in
`high frequency components by compression and radio
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 13 of 16
`
`broadcast, making them leSS useful components from a
`feature Standpoint. At block 22, this audio Stream is
`advanced until the first non-slient sample. This 11025 ha, 16
`bit, mono audio Stream is then passed into the fingerprint
`generation Subsystem for the beginning of Signature or
`fingerprint generation at block 24.
`0030 Four parameters influence fingerprint generation,
`Specifically, frame size, frame overlap percentage, frame
`vector aggregation type, and Signal Sample length. In dif
`ferent types of applications, these can be optimized to meet
`a particular need. For example, increasing the Signal Sample
`length will audit a larger amount of a Signal, which makes
`the System uSable for Signal quality assurance, but takes
`longer to generate a fingerprint. Increasing the frame size
`decreases the fingerprint generation cost, reduces the data
`rate of the final Signature, and makes the System more robust
`to Small misalignment in fingerprint windows, but reduces
`the overall robustness of the fingerprint. Increasing the
`frame overlap percentage increases the robustness of the
`fingerprint, reduces Sensitivity to window misalignment, and
`can remove the need to Sample a fingerprint from a known
`Start point, when a high overlap percentage is coupled with
`a collection Style frame aggregation method. It has the costs
`of a higher data rate for the fingerprint, longer fingerprint
`generation times, and a more expensive match routine.
`0031. In the present invention, 2 combinations of param
`eters were found to be particularly effective for different
`systems. The use of a frame size of 96,000 samples, a frame
`overlap percentage of 0, a concatenation frame vector aggre
`gation method, and a signal sample length of 288,000
`Samples proves very effective at quickly indexing multime
`dia content, based on Sampling the first 26 Seconds in each
`file. It is not robust against window shifting, or usable in a
`System wherein that window cannot be aligned, however. In
`other words, this technique works where the Starting point
`for the audio Stream is known.
`0032 For applications where the overlap point between a
`reference fingerprint and an audio stream is unknown (i.e.,
`the starting point is not known), the use of 32,000 sample
`frame windows, with a 75% frame overlap, a signal Sample
`length equal to the entire audio Stream, and a collection
`aggregation method is advised. The frame overlap of 75
`percent means that a frame overlaps an adjacent frame by 75
`percent.
`0033 Turning now to the fingerprint pipeline of FIG. 2,
`the audio Stream is received at block 26 from the prepro
`cessing technique of FIG. 1. At block 28, the transform
`window Size is Set to 64 Samples, the window overlap
`percentage is set (to Zero in this case), frame size is set to
`4500 window size samples. At block 30, the next step is to
`advance window frame size Samples into the working buffer.
`0034 Block 32 tests if a full frame was read in. If so, the
`time domain features of the working frame vector are
`computed at block 34 of FIG. 2. This is done using the steps
`now described with reference to FIG. 3. After receiving the
`audio Samples at block 36, the Zero crossing rate is com
`puted at block 38 by Storing the Sign of the previous Sample,
`and incrementing a counter each time the sign of the current
`Sample is not equal to the Sign of the previous Sample, with
`Zero Samples ignored. The Zero crossing total is then divided
`by the frame window length, to compute the Zero crossing
`mean feature. The absolute value of each Sample is also
`
`
`
`US 2002/0133499 A1
`
`Sep. 19, 2002
`
`Summed into a temporary variable, which is also divided by
`the frame window length to compute the Sample mean value.
`This is divided by the root-mean-Square of the Samples in the
`frame window, to compute the mean/RMS ratio feature at
`block 40. Additionally, the mean energy value is stored for
`each block of 10624 samples within the frame. The absolute
`value of the difference from block to block is then averaged
`to compute the mean energy delta feature at block 42. These
`features are then Stored in a frame feature vector at block 44.
`0.035 Having completed the detailed explanation of the
`block 34 of FIG. 2 as shown at FIG. 3, reference is made
`back to FIG. 2 where the process continues at block 46. At
`this block, a Haar wavelet transform, with transform size of
`64 Samples, using /1:2 for the high pass and low pass
`components of the transform, is computed across the frame
`samples. Each transform is overlapped by 50%, and the
`resulting coefficients are Summed into a 64 point array. Each
`point in the array is then divided by the number of trans
`forms that have been performed, and the minimum array
`value is Stored as the normalization value. The absolute
`value of each array value minus the normalization value is
`then Stored in the array, any values less than 1 are Set to 0,
`and the final array values are converted to log space using
`the equation arrayLI=20*log10 (arrayIII). These log scaled
`values are then Sorted into ascending order, to create the
`wavelet domain feature bank at block 48.
`0.036 Subsequent to the wavelet computation, a Black
`man-Harris window of 64 Samples in length is applied at
`block 50, and a Fast Fourier transform is computed at block
`52. The resulting power bands are Summed in a 32 point
`array, converted to a log Scale using the equation SpecI=
`log10(specI/4096)+6, and then the difference from the
`previous transform is Summed in a companion spectral band
`delta array of 32 points. This is repeated, with a 50% overlap
`between each transform, across the entire frame window.
`Additionally, after each transform is converted to log Scale,
`the Sum of the Second and third bands, times 5, is Stored in
`an array, beatStore, indexed by the transform number.
`0037 After the calculation of the last Fourier transform,
`the spectral domain features are computed at block 54. More
`specifically, this corresponds to FIGS. 4 and 5. The beat
`Store array is processed using the beat tracking algorithm
`described in FIG. 5. The minimum value in the beatStore
`array is found, and each beatStore value is adjusted Such that
`beatStoreI=beatStoreI-minimum val. Then, the maxi
`mum value in the beatStore array is found, and a constant,
`beatmax is declared which is 80% of the maximum value in
`the beatStore array. For each value in the beatStore array
`which is greater than the beatmaX constant, if all the
`beatStore values +-4 array slots are less than the current
`value, and it has been more than 14 Slots Since the last
`detected beat, a beat is detected and the BPM feature is
`incremented.
`0.038. Upon completing the spectral domain calculations,
`the frame finalization process described in FIG. 6 is used to
`cleanup the final frame feature values. First, the Spectral
`power band means are converted to spectral residual bands
`by finding the minimum spectral band mean, and Subtracting
`it from each Spectral band mean. Next the Sum of the Spectral
`residuals is Stored as the Spectral residual Sum feature.
`Finally, depending on the aggregation type, the final frame
`vector consisting of the Spectral residuals, the spectral
`
`Case 1:14-cv-02396-PGG-SN Document 239-6 Filed 11/12/20 Page 14 of 16
`
`deltas, the Sorted wavelet residuals, the beats feature, the
`mean/RMS ratio, the Zero crossing rate, and the mean energy
`delta feature is Stored. In the concatenation model, the frame
`vector is concatenated with any other frame vectors to form
`a final fingerprint vector. In the aggregation model, each
`frame vector is Stored in a final fingerprint Set, where each
`vector is kept Separate.
`0039. In the preferred system, the fingerprint resolution
`component is located on a central Server, although methods
`using a partitioning Scheme based on the fingerprint data
`base hash tables can also be used in a distributed System.
`Depending on the type of fingerprint to be resolved, the
`architecture of the server will be similar to FIG. 7 for
`concatenation model fingerprints, and similar to FIG. 8 for
`aggregation Style fingerprints. Both models share Several
`data tables, Such as the feature vector-sidentifier database,
`the feature vector hash index, and the feature class->com
`parison weights and match distance tuple table. Within the
`concatenation System, the identifiers in the feature vector->
`identifier database are unique GUIDs, which allows the
`return of a unique identifier for an identified fingerprint. The
`aggregation match Server has Several additional tables. The
`cluster ID occurrence rate table shows the overall occur
`rence rate of any given feature vector, for the probability
`functions within the match algorithm. The feature vector
`cluster table is a mapping from any feature vector to the
`cluster ID which identifies all the nearest neighbor feature
`vectors for a given feature vector. In the aggregation System,
`a unique integer or Similar value is used in place of the
`GUID, Since the Fingerprint String database contains the
`GUID for aggregation fingerprints. The fingerprint String
`database consists of the identifier Streams associated with a
`given fingerprint, and the cluster ID's for each component
`within the identifier stream. Finally, the cluster ID->string
`location table consists of a mapping between every cluster
`ID and all the String fingerprints that contain a given cluster
`ID.
`0040. To resolve an incoming concatenation fingerprint,
`the match algorithm described in FIG. 9 is used. First, a
`check is performed to see if more than one feature class
`exists, and if So, the incoming feature vector is compared
`against each reference class vector, using the comparison
`function in FIG. 10 and a default weight set. The feature
`class with the shortest distance to the incoming feature
`vector is used to load an associated comparison function
`weight Scheme and match distance. Next, using the feature
`vector database hash index, which Subdivides the reference
`feature vector database based on the highest weighted fea
`tures in the vector, the nearest neighbor feature vector Set of
`the incoming feature vector is loaded. Next, each loaded
`feature vector in the nearest neighbor Set is compared, using
`the loaded comparison weight Scheme. If any of the refer
`ence vectors have a distance less than the loaded match
`threshold, the linked GUID for that reference vector is
`returned as the match for the incoming feature vector. If
`none of the nearest neighbor vectors are within the match
`threshold, a new GUID is generated, and the incoming
`feature vector is added to the reference database, allowing
`the System to organically add to the reference database as
`Signals are encountered. Additionally, the Step of re-averag
`ing the feature values of the matched feature vector can be
`taken, which consists of multiplying each feature vector
`field by the number of times it has been matched, adding the
`values of the incoming feature vector, dividing by the now
`
`
`
`US 2002/0133499 A1
`
`Sep. 19, 2002
`
`incremented match count, and Storing the resulting means in
`the reference database entry. This helps to reduce fencepost
`error, and move a reference feature vector to the center of the
`Spread for different quality observations of a signal, in the
`event the initial observations were of an overly high or low
`quality.
`Resolution of an aggregation fingerprint is essen
`0041
`tially a two level process. First, the individual feature
`vectors within the aggregation fingerprint are resolved,
`using essentially the Same process as the concatenation
`fingerprint, with the modification that instead of returning a
`GUID, the individual signatures return a subsig ID and a
`cluster ID, which indicates the nearest neighbor Set that a
`given SubSig belongs to. After all the aggregated feature
`vectors within the fingerprint are resolved, a String finger
`print, consisting of an array of SubSig ID and cluster ID
`tuples is formed. This format allows for the recognition of
`Signal patterns within a larger Signal Stream, as well as the
`detection of a Signal that has been reversed. Matching is
`performed by Subdividing the incoming String fingerprint
`into Smaller chunks, Such as the SubSigs which correspond to
`10 Seconds of a Signal, looking up which cluster ID within
`that window has the lowest occurrence rate in the overall
`feature database, loading the reference String fingerprints
`which share that cluster ID, and doing a run length match
`between those loaded String fingerprints and the incoming
`fingerprint. Additionally, the number of matches and mis
`matches between the reference String fingerprint and the
`incoming fingerprint are stored. This is used instead of
`Summed distances, because Several consecutive mismatches
`should trigger a mismatch, Since that indicates a Strong
`difference in the Signals between two fingerprints. Finally, if
`the match VS. mismatch rate crosses a predefined threshold,
`a match is recognized, and the GUID associated with the
`matched String fingerprint is returned.
`0.042
`Additional variants on this match routine include
`Searching forwards and backwards for matches, So as to
`detect reversed signals, and accepting a continuous Stream of
`aggregation feature vectors, Storing a trailing window, Such
`as 30 Seconds of Signal, and only returning a GUID when a
`match is finally detected, advancing the Search window as
`more fingerprint SubSigs are Submitted to the Se