throbber
Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 1 of 10
`Case 1:14-cv-02396—PGG-MHD Document 153-15 Filed 06/28/19 Page 1 of 10
`
`EXHIBIT N
`
`EXHIBIT N
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 2 of 10
`
`United States Patent (19)
`Ghias et al.
`
`54). APPARATUS AND METHOD FOR
`SEARCHING AMELODY
`
`76 Inventors: Asif U. Ghias, 2306 Sultana Dr.,
`Yorktown Heights, N.Y. 10598;
`Jonathan A. Logan, 1186 Palomino
`Rd., Santa Barbara, Calif. 93 105; David
`W. Chamberlin, 206 Flynn Ave.,
`Mountain View, Calif. 94043
`Appl. No.: 742,260
`21
`ppl. No.:
`9
`22 Filed:
`Oct. 31, 1996
`Related U.S. Application Data
`
`60 Provisional application No. 60/008,177, Oct. 31, 1995.
`51) Int. Cl. ............................. A63H 5700; G04B 13/00;
`G1OH 7/00
`52 U.S. Cl. ................................................. 84/609; 84/616
`s
`58 Field of Search .............................. 846. '.
`/616, 654
`
`56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`4,688,464 8/1987 Gibson et all
`4,771.671
`9/1988 Hoff, Jr............ 8464s
`5,038,658 8/1991 Tsuruta et al..
`5,088,380 2/1992 Minamitaka .............................. 84/637
`
`USOO5874.686A
`Patent Number:
`11
`(45) Date of Patent:
`
`5,874,686
`Feb. 23, 1999
`
`5,402,339 3/1995 Nakashima et al..
`5,510,572 4/1996 Hayashi et al. ........................... 84/609
`5,619,004 4/1997 Dame.
`
`FOREIGN PATENT DOCUMENTS
`405061917A 3/1993 Japan.
`OTHER PUBLICATIONS
`Hawley, Michael J., Structure out of Sound, 1993, MIT
`Doctoral Thesis, Abstract.
`Primary Examiner William M. Shoop, Jr.
`Assistant Examiner Jeffrey W. Donels
`Attorney, Agent, or Firm-Hodgson, Russ, Andrews, Woods
`& Goodyear LLP
`57
`
`ABSTRACT
`
`Apparatus and method for easily, efficiently, and accurately
`als ski A ally S inputsdO a ople and
`converted into a form of a Sequence of digitized represen
`tations of relative pitch Gries. SCSI
`thereof. A melody database is Searched for at least one
`Sequence of digitized representations of relative pitch dif
`ferences between Successive notes which at least approxi
`mately matches the Sequence of digitized representations of
`relative pitch differences between Successive notes of the
`melody.
`
`16 Claims, 3 Drawing Sheets
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`20
`
`P
`
`HUMMED
`OUERY
`
`COMPUTER
`
`WAVELET
`PTCH
`TRACKER
`
`LOWPASSCUBic
`FILTER
`: SPLINE
`WAVELET:
`CENTER
`PITCH
`CPPING TRACKER
`
`as
`
`AUTOCORRELATON
`BASED PITCH
`DETECTOR
`
`MELODY
`DATABASE
`
`OUERY ENGINE
`
`RANKED ST
`OF MATCHING
`MELODIES
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 3 of 10
`
`U.S. Patent
`
`Feb. 23, 1999
`
`Sheet 1 of 3
`
`5,874,686
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`COMPUTER
`
`
`
`are
`
`WAVELET
`IFS
`
`FILTER
`
`
`
`SPLINE
`WAVELET
`CBEING TRACKER
`
`m. m. m. maa e a
`
`- - - - - - -
`
`AUTOCORRELATION
`BASED PITCH
`DETECTOR
`
`MELODY
`DATABASE
`
`QUERY ENGINE
`
`RANKED LIST
`OF MATCHING
`MELODES
`
`
`
`26
`
`- EIG.1
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 4 of 10
`
`U.S. Patent
`
`Feb. 23, 1999
`
`Sheet 2 of 3
`
`5,874,686
`
`0.9
`0.8
`O.7
`O.6
`O.5
`0.4
`O.3
`0.2
`O.1
`
`30
`
`O
`
`2
`
`4
`
`6
`
`10 12 14 16 18 20
`8
`TIME
`- EIG. 2
`
`30
`
`30
`
`30
`
`30
`
`30
`
`0.9
`0.8
`O.7
`0.6
`O.5
`0.4
`0.3
`O.2
`O.1
`
`O
`
`50
`
`100 150 200 250 300 350 400
`TIME
`- EIG. 3
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 5 of 10
`
`U.S. Patent
`
`Feb. 23, 1999
`
`Sheet 3 of 3
`
`5,874,686
`
`2.5
`
`1.5
`
`1
`0.5
`O
`-0.5
`-1
`- 1.5
`-2
`O
`
`5
`4
`3
`
`2
`1
`O
`-1
`-2
`-3
`-4
`O
`
`/?
`
`42
`/
`
`40
`
`60
`
`7O
`
`80
`
`30
`
`50
`
`40
`TIME
`- EIG. 4
`
`1 O
`
`20
`
`50
`
`O.5
`
`1
`
`1.5
`
`3
`
`3.5 4 4.5
`
`5
`
`2 2.5
`TIME
`- EIG.5
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 6 of 10
`
`1
`APPARATUS AND METHOD FOR
`SEARCHING AMELODY
`
`5,874,686
`
`CROSS REFERENCE TO ARELATED
`APPLICATION
`This application claims priority of provisional application
`Ser. No. 60/008,177, filed Oct. 31, 1995, which is hereby
`incorporated herein by reference.
`
`15
`
`BACKGROUND OF THE INVENTION
`The present invention relates generally to database
`Searching. More particularly, the present invention relates to
`melody Searching.
`Next generation databases should include image, audio,
`and Video data in addition to traditional text and numerical
`data. These data types will require query methods that are
`more appropriate and natural to the type of respective data.
`For instance, a natural way to query an image database is to
`retrieve images based on operations on images or Sketches
`Supplied as input. Similarly, a natural way of querying an
`audio database (of Songs) is to hum the tune of a Song, as
`apparently addressed in T. Kageyama and Y. Takashima, "A
`Melody Retrieval Method With Hummed Melody”
`(language: Japanese), Transactions of the Institute of
`25
`Electronics, Information and Communication Engineers
`D-II, J77D-II(8): 1543–1551, August 1994. Such a system
`would be useful in any multimedia database containing
`musical data by providing an alternative and natural way of
`querying. One can also imagine a widespread use of Such a
`System in commercial music industry, music radio and TV
`Stations, music Stores, and even for one's personal use.
`It has been observed that melodic contour, defined as the
`Sequence of relative differences in pitch between Successive
`notes, can be used to discriminate between melodies. See
`Stephen Handel, Listening. An Introduction to the Percep
`tion of Auditory Events, The MIT Press, 1989, which indi
`cates that melodic contour is one of the most important
`methods that listeners use to determine Similarities between
`melodies. In Michael Jerome Hawley, Structure out of
`Sound, PhD thesis, MIT, September 1993, a method of
`querying a collection of melodic themes by Searching for
`exact matches of Sequences of relative pitches input by a
`MIDI keyboard is briefly discussed.
`U.S. Pat. No. 5,510,572 discloses utilizing pitch differ
`ences between Successive notes in classifying motion for a
`melody analyzer and harmonizer, wherein a Search may be
`incidentally used to find an appropriate chord progression to,
`for example, harmonize music So as to accompany a singer.
`Other art which may be of interest includes U.S. Pat. Nos.
`5,040,081; 5,146,833; 5,140,886; 4,688,464, and 5,418,322.
`
`35
`
`40
`
`45
`
`50
`
`SUMMARY OF THE INVENTION
`It is an object of the present invention to easily, efficiently,
`and accurately Search melodies.
`In order to easily, efficiently, and accurately Search
`melodies, in accordance with the present invention, a com
`puter means is provided which has a database of melodies
`each including a plurality of notes in a form of a Sequence
`of digitized representations of relative pitch differences
`between Successive notes, a melody is inputted to the
`computer means and converted into a form of a Sequence of
`digitized representations of relative pitch differences
`between Successive notes thereof, and the melody database
`is Searched for at least one Sequence of digitized represen
`tations of relative pitch differences between Successive notes
`
`55
`
`60
`
`65
`
`2
`which at least approximately matches the Sequence of digi
`tized representations of relative pitch differences between
`Successive notes of the melody.
`The above and other objects, features, and advantages of
`the present invention will be apparent in the following
`detailed description of a preferred embodiment thereof when
`read in conjunction with the accompanying drawings
`wherein the same reference numerals denote the same or
`Similar parts throughout the Several views.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a block diagram illustrating an apparatus and
`method which embodies the present invention.
`FIGS. 2 and 3 are graphs illustrating an excitation signal
`and train of Such excitation signals respectively used to
`create a Synthesized pitch for the method and apparatus.
`FIG. 4 is a graph of a formant structure which is formed
`from certain formant frequencies.
`FIG. 5 is a graph of a synthesized pitch created by
`convolving the train of excitation pulses of FIG. 3 and the
`formant structure of FIG. 4.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`Referring to FIG. 1, there is illustrated generally at 10
`apparatus for Searching a melody, which is represented by a
`Series of Successive notes, illustrated at 12, which of course
`have different pitches. In other words, it may be desirable to
`know the identity of the particular melody or tune 12, and a
`database, illustrated at 14, of melodies or tunes is Searched,
`as described hereinafter, to locate at least one melody or tune
`which at least approximately matches the tune 12. The
`database 14 is shown to be contained within a general
`purpose computer 16, but, alternatively, the database 14 may
`be located apart from the computer 16 and Suitably con
`nected thereto for communicating between the computer and
`database. Both of these alternatives are meant to come
`within the Scope of the present invention.
`In accordance with the present invention, the tune 12 is
`hummed by a person 18 into a microphone 20, and the
`hummed query, illustrated at 21, is Suitably digitized, in
`accordance with principles commonly known to those of
`ordinary skill in the art to which this invention pertains, and
`the digitized signals of the hummed query 21 are fed to a
`pitch tracking module 22 in computer 16. The pitch tracker
`assembles a contour representation of the hummed melody
`12, as hereinafter discussed in greater detail, which is fed to
`a query engine, illustrated at 24. The query engine 24
`Searches the melody database 14 and outputs a ranked list of
`approximately matching melodies, as illustrated at 26. A
`preSelected error tolerance may be applied to the Search. The
`query engine 24 may of course alternatively be programmed
`to output the Single most approximate matching melody or,
`if desired, to output an exact matching melody. However, by
`Searching for an approximate matching melody, as herein
`after discussed, various forms of anticipated errors may be
`taken into account.
`The database 14 of melodies may be acquired, for
`example, by processing public domain MIDI Songs, or may
`otherwise be Suitably acquired, and may be Stored as a
`flat-file database. Pitch tracking may be performed in, for
`example, Matlab software, a product of the Matworks, Inc.
`of Natick, Mass., chosen for its built-in audio processing
`capabilities and the east of testing a number of algorithms
`within it. Hummed queries may be recorded in a variety of
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 7 of 10
`
`5,874,686
`
`4
`
`
`w = -- : k = 1, 2, 3,...
`(2k-1)c
`3 as ws
`
`With L-17 cm (average value of vocal-tract length) and a
`Sound propagation Speed of c=340 meters/sec., the frequen
`cies of these resonances will be:
`
`F=(2k-1)-500 Hz; k=1,2,3,...
`
`The frequencies F are called formant frequencies.
`The resulting Sound that is heard is considered to be the
`convolution of the excitation pulse, illustrated at 30 in FIG.
`2, created by the glottis and the formant frequencies.
`Therefore, if one wants to model a Speech Signal, one Starts
`with a train of excitation pulses 30, as shown in FIG. 3. The
`period T in the train of excitations 30 is 0.01 sec. making
`the pitch 100 Hz. For the formant frequencies, the above
`equation for F is used with ke {1,2,3}. This gives formant
`frequencies: F=500 Hz, F=1500 Hz, and F=2500 Hz.
`Combining these frequencies and adding an exponential
`envelope produces the formant structure shown at 40 in FIG.
`4. By convolving the train of excitation pulses 30 with the
`formant structure 40 a synthesized pitch, as shown at 50 in
`FIG. 5, is obtained.
`Since what one hears as pitch is considered to actually be
`the frequency at which the bursts of air occur, one may be
`able to find the pitch of the Segment by tracking the bursts
`of air.
`There are at least three approaches to tracking pitch.
`The Maximum Likelihood approach, which is described
`by James D. Wise, James R. Caprio, and Thomas W. Parks
`in “Maximum Likelihood Pitch Estimation.” IEEE Trans.
`Acoust., Speech, Signal Processing, 24(5): 418-423, Octo
`ber 1976, is a modification of autocorrelation (hereinafter
`discussed) that increases the accuracy of the pitch and
`decreases the chances of aliasing.
`However, this approach is very slow due to its computa
`tional complexity. An implementation in Matlab Software
`may typically take approximately one hour or longer to
`evaluate 10 seconds of audio on a 90 MHZ Pentium work
`Station. With Some optimizations, the performance may be
`improved to approximately 15 minutes per 10 Seconds of
`audio, but this is still far too slow for the purposes of this
`invention.
`The Cepstrum Analysis approach, which is discussed in
`A. V. Oppenheim, “A Speech Analysis-Synthesis System
`Based on Homomorphic Filtering.” J. Acoustical Society of
`America, 45: 458–465, February 1969, and in Alan V.
`Oppenheim and Ronald W. Schafer, Discretetime-Signal
`Processing, Prentice Hall, Englewood Cliffs, N.J., 1989,
`does not give very accurate results for-humming.
`In order to provide both rapid and accurate pitch tracking,
`a preferred pitch tracking approach is autocorrelation, which
`is described in L. R. Rabiner, J. J. Dubnowski, and R. W.
`Schafer, “Realtime Digital Hardware Pitch Detector.” IEEE
`Transactions on Acoustics, Speech and Signal Processing,
`ASSP-24(1): 2–8, February 1976. This approach isolates
`and tracks the peak energy levels of the Signal which is a
`measure of the pitch. Referring back to FIG. 4, the Signal
`S(n), illustrated at 40, peaks where the impulses occur.
`Therefore, tracking the frequency of these peaks 42 should
`provide the pitch of the Signal. In order to get the frequency
`of these peakS 42, autocorrelation is employed, which is
`defined by:
`
`3
`formats, depending upon the platform-specific audio input
`capabilities of the Matlab software. For example, the format
`may be a 16-bit, 44 Khz WAV format on a Pentium system,
`or a 8-bit 8 Khz AU format on a Sun Sparcstation. The query
`engine 24 may use an approximate pattern matching 5
`algorithm, described hereinafter, in order to tolerate hum
`ming errors.
`User input 12 (humming) to the system 10 is converted
`into a Sequence of relative pitch transitions, as follows. A
`note 12 in the input may be classified in one of three ways:
`a note is either the same as the previous note (S), higher than
`the previous note (U), or lower than the previous note (D).
`Thus, the input is converted into a String with a three letter
`alphabet (U.D.S). For example, the introductory theme of
`Beethoven's 5th Symphony would be converted into the is
`sequence: - S S D U S S D (the first note is ignored as it
`has no previous pitch). For another example, the melody
`portion 12, as illustrated in FIG. 1 with one high note
`between two lower notes, would be converted into the
`Sequence: U D.
`To accomplish this conversion, a Sequence of pitches in
`the melody must be isolated and tracked. This is not as
`Straight-forward as it Sounds, however, as there is still
`considerable controversy over exactly what pitch is. The
`general concept of pitch is clear: given a note, the pitch is the 25
`frequency that most closely matches what one hears. Per
`forming this conversion in a computer can become trouble
`Some because Some intricacies of human hearing are still not
`clearly understood. For instance, if one plays the 4th, 5th,
`and 6th harmonics of Some fundamental frequency, one 30
`actually hears the fundamental frequency, not the harmonics,
`even though the fundamental frequency is not present. This
`phenomenon was first discovered by Schouten in Some
`pioneer investigations carried out from 1938 to 1940.
`Schouten Studied the pitch of periodic Sound waves pro- 35
`duced by an optical siren in which the fundamental of 200
`HZ was cancelled completely. The pitch of the complex tone,
`however, was the same as that prior to the elimination of the
`fundamental. See R. Plomp, Aspects of Tone Sensation,
`Academic Press, London, 1976.
`Because of the interest in tracking pitch in humming,
`methods were examined for automatically tracking pitch in
`a human Voice. Before one can estimate the pitch of an
`acoustic signal, we must first understand how this signal is
`created, which requires forming a model of Sound produc- 45
`tion at the Source. The vibrations of the vocal cords in voiced
`Sounds are caused as a consequence of forces that are
`exerted on the laryngeal walls when air flows through the
`glottis (the gap between the Vocal cords). The terms “vocal
`folds' and “vocal cords' are more or leSS used as Synonyms 50
`in the literature. Wolfgang Hess, Pitch Determination of
`Speech Signals, Springer-Verlag, Berlin Heidelberg, 1983,
`describes a model of the vocal cords as proposed by M.
`Hirano, “Structure and Vibratory Behavior of the Vocal
`Folds, " in M. Sawashima and F. S. Cooper, editors, 55
`Dynamic aspects of Speech production, pages 13–27, Uni
`versity of Tokyo Press, 1976. For the purposes of the present
`application though, it is Sufficient to know that the glottis
`repeatedly opens and closes thus providing bursts of air
`through the Vocal tract.
`The Vocal tract can be modeled as a linear passive
`transmission System with a transfer function H(Z). If one
`adds an additional transfer function R(Z) which takes into
`account the radiation, the output impedance of the Vocal
`tract can approximately be set to Zero. In the neutral position 65
`where the Vocal tract can be regarded as a uniform tube, the
`resonances of the Vocal tract occur at Sound wavelengths of
`
`2O
`
`40
`
`60
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 8 of 10
`
`5,874,686
`
`6
`previously discussed U.D.S characters, and the converted
`user input (the key 23) is compared with all the Songs.
`The pattern-matching desirably uses a “fuzzy' Search to
`allow for errors within the matches. These errors reflect the
`inaccuracies in the way people hum as well as errors in the
`representation of the Songs themselves.
`For performing the key-search within the database 14, it
`is considered desirable to use an efficient approximate
`pattern matching algorithm. By “approximate” is meant that
`the algorithm should be able to take into account various
`forms of errors.
`Using the word CASABLANCA as an example, the
`following are examples of the types of errors which may
`occur in a typical pattern matching Scheme.
`
`where 1 is the lag and h is the input signal.
`While autocorrelation is Subject to aliasing (picking an
`integer multiple of the actual pitch) and is computationally
`complex, it still provides greater Speed and accuracy as
`compared to the previously discussed approaches. The
`present implementation of autocorrelation was found to
`require approximately 45 seconds for 10 seconds of 44 KHZ,
`16-bit audio on a 90 MHz pentium workstation.
`In order to improve the performance and Speed and
`robustness of the pitch-tracking algorithm, a cubic-spline
`wavelet transform or other suitable wavelet transform may
`be used. The cubic spline wavelet peaks at discontinuities in
`the Signal (i.e., the air impulses). One of the most significant
`features of the wavelet analysis is that it can be implemented
`in a filter bank structure and therefore computed in O(n)
`time.
`It is preferred that the cubic Spline wavelet implementa
`tion be an event-based implementation of the tracer.
`The cubic spline wavelet transform is linear and shift
`invariant which are useful properties for Speech Signals since
`they are often modelled as a linear combination of shifted
`and damped sinusoids.
`If a signal x(t) or its derivatives have discontinuities, then
`the modulus of the cubic spline transform of X(t) exhibits
`local maxima around the points of discontinuity. This is
`considered to be a desirable property for a cubic spline
`wavelet pitch-tracker Since glottal closure causes Sharp
`changes in the derivative of the air flow in the glottis and
`transients in the Speech Signal.
`It has been shown in S. Mallat and W. L. Hwang,
`“Singularity Detection and Processing with Wavelets”,
`Technical Report, Courant Institute of Mathematical
`Sciences, New York, March, 1991, that if one chooses a
`wavelet function g(t) that is the first derivative of a smooth
`ing function (a Smoothing function is a function whose
`Fourier transform has energy concentrated in the low
`frequency region) P(t), then the local maxima of the trans
`form indicate the Sharp variations in the Signal whereas the
`local minima indicate the Slow variations. Hence, the local
`maxima of the transform using a wavelet, which is the first
`derivative of a Smoothing function, is considered to be
`desirable for detecting the abrupt changes or transients in the
`Speech Signal caused by glottal closure.
`One of the most appealing features of the cubic spline
`wavelet transform is that is can be implemented in a filter
`bank structure. It therefore is O(n) fast.
`Preferably, the autocorrelation pitch tracking approach
`includes low-pass filtering (for removing non-vocal
`frequencies) and center-clipping, as described in M. M.
`Sondhi, “New Methods of Pitch Extraction," IEEE Trans.
`Audio Electroacoust. (Special Issue on Speech Communi
`cation and Processing Part II), AU-16: 262-266, June
`1968. It is considered that these will help eliminate the
`formant Structure that generally causes difficulty for auto
`correlation based pitch detectors.
`This preferred form of autocorrelation takes between 20
`and 45 Seconds on a Sparc10 WorkStation to process typical
`Sequences of hummed notes. A brute-force Search of the
`database unsurprisingly shows linear growth with the size of
`the database, but remains below 4 seconds for 100 songs on
`a Sparc2. Therefore, the search time is currently effectively
`limited by the efficiency of the pitch-tracker.
`In order to Search the database, Songs in the database 14
`are preprocessed to convert the melody into a stream of the
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`In CASABLANCA In a melody
`
`Transposition error
`Dropout error
`Duplication error
`Insertion error
`
`CASABALNCA
`CAS BLANCA
`CASAABLANCA
`CASABTLANCA
`
`two notes reversed
`a note dropped
`a note duplicated
`a note inserted
`
`Several Algorithms have been developed that address the
`problem of approximate String matching. Running times
`have ranged from 0(mn) for the brute force algorithm to
`0(kn) or 0(nlog(m), where “0” means “on the order of,” m
`is the number of pitch differences in the query, and n is the
`Size of the String (Song). See Ricardo Baeza-Yates and G. H.
`Gonnet, “Fast String Matching with Mismatches.” Informa
`tion and Computation, 1992. A preferred algorithm which is
`considered to offer better performance in general for this
`purpose is that described in Ricardo A. Baesa-Yates and
`Chris H. Perieberg, “Fast and Practical Approximate String
`Matching, “Combinatorial Pattern Matching, Third Annual
`Symposium, ' pages 185-192, 1992.
`This algorithm addresses the problem of String matching
`with k mismatches. The problem consists of finding all
`instances of a pattern String P=p1.p2p3 . . . pm in a text
`String T=t1, t2,t3 .
`.
`. th Such that there are at most k
`mismatches (characters that are not the same) for each
`instance of P in T. When k=0 (no mismatches) one has the
`Simple String matching problem, Solvable in O(n) time.
`When k=m, every substring of T of length m qualifies as a
`match, Since every character of P can be mismatched. Each
`of the errors in the above CASABLANCA example corre
`sponds to k=1.
`The worst case for this algorithm occurs when P(the key
`23) consists of m occurrences of a single distinct character,
`and T (contour representation of Song) consists of n
`instances of that character. In this case the running time is
`0(mn). However, this is neither a common nor useful situ
`ation for the present invention. In the average case of an
`alphabet in which each character is equally likely to occur,
`the running time is
`
`where r is the size of the alphabet.
`The computer 16 may desirably be programmed So that,
`for a given query, the database 14 returns a list of Songs
`ranked by how well they matched the query, not just one best
`match. The number of matches that the database 14 should
`retrieve depends upon the error-tolerance used during the
`key-Search. This error-tolerance could be set in one of two
`possible ways: either it can be a user-definable parameter or
`the database can itself determine this parameter based on, for
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 9 of 10
`
`5,874,686
`
`7
`example, heuristics that depends on the length of the key.
`This would give the user an opportunity to perform queries
`even if the user is not Sure of Some notes within the tune.
`From the results of the query the user can identify the
`Song of interest. If the list is too large, the user can perform
`a new query on a restricted Search list consisting of Songs
`just retrieved. This allows the user to identify Sets of Songs
`that contain Similar melodies.
`An experimental evaluation of the System tested the
`tolerance of the System with respect to input errors, whether
`from mistakes in the user's humming or from problems with
`the pitch-tracking.
`Effectiveness of the system 10 is directly related to the
`accuracy with which pitches that are hummed can be tracked
`and the accuracy of the melodic information within the
`database 14. Under ideal circumstances, one can achieve
`close to 100% accuracy tracking humming, where “ideal
`circumstances' means that the user places a Small amount of
`Space between each note and hits each note Strongly. For this
`purpose, humming Short notes is encouraged. Even more
`ideal is for the user to aspirate the notes as much as possible,
`perhaps going So far as to voice a vowel, as in “haaa haaa
`haaa..” Only male voices have been experimented with.
`The evaluation database contained a total of 183 Songs.
`Each song was converted from public domain General MIDI
`Sources. Melodies from different musical genres were
`included, including both classical and popular music. A few
`Simple heuristics were used to cut down on the amount of
`irrelevant information from the data, e.g., MIDI channel 10
`was ignored as this is reserved for percussion in the General
`MIDI standard. However, the database still contained a great
`deal of information unrelated to the main theme of the
`melody. Even with this limitation, it is discovered that
`Sequences of 10 to 12 pitch transitions were Sufficient to
`discriminate 90% of the songs.
`AS a consequence of using a fast approximate String
`matching algorithm, Search keys can be matched with any
`portion of the melody, rather than just the beginning. AS the
`Size of the database grows larger, however, this may not
`prove to be an advantage.
`Contour representations for each Song are Stored in Sepa
`rate files. Opening and closing files may become a signifi
`cant overhead. Performance may therefore be improved by
`packing all the Songs into one file, or by using a database
`manager. The programming code may be modernized to
`make it independent of any particular database Schema.
`The pattern matching algorithm in the form previously
`described does not discriminate between the various forms
`of pattern matching errors discussed earlier, but only
`accounts for them collectively. Some forms of errors may be
`more common than otherS depending upon the way people
`casually hum different tunes. For example, drop-out errors
`reflected as dropped notes in tunes are more common than
`transposition or duplication errors. Tuning the key-Search So
`that it is more tolerant to drop-out errors, for example, may
`yield better results.
`The generation of the melodic contours of the Source
`songs automatically from MIDI data is convenient but not
`optimal. More accuracy and less redundant information
`could be obtained by entering the melodic themes for
`particular Songs by keyboard.
`The resolution of the relative pitch differences may be
`desirably increased by using query alphabets of three, five,
`and more possible relationships between adjacent pitches.
`Early experiments using an alphabet of five relative pitch
`differences (Same, higher, much higher, lower, much lower)
`indicated changes of this Sort are promising. One drawback
`
`1O
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`of introducing more resolution is that the users must be
`Somewhat more accurate in the intervals they actually hum.
`It should be understood that, while the present invention
`has been described in detail herein, the invention can be
`embodied otherwise without departing from the principles
`thereof, and Such other embodiments are meant to come
`within the scope of the present invention as defined by the
`appended claims.
`What is claimed is:
`1. Apparatus for Searching a melody comprising a com
`puter means having a database of melodies each including a
`plurality of notes in a form of a Sequence of digitized
`representations of relative pitch differences between Succes
`Sive notes, means for inputting a melody to Said computer
`means, means for converting the melody into a form of a
`Sequence of digitized representations of relative pitch dif
`ferences between Successive notes thereof, and means for
`Searching Said melody database for at least one Sequence of
`digitized representations of relative pitch differences
`between Successive notes which at least approximately
`matches said Sequence of digitized representations of rela
`tive pitch differences between Successive notes of the
`melody.
`2. Apparatus according to claim 1 wherein Said input
`means comprises a microphone.
`3. Apparatus according to claim 1 further comprising
`means for outputting from Said computer a ranked list of
`approximately matching melodies.
`4. Apparatus according to claim 1 wherein Said converting
`means comprises an autocorrelation based pitch detector
`CS.
`5. Apparatus according to claim 1 wherein Said converting
`means comprises an autocorrelation based pitch detector
`means having low-pass filter means for removing non-vocal
`frequencies and center-clipping means for eliminating
`unwanted formant Structure.
`6. Apparatus according to claim 1 wherein Said converting
`means comprises a wavelet pitch tracker.
`7. Apparatus according to claim 1 wherein Said converting
`means comprises a cubic spline wavelet pitch tracker.
`8. Apparatus according to claim 1 further comprising
`means for approximately matching Sequences of digitized
`representations of relative pitch differences between Succes
`Sive notes.
`9. A method for Searching a melody comprising the Steps
`of: (a) selecting a computer means having a database of
`melodies each including a plurality of notes in a form of a
`Sequence of digitized representations of relative pitch dif
`ferences between Successive notes, (b) inputting a melody to
`the computer means, (c) converting the melody into a form
`of a Sequence of digitized representations of relative pitch
`differences between Successive notes thereof, and (d)
`Searching the melody database for at least one Sequence of
`digitized representations of relative pitch differences
`between Successive notes which at least approximately
`matches the Sequence of digitized representations of relative
`pitch differences between Successive notes of the melody.
`10. A method according to claim 9 wherein the step of
`inputting the melody comprises humming into a micro
`phone.
`11. A method according to claim 9 further comprising
`operating the computer to output a ranked list of approxi
`mately matching melodies.
`12. A method according to claim 9 wherein the step of
`converting comprises applying to the Sequence of represen
`tations of relative pitch changes between Successive notes of
`the hummed melody an autocorrelation based pitch detector.
`
`

`

`Case 1:14-cv-02396-PGG-MHD Document 153-15 Filed 06/28/19 Page 10 of 10
`
`5,874,686
`
`9
`13. A method according to claim 9 wherein the step of
`converting comprises applying to the Sequence of represen
`tations of relative pitch changes between Successive notes of
`the hummed melody an autocorrelation based pitch detector
`having low pass filtering for removing non-vocal frequen
`cies and center-clipping for eliminating unwanted formant
`Structure.
`14. A method according to claim 9 wherein the step of
`converting comprises applying to the Sequence of represen
`tations

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket