`Wood
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,743,092 B2
`Jun. 22, 2010
`
`US007743092B2
`
`(54) METHOD FOR RECOGNIZING AND
`DISTRIBUTING MUSIC
`
`(76) Inventor: Lawson A. Wood, 873 N. Frederick St.,
`Arllngton, VA (Us) 22205
`
`( * ) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. l54(b) by 0 days.
`
`5,745,556 A
`5,794,217 A
`5,874,686 A
`5,918,213 A
`6,057,884 A
`6,282,549 Bl
`6 385 596 B1
`’
`’
`
`4/1998 Ronen
`8/1998 Allen
`2/l999 Ghias et al.
`6/1999 Bernard et a1‘
`5/2000 Chen et al.
`8/2001 H ff It t l.
`5/2002 Wis; eteala
`'
`
`(21) Appl. No.: 10/649,932
`
`(22) Filed:
`
`Aug. 28, 2003
`
`(65)
`
`Prior Publication Data
`
`Us 2004/ 00495 40 A1
`
`Mar‘ 11’ 2004
`
`Related US. Application Data
`
`(63) Contmuatlon-m-part of application No. 09/438,469,
`?led on Nov. 12, 1999, now abandoned.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`(51) Int. Cl.
`
`OTHER PUBLICATIONS
`
`Rodger J. McNab et al., “The New Zealand Digital Library Melody
`Index,” D-Lib Magazine (downloaded from the Internet) May 1997
`(11 Page Printout)
`Paul Flavin, web site entitled “Play a Piano/SynthesiZer/Oscillo
`scope,” http://www frontiernet.net/~imaging/playiaipiano.html,
`co
`i tdate 1998, 2 a es.
`pyr gh
`p g
`
`Primary ExamineriWen-Tai Lin
`
`(57)
`
`ABSTRACT
`
`A customer for music distributed over the intemet may select
`a composition from a menu of written identi?ers (such as the
`song title and singer or group) and then con?rm that the
`composition is indeed the one desired by listening to a cor
`ruptedversionoftheCompositionlfthecustomerhas forgop
`ten the song title or the singer or other words that provide the
`identi?ers he or She may hum or Otherwise VOeahZe a few bars
`of the desired composition, or pick the desired composition
`out on a simulated keyboard. A music-recognition system
`then locates candidates for the selected composition and dis
`plays identi?ers for these candidates to the customer.
`
`18 Claims, 7 Drawing Sheets
`
`G] 0H 7/00
`G] 0L 1 7/00
`
`_
`
`(200601)
`(200601)
`_
`_
`'
`(52) US. Cl. ....................... .. 709/203, 704/246,
`_
`_
`_
`(58) Field of Classi?cation Search ................. .. 705/27,
`_
`_
`705/51’52; 84/699; 709/203
`See apphcanon ?le for Complete Search hlstory'
`References Cited
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`5,734,719 A
`
`3/1998 Tsevdos et al.
`
`CUSTOMER
`
`MUSIC DISTRIBUTION COMPANY
`
`52
`
`ADDRESS MUSIC
`DISTRIBUTION CO.
`
`56
`
`SELECT OPTION
`
`6 II
`
`54
`
`DOWNLOAD OPTIONS
`
`DOWNLOAD INFORMATION
`FOR SELECTED OPTION
`
`6
`
`INPUT INFORMATION ABOUT
`DESIRED SONG 0R ALBUM
`
`DOWNLOAD PREVIEW
`INFORMATION
`
`54
`
`VERIFY SELECTION AND
`PROVIDE PAYMENT
`INFORMATION
`
`70
`
`STORE SELECTION
`
`VERIFY PAYMENT
`INFORMATION
`
`68
`
`DOWNLOAD SELECTION
`
`Google Ex. 1015
`
`
`
`U.S. Patent
`
`Jun. 22, 2010
`
`Sheet 1 of7
`
`US 7,743,092 B2
`
`FIG. 1
`
`
`
` FINANCIAL
`INSTITUTION
`
` MUSIC
`
`DISTRIBUTION
`COMPANY
`
`
`
`
`
`
`
`10
`
`CUSTOMER'S
`
`HOUSE
`
`FIG. 2
`
`
`
` MUSIC DISTRIBUTION CO.
`48
`50
`
`
`
`OPTION
`A
`
`OPTION ‘A
`B
`
`MUSIC
`RECOGNITION
`OPTION
`
`4;
`
`72
`
`fl
`
`
`
`
`
`
`
`Google Ex. 1015
`
`Google Ex. 1015
`
`
`
`US. Patent
`
`Jun. 22, 2010
`
`Sheet 2 of7
`
`US 7,743,092 B2
`
`FIG. 3
`
`MUSIC DISTRIBUTION COMPANY
`
`vllllllln
`
`Illlll ulllllll.
`
`DOWNLOAD OPTIONS
`
`DOWNLOAD INFORMATION
`FOR SELECTED OPTION
`
`CUSTOMER
`
`START
`
`v
`ADDRESS MUSIC
`DISTRIBUTION CO.
`
`SELECT OPTION
`
`INPUT INFORMATION ABOUT
`DESIRED SONG OR ALBUM
`
`DOWNLOAD PREVIEW
`INFORMATION
`
`VERIFY SELECTION AND
`PROVIDE PAYMENT
`INFORMATION
`
`VERIFY PAYMENT
`INFORMATION
`
`DOWNLOAD SELECTION
`
`STORE SELECTION
`
`END
`
`Google Ex. 1015
`
`
`
`U.S. Patent
`
`Jun. 22, 2010
`
`Sheet 3 of7
`
`US 7,743,092 B2
`
`mossesmizm
`
`:2:>>m__>m=..Ez,oEmEm
`
`
`
`._._z:._<>m_mm_mQ53:
`
`v.o_“_
`
`
`
`
`
`:2:zo:_z8om_~_o_w=s_
`
`
`
`zo_._._mon_s_oo5.32
`
`~mEzma_2,2
`
`>%Es_
`
`.zEEa
`
`wz__._o._.<s.
`
`._._z:
`
`M223:
`
`zozosea
`
`:2:
`
`Google Ex. 1015
`
`Google Ex. 1015
`
`
`
`
`
`
`US. Patent
`
`Jun. 22, 2010
`
`Sheet 4 0f 7
`
`US 7,743,092 B2
`
`@ “6-5
`
`10s
`/
`8 WM
`SELECT
`
`SEND IDENTIFI
`TO CUSTOM
`
`\‘
`
`HAS CUSTOM/ER
`ECTED
`PQSITION
`
`' Y
`
`110/ START READING OUT
`CIT
`
`STARTS
`
`N
`
`SEND COMPO
`TO CUSTO
`‘Ir
`SET SNIPPET
`DURATION TIMER
`
`116
`
`STOP SEN
`T0 CUSTO
`
`SET BLANK
`INTERVAL TIMER
`
`N
`
`Google Ex. 1015
`
`
`
`US. Patent
`
`Jun. 22, 2010
`
`Sheet 5 of7
`
`US 7,743,092 B2
`
`m 55
`
`m n. ......................................... i ............ 2 5E " “ $2: " u
`
`u u 205%
`
`
`
`. was zogjémcz
`
`3.0 n n @556 mo smz? DE: EEEQ " u " v25 \ m:
`
`
`
` . u k P Q > o \ u n L. " oil 5%; M28 , . . E m m wzoii A US” @2505 , Ewzspw m m m Mm...
`
`N” m m 37x0 2:\ m m ., .......... -4 .... :n
`
`n " NE of S; u u n Em: “
`
`n u H NE a u n N u
`m m a n u .650 m
`
`u H 22.5% n u .1 ............. .Li.
`
`H m 258% m m " 5 m
`
`
`
`E~>2z< 838% 65 M23 0 NAME
`
`Google Ex. 1015
`
`
`
`US. Patent
`
`Jun. 22, 2010
`
`Sheet 6 0f 7
`
`US 7,743,092 B2
`
`
`
`m $52855 E28 E118 " \
`
`“L1 5585 \ zo??z A mug, 2% N a:
`
`m a: 3; Q: s: "1 Q: k
`
`M 2% m ii,
`m J EOE: \ m Sm ?u EMU
`
`“ m o w m can“? T62 @SN @ (2
`
`w 6E N .@E
`
`m (3N "
`
`
`
`m 81 u .. ........... -4, ........
`
`
`
`m8 ga?muzmamwzsgaa 28$?
`
`. ................................................. :2. .......... l. ,.
`
`m 8N \ " .. ...... :wik ........
`
`n @2585 .552 \ 25H; i n E \ .
`" @Qsmm m \ N2
`oJl. mimzgsmm t| e228 5E n " Em E J
`
`
`u ./ $2: E: m u \ 2 l2 m
`
`m @E M55“.
`
`Google Ex. 1015
`
`
`
`US. Patent
`
`Jun. 22, 2010
`
`Sheet 7 of7
`
`US 7,743,092 B2
`
`16%?
`
`1
`
`Google Ex. 1015
`
`
`
`US 7,743,092 B2
`
`1
`METHOD FOR RECOGNIZING AND
`DISTRIBUTING MUSIC
`
`CROSS REFERENCE TO RELATED
`APPLICATION
`
`This is a continuation-in-part of application Ser. No.
`09/438,469, ?led Nov. 12, 1999 noW abandoned, the entire
`disclosure of Which is incorporated herein by reference.
`
`BACKGROUND OF THE INVENTION
`
`The present application is directed to a method for recog
`niZing and distributing music, and more particularly to a
`method for recognizing a musical composition from a speci
`men that is provided by a customer (as by humming, singing,
`or otherWise vocaliZing the specimen or by picking it out on
`a simulated piano or other tone generator), and for permitting
`a customer to previeW a musical composition before distrib
`uting the composition to the customer over the internet.
`The internet (and particularly the WorldWide Web) is
`becoming an important vehicle for distributing music, usually
`in encoded form. Web sites currently (1999) exist that distrib
`ute music in an encoded format knoWn as “MP3. ” So-called
`“juke box” programs are also available Which permit MP3
`?les that have been doWnloaded over the internet to be stored
`and played on audio systems. Some authorities speculate that
`distribution of music over the internet Will eventually replace
`conventional record shops.
`Some customers Who desire to purchase a recording at a
`record shop may be familiar With the music itself, but may not
`be sure of the singer or group that produced the music, or
`possibly the title of the relevant song or album. In a music
`shop, such a customer is able to question a shopkeeper, and
`possibly hum a feW bars of the musical composition for the
`shopkeeper to attempt to identify. Alternatively, music stores
`frequently permit patrons to sample recordings before buying
`them, so a customer Who is not sure Which recording he or she
`Would like to purchase may select a feW possible recordings
`and listen to them until the desired recording is located. There
`is no harm in permitting a customer to listen to as much of a
`recording as the customer Would like, since the customer
`cannot legally take a recording from the shop Without paying
`for it.
`Speech recognition technology is highly developed. Typi
`cally, features are extracted from spoken Words and then
`normaliZed to provide patterns that are compared to patterns
`in a pattern library. When a pattern derived from a spoken
`Word matches a pattern in the library su?iciently, a phoneme
`of the spoken Word has been found. The features that are
`extracted from the spoken Words may identify a range of
`frequencies that are present during extremely brief slices of
`time and the poWer at those frequencies. Sophisticated math
`ematical operations are then performed on the extracted fea
`tures in order to generate the patterns for pattern matching.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`SUMMARY OF THE INVENTION
`
`An object of the invention is to facilitate distribution of
`music over the internet by permitting customers or other
`people to previeW (perhaps “pre-listen” Would be more accu
`rate, but Applicant prefers to stick With English Words) music
`before doWnloading it.
`Another object is to permit people to previeW music in a
`manner that permits them to identify a musical composition
`for Which they are searching Without providing a usable sub
`stitute for the desired composition.
`
`60
`
`65
`
`2
`A further object is to provide techniques for corrupting
`music so that it can be used for purpose of identi?cation but
`not enjoyment.
`Yet another object is to provide a method for recognizing a
`musical composition that is hummed, sung, chanted, or oth
`erWise vocaliZed by a customer. The specimen may be sent
`via the internet or telephone to a remote location for analysis
`and pattern matching. Alternatively, if a customer’s home
`computer is suitably equipped, the home computer can be
`used to generate a pattern locally from the customer’s speci
`men, and the pattern alone may be transmitted via the internet
`to a remote location for pattern matching. The music recog
`nition can also be executed at record shops, Without sending
`either the specimen of the customer’ s vocaliZation or a pattern
`derived from the specimen to a remote location.
`An additional object of the invention is to permit a cus
`tomer to generate a specimen for pattern matching by
`manipulating a keyboard, a simulated musical instrument
`such as a piano, or some other generator of tones.
`In accordance With one aspect of the invention, a method
`for distributing music includes the steps of sending informa
`tion to identify a musical composition in Writing to a customer
`or other person over the internet. If the customer sends a
`request for an audio previeW of the composition that is iden
`ti?ed in Writing, a corrupted version of some or all of this
`musical composition is sent to the customer over the internet.
`If the customer then requests the musical composition With
`out corruption, it is sent to the customer.
`The corrupted version of the musical composition that is
`sent to the customer for purposes of identi?cation may
`include a short-duration snippet of the composition or a
`sequence of isolated snippets from the composition, possibly
`With superimposed noise.
`In accordance With another aspect of the invention, a musi
`cal composition can be recogniZed by extracting features
`from a specimen that has been vocaliZed by a person, gener
`ating a pattern from the extracted features, comparing this
`pattern With patterns in a pattern library, and identifying at
`least one musical composition as a result of this comparison.
`The pattern preferably includes a pitch sequence and/or a
`duration sequence. The pitch sequence may identify hoW
`many halftones up or doWn exist betWeen a current note of the
`specimen and the previous note. The duration sequence may
`indicate the duration of one note With respect to the duration
`of the previous note, or the duration of features of the speci
`men With respect to a predetermined tempo.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 schematically illustrates a system for communica
`tion betWeen a customer and a music distribution company
`via the internet, along With a ?nancial institution for facilitat
`ing payment for distributed music;
`FIG. 2 illustrates equipment at the customer’s location;
`FIG. 3 is a How chart schematically illustrating communi
`cation betWeen the customer and the music distribution com
`Pany;
`FIG. 4 is a block diagram of a system for recogniZing
`musical compositions and providing both the compositions
`themselves and corrupted versions of the compositions;
`FIG. 5 is a How chart illustrating an example of corruption
`of music by an extraction of snippets;
`FIG. 6 is a block diagram of a feature extraction unit and a
`normaliZation unit of a music recognition system;
`FIG. 7 illustrates a display on a monitor for permitting a
`customer to select a key and a tempo;
`
`Google Ex. 1015
`
`
`
`US 7,743,092 B2
`
`3
`FIG. 8 illustrates an alternative arrangement for a duration
`sequence analyZer that is part of the normalization unit shoWn
`in FIG. 6; and
`FIG. 9 illustrates a feature extraction unit and a duration
`sequence analyZer for rap music.
`FIG. 10 illustrates a display on a monitor for permitting a
`customer to pick out a tune on a simulated piano using a
`mouse.
`
`DETAILED DESCRIPTION OF THE PREFERRED
`EMBODIMENT
`
`FIG. 1 illustrates a customer’s house 10 Which can be
`connected by the public telephone system, represented by a
`telephone line 12, to a customer’s intemet service provider
`14. Reference number 16 represents the internet. A music
`distribution company 18 is connected by a high-speed data
`link 20 to a company’s intemet service provider 22, Which can
`communicate With a customer’s intemet service provider 14
`over the internet 16. The music distribution company 18 is
`also connected by a high-speed data link 24 to a ?nancial
`institution 26, such as a bank that issues debit cards or credit
`cards or both to retail customers.
`FIG. 2 illustrates equipment located at the customer’s
`house 10. This equipment includes a computer 28 having a
`hard disk 30, a drive 32 for a removable recording medium,
`and a speaker 34. The computer 28 has a modem (not illus
`trated) for communicating With the internet service provider
`14. A monitor 36 is connected to the computer 28. Also
`connected to the computer 28 are a keyboard 38 and a micro
`phone 42. An audio system (not illustrated) may be connected
`to the computer 28. It Will be assumed that the customer has
`installed a program Which permits him to receive music ?les
`(possibly in encoded form, such as MP3 -encoded ?les) and to
`store and play them.
`The customer may move the mouse 40 over a surface in the
`usual manner to control the location of a pointer 44 on the
`screen of the monitor 36. The mouse 40 has a button 46 that
`the customer can depress to signal to the computer that the
`customer has selected a region of the screen of monitor 36
`With the pointer 44, and Wishes to commence a predetermined
`activity associated With the selected region of the screen (i.e.,
`“click on” somthing). Furthermore, depending on What is
`displayed on the screen, the operating system employed by
`computer 28 (such as Microsoft’s “Windows”TM operating
`system) may change the pointer 44 to an insertion marker (not
`illustrated) for identifying an insertion point for text entered
`via the keyboard 38.
`In FIG. 2, it is assumed that the customer has employed a
`broWser program to address a server (not illustrated) at the
`company 18 and to doWnload the music distribution compa
`ny’s home page via the internet 16. It is also assumed that the
`company’s home page offers customers three options for
`selecting songs, albums, or other musical compositions that
`the customer Wishes to purchase. TWo of these options, A and
`B, are illustrated schematically in regions 48 and 50 of the
`screen. Option A, for example, might permit the customer to
`select a time period (for example, Within the past year, or
`Within the past ?ve years, or by decade intervals prior to that)
`and to select a type of music (rock and roll, country and
`Western, and so forth), Whereupon the music distribution
`company’s server Would return one or more pages With a
`menu of songs or other musical compositions available for the
`customer to select. For example, the customer might select
`1950-1960 as the time interval, and receive an alphabetized
`menu of titles of rock and roll songs (along With the identity
`of the singer or group) that Were ?rst issued in that decade and
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`are available for purchase from the company 18. The cus
`tomer might then select the song “Blue Suede Shoes” by Elvis
`Presley from this menu. Option B might also permit the
`customer to select a time period and a type of music, and then
`return an alphabetiZed menu of singers or groups and the titles
`of songs that they originated during the selected period. For
`example, if the customer selected rock and roll and the decade
`1950-1960, one singer on the menu Would be Elvis Presley,
`and “Blue Suede Shoes” Would be listed as the title of one
`song that he released during this period. Other menu options,
`including singers or groups, folloWed by their albums by title
`and the songs on each album, or key Words in the lyrics, Would
`also be possible. In each case, What Would ultimately be
`displayed to the customer on monitor 36 for possible pur
`chase Would be a menu Which identi?es different pieces of
`music in Writing (e. g., “Blue Suede Shoes” by Elvis Presley).
`One problem With such an approach is that some customers
`have poor memories for song titles and may not remember
`Who sang a particular song, much less be able to recall the title
`of the album on Which it appeared. Such customers may
`hesitate to purchase music over the intemet out of concern
`that they Would be Wasting their money if they purchased the
`Wrong song or album. Even if steps are taken to reduce this
`uncertainty, as by displaying album covers or the lyrics of
`songs, this hesitancy Would naturally have an inhibiting effect
`on the sale of music over the intemet.
`FIG. 3 illustrates hoW to avoid this problem by permitting
`the customer to audibly verify that he or she has selected the
`piece of music that he or she intended. The customer starts by
`logging on With the customer’s internet service provider 14
`and then addressing the music distribution company (step 52)
`by typing in the company’s Worldwide Web address or URL.
`The company thereupon doWnloads the selection options
`available to the customer (step 54). The options may be pre
`sented on the company’s home page, or the home page might
`be hyper-linked to one or more intervening pages before the
`customer reaches the options. In the event that the music
`distribution company offers only one option, step 54 Would be
`skipped and, instead of selecting an option in step 56, the
`customer Would simply be presented With the option that the
`company supports.
`In step 58, the company doWnloads information about the
`selected option. For example, if the customer has been given
`the option of selecting songs by title during a time period
`selected by the customer and for a type of music selected by
`the customer, an alphabetical list of titles of songs of the
`selected type and during the selected period, possibly also
`accompanied by the name of the singer or group, is doWn
`loaded in step 58. In step 60, the customer uses the selection
`button 46 on his mouse 40 in order to identify the desired
`song. This information about the desired song is conveyed to
`the music distribution company.
`In step 62, the company doWnloads previeW information
`(step 62), Which permits the customer to audibly verify that
`the song selected in step 60 is indeed the song that the cus
`tomer Wants to purchase. The previeW information is a cor
`rupted version of the selected song. The corrupted version in
`this case is a sequence of snippets of the selected song With
`blank intervals betWeen the snippets. For example, the pre
`vieW information might be the ?rst ten seconds of the song,
`folloWed by a ?ve-second blank interval, folloWed by the
`l5’h-25th seconds of the song, folloWed by another ?ve-sec
`ond blank interval, and so forth. Preferably, the snippets are
`also acoustically degraded. One Way of doing this Would be
`by limiting the frequency response of the snippets, but since
`the customer might then assume that poor quality music Was
`being offered for sale, it might be better to superimpose noise
`
`Google Ex. 1015
`
`
`
`US 7,743,092 B2
`
`5
`on the snippets and possibly also on the blank intervals
`between the snippets. One type of noise Would be a repeating
`ticking sound, like a metronome operating at high speed. The
`purpose of the previeW information is to permit the customer
`to audibly verify the selection made at step 60 Without pro
`viding the customer at this stage With music that Would be
`enjoyable to listen to.
`After the customer has listened to the previeW information
`in step 64, he or she veri?es the selection, for example, by
`typing “Y” on keyboard 38. Although not shoWn, if the cus
`tomer decides after listening to the previeW information that
`the information about the desired song that Was entered at step
`60 Was incorrect, possibly indicated by typing “N” on key
`board 38, the procedure returns to step 60. In step 64, after
`verifying the selection, the customer is also asked to provide
`payment information, as by entering a credit card number.
`This information is then conveyed to the music distribution
`company, Which veri?es the payment information With the
`?nancial institution 26 (FIG. 1) during step 66. The company
`then doWnloads a ?le containing the selection, such as an
`MP3 ?le, in step 68. The customer then stores the doWnloaded
`?le on hard disk 3 0 or on a removable storage medium that has
`been inserted in drive 32 (step 70).
`Returning noW to FIG. 2, in addition to the options 48 and
`50, the music distribution company also offers a music rec
`ognition option that the customer can “click on” by using the
`mouse 40 to move the pointer 44 to the designated region of
`the screen of monitor 36 and then depressing the selection
`button 46. The music distribution company then doWnloads a
`page (not illustrated) asking the customer to vocalize the song
`he or she Wants into the microphone 42 during an interval (for
`example, 10 seconds) that is communicated on the screen (as
`by depicting a “record” light Which changes from red to green
`When the interval begins, and then changes back to red When
`the interval ends). Here, the term “vocalize” is intended to
`include singing lyrics, singing With the lyrics replaced by
`dummy vocalizations (such as “da-da-da-da”), humming, and
`so forth. The result of the customer’s audibilization of the
`song that he or she Wants is an audio ?le that is conveyed to the
`music distribution company 18 via the internet 16. This ?le
`Will be called the “specimen” that the customer has submitted
`for analysis.
`FIG. 4 illustrates units located at the music distribution
`company 18 for analyzing the specimen. They include a
`music recognition unit 72, a music retrieval unit 76, and a
`selection previeW unit 78. The music recognition unit 72
`includes a feature extraction unit 80, Which may include
`hardWare components; the remaining element shoWn in FIG.
`4 are preferably implemented by softWare.
`The customer’s specimen ?le is input at a port 82 and
`conveyed to the feature extraction unit 80. It extracts from the
`specimen musical features Which characterize the song. The
`features extracted typically include information about the
`notes in the specimen and possibly also information about the
`durations of the notes. Since the customer may not vocalize
`the specimen in the same key as the desired recording or at the
`same tempo as the desired recording, the extracted features
`are normalized by a normalization unit 84 in order to provide
`frequency-independent information about the notes in the
`specimen and, if information about the duration of the notes is
`among the features extracted by unit 80, to express the dura
`tions in a relative manner that is suitable for pattern matching
`instead of being expressed in terms of seconds or fractions of
`a second. The normalized features are supplied to a pattern
`matching unit 86, Which compares them to patterns in a
`pattern library 88. The pattern library 88 stores normalized
`features extracted from all of the songs or other musical
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`compositions that are stored in a music library 90. The pattern
`library 88 may include normalized extracted features for all of
`each song. HoWever, since customers Will typically vocalize
`the chorus of a song or possibly a limited number of other
`memorable features, it is su?icient for the pattern library to
`store only normalized features for popular portions of each
`song. This speeds up the pattern matching.
`As its name implies, the pattern matching unit 86 deter
`mines hoW closely the normalized features that have been
`extracted from the customer’s specimen match the normal
`ized features stored in the pattern library. Although it is pos
`sible for the normalized features extracted from the specimen
`to exactly match an entry stored in the pattern library 88,
`typically a distant score is calculated betWeen the specimen
`and each entry in the pattern library, and a particular entry is
`determined to be a candidate if the distance is smaller than a
`predetermined value. The candidates from the reference
`library 88 that are selected by pattern matching unit 86 are
`read out of music library 90. The library 90 includes not only
`the encoded songs, but also identi?cation information such as
`the title of the song and the name of the singer or group. The
`candidate songs and identifying information are stored in a
`musical composition and identi?er memory 92.
`The selection previeW unit 78 includes a snippet extractor
`94 that receives the contents of the memory 92. The identi?
`cation information (i.e., identi?ers such as title and singer or
`group) is doWnloaded at step 58 (FIG. 3) and the customer
`selects one of the candidates by clicking on the identi?cation
`information in step 60. The snippet extractor 94 thereupon
`extracts a sequence of snippets from the selected candidate,
`With blank spaces betWeen the snippets, and a noise unit 96
`superimposes a repetitive ticking sound on the snippets and
`the blanks betWeen them. This forms previeW information,
`Which is conveyed along a path 98. The previeW information
`is doWnloaded at step 62 of FIG. 3. If the customer then
`veri?es the selected candidate during step 64 and offers suit
`able payment, a ?le containing the musical composition itself
`is issued along a path 100 and is doWnloaded in step 68.
`One Way to implement snippet extractor 94 is illustrated in
`a How chart shoWn in FIG. 5. It is assumed in FIG. 5 that the
`customer has exercised the music recognition option and that
`one or more compositions and one or more identi?ers have
`been stored in memory 92 as a result of pattern matching With
`the customer’s specimen. In step 102, the identi?er or iden
`ti?ers stored in memory 92 are transmitted to the customer for
`display on the customer’s monitor. If the customer chooses a
`selection by clicking on it With the mouse 40, information
`about the selected composition is transmitted to the music
`distribution company. In step 104, a check is made to deter
`mine Whether the customer has made a selection. If not, a
`selection timer is set in step 106. A check is made in step 108
`to determine Whether the time set by this timer has elapsed,
`and if not, the process returns to step 104. If the timer has
`timed out (Y at step 108), a notice is sent to the customer, and
`the snippet extraction process ends before it has truly gotten
`underWay.
`When the customer has selected a composition (Y at step
`104), the ?le of the selected composition is read out of
`memory 92, beginning from Where the customer’s specimen
`started. The ?le is sent to the customer as it is read out in step
`112. The customer begins playing the composition When he
`or she receives the ?le, and can probably tell rapidly Whether
`the composition is the one that he or she intended in the
`specimen.
`A snippet duration timer is set in step 114. A check is made
`in step 116 to determine Whether it has timed out. If not, a
`check is made at step 118 to determine Whether the customer
`
`Google Ex. 1015
`
`
`
`US 7,743,092 B2
`
`7
`has signaled a desire to stop listening to this composition. The
`customer can signal such a desire by clicking on another
`identi?er, thereby ending the snippet extraction procedure for
`the composition that he or she had previously selected, or by
`taking some other action that is inconsistent With a desire to
`continue listening to the previously selected composition, as
`by moving to a different page of the music distribution com
`pany’s Web site or leaving the Web site entirely. If the cus
`tomer has not decided to stop listening to the selected com
`position, a check is made at step 120 to determine Whether the
`selected composition has ended. If not, the process returns to
`step 116.
`After the snippet duration timer has timed out (Y at step
`116), the reading out of the ?le from memory 92 continues,
`but it is not sent to the customer (step 122). A blank interval
`timer is set in step 124, and a check is made at step 126 to
`determine Whether it has timed out. If not, checks are made at
`steps 128 and 130 to determine Whether the customer has
`indicated a desire to stop listening to this composition or
`Whether the composition has ended. After the blank interval
`timer has timed out (Y at step 126), the process returns to step
`112, and the customer then has an opportunity to begin lis
`tening to the next snippet.
`The operation of snippet extractor 94 is similar to that
`discussed above With respect to FIG. 5 if the customer does
`not select the music recognition option, and instead picks a
`composition from a menu of identi?ers displayed on the
`monitor 36. The main difference Would be that the snippet
`extraction process Would begin at step 110 after the ?le of the
`selected composition had been read out of the library 90 on
`the basis of the identi?er selected by the customer, and the
`reading of the ?le from the memory 92 Would start from the
`beginning of the composition.
`FIG. 6 illustrates an embodiment of tWo units in the rec
`ognition unit 72: a feature extraction unit 80' and a normal
`iZation unit 84'. It is assumed that the specimen ?le from the
`customer has been transformed to an analog audio signal that
`is applied to an input port 82. This signal is ?ltered by a
`narroW bandpass ?lter 132 Whose passband is limited to a feW
`octaves in Which mo st customers can be expected to sing. The
`bandpass ?ltered signal is then supplied to a frequency ana
`lyZer such as a ?lter bank 134. The feature extraction unit 80'
`also includes a level detector 136, Which compares the level of
`the bandpass ?ltered signal to a predetermined value, output
`ting a digital one if the level is above a predetermined value
`and otherWise outputting a Zero.
`The normalization unit 84' includes a strongest tone detec
`tor 140, Which identi?es the frequency of the strongest signal
`from ?lter bank 134 if the level detector 136 determines that
`the level of the bandpass ?ltered signal exceeds the predeter
`mined value. If the frequency of the strongest signal from
`?lter bank 134 changes, the neW frequency is identi?ed by
`strongest tone identi?er 140 and the old one is transferred to
`a prior tone memory 142. In the event that the level detected
`by the detector 136 falls beloW the predetermined value, the
`strongest tone identi?er 140 continues to identify the fre
`quency that Was strongest When the level Was above the pre
`determined value, and no changes are made in the content of
`memory 140. The reason for this is that many customers can
`be expected to vocaliZe their specimen using a string of
`dummy Words, such as “da-da-da,” leaving pauses or periods
`of substantially reduced volume betWeen the “das,” even
`though the composition they are audibiliZing may lack such
`pauses or periods of reduced volume. It is Worth noting that
`the strongest tone identi?er 140 Will continue to identify a
`tone even When the customer’s audibiliZation accurately
`re?ects a rest (or period of silence) in the composition itself.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`Although this could be looked at as an error if the purpose
`Were to fully characteriZe the composition so as to be able to
`accurately reproduce it from the characterization alone, the
`purpose here is not to regenerate the music, but instead to
`generate a pattern for pattern matching. It is believed that
`ignoring periods of silence in the specimen Will be bene?cial
`since this Will accommodate differences in the Way that cus
`tomers vocaliZe music (particularly When using dummy
`Words at differen