`(12) Patent Application Publication (10) Pub. No.: Us 2003/0046083 A1
`(43) Pub. Date:
`Mar. 6, 2003
`DEVINNEY, JR. et al.
`
`US 20030046083A1
`
`(54) USER VALIDATION FOR INFORMATION
`SYSTEM ACCESS AND TRANSACTION
`PROCESSING
`
`(76)
`
`Inventors: EDWARD J. DEVINNEY JR.,
`DELANCO, NJ (US); MANISH
`SHARMA, SOMERSET, NJ (us);
`CHRIS KEYSER, ROEBLING, NJ
`(US); RAINER ROTHACKER,
`CLIFFWOOD BEACH, NJ (us);
`RICHARD J. MAMMONE,
`BRIDGEWATER, NJ (us)
`
`Correspondence Address:
`Thomas H. Young, Esq.
`Merchant & Gould, EC.
`3200 IDS Center
`80 South Eighth Street
`Minneapolis, MN 55402-2215 (US)
`
`( * ) Notice:
`
`This is a publication of a continued pros-
`ecmion application (CPA) filed under 37
`CFR 1_53(d)_
`
`(21) App], No;
`
`08/976,279
`
`(22) Filed:
`
`Nov. 21, 1997
`
`Related U.S. Application Data
`
`(60) Provisional application No. 60/031,638, filed on Nov.
`22, 1996.
`
`Publication Classification
`
`Int. CI.’ .....................................................G10L 21/00
`(51)
`(52) U.S. Cl.
`..............................................................7o4¢273
`
`(57)
`
`ABSTRACT
`
`The present invention applies speech recognition technology
`to remote access, verification, and identification applica-
`tions. Speech recognition is used to raise the security level
`of many types of transaction systems which previously had
`serious safety drawbacks, including: point of sale systems,
`home authorization systems, systems for establishing a call
`to a called party (including prison telephone systems),
`internet access systems, web site access systems, systems for
`obtaining access to protected computer networks, systems
`for accessing a restricted hyperlink, desktop computer secu-
`rity systems, and systems for gaining access to a networked
`5e1'V°1'- A general Slleech recognition Syslem “Sing C0mm“'
`nication is also presented. Funher, diflerent types of speech
`recognition methodologies are useful with the present inven-
`tion, such as “simple” security methods and systems, multi-
`tiered security methods and systems, conditional multi-
`tiered security methods
`and systems, and randomly
`prompted voice token methods and systems.
`
`201
`
`\
`
`SPEECH RECOGNITION UNIT
`
`
`
`TEST SPEECH
`203
`
`" ’ ‘I
`1- '
`.
`PROMPT
`
`F‘ " " " - "I
`I
`IDDEX
`
`I
`
`_ .293 _
`
`216
`
`DECISION!
`CONFIDENCE
`
`PAGE 1
`
`SECURUS EXHIBIT 1003
`
`PAGE 1
`
`SECURUS EXHIBIT 1003
`
`
`
`ti
`Patent Appl’
`Ica on
`
`Publi
`
`'
`cation Mar. 6, 2003 Sheet 1 of 18
`
`Us 2003/0045033 A1
`
`FIG. I
`
`201
`
`\
`
`TEST SPEECH
`203
`r‘—“/“'-'1
`
`SPEECH RECOGNITION UNIT
`
`I" " ' ’ _ " ” ""1
`Preprocessor Unit
`
`L
`
`Comparison Unit]
`Processing Unit
`
`216
`
`\
`DECISION]
`CONFIDENCE
`
`PAGE 2
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 2 of 18
`
`US 2003/0046083 A1
`
`FIG. 2
`
`r‘ pzom ‘ j
`
`(K
`
`CONHDENGE
`
`213
`
`\
`DEc,s,oN,
`
`spescn necoaumou
`‘-'"""
`
`PAGE 3
`
`PAGE 3
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 3 of 18
`
`US 2003/0046083 A1
`
`Process In Speech
`Recognition Unit
`
`204
`
`7
`
`No Authorize!
`No Identity
`
`Authorizel
`Identify
`
`I I u
`
`'r
`(Confidence)
`\\\/
`
`PAGE 4
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 4 of 18
`
`US 2003/0046083 A1
`
`FIG. 4A
`
`242
`
`Recognize
`1st Password
`
`Recognize
`2nd Password
`
`No Authorizal
`
`No Identity
`
`PAGE 5
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 5 of 18
`
`US 2003/0046083 A1
`
`FIG. 4B
`
`261
`
`271
`
`Recognize
`1st Password
`?
`
`Recognize
`2nd Password
`
`PAGE 6
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 6 of 18
`
`US 2003/0046083 A1
`
`FIG. 4C /
`
`Identify
`characteristics
`
`288
`
`Recognize
`Characteristics
`?
`
`292
`
`No Authorize!
`
`No Identity
`
`Authorize!
`
`Identify
`
`283
`
`294
`
`PAGE 7
`
`
`
`P t
`0
`o
`o
`o
`“ °"‘ Ap""°”"°“ P“'°"°‘“'°" Man 6, 2003 Sheet 7 of 13
`
`US 2003/0046083 A1
`
`TERMINAL
`
`5g
`
`VOICE RECOGNITION
`SYSTEWSERWCE
`
`16
`
`PAGE 8
`
`
`
`p
`
`-
`-
`.
`.
`atent Application Publication Mar. 6, 2003 Sheet 8 of 18
`
`CLIENT TERMINAL
`
`SYSTEM!
`SERWCE
`
`Feature Extraction '
`....T._T_._..
`I.-¥—¥—...
`Encryption!
`
`mmu
`
`E;-Iciypfipm
`
`1
`
`Comp rlsoni
`Processing Unit
`
`I
`I G°mP3|"8°N
`'
`‘L Piboeasing Unlt
`.—_.J
`
`PAGE 9
`
`
`
`Patent Application Publication Mar 6 2003 Shee
`-
`a
`
`t 9 of 18
`
`US 2003/0046083 A1
`
`1o\
`
`POINT
`OF
`SALE
`TERMINAL
`
`452
`
`450
`
`13
`
`Can! Reader
`
`VALIDATION
`SERVICE
`
`15
`
`VOICE
`IDENTIFICATION
`DATABASE
`
`1.5
`
`PAGE 10
`
`PAGE 10
`
`
`
`Patent APDlication Publication
`
`nm
`
`811f001..HChSm26»
`
`1A%0Mm2SU
`
`Euso
`
`._<z_s_mm:
`
`
`Ehzmomo_>¢mm2m.5>m
`
`
`:5zoEzaoou:m_o_o>
`
`PAGE 11
`
`PAGE 11
`
`
`
`a7
`
`E
`
`72
`
`CAILED
`
`3
`
`
`|l0!1B:)yqnJuoynoyddvmama
`IV$809700/€00?Sfl8IJ0IIJaaqg£003‘9'.IBM[
`
`”
`
`432
`
`volcensooemnon
`T SYSTEMISERVICE
`
`“4
`
`69
`
`£5
`
`FIG. 8
`
`PAGE 12
`
`PAGE 12
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 12 of 18
`
`US 2003/0046083 A1
`
`PAGE 13
`
`PAGE 13
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 13 of 18
`
`US 2003/0046083 A1
`
`304
`
`R ECOGNFTIONS 391
`
`315
`
`3%
`
`PC
`
`PC
`
`am
`
`pc
`
`we
`
`. . .
`
`11
`
`11
`
`11
`
`FIG. IOA
`
`PAGE 14
`
`PAGE 14
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 14 of 18 Us 2003/0046033 A1
`
`FIG. 103 2/
`
`620
`
`PAGE 15
`
`PAGE 15
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 15 of 13
`
`US 2003/0046083 A1
`
`FIG. 10C
`
`630
`
`Recognition
`Server
`
`VOICE INFORMATION
`DATABASE (VIDB)
`
`I‘-Iestiicted Hyperiink 535
`
`PAGE 16
`
`PAGE 16
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 16 of 18 Us 2003/0046033 A1
`
`FIG. 11
`
`650
`
`//
`
`DESKTOP
`STATION
`
`Voice Secured System Log On
`
`V Voice Secured Screen Saver
`
`Administrative Application
`
`File Encryption
`
`PAGE 17
`
`PAGE 17
`
`
`
`Patent Application Publication Mar 6 2003 s
`-
`9
`
`heet 17 of 18
`
`US 2003/0046083 A1
`
`FIG. 12A
`
`Authentication
`Server
`
`_
`
`' Networked
`Sewer
`
`Voice Information
`Database (VIDB)
`
`G Voice Secured System Log On
`
`(3 Administrative Appflcation
`
`PAGE 18
`
`PAGE 18
`
`
`
`Patent Application Publication Mar. 6, 2003 Sheet 18 of 18
`
`US 2003/0046083 A1
`
`FIG. 123 2/”
`1
`872
`
`ACCESS ATTEMPT 370
`
`suservoica an -‘-
`
`ACCEPT 632
`
`PAGE 19
`
`PAGE 19
`
`
`
`
`
`US 2003/0046083 A1
`
`
`
`
`
`Mar. 6, 2003
`
`
`
`
`
`
`
`USER VALIDATION FOR INFORMATION SYSTEM
`
`
`
`
`ACCESS AND TRANSACTION PROCESSING
`
`
`
`
`CROSS REFERENCE TO RELATED
`
`
`
`APPLICATIONS
`
`
`
`
`
`[0001] This application claims priority from U.S. Provi-
`
`
`
`
`
`
`
`sional Application Ser. No. 60/031,638, Filed Nov. 22, 1996,
`
`
`
`
`
`
`
`entitled “User Validation For Information System Access
`
`
`
`
`
`
`And Transaction Processing.”
`
`
`
`
`
`
`
`BACKGROUND OF THE INVENTION
`
`
`
`
`
`
`[0002] The invention is a verification system for ensuring
`
`
`
`
`
`
`
`
`
`that transactions are completed securely. The invention uses
`
`
`
`
`
`
`
`
`the principle of speaker recognition to allow a user to
`
`
`
`
`
`
`
`
`
`
`complete a transaction.
`
`
`
`
`[0003]
`
`1. Field of the Invention
`
`
`
`
`
`
`[0004] The invention relates to the fields of signal pro-
`
`
`
`
`
`
`
`
`
`cessing, communications, speaker recognition and security,
`
`
`
`
`
`and secure transactions.
`
`
`
`
`
`
`[0005]
`
`
`
`
`
`
`
`2. Description of Related Art
`
`
`
`
`
`
`[0006] With the increased use of credit card and computer
`
`
`
`
`
`
`
`
`
`related transactions security of the transactions is a reoccur-
`
`
`
`
`
`
`
`problem of
`increasing
`concern. Conventional
`ring
`
`
`
`
`
`
`approaches for credit card validation have included reading
`
`
`
`
`
`
`
`
`a magnetic strip of the credit card at a point of sale.
`
`
`
`
`
`
`
`
`
`
`
`
`Information stored on the credit card, such as account
`
`
`
`
`
`
`
`
`
`information, is forwarded over a telephone connection to a
`
`
`
`
`
`
`
`
`
`credit verification service at the credit card company. For
`
`
`
`
`
`
`
`
`
`example, an X.25 connection to the credit verification sys-
`
`
`
`
`
`
`
`
`tem has been used. A response from the credit verification
`
`
`
`
`
`
`
`
`
`service indicates to the salesperson whether the customer’s
`
`
`
`
`
`
`
`credit card is valid and whether the customer has sufficient
`
`
`
`
`
`
`
`
`
`credit. An example of the above-described system is manu-
`
`
`
`
`
`
`
`
`factured by VeriFone® of Redwood City, Calif., U.S.A..
`
`
`
`
`
`
`
`
`These prior art systems, however, have the disadvantage that
`
`
`
`
`
`
`
`
`
`the credit card may be verified as valid and as having
`
`
`
`
`
`
`
`
`
`
`
`sufficient credit even if it is used by someone who is not
`
`
`
`
`
`
`
`
`
`
`
`
`authorized to use the credit card.
`
`
`
`
`
`
`[0007] The identity of the consumer who presents a credit
`
`
`
`
`
`
`
`
`
`card is manually verified by a merchant. The back of the
`
`
`
`
`
`
`
`
`
`
`
`credit card contains a signature strip, which the consumer
`
`
`
`
`
`
`
`
`
`signs upon credit card issuance. The actual signature of the
`
`
`
`
`
`
`
`
`
`
`consumer at the time of sale is compared to the signature on
`
`
`
`
`
`
`
`
`
`
`
`
`the back of the credit card by the merchant. If in the
`
`
`
`
`
`
`
`
`
`
`
`
`merchant’s judgement, the signatures match, the transaction
`
`
`
`
`
`
`
`is allowed to proceed.
`
`
`
`
`include placing
`[0008] Other systems of the prior art
`
`
`
`
`
`
`
`
`
`photographs of authorized users on the credit card. At the
`
`
`
`
`
`
`
`
`
`
`time of the transaction, the merchant compares the photo-
`
`
`
`
`
`
`
`
`graph on the card with the face of the person presenting the
`
`
`
`
`
`
`
`
`
`
`
`
`card. If there appears to be a match,
`the transaction is
`
`
`
`
`
`
`
`
`
`
`
`allowed to proceed.
`
`
`
`
`[0009] While signatures and photographs are personal
`
`
`
`
`
`
`
`characteristics of the user, they have not been very effective.
`
`
`
`
`
`
`
`
`
`
`Signatures are relatively easy to forge and differences
`
`
`
`
`
`
`
`
`between signatures and photographs may go unnoticed by
`
`
`
`
`
`
`
`
`inattentive merchants. These systems are manual and con-
`
`
`
`
`
`
`
`sequently prone to human error. Further,
`these systems
`
`
`
`
`
`
`
`cannot be used with credit card transactions which do not
`
`
`
`
`
`
`
`
`
`occur in person, i.e., which occur via telephone.
`
`
`
`
`
`
`
`
`
`
`
`
`
`PAGE 20
`
`
`
`[0010] Computer related applications, such as accessing
`
`
`
`
`
`
`systems, local area networks, databases and computer net-
`
`
`
`
`
`
`
`work (such as “Internet”) systems, have conventionally used
`
`
`
`
`
`
`
`
`passwords (known as personal
`identification numbers—
`
`
`
`
`
`
`“PINs”) entered from a keyboard as a security method for
`
`
`
`
`
`
`
`
`
`accessing information. Computer passwords have the short-
`
`
`
`
`
`
`coming of being capable of being stolen, intercepted or
`
`
`
`
`
`
`
`
`
`re-created by third parties. Computer programs exist for
`
`
`
`
`
`
`
`
`guessing (“hacking”) passwords. Additionally, computer
`
`
`
`
`
`passwords/PINs are not personal characteristics, which
`
`
`
`
`
`
`means that they are less complex and easier to generate by
`
`
`
`
`
`
`
`
`
`
`
`a third party with no knowledge of the authorized individu-
`
`
`
`
`
`
`
`
`
`al’s personal characteristics.
`
`
`
`
`
`
`[0011] With the advent of electronic commerce on the
`
`
`
`
`
`
`
`
`internet, goods and services are increasingly being pur-
`
`
`
`
`
`
`
`chased by consumers, who submit credit card or other
`
`
`
`
`
`
`
`
`“secure” information to merchants over the internet. Trans-
`
`
`
`
`
`
`
`actions initiated from users connected to the internet cur-
`
`
`
`
`
`
`
`
`rently have limited security provisions. For example, a retail
`
`
`
`
`
`
`
`
`provider receiving a user’s credit card number from the
`
`
`
`
`
`
`
`
`
`internet has no idea whether the person providing the
`
`
`
`
`
`
`
`
`
`number is authorized to use the credit card, or has obtained
`
`
`
`
`
`
`
`
`
`
`
`a credit card number from an illegal source.
`
`
`
`
`
`
`
`
`
`
`
`[0012] As computers play a greater and more critical role
`
`
`
`
`
`
`
`
`
`
`in everyday life, security has emerged as a significant
`
`
`
`
`
`
`
`
`
`concern. Whether it’s restricting children from playing with
`
`
`
`
`
`
`
`
`their parent’s tax return (local access), protecting against an
`
`
`
`
`
`
`
`
`
`employee stealing trade secrets (network access), or limiting
`
`
`
`
`
`
`
`
`access to a value added WEB site (remote network access),
`
`
`
`
`
`
`
`
`
`
`the ability to determine that the claimed user is the real user
`
`
`
`
`
`
`
`
`
`
`
`
`is absolutely necessary.
`
`
`
`
`[0013] Additional areas in which a need for heightened
`
`
`
`
`
`
`
`
`security exists are cellular telephone systems and prison
`
`
`
`
`
`
`
`telephone systems. In cellular systems, fraud from unautho-
`
`
`
`
`
`
`
`rized calling is a recurring problem. In prison systems, the
`
`
`
`
`
`
`
`
`
`
`identity of inmates must be closely monitored, for purpose
`
`
`
`
`
`
`
`
`
`of authorizing certain transactions, such as telephone calls.
`
`
`
`
`
`
`
`
`
`
`
`
`[0014] What is needed are local and remote secure access
`
`
`
`
`
`
`
`
`
`systems and methods using personal characteristics of users
`
`
`
`
`
`
`
`for identifying and/or verifying the users.
`
`
`
`
`
`
`
`
`
`
`SUMMARY OF THE INVENTION
`
`
`
`
`
`
`[0015] The present invention is an improved method and
`
`
`
`
`
`
`
`
`system for increasing the security of credit card transactions,
`
`
`
`
`
`
`
`
`prison inmate transactions, database access requests, inter-
`
`
`
`
`
`
`net transactions, and other transaction processing applica-
`
`
`
`
`
`
`tions in which high security is necessary. According to the
`
`
`
`
`
`
`
`
`
`present invention, voice print and speaker recognition tech-
`
`
`
`
`
`
`
`nology are used to validate a transaction or identify a user.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[0016] Within speaker recognition (also referred to as
`
`
`
`
`
`
`
`
`voice recognition herein),
`there exists two main areas:
`
`
`
`
`
`
`
`
`speaker identification and speaker verification. A speaker
`
`
`
`
`
`
`
`identification system attempts to determine the identity of a
`
`
`
`
`
`
`
`
`person within a known group of people using a sample of his
`
`
`
`
`
`
`
`
`
`
`or her voice. Speaker identification can be accomplished by
`
`
`
`
`
`
`
`
`
`comparing a voice sample of the user in question to a
`
`
`
`
`
`
`
`
`
`
`
`database of voice data, and selecting the closest match in the
`
`
`
`
`
`
`
`
`
`
`
`database. In contrast, a speaker verification system attempts
`
`
`
`
`
`
`
`
`to determine if a person’s claimed identity (whom the person
`
`
`
`
`
`
`
`
`
`claims to be) is valid using a sample of his or her voice.
`
`
`
`
`
`
`
`
`
`
`
`
`
`Speaker verification systems are informed of the person’s
`
`
`
`
`
`
`
`
`
`PAGE 20
`
`
`
`
`
`US 2003/0046083 A1
`
`
`
`
`
`Mar. 6, 2003
`
`
`
`
`
`[0023]
`
`claimed identity by index information, such as the person’s
`
`
`
`
`
`
`
`
`claimed name, credit card number, or social security num-
`
`
`
`
`
`
`
`
`ber. Therefore, speaker verification systems typically com-
`
`
`
`
`
`
`pare the voice of the user in question to one set of voice data
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`stored in a database, the set of voice data identified by the
`
`
`
`
`
`
`
`
`
`
`
`
`index information.
`
`
`
`
`
`[0017] Speaker recognition provides an advantage over
`
`
`
`
`
`
`other security measures such as passwords (including per-
`
`
`
`
`
`
`
`sonal
`identification numbers) and personal
`information,
`
`
`
`
`
`because a person’s voice is
`a personal characteristic
`
`
`
`
`
`
`
`uniquely tied to his or her identity. Speaker verification
`
`
`
`
`
`
`
`
`therefore provides a robust method for security enhance-
`
`
`
`
`
`
`
`ment.
`
`
`
`
`
`
`
`
`consists of determining
`[0018] Speaker verification
`
`
`
`
`
`whether or not a speech sample provides a sufficient match
`
`
`
`
`
`
`
`
`
`to a claimed identity. The speech sample can be text depen-
`
`
`
`
`
`
`
`
`
`
`dent or text independent. Text dependent speaker verifica-
`
`
`
`
`
`
`
`tion systems identify the speaker after the utterance of a
`
`
`
`
`
`
`
`
`
`password phrase. The password phrase is chosen during
`
`
`
`
`
`
`
`enrollment and the same password is used in subsequent
`
`
`
`
`
`
`
`
`verification. Typically, the password phrase is constrained
`
`
`
`
`
`
`within a specific vocabulary (i.e. number of digits). A text
`
`
`
`
`
`
`
`
`
`independent speaker verification system does not use any
`
`
`
`
`
`
`
`pre-defined password phrases. However, the computational
`
`
`
`
`
`complexity of text-independent speaker verification is much
`
`
`
`
`
`
`higher than that of text dependent speaker verification
`
`
`
`
`
`
`
`systems, because of the unlimited vocabulary.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[0019] The present invention uses speech biometrics as a
`
`
`
`
`
`
`
`
`natural interface to authenticate users in today’s multi-media
`
`
`
`
`
`
`
`networked environment, rather than a password that can be
`
`
`
`
`
`
`
`
`easily compromised.
`
`
`
`
`
`
`
`[0020]
`In accordance with the present invention, security
`
`
`
`
`
`
`
`
`can be incorporated in at least three access levels: at the
`
`
`
`
`
`
`
`
`
`
`
`desktop, on corporate network servers (NT, NOVELL, or
`
`
`
`
`
`
`
`
`UNIX), and at a WEB server (internets/intranets/extranet).
`
`
`
`
`
`
`
`The security mechanisms may control access to a work
`
`
`
`
`
`
`
`
`
`station, to network file servers, to a web site, or may secure
`
`
`
`
`
`
`
`
`
`
`
`
`a specific transaction. Nesting of these security levels can
`
`
`
`
`
`
`
`
`
`provide additional security; for instance, a company could
`
`
`
`
`
`
`
`
`choose to have it’s work stations secured locally by a
`
`
`
`
`
`
`
`
`
`
`desktop security mechanism, as well as protect corporate
`
`
`
`
`
`
`
`
`data on a file server with a NT, NOVELL or FTP server
`
`
`
`
`
`
`
`
`
`
`
`
`security mechanism.
`
`
`
`[0021] Use of speaker recognition, and therefore voice
`
`
`
`
`
`
`
`
`biometric data, is able to provide varying levels of security
`
`
`
`
`
`
`
`
`
`
`based upon customer requirements. Abiometric confirms the
`
`
`
`
`
`
`
`
`actual identity of the user; other prevalent high security
`
`
`
`
`
`
`
`
`
`methods, such as token cards, can still be compromised if the
`
`
`
`
`
`
`
`
`
`
`
`token card is stolen from the owner. A system can employ
`
`
`
`
`
`
`
`
`
`
`
`any of these methods at any access level. In all cases of the
`
`
`
`
`
`
`
`
`
`
`
`
`
`inventive methods described herein, the user must know an
`
`
`
`
`
`
`
`
`
`additional identifying piece of information. The security
`
`
`
`
`
`
`
`system is not compromised whether this information is
`
`
`
`
`
`
`
`
`publicly obtainable information, such as their name, or a
`
`
`
`
`
`
`
`
`
`private piece of information, such as a PIN, a social security
`
`
`
`
`
`
`
`
`
`
`number, or an account number.
`
`
`
`
`
`
`[0022]
`In accordance with the present invention, “simple”
`
`
`
`
`
`
`
`security systems and methods (single spoken password),
`
`
`
`
`
`
`multi-tiered security systems (multiple tiers of spoken pass-
`
`
`
`
`
`
`
`words) and randomly prompted voice tokens (prompting of
`
`
`
`
`
`
`
`
`words obtained through a random look-up) are provided for
`
`
`
`
`
`
`
`
`
`
`
`
`
`PAGE 21
`
`
`
`improved security. These security systems and methods may
`
`
`
`
`
`
`
`
`be used to increase the security of point of sale systems,
`
`
`
`
`
`
`
`
`
`
`
`home authorization systems, systems for establishing a call
`
`
`
`
`
`
`
`
`to a called party (including prison telephone systems),
`
`
`
`
`
`
`
`
`internet access systems, web site access systems, systems for
`
`
`
`
`
`
`
`
`
`obtaining access to protected computer networks, systems
`
`
`
`
`
`
`
`for accessing a restricted hyperlink, desktop computer secu-
`
`
`
`
`
`
`rity systems, and systems for gaining access to a networked
`
`
`
`
`
`
`
`
`
`server.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`
`
`
`
`FIG. 1 is a diagram of a speech recognition unit.
`
`
`
`
`
`
`
`
`
`
`[0024] FIG. 2 is a high level representation of the unit
`
`
`
`
`
`
`
`
`
`
`shown in FIG. 1.
`
`
`
`
`[0025] FIG. 3 shows a “simple” security method and
`
`
`
`
`
`
`
`
`system.
`
`[0026] FIG. 4A shows a diagram of a multi-tiered security
`
`
`
`
`
`
`
`method and system.
`
`
`
`[0027] FIG. 4B shows a diagram of a multi-tiered security
`
`
`
`
`
`
`
`method and system with conditional tiers.
`
`
`
`
`
`
`[0028] FIG. 4C shows a diagram of a randomly prompted
`
`
`
`
`
`
`
`voice token method and system.
`
`
`
`
`
`[0029] FIG. 5A shows a schematic diagram of the general
`
`
`
`
`
`
`
`
`
`configuration of a speaker verification method and system.
`
`
`
`
`
`
`
`[0030] FIG. 5B shows a more specific schematic of the
`
`
`
`
`
`
`
`
`
`FIG. 5A method and system.
`
`
`
`
`
`[0031] FIG. 6 is a schematic diagram of a speaker recog-
`
`
`
`
`
`
`
`
`
`nition method and system for a point of sale system.
`
`
`
`
`
`
`
`
`
`
`[0032] FIG. 7 is a schematic diagram of an embodiment
`
`
`
`
`
`
`
`
`
`
`where home authorization is obtained through a call center.
`
`
`
`
`
`
`
`
`
`[0033] FIG. 8 is a schematic diagram of an embodiment
`
`
`
`
`
`
`
`
`
`for establishing a call
`to a called party using speaker
`
`
`
`
`
`
`
`
`
`recognition.
`
`[0034] FIG. 9 is a schematic diagram of an embodiment
`
`
`
`
`
`
`
`
`
`for use in establishing an internet connection using speaker
`
`
`
`
`
`
`
`
`recognition.
`
`[0035] FIG. 10A is a schematic diagram of an embodi-
`
`
`
`
`
`
`
`
`
`ment for use in establishing a connection to a web site using
`
`
`
`
`
`
`
`
`
`
`speaker recognition.
`
`
`[0036] FIG. 10B is a schematic diagram of an embodi-
`
`
`
`
`
`
`
`
`
`ment for use in establishing a connection to a protected
`
`
`
`
`
`
`
`
`
`network using speaker recognition.
`
`
`
`
`[0037] FIG. 10C is a schematic diagram of an embodi-
`
`
`
`
`
`
`
`
`
`ment for use in establishing a connection to a restricted
`
`
`
`
`
`
`
`
`
`hyperlink on a web server using speaker recognition.
`
`
`
`
`
`
`
`
`[0038] FIG. 11 shows an embodiment for use in securing
`
`
`
`
`
`
`
`
`
`a desktop computer using speaker recognition.
`
`
`
`
`
`
`[0039] FIG. 12A shows a system for use in gaining access
`
`
`
`
`
`
`
`
`
`to a networked server using speaker recognition.
`
`
`
`
`
`
`
`[0040] FIG. 12B shows a method for use in gaining access
`
`
`
`
`
`
`
`
`
`to a networked server using speaker recognition.
`
`
`
`
`
`
`
`DESCRIPTION OF THE PREFERRED
`
`
`
`EMBODIMENT(S)
`
`[0041] The present invention uses speech recognition in
`
`
`
`
`
`
`
`combination with various security and communications sys-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`PAGE 21
`
`
`
`
`
`US 2003/0046083 A1
`
`
`
`
`
`Mar. 6, 2003
`
`
`
`
`
`[0042]
`
`
`
`
`tems and methods. As a result, an inventive, remotely
`
`
`
`
`
`
`
`
`accessible and fully automatic speech verification and/or
`
`
`
`
`
`
`identification system results.
`
`
`
`1. Speech Recognition Unit.
`
`
`
`
`
`[0043] FIG. 1 illustrates a speech recognition system 201.
`
`
`
`
`
`
`
`
`Test speech 202 from a user is input into a speech recogni-
`
`
`
`
`
`
`
`
`
`
`
`tion unit 204, which contains a database of stored speech
`
`
`
`
`
`
`
`
`
`
`data. A prompt 203 may be presented to the user to inform
`
`
`
`
`
`
`
`
`
`
`
`
`the user to speak a password or enter index information. In
`
`
`
`
`
`
`
`
`
`
`
`a speaker verification system, an index 206 is normally
`
`
`
`
`
`
`
`
`
`supplied, which informs the speech recognition unit 204 as
`
`
`
`
`
`
`
`
`
`to which data in the database 208 is to be matched up with
`
`
`
`
`
`
`
`
`
`
`
`
`
`the user. In a speaker identification system, an index 206 is
`
`
`
`
`
`
`
`
`
`
`
`normally not input, and the speech recognition unit 204
`
`
`
`
`
`
`
`
`
`cycles through all of the stored speech data in the database
`
`
`
`
`
`
`
`
`
`
`
`to find the best match, and identifies the user as the person
`
`
`
`
`
`
`
`
`
`
`
`
`corresponding to the match. Alternatively, if a certain thresh-
`
`
`
`
`
`
`
`old is not met, the speech identification system 204 may
`
`
`
`
`
`
`
`
`
`decide that no match exists.
`
`
`
`
`
`[0044]
`In either case,
`the speech recognition unit 204
`
`
`
`
`
`
`
`
`
`utilizes a comparison processing unit 210 to compare the test
`
`
`
`
`
`
`
`
`
`speech 202 with stored speech data in a database 208. The
`
`
`
`
`
`
`
`
`
`
`
`stored speech data may be extracted features of the speech,
`
`
`
`
`
`
`
`
`
`
`a model, a recording, speech characteristics, analog or
`
`
`
`
`
`
`
`
`digital speech samples, or any information concerning
`
`
`
`
`
`
`
`speech or derived from speech. The speech recognition unit
`
`
`
`
`
`
`
`
`
`
`
`204 then outputs a decision 216, either verifying (or not) the
`
`
`
`
`
`
`
`
`user, or identifying (or not) the user. Alternatively,
`the
`
`
`
`
`
`
`
`
`
`
`“decision”216 from the speech recognition unit includes a
`
`
`
`
`
`
`
`
`confidence level, with or without the verification/identifica-
`
`
`
`
`
`
`tion decision. The confidence level may be data indicating
`
`
`
`
`
`
`
`
`how close the speech recognition match is, or other infor-
`
`
`
`
`
`
`
`
`
`mation relating to how successful the speech recognition
`
`
`
`
`
`
`
`
`unit was in obtaining a match. The “decision”216, which
`
`
`
`
`
`
`
`
`
`may be a identification, verification, and/or confidence level,
`
`
`
`
`
`
`
`is then used to “recognize” the user, meaning to identify or
`
`
`
`
`
`
`
`
`
`
`
`verify the user, or perform some other type of recognition.
`
`
`
`
`
`
`
`
`
`
`Either verification or identification may be performed with
`
`
`
`
`
`
`
`
`the system 201 shown in FIG. 1. Should identification be
`
`
`
`
`
`
`
`
`
`
`preferred, the database 208 is cycled through in order to
`
`
`
`
`
`
`
`
`
`
`obtain the closest match.
`
`
`
`
`the
`[0045] Systems which may be used to implement
`
`
`
`
`
`
`
`
`
`speech recognition system of FIG. 1 are disclosed in U.S.
`
`
`
`
`
`
`
`
`
`
`Pat. No. 5,522,012, entitled “Speaker Identification and
`
`
`
`
`
`
`
`Verification System,” issued on May 28, 1996, patent appli-
`
`
`
`
`
`
`
`
`cation Ser. No. 08/479,012 entitled “Speaker Verification
`
`
`
`
`
`
`
`System,” U.S. patent application Ser. No. 08/
`,
`
`
`
`
`
`
`
`
`entitled “Model Adaption System And Method For Speaker
`
`
`
`
`
`
`
`
`Verification,” filed on Nov. 3, 1997 by Kevin Farrell and
`
`
`
`
`
`
`
`
`
`
`William Mistretta, U.S. patent
`application Ser. No.
`
`
`
`
`
`
`
`, filed on Nov. 21, 1997, entitled “Voice Print
`08/
`
`
`
`
`
`
`
`
`
`
`System and Method,” by Richard J. Mammone, Xiaoyu
`
`
`
`
`
`
`
`
`Zhang, and Manish Sharma, each of which is incorporated
`
`
`
`
`
`
`
`
`
`herein by reference in its entirety.
`
`
`
`
`
`
`[0046] Referring to FIG. 1, the speech recognition unit
`
`
`
`
`
`
`
`
`
`204 may contain a preprocessor unit 212 for preprocessing
`
`
`
`
`
`
`
`
`
`the speech prior to making any comparisons. Preprocessing
`
`
`
`
`
`
`
`
`may include analog to digital conversion of the speech
`
`
`
`
`
`
`
`
`
`signal. The analog to digital conversion can be performed
`
`
`
`
`
`
`
`
`
`with standard telephony boards such as those manufactured
`
`
`
`
`
`
`
`
`by Dialogic. A speech encoding method such as ITU G711
`
`
`
`
`
`
`
`
`
`
`standard y and A law can be used to encode the speech
`
`
`
`
`
`
`
`
`
`
`
`
`samples. Preferably, a sampling rate of 8000 Hz is used.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`PAGE 22
`
`[0047] The preprocessor unit may perform any number of
`
`
`
`
`
`
`
`
`
`noise removal or silence removal techniques on the test
`
`
`
`
`
`
`
`
`
`speech, including the following techniques which are known
`
`
`
`
`
`
`
`
`in the art:
`
`
`
`
`[0048] Digital filtering to remove pre-emphasis. In
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`this case, a digital filter H(z)=1—otz‘1 is used, where
`(X is set between 0.9 and 1.0.
`
`
`
`
`
`
`
`
`
`
`
`[0049] Silence removal using energy and zero-cross-
`
`
`
`
`
`
`ing statistics. The success of this technique is pri-
`
`
`
`
`
`
`
`
`marily based on finding a short interval which is
`
`
`
`
`
`
`
`
`
`guaranteed to be background silence (generally
`
`
`
`
`
`
`found a few milliseconds at the beginning of the
`
`
`
`
`
`
`
`
`
`utterance, before the speaker actually starts record-
`
`
`
`
`
`
`ing).
`
`
`[0050] Silence removal based on an energy histo-
`
`
`
`
`
`
`
`gram. In this method, a histogram of frame energies
`
`
`
`
`
`
`
`
`
`is generated. A threshold energy value is determined
`
`
`
`
`
`
`
`
`based on the assumption that the biggest peak in the
`
`
`
`
`
`
`
`
`
`
`histogram at the lower energy region shall corre-
`
`
`
`
`
`
`
`spond to the background silence frame energies. This
`
`
`
`
`
`
`
`threshold energy value is used to perform speech
`
`
`
`
`
`
`
`versus silence discrimination.
`
`
`
`
`
`
`
`the speech recognition unit may
`[0051] Additionally,
`
`
`
`
`
`
`
`optionally contain a microprocessor-based feature extraction
`
`
`
`
`
`unit 214 to extract features of the voice prior to making a
`
`
`
`
`
`
`
`
`
`
`
`
`comparison. Spectral speech features may be represented by
`
`
`
`
`
`
`
`
`speech feature vectors determined within each frame of the
`
`
`
`
`
`
`
`
`
`processed speech signal. In the feature extraction unit 214,
`
`
`
`
`
`
`
`
`
`spectral feature vectors can be obtained with conventional
`
`
`
`
`
`
`
`
`methods such as linear predictive (LP) analysis to determine
`
`
`
`
`
`
`
`
`
`LP cepstral coefficients, Fourier Transform Analysis and
`
`
`
`
`
`
`
`filter bank analysis. One type of feature extraction is dis-
`
`
`
`
`
`
`
`
`
`closed in previously mentioned U.S. Pat. No. 5,522,012,
`
`
`
`
`
`
`
`entitled “Speaker Identification and Verification System,”
`
`
`
`
`
`issued on May 28, 1996 and incorporated herein by refer-
`
`
`
`
`
`
`
`
`
`ence in its entirety.
`
`
`
`
`[0052] The speech recognition unit 204 may be imple-
`
`
`
`
`
`
`
`
`mented using an Intel Pentium platform general purpose
`
`
`
`
`
`
`
`
`computer processing unit (CPU) of at least 100 MHZ having
`
`
`
`
`
`
`
`
`
`
`about 10 MB associated RAM memory and a hard or fixed
`
`
`
`
`
`
`
`
`
`
`
`drive as storage. Alternatively, an additional embodiment
`
`
`
`
`
`
`
`could be the Dialogic Antares card.
`
`
`
`
`
`
`[0053] While the speech recognition systems previously
`
`
`
`
`
`
`incorporated by reference are preferred, other speech rec-
`
`
`
`
`
`
`
`ognition systems may be employed with the present inven-
`
`
`
`
`
`
`
`
`tion. The type of speech recognition system is not critical to
`
`
`
`
`
`
`
`
`
`
`
`the invention, any known speech recognition system may be
`
`
`
`
`
`
`
`
`
`used. The present invention applies these speech recognition
`
`
`
`
`
`
`
`
`systems in the field of security to increase the level of
`
`
`
`
`
`
`
`
`
`
`
`security of prior, ineffective, systems.
`
`
`
`
`
`
`
`
`
`
`
`[0054]
`
`2. Security Methodology and Systems.
`
`
`
`
`
`
`[0055] According to the present invention, speaker recog-
`
`
`
`
`
`
`
`nition can provide varying levels of security based upon
`
`
`
`
`
`
`
`
`customer requirements. A biometric, such as voice verifica-
`
`
`
`
`
`
`
`tion, confirms the actual identity of the user. Other prevalent
`
`
`
`
`
`
`
`
`
`
`high security methods, such as token cards, can still be
`
`
`
`
`
`
`
`
`
`
`compromised if the token card is stolen from the owner.
`
`
`
`
`
`
`
`
`
`
`With speaker recognition, the user need know only a single
`
`
`
`
`
`
`
`
`
`
`piece of information, what to speak, and the voice itself
`
`
`
`
`
`
`
`
`
`
`supplies another
`identifying piece of information. The
`
`
`
`
`
`
`
`
`
`
`PAGE 22
`
`
`
`
`
`US 2003/0046083 A1
`
`
`
`
`
`Mar. 6, 2003
`
`
`
`
`
`present invention contemplates at least three levels of secu-
`
`
`
`
`
`
`
`
`rity, “simple” security, multi-tiered security, and randomly
`
`
`
`
`
`
`prompted voice tokens.
`
`
`
`
`
`
`[0056] A more general depiction of a speaker recognition
`
`
`
`
`
`
`
`
`system 215 is shown in FIG. 2. As shown in FIG. 2, the user
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`supplies a spoken password 217 to the speech recognition
`
`
`
`
`
`
`
`
`
`unit 204. The spoken password is preferably input into a
`
`
`
`
`
`
`
`
`
`
`microphone at the user’s location (not shown) or in the
`
`
`
`
`
`
`
`
`
`
`speech recognition unit 204 (not shown). The password may
`
`
`
`
`
`
`
`
`
`also be obtained from a telephone or other voice commu-
`
`
`
`
`
`
`
`
`
`nications device (not shown). In response to the spoken
`
`
`
`
`
`
`
`
`password, or subsequent data, the speech recognition unit
`
`
`
`
`
`
`
`204 outputs a decision 216, which may be or include a
`
`
`
`
`
`
`
`
`
`
`confidence level. To increase the level of security, an
`
`
`
`
`
`
`
`
`optional user index input unit 218 may be included to obtain
`
`
`
`
`
`
`
`
`
`
`index information, such as a credit card number, social
`
`
`
`
`
`
`
`
`security number, or PIN. The user index input unit 218 may
`
`
`
`
`
`
`
`
`
`
`be a keyboard, card reader, joystick, mouse, or other input
`
`
`
`
`
`
`
`
`
`device. The index may be confidential or public, depending
`
`
`
`
`
`
`
`
`on the level of security desired. An optional prompt input
`
`
`
`
`
`
`
`
`
`unit 220 may be included to prompt the user for a speech
`
`
`
`
`
`
`
`
`
`
`
`password or index information. The prompt input unit may
`
`
`
`
`
`
`
`
`be a display, speaker, or other audio/visual device.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[0057] A “simple” security method 221 is shown in FIG.
`
`
`
`
`
`
`
`
`
`3. This method may be implemented in the system of FIG.
`
`
`
`
`
`
`
`
`
`
`1 or 2. The “simple” security system requires only the
`
`
`
`
`
`
`
`
`
`password and the voice bi