`
`lntemanonal Bureau
`WORLD lNTELLECTUAL PROPERTY onenmzrmon
`
`
`
`11 April 1998 (11.04.98)
`
`INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`
` (II) International Publication Number:
`(51) International Patent Classification 6 :
`WO 98147252
`1104.] 1:00
`
`
`(43) International Publication Date:
`22 October 1993 (22.10.98)
`
`
`
`(21) International Application Number:
`PCT/USQSJOTZZS
`(81] Designated States: AL, AM. AT, AU, AZ. BA, BB. BG, BR,
`'
`BY. CA. CH. CN, CU, CZ. DE. DK. EE. ES, FI, GB, GE,
`GW, HU, ID, 11., Is, JP, KE, KG, KP, KR, KZ. DC, LK,
`(22) International Filing Date:
`
`
`
`LR, LS, LT. LU, LV, MD. MG, MK. MN, MW. MX, NO,
`NZ, PL, PT, RO, RU, SD. SE. 50. SI. SK, 51., TJ, TM,
`TR. TT. UA. UG. US. UZ. VN. YU, ZW, ARIPO patent
`(30) Priority Data:
`
`
`(GH, GM, KE. LS, MW, SD, 52, UG, ZW), Eurasian patent
`11 April 199'? (11.04.97)60f043.302 US
`
`
`
`(AM. AZ. BY. KG. KZ, MD. RU, TJ. TM), European patent
`
`(AT, BE. CH, CY, DE, DK, ES, FI. FR. GB. GR. IE, IT,
`
`LU. MC. NL, PT. SE). OAPI patent (BF. BJ. CF. CG. CI,
`(71)(72) Applicant and Inventor: STERN, Geoffrey [USIUS]; 9
`CM. GA, GN, ML, MR, NE, SN. TD, TG).
`Apache Trail. Westport. CT 06880 (US).
`
`
`
`(72) Inventor; and
`
`
`(75) Inventorprplicnut o‘er US only): WEXLER. Gil
`[IUIL];
`Beerl Street 54, 64233 Tel Aviv (IL).
`
`(74) Agents: LERCH. Joseph, B. et al.; Darby & Darby.r P.C.. 805
`
`
`Third Avenue. New Yerk, NY 10022—7513 (US).
`
`Published
`Without international search report and to be republished
`upon receipt“ of that report.
`
`
`
`
`(54) Title: PERSONAL AUDIO MESSAGE PROCESSOR AND METHOD
`
`(57) Abstract
`
`
`
`
`
`
`A portable device is disclosed which permits the user to record. edit. play and review voice messages and other audio material
`which may be received from, and subsequently transmitted to, a remote voice processing or interactive voice response (IV'R) host computer
`over a communication link. A preferred device contains its own power source.
`integrated circuitry and control buttons to permit the
`localized recording, editing, storage and playback of audio signals through a built—in speaker. microhpone and removable memoryr card.
`The device also contains a standard RJ—ll telephone jack, modem chip set and DTMF tone decoder to permit the transmission and control
`of audio signals to and from a host computer. The device contains circuitry which permits it to transmit and receive audio signals at a rate
`substantially faster than originally recorded.
`
`
`
`_J
`
`1
`
`GOOGLE 1006
`
`1
`
`GOOGLE 1006
`
`
`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`Zimbabwe
`
`Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCI‘.
`Slovenia
`SI
`[4:3th
`[.5
`ES
`Albania
`Slovakia
`SK
`LT
`Lithuania
`FI
`Armenia
`SN
`LU
`Senegal
`Austria
`FR
`Luxembourg
`Swaziland
`Latvia
`sz
`LV
`GA
`Australia
`TD
`Chad
`MC
`Monaco
`GB
`Azerbaijan
`T0
`M D
`GE
`Togo
`Republic of mldova
`Ensnia and Herzegovina
`MG
`TJ
`Tajikistan
`Barbados
`Madegmar
`TM
`Turkmenisim
`MK
`The former Yugoslav
`Belgium
`TR
`Turkey
`Burkina Faso
`Republic of Macedonia
`Mali
`TT
`Trinidad and Tbbago
`Bulgaria
`UA
`Ukraine
`Benin
`Mongolia
`Mauritania
`UG
`Brazil
`Uganda
`Malawi
`US
`United States of America
`Belarus
`UZ
`Uzbekistan
`Mexico
`Canada
`Viet Nam
`VN
`Niger
`Central African Republic
`Netherlands
`YU
`Yugoslavia
`Congo
`ZW
`Norway
`Switzerland
`New Zealand
`Chm d'lvoire
`Poland
`Cameroon
`China
`Portugal
`Romania
`Cuba
`Russian Federation
`Czech Republic
`Sudan
`Germany
`Sweden
`Denmark
`Estonia
`Singapore
`
`GN
`GR
`HU
`[E
`[L
`IS
`[T
`JP
`KE
`KG
`KP
`
`KR
`[(2.
`be
`LI
`LK
`LR
`
`Spain
`Finland
`France
`Gabon
`United Kingdom
`Georgia
`Ghana
`Guinea
`Greece
`HMS”?
`Ireland
`Israel
`lceiand
`Italy
`Japan
`Kenya
`Kyrgyzstan
`Democratic People’s
`Republic of Korea
`Republic of Korea
`Kazakslan
`Saint Lucia
`Liechtenstein
`Sri Lenka
`Liberia
`
`ML
`MN
`MR
`MW
`MK
`NE
`NL
`NO
`NZ
`PL
`FT
`110
`RU
`SD
`SE
`56
`
`2
`
`
`
`WO 98147252
`
`PCT[USQSIBTZZS
`
`PERSONAL AUDIO MESSAGE PROCESSOR AND METHOD
`
`Field of the Invention
`
`10
`
`and
`
`audio communication devices
`
`and, more particularly,
`
`The present invention relates generally to dictation
`
`concerns
`
`a method
`
`and
`
`portable
`
`apparatus
`
`for
`
`audio
`
`communication,
`
`including the recording and editing of voice
`
`mail and audio content and its transmission and reception over
`
`a private or public network, such as the Internet, using common
`electrical communication media or data links.
`
`15
`
`Background of the Invention
`
`All electronic message systems, with the exception
`
`of voice—mail, have
`
`intermediate devices or storage media
`
`a high
`transferred, preferably at
`be
`whereby data may
`transmission rate, over a standard communication link and
`
`stored in a storage medium or onto an unattended device for
`
`later off—line access, review and editing by the intended user.
`
`In the case of a facsimile transmission, an image is
`
`scanned by the transmitter and then transmitted and ultimately
`
`printed at
`
`a remote site for off—line utilization by the
`
`intended receiver.
`
`In the case of electronic mail, data is
`
`generated on a computer and then transmitted and stored either
`
`directly on the intended user's unattended computer or on a
`
`central host computer
`
`linked to a network of computers for
`
`subsequent retrieval by the intended user.
`
`The most common
`
`networks are Local Area Networks
`
`(LAN), a Wide Area Networks
`
`(WAN), and public networks,
`
`such as the Internet, or private
`
`networks. When the intended user accesses his computer, either
`
`the E-mail is already resident, or he finds a message displayed
`
`25
`
`3D
`
`35
`
`SUBS“TUTESHEET(HULE2&
`
`3
`
`
`
`W0 98147252
`
`2
`
`PCTIU598107228
`
`in a graphic editor indicating that he has mail and how he can
`
`retrieve it. Once the E—mail is retrieved, it likewise may be
`
`read,
`
`reviewed and manipulated by the intended user off-line
`
`on the users' computer. Alternately,
`
`it may be outputted to
`
`a printer, providing the user a hard copy for review at his
`convenience.
`
`When.a facsimile machine is unavailable, a facsimile
`
`may be transmitted to a computer or handheld, paperless fax
`
`machine for off—line and independent review by the recipient,
`
`such as Reflection Technology,
`reader.
`
`Inc.'s FaxView personal
`
`fax
`
`Utilities exist
`
`for both facsimile
`
`and
`
`E—mail
`
`messages, whereby messages may be selected from a host by an
`
`authorized user for subsequent
`
`transmission to the user‘s E—
`
`mail address or unattended facsimile machine.
`
`See,
`
`for
`
`example, Duehren et al., U.S. Patent No. 4,918,722.
`
`Recently, with the widespread and growing usage of
`
`the
`
`Internet
`
`and, more particularly, with
`
`the
`
`growing
`
`popularity of WEB sites offering published material in the form
`
`of HTML {Hyper Text Markup Language) documents, utilities have
`
`been created which permit
`
`such files to be selected for
`
`subsequent off—line access and independent review by fax. See,
`
`for example, FactsLine for the Web, by Ibex Technologies, Inc.
`
`Such a utility' makes
`
`the large volume of
`
`information and
`
`graphics offered over
`
`the Internet, available to users who
`
`either do not have access to a computer“ connected to the
`
`Internet, or wish to limit the amount of time spent on—line.
`
`A large percentage of potential users do not have
`
`access to the Internet, or even if they do; may be traveling;
`
`may not have access to their computers; or may not wish to
`
`spend time booting their computer and waiting for Web site
`
`graphics
`
`(utilities such as Web-On Call Voice Browser by
`
`Netphonic Communications,
`
`Inc.
`
`have been introduced which
`
`permit users to access the Internet,
`
`in response to voice
`
`prempts),
`
`to navigate to a document or E—mail of interest,
`
`to
`
`identify a document by number and to have a selected document
`
`read in real—time over the phone using text synthesizing voice
`and faxed back or sent as an e—mail attachment.
`
`Similarly the widespread use of
`
`the Internet and
`
`heavy traffic to particularly popular Web sites or during
`
`SUBSTITUTE SHEET (RULE 26)
`
`10
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`40
`
`4
`
`
`
`W0 QBM‘IZSZ
`
`3
`
`PCTlUS98f07228
`
`particular peak usage times has created a demand for utilities
`
`called off-line browsers which permit
`
`Internet users
`
`to
`
`“subscribe” to particular Web sites from which their computer
`
`then automatically retrieves material during off-peak hours,
`
`categorizes and organizes new and updated information and
`
`permits the user to review it off—line using his browser of
`
`choice (e.g. FreeLoader by FreeLoader, Inc.).
`
`Similarly, subscription services have been introduced
`
`which permit voice mail
`
`to be sent
`
`to an e-mail address and
`
`also permit audio content offered on a Web site to be updated
`
`both by
`
`way of a standard phone call to an interactive voice response
`
`system (e.g.
`(IVR)
`Communications).
`
`“Amail”
`
`and
`
`"Dialweb"
`
`by Telet
`
`Recently, voice processor system manufacturers have
`
`established a work group consisting of more than 60% of
`
`the
`
`world's voice mail system market to develop an Interoperability
`
`standard for a Voice Profile for Internet Mail
`
`(VPIM). TCP/IP
`
`(Transmission Control Protocol/Internet Protocol) has been
`
`selected as
`
`the vehicle of conductivity,
`
`because of
`
`its
`
`globally accessible points of COntact, primarily on
`
`the
`
`Internet,
`
`and because of
`
`its use of
`
`commonly recognized
`
`transmission protocols, specifically simple message transfer
`
`protocol
`
`(SMTP) and Multipurpose Internet Messaging Extension
`
`(MIME)
`
`as
`
`the core of VPIM.
`
`(see April
`
`29 1996 issue of
`
`Business Wire). Once implemented,
`
`interoperable standards such
`
`as VPIM will permit voice mail users to send and receive their
`
`voice messages over the Internet or an Intranet as easily as
`
`they can now do so over the telephone.
`
`In addition to voice messaging and audio e-mail over
`
`the Internet,
`
`the recent
`
`introduction of proprietary client
`
`server
`
`software
`
`systems permits users with conventional
`
`multimedia personal computers and voice grade telephone lines
`to browse,
`select.
`and play back
`audio or
`audio-based
`multimedia content
`in real—time streams (RE) or download on—
`
`demand (REM). An interested user need only download software
`
`from the content provider's Web site to access such audio
`
`content
`
`(e.g. Progressive Network's RealAudio Player
`
`and
`
`10
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`5
`
`
`
`WO 98147252
`
`4
`
`PCT/USQSIMZZS
`
`Server).
`
`Systems such_as this represent a real breakthrough,
`
`since in the past, delivery of audio by conventional on—line
`
`methods downloaded it at such low rates that acquiring the
`
`information took five times as long as the actual program.
`
`This required the listener to wait 25 minutes before listening
`to 5 minutes of audio.
`
`over
`
`As a result of the availability of streaming audio
`
`the Internet, a number of COmpanies have introduced Internet
`
`telephone products which permit users
`
`having multimedia
`
`computers
`
`programmed with proprietary software to talk in real time over
`
`the Internet {see Voclatec). Such a system is useful over long
`
`distances when users can access a local Internet access point
`or
`
`point of presence, making a long distance call
`call.
`
`into a local
`
`10
`
`15
`
`Similarly, as a result of streaming audio over the
`
`Internet, content providers are able to broadcast
`from
`
`20
`
`live audio
`
`a Web site (e.g. AudioNet by Cameron Audio Networks).
`
`Recently
`
`a
`
`standard—based
`
`implementation
`
`communication over
`
`the Internet has been introduced,
`
`for
`
`and
`
`supported by Intel and Microsoft, which makes use of the DSP
`
`Group's TrueSpeech G.723 compression technology. This uses an
`
`advanced algorithm that results in excellent voice quality,
`
`despite a high compression ratio, and operates at 6.3 kilo bits
`
`per second (kbpsiand 5.3 kbps with compression ratios of 20:1
`
`and 24:1, respectively.
`
`It also includes silence compression
`
`which can bring the effective rate down to less than 3.? kbps
`
`at 28.8 kbps modem speed. This would permit the transmission
`of audio at a rate of 1:7.78 or 10 minutes of audio in 1.3
`
`minutes.
`
`Using Texas Instrument's C80 DSP chip using a v.34
`
`modem running at 28.8 kbps, a transmission rate of audio at a
`
`rate of
`
`10:1
`
`(ten minutes
`
`of
`
`speech
`
`in 1 minute
`
`of
`
`transmission)
`
`can be achieved with telephone grade sound
`
`25
`
`30
`
`35
`
`quality.
`
`SUBSTITUTE SHEET (RULE 25)
`
`6
`
`
`
`W0 98M7252
`
`5
`
`PCTIUSQSIG7228
`
`From the above,
`
`it
`
`is apparent
`
`that while
`
`the
`
`transfer of data, graphics and audio messaging and content over
`
`a network has become more widespread and convenient,
`
`this
`
`growth. has also highlighted certain historic shortcomings
`
`associated with the
`
`transfer
`
`and
`
`input/output
`
`of voice
`
`messaging and audio content.
`
`As voice messaging and audio
`
`content become more available,
`
`the deficiency created by the
`
`lack of an intermediate device or storage medium for such audio
`
`will become more pronounced.
`
`For both E—mail and
`
`facsimile, use of a telephone
`
`link
`
`is limited to the transmission of the data and the transmission
`
`of control codes for that data. with the growth and widespread
`
`usage of network computing,
`
`the telephone link for e-mail and
`
`facsimile (e.g. PASSaFAX from RADLinx)
`
`is further limited to
`
`a hook-up to a local point of presence to access the network.
`
`Both e—mail
`
`and
`
`facsimile contain content which may be
`
`outputted by the intended user to a printer, which permits the
`
`user to take a hard copy of the material with him for review
`
`at his convenience, while he
`
`is away from his office or
`
`traveling.
`
`In sharp contrast, voice messages and voice—text are
`
`currently recorded by the sender and retrieved by the intended
`
`recipient primarily in real-time and on—line. At best, a user
`
`can use his multimedia notebook computer to record and access
`
`a stored audio file or streaming voice file. Off~line access
`
`to audio
`
`is
`
`limited to downloading audio files onto a
`
`multimedia computer and having the sound card equipped computer
`
`play the audio.
`
`However,
`
`a nmltimedia computer, with its
`
`screen, keyboard and multipurpose processing capability,
`
`is
`
`hardly the size of a traditional dictation device or voice
`
`recorder.
`
`This dependence
`
`on
`
`a
`
`telephone
`
`hand
`
`set or
`
`multimedia computer to create and access audio is analogous to
`requiring a recipient of a facsimile to view, edit and prepare
`
`a
`
`facsimile only while in. close proximity to a
`
`facsimile
`
`machine or fax enabled computer. Not being able to prepare,
`review and access network based voice mail other than in real-
`
`time from a telephone hand set or off—line from a multimedia
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 25)
`
`7
`
`
`
`WO 98/47252
`
`6
`
`PCT“1898107228
`
`computer, severely limits the desirability of integrating voice
`
`messaging and audio content
`
`into network based Inessaging.
`
`There exist no dedicated and portable devices to store network
`
`based voice messaging and likewise there exists no method or
`
`utility to scan and select personal voice messages or public
`
`announcements from a host connected.to a network for subsequent
`
`high speed transmission to a device for subsequent off—line
`
`review by the user.
`
`The only dedicated device which permits the user to
`
`review his/her voice messages offeline is
`
`the Telephone
`
`Answering Device
`
`(TAD) which is primarily a residential or
`
`small—office, home—office (SOHO) appliance which uses digital
`
`recording technologies to replace the standard functions of a
`
`traditional
`
`tape—based answering machine.
`
`The TAD, plugged
`
`into both an electrical outlet and phone jack is not portable,
`
`so the user must either be within hearing distance of the TAD's
`
`speaker or, using a telephone, may call in to retrieve his/her
`
`messages on-line and in real—time. While traditionally, TAD's
`
`have offered very limited outbound messaging capabilities,
`
`whatever outbound messaging was offered required that the owner
`
`record any outbound message (e.g. a general greeting or caller—
`
`specific/mail box—specific message) either from within range
`
`the microphone on the TAD or from a real—time telephone
`of
`call.
`
`Voice messaging, whether network based or TAD based,
`
`limited to on—line and real—time transmission and physically
`
`requiring access to a telephone set, TAD or multimedia computer
`
`is unfortunate, particularly because
`
`voice communication
`
`inherently does
`
`not
`
`require
`
`any
`
`external
`
`hardware
`
`or
`
`instrumentation other than the mouth and ear for a human being
`
`to create or access it.
`
`Speech is the most natural and self—
`
`‘ sufficient
`
`form of
`
`communication.
`
`Speech
`
`is
`
`hands—free
`
`requiring neither writing instrument,
`keyboard,
`screen,
`dedicated vision or handvto—eye coordination on the part of the
`
`user to input or retrieve. That voice mail is nonetheless so
`
`widely
`
`used
`
`is more
`
`a
`
`function
`
`of
`
`speech's
`
`unique
`
`characteristics than a vote of approval on the adequacy of the
`
`current
`
`technology.
`
`Similarly,
`
`that
`
`so many
`
`innovative
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET {RULE 26)
`
`8
`
`
`
`W0 98(47252
`
`7
`
`PCTIUS98107228
`
`utilities have been introduced which make audio and voice
`
`available over public and private networks is a commentary on
`
`the compelling nature of audio and voice for content, messaging
`
`and issuing commands and only underscores_the need to make
`
`audio and voice more easily available. Until such time that
`
`voice messaging and audio content are made more accessible,
`
`many of the network based audio utilities mentioned above will
`
`remain novelties for technophiles.
`
`Much
`
`has
`
`been
`
`said
`
`about Computer Telephone
`
`10
`
`Integration
`
`15
`
`20
`
`25
`
`30
`
`35
`
`(CTI) and the Universal Mail Box, where network based messages
`
`and content may originate in any medium and by any input device
`
`of choice and,
`
`likewise, may be retrieved in any medium or by
`
`any output device of choice. Faxes can be accessed as data on
`
`a computer screen, data can be accessed as a fax or text—to—
`
`speech audio—text and,
`
`as automatic speech transcription
`
`utilities become more capable,
`
`audio will be accessed as
`
`printed text in email or fax. However, as long as audio does
`
`not have
`
`an input/output device of
`
`choice other
`
`than a
`
`telephone handset or screen/keyboard,basec1multimedia.computer,
`
`its desirability as
`
`a medium of choice will
`
`likewise be
`
`severely limited.
`
`Since speech is a
`
`direct
`
`record
`
`of
`
`the user's
`
`voice,
`
`the urgency, meaning and emotional content
`
`is never
`
`lost.
`
`Similarly, since so much data is first generated in voice and
`
`is only later transcribed to text or data,
`
`info-text should be
`
`the preferred medium for timely data on meetings, speeches and
`
`radio broadcasts.
`
`Ideally, voice mail should be the preferred
`
`mode of
`
`communications when traveling, when, communicating
`
`through time—zones and when accessing timely information which
`
`originated in the spoken word (e.g. minutes of a meeting or
`
`lecture). Voice text {i.e. data or text which is spoken by a
`computer or pre-recorded by a human) should be the preferred
`
`format for messaging information to be accessed where use of
`
`motor skills and vision are not convenient or are impaired such
`as when driving,
`operating
`equipment
`or
`engaged
`in a
`
`leisure activity.
`
`SUBS“TUTESHEET{RULE2Q
`
`9
`
`
`
`WO 98147252
`
`8
`
`PCTJ’US98I'07228
`
`The
`
`current_ use of
`
`a
`
`telephone to access voice
`
`messages
`
`directly has significantly limited the potential utilization
`
`of voice messaging. Real-time transmission of voice messages
`
`and info—text makes the recording and retrieval of voice mail,
`
`especially from long distances. very costly.
`
`The cost and
`
`inconvenience involved means that one cannot compose and review
`voice mail and info-text
`in a cost efficient manner and at
`
`one's
`
`own pace. One is limited to a location and situation in which
`
`is accessible and,
`
`in the case of
`
`a wireless
`
`a t
`
`elephone
`
`communication link,
`
`to a place where wireless transmission is
`
`both possible and desirable.
`
`The application of multimedia computers to compose
`
`and
`
`review voice mail has had little effect on making voice
`
`messaging
`
`10
`
`15
`
`more convenient since the use of keyboards, pointing devices
`and
`
`20
`
`screens is hardly hands-free, nor is the size and expense of
`
`a multimedia
`
`computer
`
`conducive
`
`to widespread
`
`use
`
`and
`
`transportability.
`
`In its present state, voice mail is limited
`
`to short messages between individuals wishing to communicate
`
`in a more substantive fashion at another time (telephone tag).
`
`Voice "mail" becomes limited to voice "messaging" because of
`the cost and inconvenience to both the sender and receiver of
`
`listening to lengthy, content—rich ”mail“ over the phone or at
`
`a multimedia computer". Furthermore,
`
`the cost of transmitting
`
`audio signals in real-time,
`
`through a direct communication link
`
`to the user’s voice processor or TAD, and only when the user
`
`has access to a telephone (as opposed to un—attended recording
`
`at off—peak hours) make more commercial use of
`{recorded
`instructions,
`recorded
`travelogues,
`
`info text
`speech
`
`transcripts, article or books on
`
`"tape" etc.)
`
`and other
`
`innovative advertiser/subscriber supported uses of voice-text
`unfeasible.
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 25)
`
`10
`
`10
`
`
`
`WO 98/47252
`
`9
`
`PCT{U598107228
`
`Recently, U.S. Pat. No. 5,444,768,
`
`issued to Charles
`
`Lamar et al., and assigned to International Business Machines
`
`Corp., and U.S. Pat. No. 5,359,698,
`
`issued to Shmuel Goldberg
`
`et al.
`
`and assigned.
`
`to Espro Engineering_ both disclose a
`
`portable
`
`computer device for audible processing of audio
`
`messages
`
`stored at
`
`one or more
`
`remote central message
`
`facilities. The Lamer et al. system permits the user to record
`
`and playback,
`
`transmit
`
`(upload) and receive (download) voice
`
`messages
`
`from a
`
`central message
`
`facility and
`
`over
`
`a
`
`communication link and onto a portable device; however,
`
`the
`
`Lamer et al. system requires that a direct telephonic link be
`
`established between the portable device and one or more remote
`
`central message facilities.
`
`The Lamer et al. and Goldberg et
`
`al. systems enable the portable device to individually access
`
`a traditional, closed, expensive, proprietary voice processing
`
`system through a direct communication link.
`
`The Lamar et al.
`
`and Goldberg et al.
`
`systems do not provide a commercially
`
`feasible solution for accessing voice mail other than by way
`
`of a long distance call
`
`to a central message facility.
`
`The
`
`expense associated with such a long distance toll charge would
`
`make extended usage of the Lamer et al.
`
`system prohibitive.
`
`In addition,
`
`the Lamar et al.
`
`system requires that
`
`a user
`
`remote central message facilities to
`contact one or more
`retrieve and transmit selected audio files. The inconvenience
`
`aesociated with such
`
`a polling procedure nullifies
`
`the
`
`convenience provided by the system.
`
`Similarly,
`
`the Lamer et al. system does not provide
`
`for a method by which the user may browse available audio
`content nor for a method to select audio files from a menu for
`
`subsequent
`
`retrieval
`
`by
`
`the portable
`
`computer
`
`device.
`
`Similarly,
`
`the Lamar et al.
`
`system does not provide for a
`
`utility whereby the user may remotely access a central server
`
`linked to a network of servers to download control code, search
`
`a personal user group or public database for an address other
`
`than by way of initiating a dedicated "training" mode by either
`
`coupling the portable computer device directly to a computer
`
`or by way of detecting and recording DTMF
`
`tones generated
`
`locally by a standard touchatone telephone device.
`
`Since a
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`11
`
`11
`
`
`
`WO 98/47252
`
`l U
`
`PCTI'USQSIMZZS
`
`typical user's mail box utilities are handled on his network
`
`e~mail server and modified regularly in the course of his
`
`sending and receiving e—mail, such a dedicated training session
`
`for the portable computer device is impractical. Similarly,
`
`since new audio server platforms, utilities and compression
`
`schemes are being introduced regularly,
`
`there is a need for a
`
`dynamic and transparent method for updating both control codes
`
`and address books without
`session.
`
`the need for a dedicated training
`
`Broadly, it is an object of the present invention to
`
`provide
`
`an
`
`Internetaready dictation and
`
`voice message
`
`recording/reviewing device and method which enable a user to
`
`compose and review voice mail offvline,
`
`from any location,
`
`while engaged in any activity, at a leisurely pace, without
`
`incurring telephone toll charges and whether a communication
`
`link is presently accessible or not.
`
`It is also an object of the present invention to use
`
`local network access point
`
`a t
`
`elephone link preferably to a
`
`primarily as a communications link for high speed transmission
`
`of pre—recorded material and control codes to facilitate that
`
`transmission,
`
`thereby limiting the use of a telephone or a
`
`multimedia computer and telephone line for voice messaging as
`
`a recording or playback device.
`
`It
`
`is also an object of
`
`the present
`
`invention to
`
`provide a protocol whereby pre—message handshaking occurs
`
`between a dictation and. voice message
`
`recording/reviewing
`
`device and a network server to conform the digitized voice
`
`signal to one of the standard voice compression protocols and
`
`TCP/IP protocol stacks to facilitate a high speed transmission
`
`of voice messages over the network.
`
`It
`
`is another object of
`
`the present
`
`invention to
`
`provide
`
`a portable
`
`and dedicated voice
`
`capable
`
`network
`
`(Internet) access device which enables the user to record, edit
`
`and play audio files which may be transmitted and/or received
`
`over a public or private network.
`
`It
`
`is also an object of
`
`the present
`
`invention to
`
`provide a portable acoess device and method which permit
`
`the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSHTUTESHEET(RULEZQ
`
`12
`
`12
`
`
`
`WO 98147252
`
`11
`
`PCTIUS98I07228
`
`owner“ of
`
`a specially modem-configured Telephone Answering
`
`Device (MeTAD)
`
`to access and download compressed voice message
`
`files directly from the TAD's digital memory onto a portable
`
`voice message record/playback device either_by way of a direct
`
`cable connection to the TAD or by a telephone link.
`
`Providing such a portable access device and method
`
`would permit TAD owners to encourage inbound callers to leave
`
`more robust and data—rich audio messages on their TAD as well
`
`as permit TAD owners to subscribe to audio content which could
`
`be regularly delivered to their TAD in compressed digital form
`
`and downloaded onto the present
`
`invention for play—back and
`
`review at a convenient time and place. This would also permit
`
`TAD owners, while away from their home or office to have their
`
`portable dictation and‘voice message recording/reviewing'device
`
`establish a telephone link with their TAD and economically and
`
`automatically retrieve all stored messages and update all
`
`outgoing messages (e.g. general and caller specific greetings),
`
`with all
`
`stored messages
`
`and
`
`outbound greetings being
`
`transmitted in digitized and compressed format.
`
`The invention_provides a low cost, portable recording
`
`and playback dictation and voice message recording/reviewing
`
`device which permits the user to record, edit, play and review
`
`voice messages including audio—text,
`
`textwto-speech and other
`
`audio material which may be received from and subsequently
`
`transmitted to a remote host computer located on a public or
`
`private network over a communication
`
`link such as the public
`
`switched telephone system.
`
`A preferred device contains its own rechargeable
`
`power' source,
`
`integrated circuitry and control buttons to
`
`permit the localized recording, editing, storage, playback and
`
`transcription of audio signals through a built-in speaker,
`
`microphone or plug—in headset, foot pedal and removable memory
`
`card.
`
`The device also contains a standard RJ—ll
`
`telephone
`
`10
`
`15
`
`20
`
`25
`
`30
`
`removable PCMCIA
`a
`(or software}, or
`jack, modem chip set
`connector to which a standard or wireless modem card could be
`
`35
`
`connected, and a DTMF tone decoder to permit the transmission
`
`and control of audio signals to and from a host computer
`
`connected to a public or private network.
`
`The device contains
`
`SUBSTITUTE SHEET (RULE 26)
`
`13
`
`13
`
`
`
`WO 98147252
`
`3. 2
`
`PCTIUS98/07228
`
`circuitry which permits it to transmit and receive audio
`
`signals
`recorded.
`
`at
`
`a
`
`rate substantially faster
`
`than originally
`
`A preferred device also contains_a processor which
`
`includes the necessary terminal emulation to permit a network
`
`user to access a network directly from a local point of access,
`
`such as an Internet service provider's (ISP) point of access
`
`and shell account, using a standard protocol
`
`such as SMTP
`
`(Simple Mail Transfer Protocol), Post Office Protocol
`
`(POP3}
`
`and MIME (Multipurpose Internet Mail Extensions) in the TCP/IP
`
`suite to review, select and retrieve audio files that have been
`
`sent
`
`to the user's e—mail address
`
`(or similarly, data/text
`
`files which can be translated into voice), and to download and
`transmit such files.
`
`A preferred device also contains
`
`a
`
`standard or
`
`touchscreen display and software which permits the user
`
`to
`
`display a similar graphical editor for composing and reading
`
`e-mail messages as is displayed on his computer screen when
`
`accessing his e~mail, so that the user can scroll through his
`
`e-mail messages,
`
`selecting those audio files he wishes to
`
`download and selecting text messages
`
`he wishes
`
`to have
`
`converted, either by the network server or at the device,
`
`into
`
`an audio format
`
`(text—to-speech).
`
`A.preferred device also contains: a cradle into which
`
`the device may be placed,
`
`the cradle having ports which enable
`
`it to be connected to a power source to recharge the device's
`
`batteries;
`
`a
`
`phone
`
`jack to enable
`
`it
`
`to establish a
`
`c0mmunication link; and a serial or parallel port on a computer
`
`for downloading and uploading files directly to the computer
`
`or for receiving "redirected" files.
`
`A preferred device also contains a language user
`
`interface capable of recognizing and responding to speech with
`
`independent
`speaker
`Such an interface includes
`speech.
`functions but also permits speaker adaptation which allows the
`
`personal device to adjust to the peculiarities of the user's
`
`voice or pronunciations
`
`and thus
`
`improve accuracy.
`
`This
`
`speaker adaptation is achieved through a protocol which allows
`
`the system to adapt to the users voice through the repetition
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`14
`
`14
`
`
`
`W0 QBMTZSZ
`
`l 3
`
`PCTM598/07228
`
`of a set of sentences_prior to first use of the device (See
`
`Lernout
`
`& Hauspie Speech Product's
`
`[LHSP]
`
`asrlOOO product
`
`line).
`
`The language interface includes a vocabulary builder
`
`which permits the user
`
`to extend the vocabulary including
`
`special
`
`terms
`
`and, proper nouns
`
`to the speech recognition
`
`applicatiOn (see LHSP LextoolTM), a user template which enables
`the user to create words which the device will associate with
`
`user defined commands e.g.
`
`"home” could be associated with an
`
`
`address
`
`{LHSP
`
`asr
`
`200
`
`product
`
`line),
`
`alphabet
`
`recognition for
`
`spelling an
`
`
`address
`
`as well
`
`as
`
`background noise tolerance and speech at a distance software
`
`which improve the accuracy of the language user interface even
`
`in_an.automobile, airplane or public place and even if the user
`
`is not wearing a headset.
`
`(see LHSP)
`
`A,
`
`preferred,
`
`device
`
`also
`
`contains
`
`public-key
`
`encryption technology designed to ensure reliable and secure
`
`transmission of
`
`sensitive
`
`information by encrypting and
`
`decrypting the message data and.by authenticating the sender's
`
`identity by using a secure digital or voice signature.
`
`A preferred device also contains a
`
`text-to-speech
`
`utility which permits the user to download data not already
`
`converted to speech by a network server and to do so at
`device.
`
`the
`
`A preferred device also contains a bar code reader
`
`which permits the user to scan a printed bar code associated
`
`with printed matter such as a news article,
`
`a map, a menu of
`
`available audio files or in a travel guide which would give the
`
`device all the information it needs including network server
`
`address,
`
`file location and file ID so that
`
`the audio file
`
`associated with the printed. matter' could be automatically
`retrieved from a network such as the Internet.
`
`A preferred device also contains a bar code reader
`
`which permits the user to scan a printed bar code associated
`with printed matter such as a news article,
`a map,
`a menu of
`
`available audio files or in a travel guide which would give the
`
`device all
`
`the information it needs to play a file from a
`
`previously retrieved group of audio files (such as described
`
`in Goldberg et al.).
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 25)
`
`15
`
`15
`
`
`
`W0 98M'7252
`
`1 4
`
`PCT/(1398107228
`
`A preferred device
`
`also contains
`
`an
`
`Infrared
`
`interface
`
`using a
`
`standard
`
`such
`
`as
`
`the
`
`Infrared Data
`
`Association (IrDA) for
`
`high speed local wireless transmission {e.g. 1.2 Mbps and
`
`4Mbps) of audio files and control codes between the device and
`
`a public
`
`phone, kiosk or the users' computer.
`
`A prefe