`
`WORLD INTELLECTUAL PROPERTY ORGANIZATION
`International Bureau
`
`11 April 1998 (11.04.98)
`
`
`INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`
` (11) International Publication Number:
`(51) International Patent Classification © :
`WO 98/47252
`
`
`HO04J 1/00
`
`(43) International Publication Date:
`22 October 1998 (22.10.98)
`
`
`
`(21) International Application Number: PCT/US98/07228|(81) Designated States: AL, AM, AT, AU, AZ, BA, BB, BG, BR,
`i
`BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, GE,
`GW, HU, ID,IL, IS, JP, KE, KG, KP, KR, KZ, LC, LK,
`
`(22) International Filing Date:
`
`
`LR, LS, LT, LU, LV, MD, MG, MK, MN, MW, MX, NO,
`NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM,
`
`
`TR, TT, UA, UG, US, UZ, VN, YU, ZW, ARIPO patent
`(30) Priority Data:
`(GH, GM, KE, LS, MW, SD, SZ, UG, ZW), Eurasian patent
`
`
`11 April 1997 (11.04.97)60/043,302 US
`
`(AM,AZ, BY, KG, KZ, MD, RU, TJ, TM), European patent
`
`(AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT,
`
`LU, MC, NL, PT, SE), OAPI patent (BF, BJ, CF, CG, Cl,
`(71)(72) Applicant and Inventor: STERN, Geoffrey [US/US]; 9
`CM, GA, GN, ML, MR,NE, SN, TD, TG).
`Apache Trail, Westport, CT 06880 (US).
`
`
`
`(72) Inventor; and
`
`
`(75) Inventor/Applicant (for US only): WEXLER, Gil
`[IL/IL];
`Beeri Street 54, 64233 Tel Aviv (IL).
`
`
`
`
`(74) Agents: LERCH,Joseph, B. et al.; Darby & Darby P.C., 805
`Third Avenue, New York, NY 10022-7513 (US).
`
`Published
`Without international search report and to be republished
`upon receipt of that report.
`
`
`
`
`i!
`
`1
`
`GOOGLE 1006
`
`
`
`
`(54) Title: PERSONAL AUDIO MESSAGE PROCESSOR AND METHOD
`
`(57) Abstract
`
`A portable device is disclosed which permits the user to record, edit, play and review voice messages and other audio material
`which may be received from, and subsequently transmitted to, a remote voice processing orinteractive voice response (IVR) host computer
`over a communication link. A preferred device contains its own power source,
`integrated circuitry and control buttons to permit the
`localized recording, editing, storage and playback of audio signals through a built-in speaker, microhpone and removable memory card.
`The device also contains a standard RJ-I1 telephone jack, modem chip set and DTMF tone decoder to permit the transmission and control
`of audio signals to and from a host computer, The device contains circuitry which permitsit to transmit and receive audio signals at a rate
`substantially faster than originally recorded.
`
`
`
`1
`
`GOOGLE 1006
`
`
`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`Codes used to identify States party to the PCT onthe front pages of pamphlets publishing international applications under the PCT.
`
`Singapore
`
`Albania
`Armenia
`Austria
`Australia
`Azerbaijan
`Bosnia and Herzegovina
`Barbados
`Belgium
`Burkina Faso
`Bulgaria
`Benin
`Brazil
`Belarus
`Canada
`Central African Republic
`Congo
`Switzerland
`Céte d'Ivoire
`Cameroon
`China
`-
`Cuba
`Czech Republic
`Germany
`Denmark
`Estonia
`
`ES
`FI
`FR
`GA
`GB
`GE
`
`GN
`GR
`HU
`TE
`IL
`Is
`IT
`JP
`KE
`KG
`KP
`
`KR
`KZ
`LC
`LI
`LK
`LR
`
`Spain
`Finland
`France
`Gabon
`United Kingdom
`Georgia
`Ghana
`Guinea
`Greece
`Hungary
`Treland
`Israel
`Iceland
`Ttaly
`Japan
`Kenya
`Kyrgyzstan
`Democratic People’s
`Republic of Korea
`Republic of Korea
`Kazakstan
`Saint Lucia
`Liechtenstein
`Sri Lanka
`Liberia
`
`LS
`LT
`LU
`LV
`MC
`MD
`MG
`MK
`
`ML
`MN
`MR
`MW
`MX
`NE
`NL
`NO
`NZ
`PL
`PT
`RO
`RU
`sD
`SE
`8G
`
`Lesotho
`Lithuania
`Luxembourg
`Latvia
`Monaco
`Republic of Moldova
`Madagascar
`The former Yugoslav
`Republic of Macedonia
`Mali
`Mongolia
`Mauritania
`Malawi
`Mexico
`Niger
`Netherlands
`Norway
`New Zealand
`Poland
`Portugal
`Romania
`Russian Federation
`Sudan
`Sweden
`
`SI
`SK
`SN
`SZ
`TD
`TG
`TJ
`T™.
`TR
`Fr
`UA
`UG
`us
`UZ
`VN
`YU
`ZW
`
`Slovenia
`Slovakia
`Senegal
`Swaziland
`Chad
`Togo
`‘Tajikistan
`Turkmenistan
`Turkey
`Trinidad and Tobago
`Ukraine
`Uganda
`United States of America
`Uzbekistan
`Viet Nam
`Yugoslavia
`Zimbabwe
`
`2
`
`
`
`WO 98/47252
`
`PCT/US98/07228
`
`PERSONAL AUDIO MESSAGE PROCESSOR AND METHOD
`
`
`
`FieldoftheInvention
`
`The present invention relates generally to dictation
`audio communication devices
`and, more particularly,
`and
`concerns
`a method
`and
`portable
`apparatus
`for
`audio
`communication,
`including the recording and editing of voice
`mail and audio content and its transmission and reception over
`a private or public network, such as the Internet, using common
`electrical communication media or data links.
`
`10
`
`a
`
`Background of the Invention
`
`All electronic message systems, with the exception
`of voice-mail, have
`intermediate devices or storage media
`be
`whereby data may
`transferred, preferably at
`a high
`transmission rate, over a standard communication link and
`
`stored in a storage medium or onto an unattended device for
`
`later off-line access, review and editing by the intended user.
`
`In the case of a facsimile transmission, an image is
`scanned by the transmitter and then transmitted and ultimately
`printed at
`a remote site for off-line utilization by the
`intended receiver.
`In the case of electronic mail, data is
`
`generated on a computer and then transmitted and stored either
`
`directly on the intended user's unattended computer or on a
`central host computer
`linked to a network of computers for
`subsequent retrieval by the intended user.
`The most common
`
`networks are Local Area Networks
`(LAN), a Wide Area Networks
`(WAN), and public networks,
`such as the Internet, or private
`networks. When the intended user accesses his computer, either
`the E-mail is already resident, or he finds a message displayed
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`3
`
`
`
`WO 98/47252
`
`2
`
`PCT/US98/07228
`
`in a graphic editor indicating that he has mail and how he can
`retrieve it. Once the E-mail is retrieved, it likewise may be
`read,
`reviewed and manipulated by the intended user off-line
`on the users' computer. Alternately,
`it may be outputted to
`
`a printer, providing the user a hard copy for review at his
`convenience.
`
`When a facsimile machine is unavailable, a facsimile
`
`may be transmitted to a computer or handheld, paperless fax
`machine for off-line and independent review by the recipient,
`such as Reflection Technology,
`Ine.'s FaxView personal
`fax
`reader.
`
`Utilities exist
`
`for both facsimile
`
`and E-mail
`
`messages, whereby messages may be selected from a host by an
`authorized user for subsequent
`transmission to the user's E-
`
`mail address or unattended facsimile machine.
`
`See,
`
`for
`
`example, Duehren et al., U.S. Patent No. 4,918,722.
`
`Recently, with the widespread and growing usage of
`Internet
`the
`and, more particularly, with
`the
`growing
`popularity of WEB sites offering published material in the form
`
`of HTML (Hyper Text Markup Language) documents, utilities have
`
`been created which permit
`
`such files to be selected for
`
`subsequent off-line access and independent review by fax. See,
`
`for example, FactsLine for the Web, by Ibex Technologies, Inc.
`
`Such a utility makes
`
`the large volume of
`
`information and
`
`graphics offered over
`
`the Internet, available to users who
`
`either do not have access to a computer connected to the
`
`Internet, or wish to limit the amount of time spent on-line.
`
`A large percentage of potential users do not have
`
`access to the Internet, or even if they do; may be traveling;
`may not have access to their computers; or may not wish to
`spend time booting their computer and waiting for Web site
`graphics
`(utilities such as Web-On Call Voice Browser by
`Netphonic Communications,
`Inc.
`have been introduced which
`permit users to access the Internet,
`in response to voice
`prompts),
`to navigate to a document or E-mail of interest,
`to
`identify a document by number and to have a selected document
`read in real-time over the phone using text synthesizing voice
`and faxed back or sent as an e-mail attachment.
`
`Similarly the widespread use of
`
`the Internet and
`
`heavy traffic to particularly popular Web sites or during
`
`SUBSTITUTE SHEET (RULE 26)
`
`10
`
`LS
`
`20
`
`25
`
`30
`
`35
`
`40
`
`4
`
`
`
`WO 98/47252
`
`3
`
`PCT/US98/07228
`
`particular peak usage times has created a demand for utilities
`
`to
`Internet users
`called off-line browsers which permit
`"Subscribe" to particular Web sites from which their computer
`
`then automatically retrieves material during off-peak hours,
`
`categorizes and organizes new and updated information and
`
`permits the user to review it off-line using his browser of
`
`choice (e.g. FreeLoader by FreeLoader, Inc.).
`
`Similarly, subscription services have been introduced
`which permit voice mail
`to be sent
`to an e-mail address and
`
`also permit audio content offered on a Web site to be updated
`
`both by
`way of a standard phone call to an interactive voice response
`
`system (e.g.
`(IVR)
`Communications) .
`
`"Amail"
`
`and
`
`"Dialweb"
`
`by Telet
`
`Recently, voice processor system manufacturers have
`
`established a work group consisting of more than 60% of
`
`the
`
`world's voice mail system market to develop an Interoperability
`
`standard for a Voice Profile for Internet Mail
`
`(VPIM). TCP/IP
`
`(Transmission Control Protocol/Internet Protocol) has been
`
`selected as
`
`the vehicle of conductivity,
`
`because of
`
`globally accessible points of contact, primarily on
`
`its
`
`the
`
`commonly recognized
`its use of
`and because of
`Internet,
`transmission protocols, specifically simple message transfer
`
`protocol
`
`(SMTP) and Multipurpose Internet Messaging Extension
`
`(MIME)
`
`as
`
`the core of VPIM.
`
`(see April 29 1996 issue of
`
`Business Wire). Once implemented,
`
`interoperable standards such
`
`as VPIM will permit voice mail users to send and receive their
`
`voice messages over the Internet or an Intranet as easily as
`
`they can now do so over the telephone.
`
`In addition to voice messaging and audio e-mail over
`
`the Internet,
`
`the recent
`
`introduction of proprietary client
`
`server
`
`software
`
`systems permits users with conventional
`
`multimedia personal computers and voice grade telephone lines
`to browse,
`select,
`and play back
`audio or
`audio-based
`multimedia content
`in real-time streams (RE) or download on-
`
`demand (REM). An interested user need only download software
`
`from the content provider's Web site to access such audio
`content
`(e.g. Progressive Network's RealAudio Player
`and
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`5
`
`
`
`WO 98/47252
`
`4
`
`PCT/US98/07228
`
`Systems such as this represent a real breakthrough,
`Server).
`since in the past, delivery of audio by conventional on-line
`
`methods downloaded it at such low rates that acquiring the
`
`information took five times as long as the actual program.
`
`This required the listener to wait 25 minutes before listening
`to 5 minutes of audio.
`
`As a result of the availability of streaming audio
`
`over
`
`the Internet, a number of companies have introduced Internet
`
`telephone products which permit users
`
`having multimedia
`
`computers
`programmed with proprietary software to talk in real time over
`
`the Internet (see Voclatec). Such a system is useful over long
`distances when users can access a local Internet access point
`or
`
`10
`
`Ls
`
`point of presence, making a long distance call
`Gal...
`
`into a local
`
`Similarly, as a result of streaming audio over the
`
`Internet, content providers are able to broadcast
`from
`
`20
`
`live audio
`
`a Web site (e.g. AudioNet by Cameron Audio Networks).
`
`Recently
`
`a
`
`standard-based
`
`implementation
`
`communication over
`
`the Internet has been introduced,
`
`for
`
`and
`
`25
`
`30
`
`supported by Intel and Microsoft, which makes use of the DSP
`Group's TrueSpeech G.723 compression technology. This uses an
`advanced algorithm that results in excellent voice quality,
`
`despite a high compression ratio, and operates at 6.3 kilo bits
`per second (kbps)and 5.3 kbps with compression ratios of 20:1
`and 24:1, respectively.
`It also includes silence compression
`
`which can bring the effective rate down to less than 3.7 kbps
`
`at 28.8 kbps modem speed. This would permit the transmission
`ef audio at a rate of 1:7.78 or 10 minutes of audio in 1.3
`
`minutes.
`
`Using Texas Instrument's C80 DSP chip using a v.34
`modem running at 28.8 kbps, a transmission rate of audio at a
`
`35
`
`rate of
`
`10:1
`
`(ten minutes
`
`of
`
`speech
`
`in 1 minute
`
`of
`
`transmission)
`
`can be achieved with telephone grade sound
`
`quality.
`
`SUBSTITUTE SHEET (RULE 26)
`
`6
`
`
`
`WO 98/47252
`
`5
`
`PCT/US98/07228
`
`From the above,
`
`it
`
`is apparent
`
`that while
`
`the
`
`transfer of data, graphics and audio messaging and content over
`
`a network has become more widespread and convenient,
`
`this
`
`growth has also highlighted certain historic shortcomings
`associated with the
`transfer
`and
`input/output
`of voice
`messaging and audio content.
`As voice messaging and audio
`
`content become more available,
`
`the deficiency created by the
`
`lack of an intermediate device or storage medium for such audio
`
`will become more pronounced.
`
`For both E-mail and
`
`facsimile, use of a telephone
`
`link
`
`is limited to the transmission of the data and the transmission
`
`of control codes for that data. With the growth and widespread
`usage of network computing,
`the telephone link for e-mail and
`
`facsimile (e.g. PASSaFAX from RADLinx)
`
`is further limited to
`
`a hook-up to a local point of presence to access the network.
`
`facsimile contain content which may be
`and
`Both e-mail
`outputted by the intended user to a printer, which permits the
`user to take a hard copy of the material with him for review
`at his convenience, while he
`is away from his office or
`
`traveling.
`
`10
`
`15
`
`20
`
`In sharp contrast, voice messages and voice-text are
`
`currently recorded by the sender and retrieved by the intended
`
`recipient primarily in real-time and on-line. At best, a user
`can use his multimedia notebook computer to record and access
`
`25
`
`a stored audio file or streaming voice file. Off-line access
`
`to audio
`
`is
`
`limited to downloading audio files onto a
`
`multimedia computer and having the sound card equipped computer
`play the audio.
`However,
`a multimedia computer, with its
`
`screen, keyboard and multipurpose processing capability,
`
`is
`
`hardly the size of a traditional dictation device or voice
`
`set or
`hand
`telephone
`a
`on
`This dependence
`recorder.
`multimedia computer to create and access audio is analogous to
`requiring a recipient of a facsimile to view, edit and prepare
`a
`facsimile only while in close proximity to a
`facsimile
`machine or fax enabled computer. Not being able to prepare,
`review and access network based voice mail other than in real-
`
`time from a telephone hand set or off-line from a multimedia
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`7
`
`
`
`WO 98/47252
`
`6
`
`PCT/US98/07228
`
`computer, severely limits the desirability of integrating voice
`messaging and audio content
`into network based messaging.
`
`There exist no dedicated and portable devices to store network
`
`based voice messaging and likewise there exists no method or
`
`utility to scan and select personal voice messages or public
`
`announcements from a host connected to a network for subsequent
`
`high speed transmission to a device for subsequent off-line
`
`review by the user.
`The only dedicated device which permits the user to
`
`10
`
`review his/her voice messages off-line is
`
`the Telephone
`
`(TAD) which is primarily a residential or
`Answering Device
`small-office, home-office (SOHO) appliance which uses digital
`
`recording technologies to replace the standard functions of a
`
`15
`
`The TAD, plugged
`tape-based answering machine.
`traditional
`into both an electrical outlet and phone jack is not portable,
`
`so the user must either be within hearing distance of the TAD's
`
`speaker or, using a telephone, may call in to retrieve his/her
`messages on-line and in real-time. While traditionally, TAD's
`have offered very limited outbound messaging capabilities,
`
`whatever outbound messaging was offered required that the owner
`
`record any outbound message (e.g. a general greeting or caller-
`specific/mail box-specific message) either from within range
`of
`the microphone on the TAD or from a real-time telephone
`call.
`
`Voice messaging, whether network based or TAD based,
`
`limited to on-line and real-time transmission and physically
`
`requiring access to a telephone set, TAD or multimedia computer
`
`is unfortunate, particularly because
`
`voice communication
`
`inherently does
`
`not
`
`require
`
`any
`
`external
`
`hardware
`
`or
`
`instrumentation other than the mouth and ear for a human being
`
`to create or access it.
`
`Speech is the most natural and self-
`
`is hands-free
`Speech
`communication.
`form of
`_ sufficient
`requiring neither writing instrument,
`keyboard,
`screen,
`dedicated vision or hand-to-eye coordination on the part of the
`user to input or retrieve. That voice mail is nonetheless so
`
`widely
`
`used
`
`is more
`
`a
`
`function
`
`of
`
`speech's
`
`unique
`
`characteristics than a vote of approval on the adequacy of the
`
`current
`
`technology.
`
`Similarly,
`
`that
`
`so many
`
`innovative
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`8
`
`
`
`WO 98/47252
`
`7
`
`PCT/US98/07228
`
`utilities have been introduced which make audio and voice
`
`available over public and private networks is a commentary on
`the compelling nature of audio and voice for content, messaging
`
`and issuing commands and only underscores the need to make
`
`audio and voice more easily available. Until such time that
`voice messaging and audio content are made more accessible,
`many of the network based audio utilities mentioned above will
`
`remain novelties for technophiles.
`
`Much
`
`has
`
`been
`
`said
`
`about Computer Telephone
`
`10
`
`Integration
`
`(CTI) and the Universal Mail Box, where network based messages
`and content may originate in any medium and by any input device
`of choice and,
`likewise, may be retrieved in any medium or by
`any output device of choice. Faxes can be accessed as data on
`
`a computer screen, data can be accessed as a fax or text-to-
`
`speech audio-text and,
`
`as automatic speech transcription
`
`utilities become more capable,
`
`audio will be accessed as
`
`printed text in email or fax. However, as long as audio does
`not have
`an input/output device of
`choice other
`than a
`telephone handset or screen/keyboard based multimedia computer,
`its desirability as
`a medium of choice will
`likewise be
`
`severely limited.
`
`Since speech is a
`
`direct
`
`record
`
`of
`
`the user's
`
`voice,
`
`lost.
`is never
`the urgency, meaning and emotional content
`Similarly, since so much data is first generated in voice and
`
`is only later transcribed to text or data,
`
`info-text should be
`
`the preferred medium for timely data on meetings, speeches and
`
`Ideally, voice mail should be the preferred
`radio broadcasts.
`mode of
`communications when traveling, when communicating
`through time-zones and when accessing timely information which
`originated in the spoken word (e.g. minutes of a meeting or
`lecture). Voice text (i.e. data or text which is spoken by a
`computer or pre-recorded by a human) should be the preferred
`format for messaging information to be accessed where use of
`motor skills and vision are not convenient or are impaired such
`as when driving,
`operating
`equipment
`or
`engaged
`ina
`leisure activity.
`
`LS
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET (RULE 26)
`
`9
`
`
`
`WO 98/47252
`
`messages
`
`8
`
`PCT/US98/07228
`
`The current use of
`
`a
`
`telephone to access voice
`
`directly has significantly limited the potential utilization
`
`of voice messaging. Real-time transmission of voice messages
`
`and info-text makes the recording and retrieval of voice mail,
`
`especially from long distances, very costly.
`
`The cost and
`
`inconvenience involved means that one cannot compose and review
`voice mail and info-text
`in a cost efficient manner and at
`
`one's
`
`own pace. One is limited to a location and situation in which
`
`is accessible and,
`
`in the case of
`
`a wireless
`
`a t
`
`elephone
`
`communication link,
`
`to a place where wireless transmission is
`
`both possible and desirable.
`
`The application of multimedia computers to compose
`
`and
`
`review voice mail has had little effect on making voice
`
`messaging
`more convenient since the use of keyboards, pointing devices
`and
`
`screens is hardly hands-free, nor is the size and expense of
`a multimedia
`computer
`conducive
`to widespread
`use
`and
`transportability.
`In its present state, voice mail is limited
`
`to short messages between individuals wishing to communicate
`
`in a more substantive fashion at another time (telephone tag).
`Voice "mail" becomes limited to voice "messaging" because of
`the cost and inconvenience to both the sender and receiver of
`
`10
`
`5
`
`20
`
`25
`
`listening to lengthy, content-rich "mail" over the phone or at
`
`30
`
`the cost of transmitting
`a multimedia computer". Furthermore,
`audio signals in real-time,
`through a direct communication link
`
`to the user's voice processor or TAD, and only when the user
`
`has access to a telephone (as opposed to un-attended recording
`
`35
`
`info text
`at off-peak hours) make more commercial use of
`speech
`(recorded
`instructions,
`recorded
`travelogues,
`and other
`transcripts, article or books
`on
`"tape" etc.)
`innovative advertiser/subscriber supported uses of voice-text
`unfeasible.
`
`SUBSTITUTE SHEET (RULE 26)
`
`10
`
`10
`
`
`
`WO 98/47252
`
`9
`
`PCT/US98/07228
`
`Recently, U.S. Pat. No. 5,444,768,
`
`issued to Charles
`
`Lamer et al., and assigned to International Business Machines
`
`issued to Shmuel Goldberg
`Corp., and U.S. Pat. No. 5,359,698,
`et al.
`and assigned to Espro Engineering both disclose a
`
`portable
`
`computer device for audible processing of audio
`
`remote central message
`one or more
`stored at
`messages
`facilities. The Lamer et al. system permits the user to record
`
`and playback,
`
`transmit
`
`(upload) and receive (download) voice
`
`over
`facility and
`central message
`from a
`messages
`a
`communication link and onto a portable device; however,
`the
`Lamer et al. system requires that a direct telephonic link be
`established between the portable device and one or more remote
`central message facilities.
`The Lamer et al. and Goldberg et
`al. systems enable the portable device to individually access
`a traditional, closed, expensive, proprietary voice processing
`system through a direct communication link.
`The Lamer et al.
`
`systems do not provide a commercially
`and Goldberg et al.
`feasible solution for accessing voice mail other than by way
`of a long distance call
`to a central message facility.
`The
`expense associated with such a long distance toll charge would
`
`make extended usage of the Lamer et al.
`
`system prohibitive.
`
`In addition,
`
`the Lamer et al.
`
`system requires that
`
`a user
`
`remote central message facilities to
`contact one or more
`retrieve and transmit selected audio files. The inconvenience
`
`associated with such
`
`a polling procedure nullifies
`
`the
`
`convenience provided by the system.
`
`the Lamer et al. system does not provide
`Similarly,
`for a method by which the user may browse available audio
`content nor for a method to select audio files from a menu for
`
`device.
`computer
`the portable
`by
`retrieval
`subsequent
`the Lamer et al.
`system does not provide for a
`Similarly,
`utility whereby the user may remotely access a central server
`linked to a network of servers to download control code, search
`a personal user group or public database for an address other
`than by way of initiating a dedicated "training" mode by either
`coupling the portable computer device directly to a computer
`or by way of detecting and recording DTMF
`tones generated
`locally by a standard touch-tone telephone device.
`Since a
`
`10
`
`des
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET(RULE 26)
`
`11
`
`11
`
`
`
`WO 98/47252
`
`10
`
`PCT/US98/07228
`
`typical user's mail box utilities are handled on his network
`e-mail server and modified regularly in the course of his
`sending and receiving e-mail, such a dedicated training session
`for the portable computer device is impractical. Similarly,
`since new audio server platforms, utilities and compression
`schemes are being introduced regularly,
`there is a need for a
`dynamic and transparent method for updating both control codes
`and address books without
`the need for a dedicated training
`session.
`
`Broadly, it is an object of the present invention to
`an MInternet-ready dictation and
`voice message
`provide
`recording/reviewing device and method which enable a user to
`compose and review voice mail off-line,
`from any location,
`while engaged in any activity, at a leisurely pace, without
`incurring telephone toll charges and whether a communication
`link is presently accessible or not.
`
`It is also an object of the present invention to use
`
`a t
`
`local network access point
`elephone link preferably to a
`primarily as a communications link for high speed transmission
`of pre-recorded material and control codes to facilitate that
`
`thereby limiting the use of a telephone or a
`transmission,
`multimedia computer and telephone line for voice messaging as
`a recording or playback device.
`
`It
`is also an object of
`the present
`invention to
`provide a protocol whereby pre-message handshaking occurs
`between a dictation and voice message
`recording/reviewing
`device and a network server to conform the digitized voice
`signal to one of the standard voice compression protocols and
`TCP/IP protocol stacks to facilitate a high speed transmission
`of voice messages over the network.
`
`invention to
`the present
`is another object of
`It
`a portable
`and dedicated voice
`capable
`network
`provide
`(Internet) access device which enables the user to record, edit
`and play audio files which may be transmitted and/or received
`over a public or private network.
`
`It
`is also an object of
`the present
`invention to
`provide a portable access device and method which permit
`the
`
`10
`
`5
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET(RULE 26)
`
`12
`
`12
`
`
`
`WO 98/47252
`
`eles
`
`PCT/US98/07228
`
`a specially modem-configured Telephone Answering
`owner of
`Device (M-TAD)
`to access and download compressed voice message
`files directly from the TAD's digital memory onto a portable
`
`voice message record/playback device eitherby way of a direct
`
`cable connection to the TAD or by a telephone link.
`Providing such a portable access device and method
`
`would permit TAD owners to encourage inbound callers to leave
`
`more robust and data-rich audio messages on their TAD as well
`
`as permit TAD owners to subscribe to audio content which could
`
`be regularly delivered to their TAD in compressed digital form
`
`invention for play-back and
`and downloaded onto the present
`review at a convenient time and place. This would also permit
`TAD owners, while away from their home or office to have their
`portable dictation and voice message recording/reviewing device
`establish a telephone link with their TAD and economically and
`automatically retrieve all stored messages and update all
`outgoing messages (e.g. general and caller specific greetings),
`with all
`stored messages
`and
`outbound greetings being
`transmitted in digitized and compressed format.
`
`The invention provides a low cost, portable recording
`and playback dictation and voice message recording/reviewing
`device which permits the user to record, edit, play and review
`voice messages including audio-text,
`text-to-speech and other
`audio material which may be received from and subsequently
`transmitted to a remote host computer located on a public or
`
`private network over a communication
`
`link such as the public
`
`switched telephone system.
`
`A preferred device contains its own rechargeable
`power source,
`integrated circuitry and control buttons to
`permit the localized recording, editing, storage, playback and
`transcription of audio signals through a built-in speaker,
`microphone or plug-in headset, foot pedal and removable memory
`card.
`The device also contains a standard RJ-11 telephone
`
`10
`
`LS
`
`20
`
`25
`
`30
`
`removable PCMCIA
`(or software), or a
`jack, modem chip set
`connector to which a standard or wireless modem card could be
`
`a0
`
`connected; and a DIMF tone decoder to permit the transmission
`
`and control of audio signals to and from a host computer
`connected to a public or private network.
`The device contains
`
`SUBSTITUTE SHEET (RULE 26)
`
`13
`
`13
`
`
`
`WO 98/47252
`
`12
`
`PCT/US98/07228
`
`circuitry which permits it
`
`to transmit and receive audio
`
`signals
`recorded.
`
`at
`
`a
`
`rate substantially faster
`
`than originally
`
`A preferred device also contains a processor which
`includes the necessary terminal emulation to permit a network
`
`user to access a network directly from a local point of access,
`
`such as an Internet service provider's (ISP) point of access
`
`and shell account, using a standard protocol
`
`such as SMTP
`
`(Simple Mail Transfer Protocol), Post Office Protocol
`
`(POP3)
`
`and MIME (Multipurpose Internet Mail Extensions) in the TCP/IP
`
`suite to review, select and retrieve audio files that have been
`
`(or similarly, data/text
`to the user's e-mail address
`sent
`files which can be translated into voice), and to download and
`transmit such files.
`
`A preferred device also contains
`
`a
`
`standard or
`
`touchscreen display and software which permits the user
`to
`display a similar graphical editor for composing and reading
`e-mail messages as is displayed on his computer screen when
`accessing his e-mail, so that the user can scroll through his
`e-mail messages,
`selecting those audio files he wishes to
`
`download and selecting text messages
`
`he wishes
`
`to have
`
`converted, either by the network server or at the device,
`an audio format (text-to-speech) .
`
`into
`
`A preferred device also contains: a cradle into which
`
`the cradle having ports which enable
`the device may be placed,
`it to be connected to a power source to recharge the device's
`
`batteries;
`a
`phone
`jack to enable
`it
`to establish a
`communication link; and a serial or parallel port on a computer
`for downloading and uploading files directly to the computer
`or for receiving "redirected" files.
`
`A preferred device also contains a language user
`interface capable of recognizing and responding to speech with
`speech.
`Such an interface includes
`speaker
`independent
`functions but also permits speaker adaptation which allows the
`personal device to adjust to the peculiarities of the user's
`voice or pronunciations
`and thus
`improve accuracy.
`This
`speaker adaptation is achieved through a protocol which allows
`the system to adapt to the users voice through the repetition
`
`10
`
`15
`
`20
`
`25
`
`30
`
`a5
`
`SUBSTITUTE SHEET (RULE26)
`
`14
`
`14
`
`
`
`WO 98/47252
`
`13
`
`PCT/US98/07228
`
`of a set of sentences prior to first use of the device (See
`
`Lernout
`
`& Hauspie Speech Product's
`
`[LHSP]
`
`asr1000 product
`
`line).
`The language interface includes a vocabulary builder
`which permits the user
`to extend the vocabulary including
`special
`terms
`and proper nouns
`to the speech recognition
`application (see LHSP Lextool™), a user template which enables
`the user to create words which the device will associate with
`
`user defined commands e.g.
`
`"home" could be associated with an
`
`
`address
`
`(LHSP
`
`asr
`
`200
`
`product
`
`line),
`
`alphabet
`
`recognition for
`
`spelling an
`
`
`address
`
`as well
`
`as
`
`background noise tolerance and speech at a distance software
`which improve the accuracy of the language user interface even
`in an automobile, airplane or public place and even if the user
`
`is not wearing a headset.
`
`(see LHSP)
`
`public-key
`contains
`also
`device
`preferred
`A
`encryption technology designed to ensure reliable and secure
`
`information by encrypting and
`sensitive
`transmission of
`decrypting the message data and by authenticating the sender's
`
`identity by using a secure digital or voice signature.
`
`A preferred device also contains a text-to-speech
`utility which permits the user to download data not already
`converted to speech by a network server and to do so at
`the
`device.
`
`A preferred device also contains a bar code reader
`
`which permits the user to scan a printed bar code associated
`
`with printed matter such as a news article,
`
`a map, a menu of
`
`available audio files or in a travel guide which would give the
`device all the information it needs including network server
`address,
`file location and file ID so that
`the audio file
`
`associated with the printed matter could be automatically
`retrieved from a network such as the Internet.
`
`A preferred device also contains a bar code reader
`
`which permits the user to scan a printed bar code associated
`with printed matter such as a news article,
`a map,
`a menu of
`available audio files or in a travel guide which would give the
`
`device all
`
`the information it needs to play a file from a
`
`previously retrieved group of audio files (such as described
`in Goldberg et al.).
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`SUBSTITUTE SHEET(RULE 26)
`
`15
`
`15
`
`
`
`WO 98/47252
`
`14
`
`PCT/US98/07228
`
`A preferred device
`
`also contains
`
`an
`
`Infrared
`
`interface
`
`using a
`
`standard
`
`such
`
`as
`
`the
`
`Infrared Data
`
`Association (IrDA) for
`
`high speed local wireless transmission (e.g. 1.2 Mbps and
`4Mbps) of audio files and control codes between the device and
`
`a public
`
`phone, kiosk or the users' computer.
`
`A preferred device also includes a software utility
`
`called an off-line browser which programs
`
`the device to
`
`automatically retrieve audio files from the network during off-
`peak hours to which the user has subscribed, or from
`
`selected Web sites which have new audio material available, or
`
`from e-mail addresses that the user has programmed the off-line
`browser