throbber
United States Patent
`US 6,301,258 B1
`(10) Patent No.:
`(12)
`Katseff et al.
`(45) Date of Patent:
`*Oct. 9, 2001
`
`
`US006301258B1
`
`(75)
`
`(54) LOW-LATENCY BUFFERING FOR PACKET
`TELEPHONY
`Inventors: Howard Paul Katseff, Englishtown;
`Robert Patrick Lyons, Jackson;
`Bethany Scott Robinson, Lebanon,all
`of NJ (US)
`
`(73) Assignee: AT&T Corp., New York, NY (US)
`’
`.
`.
`.
`.
`(*) Notice:
`Thispatent issued on a continued pros-
`ceution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent
`term provisions of 35 USC.
`154{aN(2).
`ne
`:
`:
`:
`Subject to any disclaimer, the term of this
`patent is extended ror alfjusied, under 35
`US.C. 154(b) by 0 days.
`
`cesseececssccesescsseeeseensereeseeee 91/50
`0 460 867 AZ * S/1991 (EP)
`0.548 S97TAL * A24992 (ER) csrseseceeesrewessreserreraceys 93/26
`OTHER, PUBLICATIONS
`Ran etal. “Designing an ON-Demand Multimedia Service”
`IEEE Communication Magazine v30 iss7, Jul. 1992.*
`Megiddo et al. “The Minimum Reservation Rate Problem in
`Digital Audio/Video”, Israel Symposium on Theory of Com-
`puting, 1993.*
`Revindran et al. “Delay Compensation Protocols for Syn-
`chronization of Multimedia Data Streams”, IEEE Trans. on
`Knowledge and Data Engineering, v5 iss4, Aug. 1993.*
`j
`|
`(List continued on next page.)
`Primary Examiner—Huy D. Vu
`Assistant Examiner—Yoan Nguyen
`;
`,
`oie
`(74) Attorney, Agent, or Firm—Kenyon & Kenyon
`(57)
`ABSTRACT
`
`(56)
`
`.
`(21) Appl. No.: 08/985,229
`(22)
`Filed:
`Dec. 4, 1997
`;
`4
`(51)
`Int. Ch. ccc ceeeeeeee HOAL 12/56; HO04J 3/06
`(52) US. Cl eee 370/412; 370/508; 370/517
`(58) Field of Search.
`..........0.......seseeseeceseneiees 370/465, 468,
`370/477, 508, 516-517, 519; 375/371-372
`.
`References Cited
`U.S. PATENT DOCUMENTS
`
`In a method for reducing latency in packet telephony caused
`byanti-jitter buffering, audio data elements are received and
`placed in a telephony input buffcr used for anti-jitter buff-
`ering. Rather than wait until the bufferis full, the audio data
`elements are clocked, or played, out of the buffer at a rate
`slower than the normalplayrate. In this way, latency due to
`the initial buffer fill period is reduced or eliminated. Audio
`data elements continue to be played out at a slower than
`normal rate until the buffer fill level reaches a threshold. At
`that time, the play rate for sending data elements out of the
`telephony input buffer is adjusted to the normal play rate. In
`an alternative embodiment of the present invention, the fill
`level of the telephony input buffer is controlled within a
`4,914,650 ® ANT990 Sritany syecwsceeweceecercosseees 370/60
`
`desired range by speeding up or slowing down therate at
`5,109,482 :
`4/1992 Bohrman....
`ve 395/154
`
`5,159,447*10/1992 Haskell etal.. ve 358/133 which audio data elements are played out ofthe telephony
`
`acon 95/159
`SDLOAS © Sileo): Catlitest et ils
`input buffer.
`In
`yet another alternative embodiment,
`the
`3/1993 Alcoetal. ..
`ve 395/157
`5,193,148 *
`P
`mle
`yer
`ane
`te
`packs
`ts a
`
`8/1993 Mills et al. acssssscsseeeesen 395/133.
`5,237,648 *
`‘MMlolllil OL_Jaleney iller 1 AME packel Deiwork 1s Mieasire
`5,287,182 *
`2/1994 Haskell et al. vassssesnesneenee 348/500
`«andthe size of the telephonyinput buffer is adjusted based
`.. 370/84
`5,544,170 *
`8/1996 Kasahara .......
`upon the relative amountof jitter, such that the relative size
`
`.
`we 370/263
`5,623,490 *
`4/1997 Richteret al.
`of the buffer is reduced when the packet network is quiet,
`5,822,537 * 10/1998 Katseff et al. oe 395/200.61
`and the size of the buffer is increased when the network is
`FOREIGN PATENT DOCUMENTS
`metalawelyery.
`
`O 2711866 A2 © 12/1987 (ED) ccememeeen 88/25
`
`50 Claims, 6 Drawing Sheets
`
`Oo 7 170
`AuoIO OUT
`j
`‘SPEAKER MIE,
`wo]
`
`
`105]
`
`SOUNDA/D, D/)iN
` SOUND
`
`
`
`
`
`
`
`
`
`
`NETHORK LAYER
`
`CARD
`
`
`
`
`
`
`
`0 PORT
`
`140
`qo00
`
`
`
`MODEM
`1STELEPHONE LINE
`
`GOOGLE 1009
`
`GOOGLE 1009
`
`1
`
`

`

`US 6,301,258 B1
`Page 2
`
`OTHER PUBLICATIONS
`
`Computer Communications, vol. 15, No. 10, Dec. 1, 1992
`pp. 611-618, Blakowski G. et al, “Tool Support for the
`Synchronization and Presentation of Distributed Multime-
`dia’”.*
`
`IEEE Communications Magazine, vol. 29. No. 10, Oct. 1,
`1991, pp. 54-64, Israel Cidon et al, “Bandwidth Manage-
`ment and Congestion Control on plaNet”.*
`Cherry, Lorinda L. and Waldstein, Robert K., “Electronic
`Access To Full Document Text And Images Trough Linus”,
`AT&T Technical Journal, Jul./Aug. 1989, pp. 72-90.*
`Davecv, Danco, Cakmakov, Dusan and Cabukovski, Vanco,
`“Distributed Multimedia Information Retrieval System”,
`Computer Communications, vol. 15, No. 3, Apr. 1992, pp.
`177-184.*
`
`Haskin, Roger L., “The Shark Continuous—Media File
`Server”, Digest of Papers, COMPCONSpring’ 93, San Fran-
`cisco, California, Feb. 22-26, 1993, pp. 12-15.*
`Katseff, Howard P. and London, Thomas B., “The Ferret
`Document Browser”, USENIX Summer 1993 Technical
`Conference, Cincinnati, Jun. 1993.*
`Lesk, Michael, “Television Libraries for Workstations: An
`All-Digital Storage, Transmission and Display for Low-—rate
`Video”, (apparently unpublished).*
`O’Gorman, Lawrence, “Image and Document Processing
`‘lechniques for the RightPages Electronic Library System’,
`Proceedings of the 11” LAPR International Conference on
`Pattern Recognition, The Hague, Nethlands, Aug. 30-Sep. 3,
`1992, vol. II, Conference B: Pattern Recogbition Method-
`ology and Systems, IEEE Computer Society Press, Califor-
`nia, pp. 260—263.*
`
`Poole, Lon, “Quicklime In Motion: Pasting Movies Into
`Documents Will Be As Easy As Pasting Graphics”, Mac—
`world, Sep. 1991, pp. 154-159.*
`Rangan, P. Venkat, Vin, Harrick M. and Ramanathan, Srini-
`vas, “Designing An On-Demand Multimedia Service”,
`IEEE Communications Magazine, vol. 30, No. 7, Jul. 1992,
`pp. 56-64.*
`Rowe, Lawrence A. and Smith, Brian C., “A Continuous
`Media Player”, Proceeding of the 3” International Workship
`on Network and OS Support for Digital Audio and Video,
`San Diego, CA, Nov. 1992.*
`Rowe, Lawrence A. and Larson, Ray R., “A Video—on—De-
`mand System”, (apparently unpublished).*
`Semilof, Margie, “NetWare to Get Multimedia Hooks”,
`Communications Week, No. 469, Aug. 30, 1993, pp.
`21-22.*
`
`Story, Guy A., O’Gorman, Lawrence, Fox, David Schaper,
`Louise Levy, and Jagadish, H.V., “The RightPages Image—
`Based Electronic Library for Altering and Browsing”, Com-
`puter, Sep. 1992, pp. 17-26.*
`Tobagi, Fouad A. and Pang, Joseph, “StarWorks—A Video
`Applications Server”, Digest of Papers COMPCON
`Spring’93, San Francisco, California, Feb. 22-26, 1993,
`IEEE Computer Society Press, pp. 4-11.*
`“Interview: Expert Discusses Multimedia Implementations
`on Networks”, Communications Week, No. 471, Sep. 13,
`1993 pp. 22-23.*
`Pres Syndicate of the University of Cambridge, Scenari-
`o—based Hypermedia: A Model and a System, 1990.*
`
`* cited by examiner
`
`2
`
`

`

`U.S. Patent
`
`Oct. 9, 2001
`
`Sheet 1 of 6
`
`US 6,301,258 B1
`
`FIG.
`
`1
`
`1
`
`100
`
`(2 170
`
`SPEAKER
`
`105] AUDIO OUT}~160AUDIO IN
`
`
`' 1 1 U { t i) i} 1 1 ' 1 1 1 1 J ' ' ' I I t ' t t t t t i) 1 t 1 i) t { I t t ' 1 1 ' t ' t 1 1 ' ~
`
`MANAGER
`
`oSNe
`
`—_
`
`—_— —_ BRO
`
`SOUND A/D, D/A
`
`beeeeweeeeeweeHeeeeeeee
`
`150
`
`BUFFER
`
`145~-TELEPHONE LINE
`
`3
`
`

`

`U.S. Patent
`
`Oct. 9, 2001
`
`Sheet 2 of 6
`
`US 6,301,258 B1
`
`FIG. 2
`
`203
`
`
`
`RECEIVE AUDIO DATA ELEMENT AND
`STORE IN TELEPHONY INPUT BUFFER
`
`
`
`SET ELEMENTS TO PLAY AT
`SLOWER THAN NORMAL RATE
`
`NUMBER OF ELEMENTS < 1?
`
`
`
`
`
`
`
`
`
`SET ELEMENTS TO PLAY AT
`NORMAL RATE
`
`PLAY ELEMENTS WITHOUT WAITING
`FOR BUFFER TO FILL
`
` FIG. 3
`
`RECEIVE AUDIO DATA ELEMENT
`AND STORE IN TELEPHONY
`INPUT BUFFER
`
`PLAY ELEMENTS
`OUT AT SLOWER THAN
`NORMAL RATE
`
`
`
`
`
`PLAY ELEMENTS OUT
`AT FASTER THAN
`NORMAL RATE
`
`
`
`PLAY ELEMENTS OUT
`AT NORMAL RATE
`
`
`
`4
`
`

`

`U.S. Patent
`
`Oct. 9, 2001
`
`Sheet 3 of6
`
`US 6,301,258 B1
`
`403
`
`RECEIVE AUDIO DATA ELEMENT AND
`STORE IN TELEPHONY INPUT BUFFER
`
`FIG. 4A
`
`
`
`
`
`
`
`PLAY ELEMENTS OUT AT
`SLOWER THAN NORMAL RATE
`
`NUMBER OF ELEMENTS < 1?
`
`PLAY ELEMENTS OUT AT
`FASTER THAN NORMAL RATE
`
`404
`
`FIG. 4B
`
`
`
`
`
`Alf
`
`RECEIVE AUDIO DATA ELEMENT
`AND STORE IN TELEPHONY
`INPUT BUFFER
`
`
`PLAY ELEMENTS OUT
`PLAY ELEMENTS OUT
`AT SLOWER THAN
`AT FASTER THAN
`
`NORMAL RATE
`
`
`
`NORMAL RATE
`
`PLAY ELEMENTS OUT
`AT NORMAL RATE
`
`5
`
`

`

`US 6,301,258 B1
`
`U.S. Patent
`
`Sheet 4 of 6
`
`Oct. 9, 2001
`
`GnTHY34IN8 GOld
`
`
`
`SLOTS3WIL
`
`6
`
`

`

`U.S. Patent
`
`Oct. 9, 2001
`
`Sheet 5 of 6
`
`US 6,301,258 BI
`
`FIG. 6A
`
`601
`
`602
`
`603
`
`
`
`
`RECEIVE AUDIO DATA ELEMENT
`
`INTO BUFFER
`
`DETERMINE NUMBER OF DATA
`ELEMENTS IN BUFFER
`
`
`
`ENTER CURRENT TIME ARRAY
`AT INDEX GIVEN BY CURRENT
`NUMBER OF ELEMENTS
`
`
`
`FIG. 6B
`
`INCREASE BUFFER SIZE
`
`611
`
`612
`
`613
`
`615
`
`
`
`SCAN ARRAY FOR MOST
`RECENT ENTRY Tr
`
`SCAN EACH OTHER ELEMENT
`Tn OF ARRAY FOR
`COMPARISON WITH Tr
`
`REDUCE BUFFER SIZE
`
`
`
`
`
`
`
`7
`
`

`

`U.S. Patent
`
`Oct. 9, 2001
`
`Sheet 6 of 6
`
`US 6,301,258 B1
`
`FIG. 6C
`
`621
`
`622
`
`626
`
`REDUCE BUFFER
`SIZE
`
`
`
`624
`
`INCREASE BUFFER
`SIZE
`
`
`
`
`
`
`
`SCAN ARRAY FOR MOST
`RECENT ENTRY Tr
`
`
`SCAN EACH OTHER
`ELEMENT Tn OF ARRAY FOR
`COMPARISON WITH Tr
`
`MAINTAIN CURRENT BUFFER
`SIZE
`
`8
`
`

`

`1
`LOW-LATENCY BUFFERING FOR PACKET
`TELEPHONY
`
`The present application is related to U.S. application
`entitled “Low-Latency Audio Interface for Packet
`Telephony,” which is filed on even date herewith. These two
`applications are co-pending and commonly assigned.
`TECHNICAL FIELD
`
`This invention relates to packet telephony in general and,
`more particularly, provides a way of reducing latency in
`packet telephony communications.
`BACKGROUND OF THE INVENTION
`
`US 6,301,258 B1
`
`2
`such input buffers may run several packets deep with an
`equivalent length (in terms of time) ranging from % toa full
`second of audio data. Thus, each time a speaker starts
`talking, perceptible latency is introduced as a result of
`anti-jitter buffering, making interactive conversations diffi-
`cult or unnatural.
`
`Whatis desired is a way of reducing the latency in packet
`telephony communications caused by buffering used to
`reduce or eliminate networkjitter.
`SUMMARY OF THE INVENTION
`
`10
`
`The present inventionis directed to a method for reducing
`latency in packet telephony caused byanti-jitter buffering.
`Whena second, or remote, user begins to speak, the tele-
`phony input buffer, whichis used for anti-jitter buffering, is
`Packet telephony involves the use of a packet network,
`initially empty. As audio data are received,
`the data are
`such as the Internet or an “intranet” (modeled in function-
`placed in the telephony input buffer. However, rather than
`ality based upon the Internet and used by a companies
`wait until the buffer is full, the audio data are clocked, or
`locally or internally) for telecommunicating voice, pictures,
`played, out of the buffer as soon as the first data element
`moving images and multimedia (e.g., voice and pictures)
`arrives and at a rate slower than the normalplayrate. In this
`content.
`Instead of a pair of telephones connected by
`way, latency duc to the initial buffer fill period is reduced or
`switched telephone lines, however, packet telephony typi-
`eliminated. Audio data continue to be played out at a slower
`than normalrate until the bufferfill level reachesa threshold.
`cally involves the use of a “packet phone” or “Internet
`phone”at one or both ends of the telephonylink, with the
`At that
`time,
`the play rate for sending data out of the
`information transferred over a packet network using packet
`telephony input buffer is adjusted to the normal play rate.
`switching techniques. A “packet phone”or “Internet phone”
`This technique for starting playback at a slower rate before
`typically includes a personal computer (PC) running appli-
`the buffer is fulled may be employed whenever the buffer
`cation software for implementing packetized transmission of
`empties, e.g. either as the result of the startup of a
`audio signals over a packet network (such as the Internet);
`conversation, or of network delays or of loss of a burst of
`in addition,
`the PC-based configuration of a packet or
`packets.
`Internet phone typically includes additional hardware
`In an alternative embodimentof the present invention, the
`devices, such as a microphone, speakers and a sound card,
`fill level of the telephony input buffer is controlled within a
`which are plugged or incorporated into the PC.
`desired range by speeding up or slowing downthe rate at
`The amountof time it takes for a communication to travel
`which audio data are played out of the telephony input
`through a communications networkis referred to as latency.
`buffer. In yet another alternative embodiment, the amount of
`The amount of latency can impact the quality of the com-
`latency jitter in the packet network is measured andthe size
`munication; the higher the latency, the lesser the quality of
`of the telephony input buffer is adjusted based upon the
`the communication. Latency of about 150 milliseconds (ms)
`relative amountofjitter, such that the relative size of the
`or more produces a noticeable effect upon conversations
`buffer is reduced when the packet network is quiet, and the
`that, for some people, can render a conversation next to
`size of the buffer is increased whenthe networkisrelatively
`impossible. The Plain Old Telephone Service (POTS) net-
`jittery.
`work controls latency to an acceptable degree, which is one
`BRIEF DESCRIPTION OF THE DRAWINGS
`of the ways in which the POTS network is deemedareliable
`FIG. 1 showsa functional diagram for a PC-based packet
`and quality communications service.
`phone utilizing the buffering management of the present
`However, latency is a significant problem in packettele-
`invention.
`phony. Latency problems may be caused by factors such as
`traffic congestion or bottlenecks in the packet network,
`which can delay delivery of packets to the destination.
`Another problem is caused by packet network “Jitter.”
`“Jitter” is the variance in latency from packet-to-packet or
`between groups of packets, such that packets (or packet
`groups) are not received at the destination at regular inter-
`vals. In packet telephony, packets are clocked into the packet
`network from the sending station at a regular rate;
`thus,
`network characteristics are responsible for deviation from
`regularity in the rate of receiving data packets at the receiv-
`ing station.
`the
`Packet telephony programs use an input buffer at
`receiving station to compensate for networkjitter. Anti-jitter
`buffering is used to allow data to be clocked outof the buffer
`and into the telephony section at a regular rate. Each time
`voice input from the network starts at the receiving station,
`the packet telephony program directs the incoming data inta
`the telephony input buffer, and docs not start clocking the
`data out of the buffer (clocking audio data out of a memory
`is often called “playing” the data) and along to the speaker
`output until the telephony input bufferis full. For example,
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`FIG. 2 showsa flow diagramof the startup process of the
`present invention.
`FIG. 3 shows a flow diagram for controlled buffering in
`an alternative embodiment of the present invention.
`FIG. 4A showsa flow diagram for controlled buffering in
`another alternative embodiment of the present invention.
`FIG. 4B showsa flow diagram for controlled buffering in
`another alternative embodiment of the present invention.
`FIG. 5 shows the data structure for a jitter array used in
`accordance with the present invention.
`FIG. 6A showsa flow diagram for updating a jitter array
`in accordance with the present invention.
`FIG. 6B shows a flow diagram for measuring jitter in
`accordance with the present invention.
`FIG. 6C shows a flow diagram for measuring jitter in
`accordance with an alternative embodimentof the present
`invention.
`
`DETAILED DESCRIPTION
`
`The present inventionis directed to a method for reducing
`latency in packet telephony caused by anti-jitter buffering. In
`
`9
`
`

`

`US 6,301,258 B1
`
`4
`From I/O port 135, the data proceeds to modem 140,
`which converts the data to tones suitable for transmission
`over a standard POTS telephone line 145 to a connecting
`service used to connect to a packet network, such as the
`Internet.
`
`3
`accordance with the present invention, data are played out of
`the receivers’s input buffer at variable rates, such that
`latency is reduced, while controlling the size of the buffer to
`reduce the effects of network jitter.
`FIG. 1 showsa functional diagram for a PC-based packet
`phone utilizing the buffering management of the present
`It should be noted that connecting a PC to a packet
`invention; the functionality shown in FIG. 1 is based upon
`network, such as the Internet, may be accomplished by any
`the hardware/software functionality typically found in a
`number of known techniques, such as through the use of a
`PC-based packet phone. Whenafirst user begins to speak
`modem over a telephone line described above. Access to a
`10
`into microphone 100 (which servesas the audio input device
`packet network, such as the Internet, may also be accom-
`for the packet phone), an analog audio signal from micro-
`plished through, e.g., use of an ISDNline, a cable television
`phone 100 is received into the PC-based packet phone via
`line, or a local area network using techniques knownto those
`audio input port 105. Audio input port 105 is connected to
`skilled in the art.
`sound card 110. The analog audio signal is delivered to
`sound card 110, where it is then digitized using an analog-
`to-digital (A/D) converter 112. Sound card 110 may be any
`one of a number of standard PC sound cards, such as the
`SoundBlaster™ 16 from Creative Labs. Sound card 110 also
`
`15
`
`typically contains a pair of data buffers 114 and 116. Data
`buffer 114 buffers the audio data received from audio input ,
`port 105 and digitized by A/D converter 112 before being
`sent to CODEC 120. ‘lypically, this data buffering is per-
`formed in accordance with an established protocol, such as
`that provided by a standard Microsoft audio driver supplied
`with the Microsoft Windows™ operating system.
`CODEC 120 compresses the audio data for efficient
`transmission over the packet network. CODEC 120 may,
`typically, be either a hardware or software componentthatis
`well-known in communications and telephony applications
`to those skilled in the art.
`
`25
`
`30
`
`The packet telephony productis a telephony program 125
`having a telephony application 127 and data buffers 128 and
`129. Telephony application 127 implements the [unctional-
`ity needed to prepare the data for transmission over a packet
`network. For example, telephony application 127 places the
`data into a form compatible with a data communications
`protocol used for transmitting data over a packet network.
`Telephony output buffer 128 buffers the data output by
`telephony application 127. Telephony output buffer 128 is
`kept as short as possible and is used to buffer data going out
`to the packet network in the event the network becomes
`temporarily busyat a particular instant, so that outgoing data
`are not lost.
`
`The audio data from telephony output buffer 128 is then
`processed by network layer 130. Telephony application 127
`requests that network layer 130 play data out of telephony
`buffer 128 as soon as placed in the buffer. Network layer 130
`is a software communications application which adds one or
`more layers of data protocol, such as the well-known Trans-
`mission Control Protocol and Internet Protocol (TCP/IP), or
`the known User Datagram Protocol and Internet Protocol
`(UDP/IP), and/or the well-established point-to-point proto-
`col (PPP) used for communicating over a packet network.
`TCP/IPis typically used for control and setup, while UDP/IP
`is often used for transmitting audio data because UDP/IP
`does not cause lost packets of audio data to be retransmitted.
`UDP/IP may be preferred for transmitting audio data
`because, for packet telephony, retransmitting lost audio data
`will degrade a conversation. PPP is typically employed
`when a modem is used to permit the PC connect to PC
`connect to a packet network, such as the Internet, using a
`standard dial-up telephone line. Network layer 130 is typi-
`cally included as the network stack in the Microsoft Win-
`dows™operating system. Packcts are then scent to input/
`output (I/O) port 135, which is a standardserial port used for
`establishing a serial data connection between a PC and a
`peripheral device, such as a modem.
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`10
`
`Once the data is sent to the packet network, the packet
`network transmits the data whichis ultimately directed to a
`second user having a receiving terminal (e.g., another
`PC-based packet phone) at
`the other end of a TCP/IP-
`compatible connection established between the two users.
`The second user may transmit audio or speech data back
`to the first user. The process of receiving external audio data
`from the second user over the packet network into thefirst
`user’s PC-based packet phoneis, in many respects, a rever-
`sal of the steps described above in connection with sending
`audio data from the first user’s packet phone to the second
`user over the packet network. The external audio data
`packets are received from a packet network connecting
`service (i.¢., Internet service provider) over POTS telephone
`line 145 into modem 140, which converts the data from
`tones into digital data packets. From modem 140 the data
`proceeds to J/O port 135 and then to network layer 130,
`which removesone or more protocollayers (such as TCP/IP,
`UDP/IP and/or PPP).
`After network layer 130, the data is sent to telephony
`application 125 which directs the data into telephony input
`buffer 129. As mentioned above, telephony input buffer 129
`is typically several data elements deep and, in an attempt to
`compensate forjitter, telephony program 125 delays playing
`the data out
`to speaker 170 from buffer 129,
`through
`telephony application 127, until telephony input buffer 129
`is full or has reached a given threshold (typically, such a
`threshold would be one-half full); this introduces latency
`whenever the buffer cmptics, c.g., when the second user
`begins to speak.
`In accordance with the present invention, buffer manager
`150 operates to control telephony application 127 and tele-
`phony input buffer 129, such that data is played out of
`telephony input buffer before the buffer fills up. Buffer
`manager 150 clocks the audio data out at a rate less than the
`normalrate (1.¢., at less than the real-time rate) which allows
`telephony input buffer 129 to fill—thus utilizing the effec-
`tiveness of the buffer in reducing jitter. In this way, latency
`normally introduced by virtue of the delay in the start of
`playing data from telephony input buffer 129 is eliminated
`whilc, at the same time, the taking advantage of the bencfits
`afforded by buffering in the reduction of networkjitter. This
`technique forstarting play of data from the buffer at a slower
`rate before the buffer is fulled may be employed whenever
`the buffer empties, e.g. either as the result of the startup of
`a conversation, or of network delays or of loss of a burst of
`packets.
`Once the data is played from telephony input buffer 129,
`it proceeds through the rest of the telephony and audio
`processing. Upon lcaving input buffer 129,
`the data is
`processed by telephony application 127 and is sent
`to
`CODEC120. CODEC 120 decompresses the audio data that
`was compressed (by the transmitting packet phone) for
`
`10
`
`

`

`US 6,301,258 B1
`

`
`10
`
`15
`
`5
`transmission over the packet network. From CODEC 120
`the audio data is then sent to data buffer 116 of sound card
`110. Data buffer 116 buffers the audio data, which is then
`converted into analog form by D/A converter 112 and sent
`in analog through audio output port 160 to speaker 170.
`For purposes of the various aspects and embodiments of
`the invention described below,
`it
`is assumed that
`the
`PC-based packet telephone operates as discussed above with
`reference to the functional diagram of FIG. 1.
`1. Startup.
`The operation of the present invention at startup of a
`conversation by the second user will now be described in
`further detail with reference to FIG. 2, which shows a flow
`diagram of the startup process of the present invention.
`Those skilled in the art will recognize that
`the startup
`process may be employed wheneverthe buffer empties (and
`not just when a conversation begins). When the second user
`(whois remote from thefirst user) initially begins to speak,
`telephony input buffer 129 is typically empty. Buffer 129
`will also typically be emptyafter a period of silence from the
`remote terminal as the result of silence suppression (in <
`which the transmission of audio data packets from the
`remote terminal of the second user is temporarily halted
`during the period of silence by the second user), or as the
`result of network delaysor loss of a burst of packets. As the
`first audio data beginsto arrive after the second user begins
`to speak, they are placed in telephony <input buffer 129 as
`shown in block 201 of FIG. 2. Rather than waiting for the
`buffer to fill (either fully or to some predetermined level)
`with the data, as soonasthe first data element(s) are received
`they are played out of the buffer. Initially, the numberof data
`elements in telephony input buffer 129 is small, so that the
`data cannot be played out at the normal rate; otherwise, the
`buffer would notfill.
`
`25
`
`30
`
`The initiation of playing out data from the buffer may
`begin as soon as the first data element
`is received.
`Alternatively, the playing of data out of the buffer may await
`the receipt of a small numberof data elements, such number
`being less than one-half full (e.g., 2 elements), before the
`data are played out of the bulfer.
`Thus, in accordance with the present invention, the num-
`ber of data elements in buffer 129 is compared against a
`threshold value, T, as shown in block 202. The threshold
`value represents a numberof data elementsor, alternatively,
`the threshold T could be provided as a unit of time (as the
`data elements are expected to be received at regular
`intervals,
`there is a direct correlation between expected
`duration and the number of elements of audio data) or the
`desired relative fill percentage for the buffer(i.c., the ratio of
`the numberof data elements in the buffer to the total size or
`capacity ofthe full buffer). Illustratively, the threshold value
`maybe set to represent approximately 50% ofthe size of the
`buffer (i-e., the desired fill percentage would be 50%).
`If the numberof data elements in buffer 129 is less than
`the threshold value T, the elements are set to play out at a
`slower than normal rate as shown in block 203, and the
`elements are played as shownin block 205 without waiting
`for the buffer to fill. Ilustratively, elements may be played
`out at a rate equal to approximately 90% of the normal or
`expected rate at which audio data elements arrive. In this
`way, the audio data will be processed and speech heard at
`speaker 170 at a time before buffer 129 would have filled
`without playing any data. At the same time, because the data
`are being played out at a rate slower than the rate at which
`they arrive in telephony input buffer 129, buffcr 129 slowly
`begins to fill with data elements.
`The slower rate at which the data is played out should be
`arate at which the speech nonethelessis intelligible. A 90%
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`11
`
`6
`play rate is barely noticeable and not objectionable; other
`rates may, similarly, prove acceptable.
`At somepoint during the continuing receipt of audio data,
`the numberof data elements in the buffer will be equal to or
`exceed the threshold value, T, where T may represent an
`optimal or desired number of elements to be contained in
`telephonyinput buffer 129. Whenthatis the case, compari-
`son between the number of elements in the butter to the
`threshold value T at block 202 will produce a response that
`the numberof data elementsis not less than the threshold T
`and, as shownin block 204,the rate of playing the audio data
`out of buffer 129 is then set to the normal rate. In this way,
`it is expected that the number of data clements arriving at
`buffer 129 will be roughly equal to the number of elements
`played out of buffer 129 and, thus,the relative buffer “fill”
`percentage would be expected to remain approximately the
`same over time. Of course, given the nature of the network
`jitter problem, there will be short-term fluctuations in the
`telephony input buffer fill percentage.
`2. Controlled Buffering.
`Turing now to FIG. 3, another embodiment of the
`present invention will now be described in which the num-
`ber of elements in telephony buffer 129 is controlled over
`time to be constrained within a desired range (this processis
`not limited to the startup of a conversation). This provides
`the benefit of not permitting the buffer to overfill or empty
`out, such that the conversation will appear more natural. As
`before, in block 301 audio data arriving from the remote
`terminal of the second user are placed in telephony input
`buffer 129. As shown in block 302, the number of data
`elements in buffer 129 is compared againsta first threshold
`value, T1. The threshold value T1 represents a number of
`data elementsor, alternatively, could be provided as a unit of
`time (as the data elements are expected to be received at
`regular
`intervals,
`there is a direct correlation between
`expected duration and the number of elements of audio
`data). If the numberof elementsin buffer 129 is less than the
`first threshold value T1, the elements are played out at a
`slower than normal rate as shownin block 303. As discussed
`above, the play rate should be one that provides acceptable
`audio speech quality at the speaker output. Because the rate
`is lower than the normalrate, it is expected that the buffer
`would slowly begin to fill to a greater percentage.
`If the number of elements is not
`less than the first
`
`threshold, T1, the number is then compared at block 304
`against a second threshold value, T2, which is higher than
`the first threshold value, T1. If the number of elements is
`greater than the sccond threshold valuc, T2,the rate at which
`audio data elements are played out of the buffer is set at
`block 305 to a rate faster than the normal rate. When the play
`rate is faster than normal, it is then expected that the buffer
`would slowly begin to empty.
`If comparison at block 304 of the number of elements
`filling input buffer 129 with the second threshold, T2,
`reveals that the number of elements is less than (or equal to)
`T2, then the play rate, as shown at block 306, the rate of
`playing the audio data out of buffer 129 is then set to the
`normal play rate. In this way, the number of data elements
`filling telephony input buffer 129 is controlled such that the
`number of elements is steered to the range between T1 and
`T2. As described above, the threshold values T1 and T2 may
`be expressed as a number of elements, a unit of time, or the
`relative fill percentage for the buffer.
`Illustratively,
`the
`threshold values T1 and T2 maybesct to represent approxi-
`mately 25% and 75% ofthe size of the buffer, respectively.
`In an alternative embodiment of the present invention,
`controlled buffering may be performed in essentially a
`
`11
`
`

`

`US 6,301,258 B1
`
`7
`continuous mannerby collapsing the two thresholds, T1 and
`T2, into a single threshold value T, as shown in FIG. 4A. At
`block 401 audio data elements arriving from the remote
`terminal of the second user are placed in telephony input
`buffer 129. As shown in block 402, the numberof elements
`in buffer 129 is compared against a threshold value, T. If the
`number of elements is less than the threshold value, T, the
`audio data are played out at a slower than normal rate as
`shown in block 403. If the number of elements is greater
`than or equalto the threshold value, the elements are played
`out at a faster than normal rate as shown in block 404.
`An alternative to this embodimentis shown in FIG. 4B. At
`block 411 audio data arriving from the remote terminal of
`the second user are placed in telephony input buffer 129. As
`shown in block 412, the numberof data elements in buffer
`129 is compared against a threshold value, T. If the number
`of elements is less than the threshold value, T, the elements
`are played outat a slower than normalrate as shownin block
`413. If the number of data elements is greater than the
`threshold value, the elements are played out at a faster than
`normal rate as shown in block 414. If the number of :
`elements is equal to the threshold, T, the rate is set to the
`normal rate at block 415.
`
`10
`
`15
`
`3. Dynamic Buffer Sizing.
`In addition to controlling the rate at which audio data
`elements are played out of the telephony input buffer, the
`size of the buffer itself may be controlled. The longer the
`butter, the greater the potential latency in the audio that is
`eventually heard at the speaker output. The length of the
`buffer required to be effective in reducing jitter is dependent
`upon the relative amount ofjitter presented by the packet
`network. If the network

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket