throbber
US007286652Bl
`
`(12)
`
`United States Patent
`
`Azriel et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,286,652 B1
`Oct. 23, 2007
`
`(54) FOUR CHANNEL AUDIO RECORDING IN A
`T BASED NETWORK
`PACKE
`
`2001/0043571 A1 *
`11/2001 Jang et al.
`................ .. 370/260
`OTHER PUBLICATIONS
`
`(75)
`
`Inventors: Gad Azriel, Holon (IL); Yackov
`Sfadya, Kfar Saba (IL)
`
`(73) Assignee: 3Com Corporation, Marlborough, MA
`(US)
`
`International Telecommunication Union, H.225.0, Annex A, RTP/
`RTCP, Feb. 1998, pp. 73-106.
`International Telecommunication Union, H.323, Draft V4, Aug.
`1999, Chapters 6 & 7, P11 13-52.
`Pocket Tele hon Primer, 3COM Co oration, Mar. 1998.
`p
`y
`rp
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(1)) by 1606 days.
`
`* Cited by examiner
`Primary Exgm;‘ner—Fan Tsang
`Assistant Examiner—Joseph T Phan
`
`(21) Appl. No.: 09/584,581
`
`(57)
`
`ABSTRACT
`
`(22)
`
`Filed:
`
`May 31, 2000
`
`(51)
`
`1111- C1-
`(2006~01)
`H04M 1/64
`(2006~01)
`H04L 12/66
`(2006~01)
`G06F 15/173
`(52) U.S. Cl.
`.................. .. 379/88.22; 370/352; 709/224
`(58) Field of Classification Search ................... .. None
`See application file for complete search history.
`_
`References Clted
`US. PATENT DOCUMENTS
`
`(56)
`
`........... .. 348/14.09
`1/ 1998 Bruno et a1.
`5,710,591 A *
`8/2000 Sharman et a1.
`704/235
`6,100,882 A *
`
`................. .. 709/224
`9/2000 Bar et al.
`6,122,665 A *
`6,487,196 B1* 11/2002 Verthein eta1.
`.......... .. 370/352
`6,614,781 B1*
`9/2003 Elliott et al.
`...... .. 370/352
`
`6,850,609 B1*
`2/2005 Schrage ............... .. 379/202.01
`
`An apparatus for and a method of audio recording in packet
`based telephony systems. Using the present invention, the
`equivalent of four audio channels are recorded utilizing only
`two recording channels. Each channel recorded comprises
`the stream of packets generated and transmitted by each
`endpoint to the other side. The RTP packets include the
`samples generated by the particular endpoint in addition to
`the timestamp of the samples received from the other side
`actually played by the endpoint. The recording device has
`knowledge of what was played at the other endpoint in order
`to accurately playback the audio samples generated by and
`received from the other endpoint. The recording device
`receives a packet stream containing the audio generated on
`each endpoint and the timestamp of the packet from the
`other side that was played on the endpoint. The recording
`device can reconstruct from this data the audio signal that
`was actually played on each endpoint.
`
`28 Claims, 11 Drawing Sheets
`
`112
`
`118
`
`
`
`
`IP PACKET
`NEWVORK
`
`ENDPOINT
`A
`
`
`
`2 CHANNEL
`IP RECORDER
`
`126
`
`110/6
`
`Ca||Copy
`1103-1
`
`CallCopy
`1103-1
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 1 of 11
`
`US 7,286,652 B1
`
`FIG.1
`PWOR ART
`
`Ca||Copy
`1103-2
`
`CallCopy
`1103-2
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 2 of 11
`
`US 7,286,652 B1
`
`Eéommoo
`
`Ez<Ez_
`
`0%
`
`mm
`
`mm
`
`XEI mun
`
`
`
`Ezmmxm
`
`Io._._>>m
`
`in
`
`.3
`
`S
`
`~
`
`><>>m:.<o
`
`mNm+,
`
`mmmmmxmH<o
`
`
`
`Axmaz<4v
`
`NUE
`
`E<mos:
`
`cm
`
`Ca||Copy
`1103-3
`
`CallCopy
`1103-3
`
`
`
`
`
`

`

`U.S. Patent
`
`1B
`
`
`
` .2...__Eow_SE28Sm:"_Rm............-L__M220252comm:E5E8m_mII_E2m2ESH._
`
`SS8o_8<_o\_o_8<E_
`
`_IllllllImllllllIL3
`
`_E2m2¢5oH.__SS889>_o\_89>
`
`
`
` m:2.29%MmGEIIIIIIEIII1IIIII8n,_I8/2<2_2mERm._:K\
`s_:::::::::::::I_U8_$11SE28
`2Em»mu_____SSNN:__SE2822H_
`SE28m.SE28._._<o__2Em»m
`
`_8<.._2E2_mSSNN:_Em:
`
`Ca||Copy
`1103-4
`
`CallCopy
`1103-4
`
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 4 of 11
`
`US 7,286,652 B1
`
`4 CHANNEL
`
`IP RECORDING
`
`DEVICE
`
`FIG.4
`PR|OR ART
`
`Ca||Copy
`1103-5
`
`CallCopy
`1103-5
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 5 of 11
`
`US 7,286,652 B1
`
`zoE\2
`
`55$
`
`éosmz
`
`_
`
`H_
`
`_____
`
`%N?.22S.929:mi
`
`
`
`._.m_On_m_m..E:mV-
`
`"553
`
`mommuoomm
`
`hmoa
`
`5&835E‘..
`
`wemi.2:E;
`
`-«.. :::—::_:::.,_,_,:::j___
`
`ulN|.1.la1|I||‘I|I|I|I..|I|IUIINUII-I.lIIIIIIInIl.l'|"l!..fI|l.|.lIIIIu|II|In|nIII|nJq;__o%_,m
`
`mug
`
`\o2
`
`Ca||Copy
`1103-6
`
`CallCopy
`1103-6
`
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 6 of 11
`
`US 7,286,652 B1
`
`118
`
`
` IP PACKET
`
`NETWORK
`
` 2 CHANNEL
`
`
`
`IP RECORDER 126
`
`112
`
`ENDPOWT
`A
`
`110’////
`
`F1K}.6
`
`162
`
`170
`
`
`
`ENDPOINT
`
`A
`
` IP PACKET
`NEUNORK
`
`176
`
`TA(n)
`
`1 CHANNEL
`
`IP RECORWNG
`
`
`
`
`
`
`1 CHANNEL
`
`IP RECORWNG
`
`DEWCE — A
`DEWCE — B
`
`
`160/
`
`FIG?
`
`Ca||Copy
`1103J7
`
`CallCopy
`1103-7
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 7 of 11
`
`US 7,286,652 B1
`
`FIG. 8A
`
`2OO
`
`202
`NO
`
`RECORDING
`
`METHOD : ENOPOINT
`
`INITIALIZATION
`
`ADD|TiONAL SAMPLES TO BE
`PLAYED IN THE CURRENT RECEIVED
`
`RTP PACKEF ?
`
`YES
`
`204
`
`GET A SAMPLE FROM THE CURRENT
`RECEIVED RTP PACKET POINTEO TO
`
`BY R><_OEEsEr
`
`206
`
`208
`
`'
`
`INCREMENT RX_OFFSET BY ONE
`
`
`
`ENOPOINT B TIMESTAMP CLOCK RATE
`
`
`
`
`RX_T|MESTAM P_CUNTER = RX_PACKET_T|MESTAMP
`+ RX-OFFSH °<ENOPOINT B SAMPLING CLOCK RATE)
`
`
`
`
`
`
`
`210
`
`212
`
`NO
`
`PLAY THE SAMPLE
`
`TX_OFFSET = O ?
`
`YES
`
`1
`
`Ca||COpy
`1103-8
`
`CallCopy
`1103-8
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 8 of 11
`
`US 7,286,652 B1
`
`FIG.8B
`
`D
`
`214
`
`
`
`
`UPDATE THE ENDPOINT A TIMESTAMP
`COUNTER AND PUT THE TX__SEQUENCE
`AND ENDPOINT A TIMESTAMP IN THE
`RTP HEADER
`
`216
`
`
`
`PUT THE (RX_TIME STAMP_COUNTER)/
`(RX_SEQUENCE AND RX_OFFSET)
`IN
`THE RTP HEADER EXTENSION
`
`
`
`C7
`
`Ca||Copy
`1103-9
`
`218
`
`RECORD A SAMPLE
`
`PLACE THE SAMPLE IN THE RTP PACKET
`FOR TRANSMISSION AT OFFSET TX_OFFSET
`
`INCREMENT TX_OFFSET BY ONE
`
`-
`
`220
`
`222
`
`224
`
`RTP TRANSMISSION PACKET
`
`N0
`
`FULL ? -
`
`226
`
`
`
`232
`
`'
`SEND THE RTP PACKET TO THE ENDPOINT
`
`SEND COPY OF THE RTP PACKET TO
`THE RECORDING DEVICE
`
`ALLOCATE EMPTY BUFFER FOR THE NEXT
`RTP PACKET ; SET TX_OFFSET TO ZERO ;
`INCREMENT TX_SEQUENCE BY ONE
`
`SYNCHRONIZATION FLAG SET ?
`
`YES
`
`RESET SYNCHRONIZATION FLAG
`
`CallCopy
`1103-9
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 9 of 11
`
`US 7,286,652 B1
`
`on
`
`$2,
`
`mm:z:oo:%<Bm_z:rxmEmMomm_NobEmtonxm
`
`n§<Bms_:nh.2o§..xmob
`
`Sm
`
`mm:
`
`SN
`
`~
`
`EH9.Emtouémma%<Em_2:nh._V6E..xm
`
`HOENo._.mozmaommuxmEm
`
`OHm..¢z:ooun_2<Bm_2:..xmEm
`
`“EH9.mEz:oo..%<BH._2:uéHm
`
`
`
`Mo<.Ezo:<N_zom:oz»mEm
`
`MQENO»EmkonxmEm
`
`2:zo:<N_zo$_oz»mEm
`
`
`
`N.oflfifimmm9.Exo<n_
`
`
`
`
`
`mmC:,m_._+EOEEv_o<n_n_.Ebmzm:.F50
`
`
`
`
`
`Qz<%<Bm:::h§o<n_IxmmzonsWEtsm
`
`
`
`
`
`mmHmod:fiemzn:.mmozmsommnxm
`
`¢mN
`
`.2
`
`
`
`4..Etamm.hE__,z_>>o._.._$575
`
`mfi
`
`EOE£%<mGm:mg><._n_
`
`
`
`
`
`H205n_._.w_E>mR.EmI.r
`
`0905
`
`
`
`mozfim><._m
`
`Ca||Copy
`1103-10
`
`CallCopy
`1103-10
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 10 of 11
`
`US 7,286,652 B1
`
`RECORDING MEIHOD :
`
`RECORDING DEVICE
`
`RECEIVE TRANSMIT RTP
`PACKET FROM ENDPOINT
`
`‘
`
`BUFFER RECEIVE PACKETS
`
`STORE PACKETS IN
`
`SEQUENCE ORDER IN MEMORY
`
`250
`
`252
`
`254
`
`END
`
`FIG.9
`
`Ca||Copy
`1103-11
`
`CallCopy
`1103-11
`
`

`

`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 11 of 11
`
`US 7,286,652 B1
`
`PLAYBACK METHOD :
`
`RECORDING DEVICE
`
`260
`
`TIMESTAMP INDICATION USED '2
`
`NO
`
`264
`
`
`
`EXTRACT THE SEQUENCE
`NUMBER AND THE
`
`OFFSET FROM THE RTP
`
`HEADER EXTENSION AND
`
`SAVE THEM IN
`RX_B_SEQUENCE AND
`RX__B__OFFSEI.
`RES PECTIVELY
`
`
`
`
`
`
`EXTRACT TIMESTAMP FROM
`
`THE RTP HEADER EXTENSION
`
`
`
`
`AND SAVE IT IN
`RX_PACKET__B_T|MESTAMP
`
`
`
`
`
`
`SET NUMBER_OF_SAMPLES TO THE
`NUMBER OF SAMPLES IN THE RTP
`PACKET PAYLOAD
`
`255
`
`SILENCE INDICATION DETECTED '9
`
`NO
`
`268
`
`272
`
`
`
`
`
`APPEND A VECTOR
`
`
`
` GET A VECTOR OF
`SAMPLES TO REPLAY
`
`
`
`OF ZEROS TO THE
`PA(n) VECTOR
`
`PLAY RECONSTRUCTED AUDIO
`
`274
`
`END
`
`FIG.1O
`
`Ca||COpy
`1103-12
`
`CallCopy
`1103-12
`
`

`

`US 7,286,652 B1
`
`1
`FOUR CHANNEL AUDIO RECORDING IN A
`PACKET BASED NETWORK
`
`FIELD OF THE INVENTION
`
`The present invention relates generally to voice over IP
`networks and more particularly relates to four-channel audio
`recording for use in a packet-based network.
`
`BACKGROUND OF THE INVENTION
`
`10
`
`Separate Voice and Data Networks
`
`Currently, there is a growing trend to converge voice and
`data networks so that both utilize the same network infra-
`
`15
`
`2
`
`with a LAN interface port and a Layer 3 switch 38. The key
`components of an IP telephony system 30 are the modified
`desktop, gatekeeper and gateway entities. For the desktop,
`users may have an Ethernet phone 36 that plugs into an
`Ethernet RJ-45 jack or a handset or headset 35 that plugs
`into a PC 37.
`
`Today, all LAN based telephony systems need to connect
`to the PSTN 44. The gateway is the entity that is specifically
`designed to convert voice from the IP domain to the PSTN
`domain. The gatekeeper is primarily the IP telephony
`equivalent of the PBX in the PSTN world.
`Typically,
`the IP telephony traffic is supported by a
`packet-based infrastructure such as an Ethernet network but
`a circuit-based infrastructure can be used as well with some
`
`provisions (e.g., ATM LAN emulation on ATM networks).
`Telephony calls traversing the intranet may pass through a
`Layer 3 switch 38 or a router (not shown) connecting a
`corporate intranet 40. The Layer 3 switch and the router
`should support Quality of Service (QoS) features such as
`IEEE 802.lp and 802.1Q and Resource Reservation Proto-
`col (RSVP).
`
`ITU-T Recommendation H.323
`
`The International Telecommunications Union (ITU-T)
`Telecommunications Standardization Sector has issued a
`number of standards related to telecommunications. The
`Series H standards deals with audiovisual and multimedia
`
`20
`
`25
`
`structure. The currently available systems that combine
`voice and data have limited applications and scope. An
`example is Automatic Call Distribution (ACD), which per-
`mits service agents in call centers to access customer files in
`conjunction with incoming telephone calls. ACD centers,
`however, remain costly and difficult to deploy, requiring
`custom systems integration in most cases. Another example
`is the voice logging/auditing system used by emergency call
`centers (e.g., 911) and financial institutions. Deployment has
`been limited due to the limited scalability of the system since
`voice is on one network and data is on another, both tied
`together by awkward database linkages.
`The aim of IP telephony is to provision voice over IP
`based networks in both the local area network (LAN) and the
`wide area network (WAN). Currently, voice and data gen-
`erally flow over separate networks, the goal is to transmit
`them both over a single medium and on a single network.
`A block diagram illustrating example separate prior art
`data and voice networks is shown in FIG. 1. The LAN
`
`portion, generally referenced 10, comprises the LAN cabling
`infrastructure, routers, switches and gateways 12 and one or
`more network devices connected to the LAN. Examples of
`typical network devices include servers 14, workstations 16
`and printers. The voice portion, generally referenced 20, has
`at
`its core a private branch exchange (PBX) 24 which
`comprises one or more trunk line interfaces and one or more
`telephone and/or facsimile extension interfaces. The PBX is
`connected to the public switched telephone network (PSTN)
`22 via one or more trunk lines 28, e.g., analog, T1, E1, T3,
`ISDN, etc. A plurality of user telephones 26 and one or more
`facsimile machines 27 are also connected directly to the
`PBX via phone line extensions 29.
`The paradigm currently in wide spread use consists of
`circuit switched fabric 20 for voice networks and a com-
`
`pletely separate LAN infrastructure 10 for data. Most enter-
`prises today use proprietary PBX equipment for voice traffic.
`
`Voice and Data Over a Shared Network
`
`An increasingly common IP telephony paradigm consists
`of telephone and data tightly coupled on IP packet-based,
`switched, multimedia networks where voice and data share
`a common transport mechanism. It
`is expected that this
`paradigm will spur the development of a wealth of new
`applications that take advantage of the simultaneous deliv-
`ery of voice and data over a single unified fabric.
`A block diagram illustrating a voice over an IP network
`where voice and data share a common infrastructure is
`
`shown in FIG. 2. The IP telephony system, generally refer-
`enced 30, comprises, a LAN infrastructure represented by an
`Ethernet switch 32, a router, one or more telephones 36,
`workstations 34, a gateway 42, a gatekeeper 46, a PBX 33
`
`30
`
`systems and describes standards for systems and terminal
`equipment for audiovisual services. The H.323 standard is
`an umbrella standard that covers various audio and video
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`encoding standards. Related standards include H.225.0 that
`covers media stream packetization and call signaling proto-
`cols and H.245 that covers audio and video capability
`exchange, management of logical charmels and transport of
`control and indication signals. Details describing these stan-
`dards can be found in ITU-T Recommendation H.323 (Draft
`4 Aug. 1999), ITU-T Recommendation H.225.0 (February
`1998) and ITU-T Recommendation H.245 (Jun. 3, 1999).
`A block diagram illustrating example prior art H.323
`compliant terminal equipment
`is shown in FIG. 3. The
`H.323 terminal 50 comprises a video codec 52, audio codec
`54, system control 56 and H.225.0 layer 64. The system
`control comprises H.245 control 58, call control 60 and
`Registration, Admission and Status (RAS) control 62.
`Attached video equipment 66 includes any type of video
`equipment, such as cameras and monitors including their
`control and selection, and various video processing equip-
`ment. Attached audio equipment 70 includes devices such as
`those providing voice activation sensing, microphones,
`loudspeakers, telephone instruments and microphone mix-
`ers. Data applications and associated user interfaces 72 such
`as those that use the T.120 real time audiographics confer-
`encing standard or other data services over the data charmel.
`The attached system control and user interface 74 provides
`the human user interface for system control. The network
`interface 68 provides the interface to the IP based network.
`The video codec 52 functions to encode video signals
`from the video source (e.g., video camera) for transmission
`over the network and to decode the received video data for
`
`output to a video display. If a terminal incorporates video
`communications, it must be capable of encoding and decod-
`ing video information in accordance with H.261. A terminal
`may also optionally support encoding and decoding video in
`accordance with other recommendations such as H.263.
`
`The audio codec 54 functions to encode audio signals
`from the audio source (e.g., microphone) for transmission
`
`Ca||Copy
`1103-13
`
`CallCopy
`1103-13
`
`

`

`US 7,286,652 B1
`
`3
`over the network and to decode the received audio data for
`
`output to a loudspeaker. All H.323 audio terminals must be
`capable of encoding and decoding speech in accordance
`with G.7ll including both A-law and p.-law encoding. Other
`types of audio that may be supported include G.722, G.723,
`G.728 and G.729.
`
`The data channel supports telematic applications such as
`electronic whiteboards, still image transfer, file exchange,
`database access,
`real
`time audiographics conferencing
`(T.l20), etc. The system control unit 56 provides services as
`defined in the H.245 and H.225.0 standards For example, the
`system control unit provides signaling for proper operation
`of the H.323 terminal, call control, capability exchange,
`signaling or commands and indications and messaging to
`describe the content of logical channels. The H.225.0 Layer
`64 is operative to format the transmitted video, audio, data
`and control streams into messages for output to the network
`interface. It also functions to retrieve the received video,
`audio, data and control streams streams from messages
`received from the network interface 68.
`
`The gateway functions to convert voice from the IP
`domain to the PSTN domain. In particular,
`it converts IP
`packetized voice to a format that can be accepted by the
`PSTN. The actual format depends on the type of media and
`protocol used for connecting to the PSTN (e.g., T1, E1,
`ISDN BRI, ISDN PR1, analog lines, etc.). The gateway
`provides the appropriate translation between different video,
`audio and data transmission formats and between different
`
`communications procedures and medias.
`Note that since the digitization format for voice on the IP
`packet network is often different than on the PSTN, the
`gateway needs to provide this type of conversion that is
`known as transcoding. Note also that gateways also function
`to pass signaling information such as dial tone, busy tone,
`etc. Typical connections supported by the gateway include
`analog, T1, E1, ISDN, frame relay and ATM at OC-3 and
`higher rates. Additional functions performed by the gateway
`include call setup and clearing on both the network side and
`the PSTN side. The gateway may be omitted if communi-
`cations with the PSTN is not required.
`The gatekeeper functions to provide call control services,
`address translation services, call routing services, call autho-
`rization services, billing, bandwidth management and tele-
`phony supplementary services like call forwarding and call
`transfer to terminal endpoints on the network. It is primarily
`designed to be the IP telephony equivalent of the PBX.
`Logical endpoints register themselves with the gatekeeper
`before attempting to bring up a session. The gatekeeper may
`deny a request to bring up a session or may grant the request
`at a reduced data rate. This is particularly relevant to video
`connections that typically consume huge amounts of band-
`width for a high quality connection.
`Call control signaling is optional as the gatekeeper may
`choose to complete the call signaling with the H.323 end-
`points and process the call signaling or it may direct the
`endpoints to connect the call signaling channel directly to
`each other, thus the gatekeeper avoids handling the H.225.0
`call control signals.
`Through the use of H.225.0 signaling, the gatekeeper may
`reject calls from a terminal due to authorization failure. The
`reasons for rejection may include restricted access to or from
`particular terminals or gateways, or restricted access during
`certain time periods.
`Bandwidth management entails controlling the number of
`H.323 terminals that are allowed to simultaneously access
`the network. Via H.225.0 signaling,
`the gatekeeper may
`reject calls from a terminal due to bandwidth limitations.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`
`This may occur if the gatekeeper determined that there is not
`sufficient bandwidth available on the network to support the
`call.
`
`The call management function performed by the gate-
`keeper includes maintaining a list of currently active H.323
`calls. This information is used to indicate that a terminal is
`
`busy and to provide information for the bandwidth manage-
`ment function.
`
`The gatekeeper also provides address translation whereby
`an alias address is translated to a Transport Address. This is
`performed using a translation table that is updated using
`Registration messages, for example.
`
`Real-Time Transport Protocol
`
`The H.225.0 standard dictates the usage of the Real-Time
`Transport Protocol (RTP) which is defined by the IETF in
`RFC 1889 for conveying the data between the call endpoints
`and for monitoring the network congestion. The RTP pro-
`tocol defines the RTP packet structure that includes two
`parts:
`the RTP packet header part and the RTP packet
`payload part. The RTP packet header includes several fields.
`Among those fields, are the payload type identification field,
`the sequence numbering field and the time stamping field.
`Typically, applications encapsulate RTP in a UDP packet.
`UDP/IP is an unreliable transport mechanism and therefore
`there is no guarantee that the RTP packet would reach its
`destination. RTP may, however, be used with other suitable
`underlying network or transport protocols.
`RTP does not itself provide any mechanism to ensure
`timely delivery or other QoS guarantees, but relies on lower
`layer services to do so. It also does not guarantee delivery,
`nor does it assume that the underlying network is reliable
`and delivers packets in sequence. RTP includes sequence
`numbers and timestamps in the packet to allow the receiver
`to reconstruct the sender’s packet sequence and timing.
`RTP is intended to be flexible so as to provide the
`information required by a particular application. Unlike
`conventional protocols in which additional functions may be
`accommodated by making the protocol more general or by
`adding an option mechanism that required parsing, RTP can
`be tailored through modifications and/or additions to the
`headers.
`
`The RTP Control Protocol (RTCP) functions to periodi-
`cally transmit control packets to all participants in a session.
`The primary function of RTCP is to provide feedback on the
`quality of the data distribution that is useful for monitoring
`network congestion. The RTCP protocol
`is designed to
`monitor the quality of service and to convey information
`about the participants in an on-going session. RTCP also
`carries a transport level identifier for an RTP source called
`the canonical name or CNAME. Receivers require the
`CNAME to associate multiple data streams from a given
`participant in a set of related RTP sessions. The RTCP
`protocol can also be used to convey session control infor-
`mation such as participant identification. Each RTCP packet
`begins with a fixed header followed by structured elements
`of variable length. Note that the signaling/control informa-
`tion carried in the RTCP packets is transmitted using TCP/IP
`reliable protocol.
`Also under the H.323 protocol umbrella are a number of
`standards for voice codecs including for example, G.7ll,
`G.729, G.729.l and G.723.l.
`
`Ca||Copy
`1103-14
`
`CallCopy
`1103-14
`
`

`

`US 7,286,652 B1
`
`5
`Call Signaling
`
`Call signaling encompasses the messages and procedures
`used to establish a call, request changes in bandwidth of the
`call, get status of the endpoints in the call and disconnect the
`call. Call signaling uses messages defined in the H.225.0
`standard. In particular,
`the RAS signaling function uses
`H.225 .0 messages to perform registration, admissions, band-
`width changes, status and disengage procedures between
`endpoints and Gatekeepers. The RAS Signaling Charmel is
`independent from the Call Signaling Channel and the H.245
`Control Charmel.
`
`Each H.323 entity has at least one network address that
`uniquely identifies the H.323 entity on the network. For each
`network address, each H.323 entity may have several TSAP
`identifiers that enable the multiplexing of several charmels
`sharing the same network address. Endpoints have one
`well-known TSAP identifier known as the Call Signaling
`Charmel TSAP Identifier. In addition, Gatekeepers also have
`one well-known TSAP identifier defined known as the RAS
`Charmel TSAP Identifier, and one well-known multicast
`address defined known as the Discovery Multicast Address.
`Endpoints and H.323 entities use dynamic TSAP Identifiers
`for the H.245 Control Charmel, Audio Charmels, Video
`Charmels, and Data Charmels while the Gatekeeper uses a
`dynamic TSAP Identifier for Call Signaling Channels.
`Further, an endpoint may have one or more alias addresses
`associated with it. An alias address represents the endpoint
`and provides an alternate method of addressing the endpoint.
`It is important to note that an endpoint may have more than
`one alias address that translates to the same TSAP. The alias
`
`may comprise, for example, private telephone numbers,
`E.l64 numbers, any alphanumeric string that may represent
`a name, e-mail address, etc. In addition,
`the alias may
`comprise a MAC address, IP address, ATM address, access
`token, DNS address, TSAP as IP address concatenated with
`port number or name alias. Note that alias addresses are
`unique within a zone and that gatekeepers do not have alias
`addresses.
`
`When there is a Gatekeeper in the network, the calling
`endpoint addresses the called endpoint by its Call Signaling
`Charmel Transport Address or by its alias address. The
`Gatekeeper translates the latter into a Call Signaling Chan-
`nel Transport Address.
`An endpoint joins a zone via the registration process
`whereby it
`informs
`the Gatekeeper of its Transport
`Addresses and one or more associated alias addresses. Note
`
`take place before any calls are
`that registration must
`attempted. When endpoints are powered up, they look on the
`network for the Gatekeeper and once found, they register
`their TSAP and one or more aliases with therewith.
`
`Prior Art Four Charmel Audio Recording
`
`In LAN Telephony applications, the voice samples gen-
`erated are packed within RTP packets that are then encap-
`sulated within UDP/IP packets. The UDP packets that travel
`over an IP network may, however, be delayed, dropped or
`arrive out of order from their original transmission sequence
`depending on the degree of network congestion. Therefore,
`the frequency in which the packets arrive to the receive side
`is not constant.
`
`In order to combat the delay problems, many devices
`implement a jitter bulfer on the receive side. If packets are
`only delayed on the network, arriving at the receiver before
`the jitter bulfer underflows, the receive side will hear the
`sound as it was originally transmitted by the local endpoint.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`If, however, packets are dropped or packets are delayed too
`much and the jitter bulfer underflows (i.e. becomes empty),
`the receiving device either (1) replays the last packet
`received or (2) it injects a silence. Thus, in the event packets
`are dropped or are delayed excessively causing jitter bulfer
`underflow, the sound that is played on the receive side is not
`the original sound that was transmitted.
`Many audio applications including voice require that the
`audio (or voice) be recorded, at one or both ends of a
`conversation. Ablock diagram illustrating a prior art packet-
`based four charmel audio recorder is shown in FIG. 4. The
`
`system, generally referenced 80, comprises a packet network
`88 to which are connected a plurality of endpoints 82, such
`as endpoints A and B. Each endpoint comprises a loud-
`speaker (not shown) for generating audio and a microphone
`for converting audio, i.e. voice, to an electrical signal. Each
`endpoint is operative to receive an Rx signal 90 from the
`other endpoint and to generate a Tx signal 92 to the other
`side.
`
`The system further comprises a 4 four channel IP recorder
`device 94 that is adapted to receive a plurality of digitized
`audio channels and record them on storage media such as a
`hard disk, flash memory disk, RAM, NVRAM, magnetic
`tape, etc. Each endpoint sends two separate channels of
`audio to the recording device: a (I) played audio charmel and
`a (2) transmitted audio charmel. Endpoint A is adapted to
`send a separate played audio signal PA(n) 96 and a trans-
`mitted audio signal TA(n) 98 to the recording device. Note
`that the signal received (Rx 90) is not forwarded to the
`recorder as this signal is not necessarily the signal that is
`played by the endpoint. Similarly, endpoint B is adapted to
`send a separate played audio signal PB(n) 100 and a trans-
`mitted audio signal TB(n) 102 to the recording device.
`A requirement of any accurate recording system is to be
`able to faithfully playback the sound that was originally
`recorded. In a packet telephony system, a recorder must be
`able to playback the sound that was generated on the side of
`the talking endpoint (i.e. sent by the transmitter) in addition
`to the sound that was played at the listening endpoint (i.e. the
`playback signal sent to the loudspeaker). Therefore, each
`endpoint must forward two separate audio streams: the audio
`that is played through the speaker and the audio that is
`transmitted to the other side.
`
`In addition, the recording device must synchronize. The
`four channels of audio it receives from the two endpoints. It
`must be adapted to not only synchronize between playback
`and transmit between two endpoints, but must also be
`adapted to synchronize audio between transmit and playback
`from the same endpoint.
`
`SUMMARY OF THE INVENTION
`
`The present invention provides an apparatus for and a
`method of audio recording in packet-based telephony sys-
`tems. Using the present invention, the equivalent of four
`audio channels are recorded utilizing only two recording
`channels. Each channel recorded comprises the stream of
`packets (e.g., RTP packets) generated and transmitted by
`each endpoint to the other side. The RTP packets include the
`samples generated by the particular endpoint in addition to
`an indication (e.g., a pointer) of the samples received from
`the other side actually played by the endpoint. Note that the
`audio played on an endpoint is not necessarily the samples
`received from the other side.
`
`The transmit data, including the indication of the samples
`played, generated by each side of a connection is sent to the
`recording device. The recording device is operative to store
`
`Ca||Copy
`1103-15
`
`CallCopy
`1103-15
`
`

`

`US 7,286,652 B1
`
`7
`the received packet stream on some type of storage media
`such as hard disk drive, a flash memory disk, RAM,
`NVRAM, magnetic tape, etc. The recording device com-
`prises means for synchronizing the audio stream of one
`endpoint to the audio stream from the other endpoint. The
`recording device must know what was played at the end-
`point in order to accurately playback the audio samples
`generated by and received from the other endpoint. Thus, the
`recording device is effectively provided knowledge of the
`actual audio played on both ends of the connection.
`In one embodiment, a two channel IP recording device is
`adapted to receive a single packet stream generated by each
`side of a connection. The packet stream is transported from
`each endpoint to the recording device over a reliable con-
`nection, using either a reliable protocol such as TCP/IP, a
`point-to-point connection, or a circuit based connection.
`Note that it is not necessary that the reliable connection be
`a real
`time connection. The packet stream includes the
`digital audio data generated on the endpoint, e.g., voice from
`a microphone, and an indication, e.g., pointer, of the packet
`from the other side that was played on the endpoint. In a
`second embodiment, each endpoint comprises recording
`means for recording the transmit packet stream sent to the
`other side. A subsequent ofiline process combines and syn-
`chronizes the two recorded packet streams using the indi-
`cations that were added to the RTP packets.
`Since the recorder receives the audio signal that was
`generated and transmitted from each endpoint, it can recon-
`struct the audio signal
`that was actually played on the
`endpoint. To playback an audio signal, the recording device
`needs to know the samples that were actually played on each
`endpoint. The recorder is provided knowledge of the audio
`played on the other end via information transmitted in the
`data sample packets it receives. Each endpoint is adapted to
`include an indication of the audio that it played, with the
`packet of data samples sent to the recorder.
`To perform accurate playback, the recording device needs
`to know for each sample an endpoint recorded, what sample
`the endpoint played at that time. The recording device is
`provided knowledge of the audio played on the endpoint via
`information transmitted in the header and header extension
`
`portions of the RTP packets and via the knowledge of the
`number of samples in the payload portion of the RTP packet.
`There are two methods by which an endpoint informs the
`recording device which samples were played when the
`samples in the data packet were recorded: the first method
`uses timestamps and the second method uses the RTP packet
`sequence numbers and offset pointers into the RTP packets.
`In the timestamp method, each endpoint is adapted to
`include the timestamp of the packet of audio that is played,
`with the packet of data samples sent to the recording device.
`Thus, two timestamps are sent in the RTP packet including
`(1) a first timestamp of the data samples generated by the
`endpoint
`(this timestamp value is taken when the first
`sample in the packet is taken) and (2) a second timestamp of
`the packet received from the other endpoint and played at a
`point in time when the first sample of the local endpoint
`packet is generated.
`Each endpoint is operative to track the timestamp of the
`data samples received encapsulated in RTP packets sent
`from the other endpoint. These data samples are subse-
`quently played by the endpoint
`through its associated
`speaker. The data samples generated by the endpoint are
`timestamped and placed in RTP packets. In addition, the
`timestamp of the data samples played by the other endpoint
`at that moment in time is also placed in the extension portion
`of the header of the RTP packet sent to the recording device.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`If the last packet received was replayed, an indication is
`placed in the header extension of the packet that comprises
`the timestamp of the most recently received RTP packet. If
`a silence is played, a zero is placed in the header extension.
`The completed RTP packet is then sent over a real time
`connection (e.g., UDP/IP) to the remote endpoint for

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket