`
`(12) United States Patent
`Dye et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7.664,056 B2
`Feb. 16, 2010
`
`(54)
`
`(75)
`
`(73)
`
`(*)
`
`(21)
`(22)
`(65)
`
`(60)
`
`(51)
`
`(52)
`(58)
`
`MEDIA BASED COLLABORATION USING
`MIXED-MODE PSTN AND INTERNET
`NETWORKS
`
`Inventors: Thomas A. Dye, Austin, TX (US);
`Thomas A. Dundon, Austin, TX (US)
`Assignee: Meetrix Corporation, Santa Monica,
`CA (US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1054 days.
`Appl. No.: 10/796.560
`
`Notice:
`
`Filed:
`
`Mar. 9, 2004
`Prior Publication Data
`US 2004/0223464 A1
`Nov. 11, 2004
`Related U.S. Application Data
`Provisional application No. 60/453,307, filed on Mar.
`10, 2003.
`
`Int. C.
`(2006.01)
`H04L 2/66
`U.S. Cl. ........................ 370/260; 370/352; 370/354
`Field of Classification Search ......... 370/352–356,
`370/260 263,265
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`6,751,477 B1*
`2003.0137976 A1*
`
`U.S. PATENT DOCUMENTS
`6/2004 Alperovich et al. ......... 455,560
`7/2003 Zhu et al. ................... 370,354
`
`* cited by examiner
`Primary Examiner Suhan Ni
`(74) Attorney, Agent, or Firm Kerr IP Group, LLC
`
`(57)
`
`ABSTRACT
`
`A method which allows standard telephone users to audio
`conference with video conferencing participants over IP net
`works in a private secure environment. A dial-out is per
`formed from one or more conference client terminals bridg
`ing audio between the Internet and the PSTN networks. The
`process uses a mixed mode hybrid network architecture for
`call set-up, initialization and teardown including the method
`to mix audio at the desktop terminal instead of in a general
`purpose server as in the prior art. The method conferences
`video and audio between multiple clients and include audio
`from a standard telephone network within the conference. A
`virtual private network connects all of the IP clients together
`including the voice over IP server used to transcode the pro
`prietary audio into the H.323 standard for transport into the
`telephony network.
`
`8 Claims, 6 Drawing Sheets
`
`Client Video Input device 451
`
`Audio input device 452
`
`Router Modem
`453
`
`Clent "POTS"
`Telephone
`457
`
`O /
`
`WoP Moderator 401
`
`All
`H
`:::::
`
`461
`Virtual Private
`Network
`
`i--------
`
`Client Computing device 459
`VPNBridge
`461
`Virtual Private Network
`423
`
`.
`
`Internet 435
`Server
`
`E
`es
`
`DataBase
`
`PSTN Client 2
`412a
`PSTN Network g 9- 412b
`433
`& Wireless Phone
`PSTN Gateway
`PSTN
`411
`Callee
`
`Global Dial Network
`is 450
`
`m
`
`A
`
`k
`= Global Dials
`E Network 2
`
`-r- -
`
`c.
`
`Standard Phone
`-- 413b
`E. Telephone
`
`allee
`
`409
`WOP Server
`
`
`
`
`
`Router
`
`3.
`
`
`
`Telephone
`
`Router
`
`Virtual Private Network
`411
`
`
`
`
`
`
`
`Audio/Video
`Client in :
`417
`Telephone
`
`-oil
`
`Conference Callee
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 001
`
`
`
`U.S. Patent
`
`Feb. 16, 2010
`
`Sheet 1 of 6
`
`US 7.664,056 B2
`
`
`
`PBX/
`POTS
`phone
`
`
`
`o
`O
`&
`Wireless
`phone
`
`
`
`433
`
`411
`
`203
`
`
`
`Multipoint
`Control
`
`FIG. 1
`(Prior Art)
`
`
`
`
`
`
`
`H.323.NetWOrk
`i.e. Ethernet
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 002
`
`
`
`U.S.
`Patent
`
`Feb. 16, 2010
`
`Sheet 2 of 6
`
`US 7.664,056 B2
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`O|| opný
`
`ZG?
`
`— — — — J
`
`
`
`??Ž (JIM)
`
`Soesoooods
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 003
`
`
`
`U.S. Patent
`
`Feb. 16, 2010
`
`Sheet 3 of 6
`
`US 7.664,056 B2
`
`097
`
`Z, quomqøN
`º~~~~ .
`
`-------------------------------------
`
`
`
`ITF ??N
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`?*********** • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
`
`ses - - - - - - - - - as a a as - - - - - a as ss - - - - - - - - - - - - - - - - a -w- - - - ess
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 004
`
`
`
`U.S. Patent
`
`Feb. 16, 2010
`
`Sheet 4 of 6
`
`US 7,664,056 B2
`
`femayes:=:
`
`
`
`
`
`NLSdRRLSP=hemayeoWOMENaIEAUg[ENB
`
`N1Sd
`
`
`
`Aemayeg
`
`
`=p#dnoIgyuaIDey(_][a7=oyfeff=
`
`2==|S1P2Il2D*guoydajeltJee2)+NISdaLi]nusd
` iEFsois09auoydaie1osreCSAaonel.
`
`Aemayeg
`
`
`
`auoydayai
`
`N1Sd/dlOA
`
`Aemayeg
`
`
`grocettcnnssenenecceeccnelicocecenfeoeed
`
`SIPe#WAINLSdtyaudYdPLEpUE]S
`
`
`
`LOSPINOIDNYeeeeneeeeeeeeehyL#SIDNISd
`
`Cot
`
`aourlddy
`
`
`
`LO¥
`
`LZp
`
`kemayes
`
`N1Sd/dlOA
`
`moSla
`
`Kemayesy
`
` .ye{faouelddy>=|4“>*eSp,
`
`
`
`
`
`
`
`VvOlaAemajes)gale)SYCF
`
`
`
`
`feuereoNISdidIOA|NISdG#WAINNUSd5Mfosly[teepSho
`
`S
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 — 005
`
`
`Kemaye5£5)ye4Nisd[==
`SSP=eaouerddy%\
`agAemayegRtlsouelddy
`
`
`
`lamewwweeeendoeeweeeN
`
`
`
`N1Sd
`
`e6pugNdA
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 005
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Feb. 16, 2010
`
`Sheet 5 of 6
`
`N1Sd
`
`WUat|D
`
`ely
`
`N1Sd
`
`CpYOMIONNISd
`
`JOUl9}u|
`
`cer
`
`
`
`GOpdnjas|[eDpueopny
`
`dIOA[J2Dsousayu04n
`
`abpugNdA
`
`200
`
`
`
`olpny_iOAsouesa}U04
`
`dnjasjepue
`
`GOP
`
`OaplA/olpny
`
`
`
`LEPLOMION€20°H
`
`LopjeuunLNdA
`
`LP07
`
`Joyesapoyy
`
`L#JUSIID
`
`LP
`
`
`
`
`
`KemayegOapiApureOlpnlyeoUaajU04
`
`oaplA/olpny
`
`N#1UalIO
`
`zy
`
`SzpjeuUNLNdA
`
`cyjeuunlNdA
`
`C#WUAI|D
`
`Sly
`
`US 7,664,056 B2
`
`
`
`.OaplApUueOlpnyaouUaLajUODGOls
`
`v0
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 — 006
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 006
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Feb. 16, 2010
`
`Sheet 6 of 6
`
`US 7.664,056 B2
`
`Remote Speaker
`401
`Remote Mic
`454
`
`
`
`
`
`
`
`418
`Remote AV Clients
`
`VPN NetWork
`Transport 461
`Remote Transport input for AW Clients
`542
`
`VPN NetWOrk
`Transport 461
`544
`Remote Transport Output
`for AV Clients
`
`a
`
`0
`
`- 08 d
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`P
`
`as 1----------------------------------------------------
`
`VPN NetWork
`VPN NetWork
`Remote Client/Moderator boundary 510
`Transport 461
`Transport 461
`Local Mic
`p
`rt Output
`LOCalT
`( 452
`ocal transportug
`Local Transport input
`548 -
`S- O==
`Audio EnCOder
`520
`401
`LOCal Moderator Client
`
`552
`Conference Audio Output
`Decodern:
`O
`Decoden2
`Remote Clients
`Decodern
`Audio Mixer
`532 3 to (n)
`2
`N
`525
`NN
`Audio Decompressors
`N- 527Remote mixed Audio
`G.) () Os Sound 1/F Boundary 564
`C - Moderator VoIP Mixer 569
`
`- - -a as
`
`as sm as
`
`a -- a da as up o
`
`as
`
`a
`
`550
`Moderator Audio input
`Local Speaker Output 564
`Local Speaker
`454
`
`
`
`Conference Application
`
`Boundary 566
`Moderator VoIP
`Mixer 568
`LOCal Client VoIP
`Boundary 520
`
`
`
`VoIP EnCOder
`522
`
`VoIP DeCOder
`524
`1
`VPN Network Transport 461
`WolP Server (H.323) 409
`PSTN Boundary 515 puty
`
`PSTNTransport 433 tes. Tri
`
`Audio Mixer 534
`
`: Internet Interfaces 435
`
`FIG. 6
`
`- -
`
`(S
`
`&
`CNC
`PSTN Client
`412
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 007
`
`
`
`US 7,664,056 B2
`
`1.
`MEDIA BASED COLLABORATION USING
`MIXED-MODE PSTN AND INTERNET
`NETWORKS
`
`PRIORITY CLAIM
`
`This application claims benefit of priority of U.S. provi
`sional application Serial No. 60/453,307 titled “THE
`METHOD AND PROCESS FOR MEDIA BASED COL
`LABORATION USING MIXED-MODE PSTN AND
`10
`INTERNET NETWORKS filed Mar. 10, 2003, whose
`inventors are Thomas A. Dye and Tom Dundon which is
`hereby incorporated by reference in its entirety.
`
`BACKGROUND OF THE INVENTION
`
`15
`
`2
`conferencing system used between hybrid networks connect
`ing the PSTN and Internet. Hybrid networks are used to
`communicate audio on internal LAN and WAN networks as
`well as transfer of audio to the existing telephone or PSTN
`network. While the H.323 recommendation allows for video
`conferencing, the prior art systems use private Switched net
`works to establish transport that require expensive H.323
`bridges between dedicated networks and the PSTN. Each of
`the components in FIG.3 serves this purpose to achieve audio
`telecommunications between multiple parties.
`Referring again to FIG. 1, the components of FIG. 1 are
`interconnected as follows. Prior art technology uses PC or
`client terminals 455 connected through a local area network
`457 to either a data server or a specialized audio/video server
`201. The network server 201 contains the application neces
`sary to generate the H.323 network protocol. The data server
`201 may be connected to a local gatekeeper 205 that is
`responsible for management control functions. As known the
`gatekeeper 205 is responsible for various duties such as
`admission control, status determination, and bandwidth man
`agement. Data server 201 functions are specified and handled
`through the ITU-H.225.0RAS recommendations. In addition,
`management control unit (MCU) 203 is connected to the data
`server 201. The multipoint control unit of a 203 is required by
`the eight-step ITU-H.323 recommendation for flexibility to
`negotiate endpoints and determine compatible setups for any
`conference media correspondents. The multipoint control
`unit 203 enables communication between three or more end
`points. Similar to a multipoint bridge, the gatekeeper 205 and
`the multipoint control unit 203 are optional components of
`the H.323 enabled network. Another useful job of the multi
`point control unit 203 is to determine whether to unicast or
`multicast the audio or video streams. As known by one skilled
`in the art, these decisions are dependent on the capability of
`the underlying network and the topology of the multipoint
`conference. The multipoint control unit 203 determines the
`capabilities of each client terminal 455 and status each of
`media stream.
`Again referring to FIG. 1 a standard network router 453 is
`connected between the local area network 457 and the Inter
`net 435. At the outer edges of the Internet, “points of pres
`ence' are located at multiple end-point or call termination
`sites. Gateways 411 are used to the transcode the H.323
`network information onto the PSTN 433. Standard telephone
`handsets 413 or wireless phones 412 are connected to the
`PSTN telephony system.
`FIG. 2 illustrates the embodiment of the H.323 protocol
`stack 200, its components and their interfaces to the local area
`network computers at the network interface 300. The input
`and control devices 455 along with a local area network 457
`of FIG. 1 are shown in FIG. 2, consisting of the audio input
`output block 452, the video input and output block 451, the
`system control unit and data collaboration unit 459. These
`input devices are largely responsible for the delivery of media
`data to the H.323 protocol stack 200 shown in FIG. 2.
`Again referring to FIG. 2 the sub blocks of functionality
`that make up the H.323 protocol stack 200 is described. The
`H.323 protocol stack consists of an audio codec 211, a video
`CoDec 213 connected to the audio/video 452 451 input and
`output blocks. The audio and video Codecs are responsible
`for compression and decompression of the audio and video
`sources. The real-time network protocol component 215 is
`connected to the audio video Codecs and is also responsible
`for preparation of the media data for transport according to
`the RTP (real-time protocol) recommendations.
`Again referring to the prior art system of FIG. 2, the H.323
`protocol stack has a system control unit 459 which connects
`
`1. Field of the Invention
`The present invention relates to computer system architec
`tures, and more particularly to audio and video telecommu
`nications for collaboration over hybrid networks.
`2. Description of the Related Art
`Since their introduction in the early 1980s, audio/video
`conferencing systems ("video conferencing systems) have
`enabled users to communicate between remote sites using
`telephone lines based on dedicated or switched networks.
`Recently, technology and products to achieve the same over
`Internet Protocol have been attempted. Many such systems
`have emerged on the marketplace. Such systems produce
`low-frame-rate and low quality communications due to the
`unpredictable nature of the Internet. Such connections have
`been known to produce long latencies with limited band
`width, resulting in jerky video, dropped audio and loss of lip
`Sync.
`Therefore, most video conferencing solutions have relied
`on dedicated Switched networks such as T1/T3, ISDN or
`ATM. Theses systems have the disadvantage of higher cost
`and complexity and a lack offlexibility due largely to interop
`erability issues and higher cost client equipment. High costs
`are typically related to expensive conferencing hardware and
`dedicated pay-per-minute communications usage. Most often
`these dedicated communications circuits are Switched cir
`cuits which use a fixed bandwidth allocation.
`In most prior art systems the public switched telephone
`network (PSTN) is used to transfer audio during conferencing
`and collaboration with remote parties. It is known that quality
`of audio reception is poor over typical prior art Internet pro
`tocol (IP) systems. Prior art audio/video conferencing sys
`tems which use IP networks for audio and video transport lack
`the ability to terminate audio to client end systems through
`both PSTN and IP networks. Thus, it is desirable to achieve a
`hybrid mix of audio and video data over PSTN and IP based
`audio/video conferencing to achieve full duplex real-time
`operation for all conference participants.
`Modern voice over IP telephony systems have used the
`H.323 standard from the international telecommunications
`union (ITU). The H.323 standard focuses on the transmission
`of audio and video information through the Internet or
`switched private networks. FIG. 1 illustrates a prior art H.323
`system. The block diagram of FIG. 1 includes a number of
`major components, including the general Internet 435. Inter
`net H.323 bridges or gateways 411, telecommunications
`PSTN 433 (Public Switched Telephone Network), wireless
`and land-line phone handsets 412/413, standard Internet
`router 453, an optional gatekeeper 205, a multipoint control
`unit 203, a standard local area network 457, a voice over IP
`server running the H.323 protocol 201, and multiple I/O and
`display terminals 455. FIG. 1 is an example of the prior art
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 008
`
`
`
`3
`to multiple control blocks within the H.323 protocol stack
`200. The system control unit connects to the RTC Protocol
`block 217 for real-time transport of the control information
`used to set-up and tear down the conference. The system
`control unit 459 also connects to the call-signaling units 221
`and 219 for call signaling protocols and media stream pack
`etization application used for packet based multimedia com
`munications. The system control unit 459 also connects to the
`control signaling block 223 used for control of protocols for
`multimedia communications. Lastly, the H.323 recommen
`dation defines a data collaboration capability as known and
`outlined in the T. 120 data collaboration unit 225.
`All of the defined blocks make up the H.323 protocol
`network interface to the Transport protocol and network inter
`face unit 300 for transport of data through the modem or
`router 453 to the Internet 435.
`
`10
`
`15
`
`SUMMARY OF THE INVENTION
`
`25
`
`30
`
`35
`
`The present invention comprises various embodiments
`which enable audio from standard and wireless telephones
`systems to be mixed with audio, video and collaboration data
`resident in IP networks in preparation for transport, prefer
`ably over a novel multicasting technique using virtual private
`networks. In one embodiment, audio data terminating or
`originating from the PSTN may be multiplexed into open or
`private IP networks for efficient transport to multiple local or
`remote client computers. This allows video and audio col
`laboration clients to talk with remote telephony devices dur
`ing the process of Multiparty audio/video conferencing.
`In an alternate embodiment, without video conferencing,
`the method may use public networks to transport a multicast
`enabled IP audio stream during multi-party audio conferenc
`ing without the need for a conventional audio bridge device.
`Audio data is transported in a hybrid network comprising the
`PSTN and IP network. In this embodiment, a local client
`initiates a call to the remote telephone or wireless telephony
`device from a local dial-out application located preferably on
`the clients’ computer. Call set-up is initiated as a series of
`control packet data transfers to a Voice-over-IP (VoIP) server
`40
`or PSTN gateway located at some predetermined Internet
`address on the world-wide-web. Control data packets are
`transported to the VoIP server via a secure multicast enabled
`virtual private network. The local client computer compresses
`the audio data prior to transport to the VoIP server. The VoIP
`Server uses standard ITU-T. H.323 or SIP audio telephony
`transport protocol on the primary network connection proto
`col in preparation for entry to the secondary PSTN. The
`H.323 or SIP call instantiation is a protocol completed by the
`VoIP server which requests further transport of the digitized
`audio stream through a gateway to the public PSTN. In this
`embodiment, the majority of the audio data in transport over
`virtual private tunnels is multicast enabled such that the final
`termination or origination points are geographically close to
`the local or remote client computers. Once the proprietary
`55
`data packets are handed off to the VoIP server or remote PSTN
`gateway, the invention ensures that standard protocols such as
`H.323 or SIP are used to further process for audio call set-up,
`tear-down and transport as know by those knowledgeable in
`the art.
`The H.323 or Session Initiation Protocol (SIP) are used for
`call set-up of the network connections between the Hybrid
`networks and the remote telephone(s) (PSTN). Once IP net
`work to PSTN call connection is established, compressed
`digitized audio packet data is grouped into multicast packets
`and encapsulated for traversal through the open Internet.
`Transport between the remote PSTN client (Callee) and the
`
`45
`
`50
`
`60
`
`65
`
`US 7,664,056 B2
`
`4
`Local (Caller) is accomplished with full duplex audio
`between all audio and video participants within the confer
`ence. In one embodiment, compression may be accomplished
`with a standard audio CoDec such as that specified in the
`ITU-T G.729 recommendation or with a proprietary audio
`CoDec as know in the art. Thus, audio data transcoders at the
`VoIP server may be used to match the expected audio decod
`ers located at the PSTN gateways. The unique process com
`presses the “Callee' audio data at the local client computer
`prior to multicast transport to other remote clients and to the
`VoIP server. This process minimizes the transport bandwidth
`during the first mile connection to/from the Internet.
`In one embodiment, the method for adding a telephone
`participant to a multi-participant video conference operates
`as follows. A first message is sent to each of a plurality of
`multicast appliances over the Internet, wherein the first mes
`sage comprises a group address which identifies participants.
`Each of the multicast appliances receives the first message. A
`plurality of virtual private networks are then established
`across the Internet between the multicast appliances. As a
`result, one or more of the participants are able to communi
`cate in the multi-participant video conference. The telephone
`participant then joins the multi-participant video conference
`wherein this comprises a first participant contacting the tele
`phone participant; establishing a phone number with a VoIP
`server; the VoIP server communicating with a gateway to call
`the telephone participant; and the telephone participant par
`ticipating in the multi-participant video conference.
`In one embodiment, the telephone participant participates
`in the multi-participant video conference as follows: the tele
`phone participant speaking in the video conference; generat
`ing digital voice data in response to the telephone participant
`speaking; transforming the digital Voice data into IP packets;
`transmitting the IP packets containing the digital Voice data to
`the first participant; at a computer system of the first partici
`pant, decoding the IP packets containing the digital Voice data
`to produce the digital Voice data; mixing the digital Voice data
`of the telephone participant with digital voice data of the first
`participant; and providing the mixed digital Voice data of the
`telephone participant and the first participant to the other
`participants.
`The method may further comprise: mixing the digital Voice
`data of the first participant and the digital voice data of the
`other participants; and providing the mixed digital voice data
`of the first participant and the other participants to the tele
`phone participant.
`In another embodiment, the telephone participant partici
`pates in the multi-participant video conference as follows: the
`telephone participant speaking in the video conference; gen
`erating digital Voice data in response to the telephone partici
`pant speaking; transforming the digital Voice data into IP
`packets; configuring the IP packet with a group address
`according to a multicast protocol to create a multicast IP
`packet, encapsulating the multicast IP packet as a unicast
`packet; transmitting the unicast packet over the virtual private
`networks across the Internet between one or more appliances;
`one or more of the appliances determining the multicast data
`from the unicast packet; and the appliances providing the
`multicast data to each of the other participants in the group
`address.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`A better understanding of the present invention can be
`obtained when the following detailed description of the pre
`ferred embodiment is considered in conjunction with the
`following drawings, in which:
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 009
`
`
`
`US 7,664,056 B2
`
`5
`FIG. 1 illustrates a typical H.323 audio and video confer
`encing system implemented in accordance with prior art;
`FIG. 2 illustrates an H.323 protocol stack and its compo
`nents implemented in accordance with prior art;
`FIG.3 illustrates one embodiment of the present invention;
`FIG. 4 illustrates an embodiment using multicast Protocol;
`FIG. 5 illustrates the audio and video data flow over hybrid
`networks; and
`FIG. 6 illustrates the local client data mixing used in the
`preferred embodiment.
`
`DETAILED DESCRIPTION OF THE PREFERRED
`EMBODIMENT
`
`Incorporation by Reference
`
`10
`
`15
`
`6
`video client number and 417. In addition FIG. 3 shows two
`possible telephony clients using standard wired 412 or wire
`less telephone 413 systems. PSTN client #1412 is connected
`to a wireless cell-phone that in turn is connected to the global
`dial network 450 as specified by the PSTN 433. Remote
`telephony user client 2 413 is connected to a standard tele
`phone handsets 413 which again is connected to the global
`dial network 450 based on the PSTN 433.
`Again referring to FIG.3 the Internet based clients 401415
`418 and 417 are connected through routers or modems 453
`preferably in a virtual private network configuration 461. A
`virtual private network bridge 407 is used to connect local and
`remote clients together within a secure private network. A
`local connection from the VPN bridge 407 to the voice over IP
`server 409 is used to transfer conference audio from any
`participant on the IP network to any participant in the PSTN.
`Thus, the voice over IP server 409 is responsible for transcod
`ing audio information from the virtual private network 461 to
`and from the PSTN gateway 411, thus bridging the PSTN and
`VPN together.
`FIG. 4 illustrates one embodiment of the present invention.
`The system of FIG. 4 performs audio transport between mul
`tiple client groups who all share the same multicast group
`address such that audio/video and data may be shared inter
`actively without the need of central servers. Multicast proto
`col and encapsulated media packets are implemented so that
`media data may be routed through public or private IP net
`works without the need for special hardware and software
`during the majority of the network transport. FIG. 4 shows a
`system of virtual networks that interconnect as a virtual pri
`vate network 423. Each VPN tunnel can be connected in a
`series or star topology between one or more multicasting
`appliances 449-451. One or more central servers or VPN
`bridge(s) 407 are at the center of the network topology. Mul
`ticasting enabled appliances 447, 449, 451, 453,455 are used
`at the origination or termination points for audio, video or
`data (media data) to from the backbone of the transport path.
`PSTN gateways are used to provide “points of presence'
`throughout and are responsible for origination or termination
`of audio data on and off of the PSTN from the IP network
`topology. Multicast enabled routing allows remote clients to
`be PC’s or PSTN gateways which become “Listeners” of
`media data. Thus, media data is presented or broadcast onto a
`network with one or more group addresses. This method uses
`less bandwidth and reduces latency during transport.
`Again referring to FIG. 4 PSTN group #1 412 has three
`analog telephones which are switched into a PSTN gateway
`and VoIP server 471 which is networked over public or private
`network connection to a multicast enabled VPN appliance
`447. Appliance 447 is connected to a VPN bridge server 407
`also by means of a virtual private network. The VPN Bridge
`407 is used to authenticate clients, assign multicast IP group
`addresses to various PC clients and VoIP gateway servers. In
`addition the VPN Bridge Server 407 may have additional
`meeting room or conferencing features necessary to carryout
`a multi-party conference. Connected to the VPN Bridge 407
`are various virtual private networks which form network tun
`nels to one or more other multicasting appliances 449, 451,
`453,455,457 which connect to one or more PSTN gateways
`typically located in geographically dispersed areas.
`For the purpose of the illustration of FIG. 4. PSTN group
`#1 412 is audio conferencing with PSTN client #3 414 and
`PSTN client #5 416, each of which are audio conferencing
`with AudioNideo client group #4415. In the illustration of
`FIG. 4 each member of audio/video client group #4 share
`audio with all the clients and video with each other. One
`example may be illustrated again referring to FIG. 4. If tele
`
`25
`
`30
`
`35
`
`40
`
`The following applications and references are hereby
`incorporated by reference as though fully and completely set
`forth herein.
`U.S. application Ser. No. 10/446,407 titled “Transmission
`Of Independently Compressed Video Objects Over Internet
`Protocol, Dye et al. filed May 28, 2003
`U.S. application Ser. No. 10/620,684 titled “Assigning Pri
`oritization During Encode Of Independently Compressed
`Objects, Dye, et al. filed on Jul. 16, 2003.
`International Telecommunications Union Recommenda
`tion H.323, Titled “Packet Based Multimedia Communica
`tion System.” November, 2000
`International Telecommunications Union Recommenda
`tion H.261, Titled “Video Coding for Audio Visual Services at
`Px64 kbps.”
`International Telecommunications Union Recommenda
`tion H.263, Titled “Video Coding for Low Bit-Rate Commu
`nications' February, 1998
`One embodiment of the present invention uses a decentral
`ized model for multipoint conferencing. The multipoint con
`trol unit insures communication capability once the media
`stream is transcoded to the H.323 standard as known. How
`ever, this embodiment mixes media streams at each terminal
`prior to multicast.
`FIG. 3 illustrates one embodiment of the invention. This
`embodimentallows audio video and data collaboration infor
`mation to be securely transferred between a plurality of local
`and remote clients preferably within a virtual private network.
`This embodiment provides the ability for a moderator (single
`45
`member of the conference) to dial out from a desktop com
`puter or terminal (using a novel hybrid network structure)
`connecting an external telephone user's audio into the audio/
`video conference. The embodiment integrates full duplex
`audio, video and data connections between clients conferenc
`ing on the Internet and clients conferencing on Standard tele
`phone systems. The Internet/PSTN hybrid network is the
`medium used for transport. FIG. 3 depicts the necessary
`equipment and protocols to complete the dial out to PSTN
`network method and process.
`55
`Now referring to FIG. 3, the voice over IP moderator 401
`(call initiator or caller) typically has a number of peripherals
`used for real input output devices at the desktop. These
`include a client computing devices Such as a PC or other
`computer 459, a client terminal 455 including a keyboard and
`mouse for input output control, a standard desktop telephone
`457, a video input device or camera 401 and the audio input
`device, microphone 452. In one embodiment each conference
`call connected to the Internet will have similar peripheral
`hardware devices. FIG.3 illustrates a multi-party virtual con
`ference connected over the Internet. Internet clients include
`audio video client 415, audio video client 418, and audio
`
`50
`
`60
`
`65
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1036 – 010
`
`
`
`7
`phone client #5 416 is talking, the analog audio is converted
`from switched network (PSTN) to IP in the VoIP/PSTN gate
`way 475. The digital IP is routed via Internet to an appliance
`455 at the edge of the network typically co-located with the
`VoIP/PSTN gateway 475. The appliance has been configured
`to have a virtual private network creating a tunnel through
`Internetto appliance 453 which also has Internet based virtual
`private tunnels to appliance 457 and appliance 447. Audio
`from PSTN client #5 416 is broadcast from appliance 457
`whereby all the audio/video client PCs of group #4 are “lis
`teners' and receive the audio from PSTN client 416 at the
`same time. Additionally, PSTN client #5's 416 audio is routed
`over another virtual private network to one or more appli
`ances in this case appliances 447 and 449. PSTN Client group
`#1412 are also “listeners' of the multicast group as well as
`PSTN Client #3 414. Thus, audio is broadcast to multiple
`audio devices in both IP networks and the PSTN using a
`unique group address and a virtual private network structure.
`Interactivity is gained by using the same process no matter
`who in the group is the broadcaster of audio or video.
`FIG.5 shows a more detailed block diagram of the embodi
`ment of the present invention. The moderator client #1 401
`initiates the call using the application code running on the
`voice over IP server 409. Call initiation and call transfer may
`be accomplished through a VPN tunnel 421 connected to the
`moderator client 401. Two connections to the Moderator cli
`ent 1401 through the VPN tunnel 421 are established. The
`first connection connects the VoIP conference data for call
`initiation, set-up and control 405. The second connection 403
`through the VPN tunnel connects the conference audio and
`video 403 between the moderator client 401 and multiple
`remote clients 415 417 413 connected to the Internet. The
`VPN tunnel 421 is connected into the VPN bridge 407 which
`may be located within the Internet 435 at either local or
`remote sites. As indicated in FIG. 5, the VPN bridge 407 is
`responsible for connecting and establishing the virtual private
`network used for secure conferencing. In the embodiment of
`the present invention the VPN bridge 407 bridges all the
`tunnels for data transfer. Thus, VPN tunnel 421, VPN tunnel
`423 and VPN tunnel 425 are on the same virtual private
`40
`network. Alternate embodiments may include a plethora of
`tunnels connected to through a single VPN bridge or multiple
`VPN bridges based on scalability of the system. An additional
`tunnel containing the conference Voice over IP audio and call
`set-up data 405 is connected to a separate voice over IP server
`409. The server 409 is responsible for transcoding the voice
`over IP audio and call set-up control 405 in preparation for
`data transfer across the H.323 network 437. The H.323 net
`work 437 traverses across the Internet to one of many PSTN
`gateways 411. PSTN gateways 411 form the bridge between
`50
`the Internet and the public switched telephone network 433.
`These VoIP gateways are typically located at the local
`exchange carrier (LEC) in a plethora of individual points of
`presence throughout the world. Audio telephony calls are
`terminated at the voice over IP client 413. These termination
`points may be located throughout the world. Thus, the
`embodiment shown in FIG. 5 allows for the dial-out to s