throbber
US007180997B2
`
`12
`
`United States Patent
`Knappe
`
`10) Patent No.:
`(45) Date of Patent:
`
`US 7,180.997 B2
`9
`9
`Feb. 20, 2007
`
`(54) METHOD AND SYSTEM FOR IMPROVING
`THE INTELLIGIBILITY OF A MODERATOR
`DURING AMULTIPARTY
`COMMUNICATION SESSION
`
`1/2001 Horn .......................... 379f2O2
`6,178,237 B1
`8/2001 Jonsson ................. 379,202.01
`6,272,214 B1
`2002/0181686 A1* 12/2002 Howard et al. ........ 379,202.01
`FOREIGN PATENT DOCUMENTS
`
`(75) Inventor: Michael E. Knappe, Sunnyvale, CA
`(US)
`
`(73) Assignee: Cisco Technology, Inc., San Jose, CA
`(US)
`
`- r
`c
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 241 days.
`21) Appl. No.: 10/236,484
`(21) App
`9
`(22) Filed:
`Sep. 6, 2002
`
`(65)
`
`O
`O
`Prior Publication Data
`US 2004/OO52218 A1
`Mar. 18, 2004
`
`EP
`WO
`
`9, 1996
`O 730 365 A2
`WOOO,72560 A1 11, 2000
`OTHER PUBLICATIONS
`M
`PCT Notification of International Search Report, Application No.
`PCT/US03/25580 filed Aug. 15, 2003, Authorized by Carole Emery
`and mailed Jan. 1, 2004.
`Kok Soon Phua et al., “Spatial Speech Coding for Multi-Telecon
`ferencing, TENCON 99, Proceedings of the IEEE Region 10
`Conference, Cheju Island, South Korea, Sep. 15-17, 1999, pp.
`313-316.
`* cited by examiner
`Primary Examiner Sinh Tran
`Assistant Examiner Walter F Briney, III
`(74) Attorney, Agent, or Firm Baker Botts L.L.P.
`
`ABSTRACT
`(57)
`51) Int. C.
`A E. an method R improving the intelligibility of a
`(2006.01)
`(51) H04M 3/56
`(52) U.S. Cl. ............................. 379/387.01: 379/202.01 E.N. t Alter
`Act N
`(58) Field of Classification Search ................ 379/202,
`E" ity E". An
`379f2O2.07 20307,204.07 20507,206.O7
`Irom a plura 1ty o respective COCC participants.
`S
`licati
`fil f
`l t •
`s h hi t
`incoming moderator Voice stream may be received from a
`ee applicauon Ille Ior complete searcn n1story.
`moderator. The plurality of participant voice streams and the
`References Cited
`moderator voice stream are transmitted Such that the intel
`ligibility of the moderator voice stream is improved relative
`to at least one of the participant Voice streams.
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`4,499,578 A
`6,011,851 A
`
`... 370.62
`2, 1985 Marouf et al. ......
`1/2000 Connor et al. ................ 381.17
`
`40 Claims, 4 Drawing Sheets
`
`200
`
`ESTABLISH MUTI-PARTY
`ON SESSION
`COMMUNICAT
`
`IDENTY PRIORITY OF
`PARTCIPANTS
`
`204
`
`RECEIVE PARTICIPANT
`VOICE STREAMS
`
`
`
`
`
`
`
`MODERATOR
`SPEAKENG
`
`YES
`IMPROVE INTELLIGIBILITY OF
`210-11 MODERATOR WOICE STREAM
`
`208
`Z
`TRANSMIT PARTICIPANT
`VOICESREAMS
`
`
`
`212-/
`
`TRANSMIT PARTICIPANT AND
`MODERATOR WOICE STREAMS
`
`
`
`
`
`CONFERENCE
`TERMINATEO
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 001
`
`

`

`U.S. Patent
`
`Feb. 20, 2007
`
`Sheet 1 of 4
`
`US 7,180,997 B2
`
`COMMUNICATIONS
`DEVICE
`
`28
`
`
`
`
`
`COMMUNICATIONS
`DEVICE
`
`COMMUNICATIONS
`DEVICE
`
`- 28
`
`50
`
`NETWORK
`
`52
`
`
`
`CALL
`MANAGER
`
`CONFERENCE
`BRIDGE
`
`28
`COMMUNICATIONS
`DEVICE
`
`FIC. 1
`
`1
`
`
`
`50
`
`54 CONFERENCE BRIDGE
`
`CONTROLLER
`
`CONVERTERS
`
`MIXER
`
`52
`
`5
`6
`
`60
`
`58
`
`62
`
`DATABASE
`
`62
`
`CONFERENCE
`PARAMETERS
`CONFERENCE
`PARTICIPANTS
`
`64
`
`66
`
`PARTICIPANT
`PRIORITIES
`
`CONFERENCE
`PARAMETERS
`CONFERENCE
`PARTICIPANTS
`
`64
`
`66
`
`PARTICIPANT
`PRIORITIES
`
`32
`
`FIC. 2
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 002
`
`

`

`U.S. Patent
`
`Feb. 20, 2007
`
`Sheet 2 of 4
`
`US 7,180,997 B2
`
`
`
`PARTICIPANT INPUTS
`
`MONAURAL MIXER
`
`108
`
`108
`
`108
`
`108
`
`108
`
`CONFERENCE OUTPUTS
`
`
`
`PARTICIPANT INPUTS
`
`100
`
`106
`
`DP)
`
`106
`
`STEREO MIXER
`
`DP) 106YDP) 198
`
`(DPI
`
`106
`
`(DPI
`
`106
`
`CONFERENCE OUTPUTS
`
`LATERAL DIRECTIVITY
`
`BACK
`
`DEPTH
`PERCEPTION
`
`122
`
`FRONT
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 003
`
`

`

`U.S. Patent
`
`Feb. 20, 2007
`
`Sheet 3 of 4
`
`US 7,180,997 B2
`
`FIC.. 6
`
`LATERAL DIRECTIVITY
`
`DEPTH
`
`FIC 7
`
`DIRECTIONAL PROCESSOR
`
`SPATIAL
`PROCESSOR
`
`
`
`106
`
`108
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 004
`
`

`

`U.S. Patent
`
`Feb. 20, 2007
`
`Sheet 4 of 4
`
`US 7,180,997 B2
`
`FIC. 8
`
`
`
`200
`
`
`
`ESTABLISH MULTI-PARTY
`COMMUNICATION SESSION
`
`
`
`
`
`
`
`202
`
`IDENTIFY PRIORITY OF
`PARTICIPANTS
`
`RECEIVE PARTICIPANT
`VOICE STREAMS
`
`2O6
`
`MODERATOR
`SPEAKING?
`
`NO
`
`IMPROVE INTELLIGIBILITY OF
`MODERATOR VOICE STREAM
`
`TRANSMIT PARTICIPANT
`VOICE STREAMS
`
`208
`
`TRANSMIT PARTICIPANT AND
`MODERATOR VOICE STREAMS
`
`CONFERENCE
`TERMINATED?
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 005
`
`

`

`US 7,180,997 B2
`
`1.
`METHOD AND SYSTEM FOR IMPROVING
`THE INTELLIGIBILITY OF A MODERATOR
`DURING AMULTIPARTY
`COMMUNICATION SESSION
`
`TECHNICAL FIELD OF THE INVENTION
`
`The present invention relates generally to the field of
`multiparty communications, and more particularly to a
`method and system for improving the intelligibility of a
`moderator during a multiparty communication session.
`
`10
`
`BACKGROUND OF THE INVENTION
`
`Communication networks, such as the Public Switched
`Telephone Network (PSTN), for transporting electrical rep
`resentations of audible sounds from one location to another,
`are well known. Additionally, packet switched networks,
`Such as the Internet, are able to perform a similar function
`by transporting packets containing data that represents
`audible sounds from one location to another. The audible
`Sounds are encoded into digital data and placed into packets
`at the origination point, and extracted from the packets and
`decoded into audible sounds at the destination point.
`Such communication networks also allow multiple people
`to participate in a single call, typically known as a “confer
`ence call.” In a conference call, the audible sounds at each
`device, usually telephones, are distributed to all of the other
`devices participating in the conference call. Thus, each
`participant in the conference call may share information
`with all of the other participants.
`Modern business practices often require that several per
`Sons meet on the telephone to engage in a conference call.
`The conference call has introduced certain applications and
`techniques that are Superior to those found in a meeting with
`persons physically present in the same location. For
`example, a conference call attendee who is not participating
`at the moment may wish to mute their audio output and
`simply listen to the other conferences. This allows the
`particular conferencee to work on another project while still
`participating in the conference.
`While the conference call has been substantially helpful in
`minimizing travel expenses and other costs associated with
`business over long distances, significant obstacles still
`remain in accomplishing many tasks with the same effi
`ciency as one would in having a meeting with all persons in
`the same physical location. For example, the inability of
`conferencees to use or see visual aids or commands com
`plicates the control and organization of the conference. This
`often results in multiple speakers “stepping-on each other's
`speech Such that the resultant audio is incomprehensible.
`Furthermore, it is difficult to determine which participant(s)
`is speaking at any given time. Accordingly, it is difficult for
`a designated moderator(s) to control the flow and/or orga
`nization of the conference.
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`SUMMARY OF THE INVENTION
`
`The present invention provides an improved method and
`system for enhancing the intelligibility of a moderator
`during a multiparty communication session that Substan
`tially eliminate or reduce the disadvantages and problems
`associated with previous systems and methods. In particular,
`the intelligibility of audio generated by a conference mod
`erator is improved with respect to other conference partici
`65
`pants. This enhances the ability of the moderator to control
`the organization, flow and/or control of the conference.
`
`60
`
`2
`In accordance with a particular embodiment of the present
`invention, a system and method for improving the intelligi
`bility of a moderator during a multi-party communications
`session includes receiving a plurality of participant voice
`streams from a plurality of respective conference partici
`pants. An incoming moderator Voice stream is also received,
`from a moderator. The method includes processing the
`plurality of participant voice streams and the moderator
`voice stream such that the intelligibility of the moderator
`voice stream is enhanced relative to at least one of the
`participant voice streams.
`In accordance with another embodiment of the present
`invention, the method includes storing a priority associated
`with each of the plurality of participant Voice streams. Such
`that at least one lowest priority Voice stream may be iden
`tified. In this embodiment, the incoming moderator voice
`stream is detected and transmitted to the conference partici
`pants. The at least one lowest priority voice stream may be
`blocked from transmission, while the moderator voice
`stream is being transmitted.
`In accordance with yet another embodiment, an increase
`in signal strength of the incoming moderator voice stream is
`detected. The participant voice streams are transmitted with
`a diminished signal strength approximately proportional to
`the increase in the signal strength of the incoming moderator
`Voice stream.
`Technical advantages of particular embodiments of the
`present invention include an improved method and system
`for improving the intelligibility of a moderator, during a
`multiparty communication session. The present invention
`allows the moderator's voice stream to be enhanced with
`respect to other conference participants. Accordingly, con
`ference participants can distinguish the moderator from
`other conference participants.
`Another technical advantage of particular embodiments
`of the present invention includes a method for improving the
`intelligibility of a moderator, in which the lowest priority
`voice stream may be blocked while the moderator's voice
`stream is being transmitted. This allows the moderator to
`speak over one participant, while allowing the rest of the
`participants to continue speaking intelligibly, while the
`moderator is exercising control over the telephone confer
`CCC.
`Other technical advantages of the present invention will
`be readily apparent to one skilled in the art from the
`following figures, description and claims. Moreover, while
`specific advantages have been enumerated above, various
`embodiments may include all, some, or none of the enu
`merated advantages.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`For a more complete understanding of the present inven
`tion and its advantages, reference is now made to the
`following description taken in conjunction with the accom
`panying drawings, wherein like numerals represent like
`parts, in which:
`FIG. 1 is a block diagram illustrating a communications
`system, in accordance with one embodiment of the present
`invention;
`FIG. 2 is a block diagram illustrating details of the
`conference bridge of FIG. 1, in accordance with one
`embodiment of the present invention;
`FIG. 3 is a block diagram illustrating a monaural mixer
`for the conference bridge of FIG. 2, in accordance with one
`embodiment of the present invention;
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 006
`
`

`

`US 7,180,997 B2
`
`3
`FIG. 4 is a block diagram illustrating a stereo mixer for
`the conference bridge of FIG. 2, in accordance with one
`embodiment of the present invention:
`FIG. 5 is a block diagram illustrating spatial placements
`of participants in a stereo conference stream generated by
`the stereo mixer of FIG. 6, in accordance with one embodi
`ment of the present invention;
`FIG. 6 is a block diagram illustrating spatial movement of
`a conference moderator to a position of higher prominence
`in a stereo conference stream, in accordance with one
`embodiment of the present invention:
`FIG. 7 is a block diagram illustrating the directional
`processors and summers of the stereo mixer of FIG. 6, in
`accordance with one embodiment of the present invention;
`and
`FIG. 8 is a flow diagram illustrating a method for improv
`ing the intelligibility of a moderator, during a conference
`call, in accordance with one embodiment of the present
`invention.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`10
`
`15
`
`25
`
`30
`
`4
`video and/or other information over the network 14. The
`communication devices 16 also communicate control infor
`mation with the network 14 to control call setup, teardown
`and processing as well as call services.
`For voice calls, the communication devices 16 comprise
`real-time applications that play traffic as it is received or
`substantially as it is received and to which packet delivery
`cannot be interrupted without severely degrading perfor
`mance. A codec (coder/decoder) converts audio, video or
`other Suitable signals generated by users from analog signals
`into digital form. The digital encoded data is encapsulated
`into IP or other suitable packets for transmission over the
`network 14. IP packets received from the network 14 are
`converted back into analog signals and played to the user. It
`will be understood that the communication devices 16 may
`otherwise Suitably encode and decode signals transmitted
`over or received from the network 14.
`The gateway 20 provides conversion between analog
`and/or digital formats. The standard analog telephones 18
`communicate standard telephony signals through PSTN 22
`to the gateway 20. At the gateway 20, the signals are
`converted to IP packets in the VoIP format. Similarly, VoIP
`packets received from the network 14 are converted into
`standard telephony signals for delivery to the destination
`telephone 18 through PSTN 22. The gateway 20 also trans
`lates between the network call control system and the
`Signaling System 7 (SS7) protocol and/or other signaling
`protocols used in PSTN 22.
`In one embodiment, the network 14 includes a call
`manager 30 and a conference bridge 32. The call manager 30
`and the conference bridge 32 may be located in a central
`facility or have their functionality distributed across and/or
`at the periphery of the network 14. The call manager 30 and
`the conference bridge 32 are connected to the network 14 by
`any suitable type of wireline or wireless link. In another
`embodiment, the network 14 may be operated without the
`call manager 30, in which case the communication devices
`16 may communicate control information directly with each
`other or with other suitable network elements. In this
`embodiment, services are provided by the communication
`devices 16 and/or other suitable network elements.
`The call manager 30 manages calls in the network 14. A
`call is any communication session between two or more
`parties. The parties may be persons and/or equipment Such
`as computers. The sessions may include real-time connec
`tions, connections having real-time characteristics, non real
`time connections and/or a combination of connection types.
`The call manager 30 is responsive to service requests
`from the communication devices 16, including the standard
`telephones 18 through the gateway 20. For example, the call
`manager 30 may provide Voicemail, bridging, multicasting,
`call hold, conference call and other multiparty communica
`tions and/or other suitable services for the communications
`devices 16. The call manager 30 provides services by
`performing the services, controlling performance of the
`services, delegating performance of the services and/or by
`otherwise initiating the services.
`The conference bridge 32 provides conference call and
`other suitable audio, video, and/or real-time multiparty
`communication sessions between communication devices
`16. A multiparty communication session includes three or
`more parties exchanging audio and/or other suitable infor
`mation. In particular, the conference bridge 32 receives
`media from participating devices 16 and, using Suitable
`signal processing techniques, mixes the media to produce
`conference signals. During normal operation, each device 16
`receives a conference signal that includes contributions from
`
`FIG. 1 illustrates a communication system 12 in accor
`dance with one embodiment of the present invention. In this
`embodiment, the communication system 12 is a distributed
`system transmitting audio, video, Voice, data and other
`suitable types of real-time and/or non real-time traffic
`between source and destination endpoints. Communication
`system 12 may be used to conduct multiple party telephone
`conference communication sessions. In accordance with a
`particular embodiment of the present invention, various
`components of communication system 12 may be config
`ured to automatically improve the intelligibility of a mod
`erator during a multi-party communication session. The
`35
`disclosed embodiments allow the moderator to exercise
`control and influence over the telephone conference, without
`completely silencing all other participants. The various
`methods and systems by which this is accomplished are
`described and illustrated herein.
`Referring to FIG. 1, communication system 12 includes a
`network 14 connecting a plurality of communication devices
`16 to each other and to standard analog telephones 18
`through a gateway 20 and the public switched telephone
`network (PSTN) 22. The communication devices 16, stan
`dard analog telephones 18 and gateway 20 are connected to
`the network 14 and/or PSTN 22 through twisted pair, cable,
`fiber optic, radio frequency, infrared, microwave and/or any
`other suitable type of wireline or wireless links 28.
`In accordance with a particular embodiment, network 14
`is the Internet, a wide area network (WAN), a local area
`network (LAN) or other suitable packet-switched network.
`In the Internet embodiment, the network 14 transmits infor
`mation in Internet Protocol (IP) packets. Telephony voice
`information is transmitted in the Voice over IP (VoIP)
`format. Real-time IP packets such as VoIP packets are
`encapsulated in real-time transport protocol (RTP) packets
`for transmission over the network 14. It will be understood
`that the network 14 may comprise any other Suitable types
`of elements and links and that traffic may be otherwise
`Suitably transmitted using other protocols and formats.
`The communication devices 16 comprise IP or other
`digital telephones, personal and other Suitable computers or
`computing devices, personal digital assistants (PDAs), cell
`or other mobile telephones or handset or any other device or
`set of devices such as the telephone 18 and gateway 20
`combination capable of communicating real-time audio,
`
`55
`
`40
`
`45
`
`50
`
`60
`
`65
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 007
`
`

`

`15
`
`5
`all other participating devices. As used herein, the term each
`means every one of at least a subset of the identified items.
`As described in more detail below, the conference bridge
`32 improves the intelligibility of a moderator during multi
`party communication sessions. In particular, the conference
`bridge 32 provides a system and method for enhancing the
`voice stream of the conference moderator, with respect to
`Voice streams of other participants.
`In operation, a call initiation request is first sent to the call
`manager 30 when a call is placed over the network 14. The
`call initiation request may be generated by a communication
`device 16 and/or the gateway 20 for telephones 18. Once the
`call manager 30 receives the call initiation request, the call
`manager 30 sends a signal to the initiating communication
`device 16 and/or gateway 20 for telephones 18 offering to
`call the destination device. If the destination device can
`accept the call, the destination device replies to the call
`manager 30 that it will accept the call. By receiving this
`acceptance, the call manager 30 transmits a signal to the
`destination device causing it to ring. When the call is
`answered, the call manager 30 instructs the called device and
`the originating device to begin media streaming to each
`other. If the originating device is a PSTN telephone 18, the
`media streaming occurs between the gateway 20 and the
`destination device. The gateway 20 then transmits the media
`to the telephone 18.
`For conference calls, the call manager 30 identifies par
`ticipants based on the called number or other suitable
`criteria. The call manager 30 controls the conference bridge
`32 to set up, process and tear down conference calls and
`other multiparty communication sessions. During the mul
`tiparty communications sessions, participants are connected
`and stream media through the conference bridge 32. The
`media is cross connected and mixed to produce conference
`output streams for each participant. The conference output
`stream for a participant includes the media of all other
`participants, a Subset of other participants or other Suitable
`mix dictated by the type of multiparty session and/or the
`participant.
`FIG. 2 illustrates details of the conference bridge 32 in
`accordance with a particular embodiment of the present
`invention. In this embodiment, the conference bridge 32
`provides real-time multiparty audio connections between
`three or more participants. It will be understood that the
`conference bridge 32 may support other types of suitable
`multiparty communications sessions including real-time
`audio streams and/or video streams, without departing from
`the scope of the present invention.
`Referring to FIG. 2, conference bridge 32 includes a
`controller 50, buffers 52, converters 54, normalizer 56,
`mixer 58 and database 60. The controller 50, buffers 52,
`converters 54, normalizer 56, mixer (e.g., adaptive Sum
`mers) 58 and database 60 of the conference bridge as well
`as other Suitable components of the communications system
`12 may comprise logic encoded in media. Logic comprises
`functional instructions for carrying out programmed tasks.
`The media comprises computer disks or other Suitable
`computer-readable media, applications specific integrated
`circuits (ASIC), field programmable gate arrays (FPGA) or
`other Suitable specific or general purpose processors, trans
`mission media or other Suitable media in which logic may be
`encoded and utilized.
`The controller 50 directs the other components of the
`conference bridge 32 and communicates with the call man
`ager 30 to set up, process and tear down conference calls.
`65
`The controller 50 also receives information regarding the
`priority of each participant either directly from the commu
`
`40
`
`45
`
`50
`
`55
`
`60
`
`US 7,180,997 B2
`
`10
`
`25
`
`30
`
`35
`
`6
`nication devices 16 or through the call manager 30. Such
`information is stored in the database 60.
`The buffers 52 include input and output buffers. The input
`buffers receive and buffer packets of input audio streams
`from participants for processing by the conference bridge
`32. The output buffers receive and buffer conference output
`streams generated by the conference bridge 32 for transmis
`sions to participants. In a particular embodiment, a particular
`input buffer or set of input buffer resources are assigned to
`each audio input stream and a particular output buffer or set
`of output buffer resources are assigned to each conference
`output stream. The input and output buffers may be associ
`ated with corresponding input and output parts or interfaces
`and perform error check, packet loss prevention, packet
`ordering and congestion control functions.
`The converters 54 include input and output converters.
`The input converters receive input packets of a participant
`from a corresponding buffer and convert the packet from the
`native format of the participants device 16 to a standard
`format of the conference bridge 32 for cross linking and
`processing in the conference bridge 32. Conversely, the
`output converters receive conference output streams for
`participants in the standard format and convert the confer
`ence output streams to the native format of participants
`devices. In this way, the conference bridge 32 allows par
`ticipants to connect using a variety of devices and technolo
`gies.
`The normalizers 56 include input and/or output normal
`izers. The normalizers receive packets from the input audio
`streams in a common format and normalize the timing of the
`packets for cross connections in the mixer 58.
`The mixer 58 includes a plurality of summers or other
`Suitable signal processing resources each operable to Sum,
`add or otherwise combine a plurality of input streams into
`conference output streams for participants to a conference
`call. As described in more detail below, the mixer 58 may be
`a monaural mixer or a stereo mixer. Once the mixer 58 has
`generated the conference output streams, each conference
`output stream is converted by a corresponding converter and
`buffered by a corresponding output buffer for transmission
`to the corresponding participant.
`The database 60 includes a set of conference parameters
`62 for each ongoing conference call of the conference bridge
`32. The conference parameters 62 for each conference call
`include an identification of participants 64 and the priority
`assigned to each participant (priority parameters) 66 for the
`conference call. In one embodiment, the participants are
`identified at the beginning of a conference call based on
`caller ID, phone number or other suitable identifier. The
`priority parameters may be initially set to a default.
`FIG. 3 illustrates components and operation of the mixer
`58 in a monaural embodiment of the present invention. In
`particular, FIG. 3 illustrates details of a monaural mixer 80
`in accordance with a particular embodiment. It will be
`understood that a monaural mixer may be otherwise Suitably
`implemented without departing from the scope of the
`present invention.
`Referring to FIG. 3, the monaural mixer 80 receives
`participant input streams 84 and combines the streams in
`summers 82 to generate conference output streams 86 for
`each participant to a conference call. In one embodiment,
`each participant is assigned a Summer 82 that receives audio
`input streams from each other participant to the conference
`call. The summer 82 combines the audio input streams to
`generate a conference output stream for delivery to the
`participant.
`
`Zoho Corp. and Zoho Corp. Pvt., Ltd.
`Exhibit 1006 – 008
`
`

`

`7
`During normal operation, each participant receives the
`audio input of each other participant. Thus, for example, the
`conference output stream of participant 1 includes the audio
`inputs of participants 2–5. Similarly, the conference output
`stream of participant 2 includes the audio inputs of partici
`pants 1 and 3–5. The conference output stream of participant
`3 includes the audio inputs of participants 1–2 and 4–5. The
`conference output stream of participant 4 includes the audio
`inputs of participants 1–3 and 5. The conference output
`stream of participant 5 includes the audio inputs of partici
`pants 1-4.
`The audio input 84 of the conference moderator may be
`amplified and/or the audio input 84 of the remaining par
`ticipants attenuated to focus on or provide higher promi
`nence to the audio input 84 of the conference moderator. A
`15
`higher prominence is provided by increasing the intelligi
`bility of the moderator relative to the remaining participants.
`For a conference moderator, the audio streams may be
`made prominent in the conference output stream by ampli
`fying the Voice input stream of the moderator or by attenu
`ating voice input streams 90 of the other participants. For
`example, the Voice input stream of the moderator may be
`multiplied by “1.2' while the voice input streams of the
`other conference participants are multiplied by "0.8. Other
`methods for enhancing the intelligibility of the moderator
`with respect to other conference participants will be
`described with regard to FIGS. 4-8.
`FIGS. 4–7 illustrate components and operation of the
`mixer 58 in a stereo embodiment of the present invention. In
`particular, FIG. 6 illustrates details of a stereo mixer 100 in
`accordance with a particular embodiment. FIG. 5 illustrates
`spatial positioning of participant audio in a stereo confer
`ence stream of a conference call participant. FIG. 7 illus
`trates details of a directional processor 106 and a summer
`108 of the Stereo mixer 100.
`Referring to FIG. 4, the stereo mixer 100 receives par
`ticipant input streams 102 and generates Stereo conference
`output streams 104 using the directional processors 106 and
`the summers 108. In one embodiment, each participant is
`assigned a directional processor 106 and a summer 108. The
`directional processor 106 receives audio input streams 102
`from other participants to the conference call and generates
`spatially positioned stereo streams that are combined by the
`summer 108 to generate the stereo conference output
`streams 104. Each stereo conference output stream 104
`includes a left (L) and a right (R) channel.
`During normal operation, each participant receives the
`audio input of each other participant to a conference call.
`Thus, for example, the Stereo conference output stream for
`participant 1 includes the audio inputs of participants 2-5.
`Similarly, the stereo conference output stream for participant
`2 includes the audio inputs from participants 1 and 3–5. The
`Stereo conference output stream for participant 3 includes
`the audio inputs of participants 1–2 and 4–5. The stereo
`conference output stream for participant 4 includes the audio
`inputs from participants 1–3 and 5. The stereo conference
`output stream for participant 5 includes the audio inputs
`from participants 1–4.
`Referring to FIG. 5, each stereo conference output stream
`104 includes audio inputs or sources 120 from the other
`participants or groups of participants that are perceived by
`the listener 122 as coming from different spatial locations.
`The spatial locations vary from front to back in the listener's
`depth perception and from left to right in the listener's lateral
`directivity. Because the Sound sources are spatially sepa
`rated, the listener 122 can more easily focus on individual
`Sound sources of auditory information in the presence of
`
`65
`
`40
`
`45
`
`50
`
`55
`
`60
`
`US 7,180,997 B2
`
`10
`
`25
`
`30
`
`35
`
`8
`other Sound sources. Thus, the spatial separation of the
`sound sources 120 increases the ability of the listener 122 to
`differentiate between the multiple sound sources 120.
`In the illustrated embodiment, each participant 1-4 is
`spatially positioned in front and at an equal distance from the
`participant 5. In this configuration, each participant 1-4 has
`an equal degree or Substantial degree of prominence with
`respect to the participant 5. As described in more detail
`below, participants 1–4 in the Stereo conference output
`stream 104 may be repositioned to the foreground to provide
`a higher degree of intelligibility and prominence to partici
`pant 5.
`Referring to FIG. 6, the output stream 104 for participant
`5, for example, includes the audio input of moderator 3 in
`the foreground with the other participants 1, 2 and 4 in the
`background. The foreground position provides participant 5
`or other listener 160 with the highest degree of intelligibility
`such that the listener may focus on moderator 3 or other
`moderator audio Sources 162 while still hearing non-mod
`erator sources 164 in the background. It will be understood
`that moderator 162 may be otherwise suitably positioned in
`the output stream(s) 104, without departing from the scope
`of the present invention.
`Referring to FIG. 7, the directional processor 106 of the
`stereo mixer 100 includes a plurality of spatial processors
`180 and the summer 108 includes a left and right channel
`summers 182. The spatial processors 180 each present
`monaural Sources at different locations in a binaural Sound
`field using standard intensity panning and/or Head Related
`Transfer Function (HRTF) position filtering. The binaural
`Sound streams each include left and right channels compo
`nents 184 generating a perceived position such as, for
`example, back/left front/center and back/right. The left
`channel of each binaural stream is fed to the left channel
`summer 182 while the right channels are fed to the right
`channel summer 182. The summers 182 generate a com
`bined left stream 186 and combined right stream 188 includ
`ing a perceived plurality of discrete audio inputs spatially
`positioned in two or three dimensional space relative to the
`listener. Further information regarding the directional pro
`cessor 106 a

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket