`
`US[]t]6l92395B1
`
`(12) United States Patent
`Lerner et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 6,192,395 B1
`Feb. 20, 2001
`
`(54) SYSTEM AND MlC'I'H()l) FOR VISUALLY
`IDENTIFYING SPEAKING PARTICIPANTS
`IN A MULTI-PARTICIPANT NETWORKED
`EVENT
`
`(75)
`
`Inventors: Edward A. Lerner. San Francisco, CA
`(US); Arthur W. Min, Kirkland, WA
`(US); Jantes 15. C. Morris, San Jose,
`CA ( US)
`
`sitase Bmee.
`4.831.618
`5:199:
`(Telli.
`5_.n2o,n<_:s
`931995 Bieseiin ct a|..
`5_.55a_.s75
`4tt9‘)'i'
`I"enton ct al.
`.
`5_.6l‘J_.555
`5_,rs2?,9?8 * Sites)? Attorn elal. ........................ . 395t33u
`5.-$68,863
`W199‘! Bicsci in et al. .
`5_.?]t.I.5<Jl
`1t1998 Bruno et al. _
`5_.894_.30o'
`4;'19fJfJ
`Iehimurti.
`.
`5,9‘Jl_.385
`lltlfmfl Dunn et al.
`5_.9‘J‘J_,2()8 * 12,-‘I999 McNerneyetaI.
`
`.................. .. 348.515
`
`(?'3) Assignec: Multitude, I nc., South San Francisco,
`CA (US)
`
`* cited by examiner
`
`( * ) Notice:
`
`Under 35 U,S.C. l54(b), the term of this
`patent shall be extended for (I days.
`
`(21) App}. No; 09,465,717
`
`(22)
`
`Filed:
`
`Dec. 17, 1999
`
`(50)
`
`Related U.S. Application Data
`Provisional application No. ()[1t'l13_.644_. Filed on Dec. 23,
`I998.
`
`Int. Cl.7 .................................................... .. CHEF l3,I00
`(51)
`(52) U.S. CI.
`.......................... 709E204; 709E205; 709E314;
`379.I’2[I2
`
`(58) Field of Search ................................... .. 709,i’2U4, 305,
`7U9;"2l8, 223, 234, 313, 314, 328, 329;
`37‘.U2[I3
`
`(56)
`
`References Cited
`U.S. I’A'I‘liN'I' DOCUMliN'l'S
`
`4,274,155
`
`(J,-‘I981 Funderburk et ai. .
`
`Primary £'xmm'ner—Viet D. W
`(74) Attorrtey, Agent, or Ft'rm—Pennie 8.’. Edmonds I._LP
`
`(57)
`
`ABSTRACT
`
`A method of visually identifying speaking participants in a
`rnuhi-participant event such as an audio conference or an
`0n—Iine game includes the step of receiving packets of
`digitizetl sound from a network connection. The identity of
`the participant associated with each packet is used to route
`the packet to a channel bulTer or an overflow buffer. Each
`channel buffer may be assigned to a single participant in the
`rnulti-participant. A visual
`identifier module updates the
`visuai identifier associated with participants that have been
`assigned a channel buffer. In some ernbocliments. the appear-
`ance of the visual identifier associated with the participant is
`dependent upon the dillcrential of an acoustic parameter
`derived from content in the associated buffer channel and a
`reference value stored in a participant record.
`
`25 Claims, 7 Drawing Sheets
`
`i I
`
`t
`It
`
`IIIIIIIIIII
`
`
`
`Visual identification
`module
`
`60
`
`Chan nal
`buffer 2
`
`Channel
`
`buffer 1
`
`
`52-1
`
`
`
`
`Sound miner
`66
`
`Output to User ito
`Device
`402
`
`
`
`
`
`Output device
`
`38
`
`' arti eipant data stmeture
`46
`
`0001
`0001
`
`Apple 1021
`Apple 1021
`U.S. Pat. 8,243,723
`U.S. Pat. 8,243,723
`
`
`
`..lHC4....aPSU
`
`Feb. 20, 2001
`
`Sheet 1 of 7
`
`.1B593.,29J,6SU
`
`Va
`
`oomtflc_éoémz
`
`.flflflflflflflflfl.
`
`
`
`Lnununnun.EaozbmEmuEm&o_tmn_
`
`
`
`Emzmxm9592.0
`
`
`
`
`
`mo_>mn_2_Em:o__.6oE_o::oouczow
`
`Em EmtznE58mzoumm
`
`
`
`em Z-Nm.2Stan_o:.._m;o
`
`
`
`om.w__EEoo“gown.
`
`
`
`.oEEm_u__m§>
`
`m_:n.oE
`
`0002
`0002
`
`H.05
`
`_.-NN.
`
`:Emcm.._.
`
` w.050.
`
`
`
`
`
`No._ot_.5uczom:Emcm._._.
`
`
`
`
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 2 of 7
`
`US 6,192,395 B1
`
`2112-1
`
`202-2
`
`202-N
`
`6
`
`201-1
`21211-1
`2122-1
`21°-1
`212-1
`
`204-2
`200-2
`200-2
`210-2
`212-2
`
`204-11
`21112-11
`222-11
`2111-11
`0 212-11
`Visual ID state
`
`FIG. 2
`
`0003
`0003
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 3 of 7
`
`US 6,192,395 B1
`
`Send signal
`to other users
`
`Cnt >
`Threshold
`314
`
`FIG. 3
`
`0004
`0004
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 4 of 7
`
`US 6,192,395 B1
`
`I I
`
`Network interface
`34
`
`I
`
`Sound control
`
`I I I I I I
`
`I
`
`I I I I
`
`Packet controller
`
`56
`
`
`
`Sf.}?2:‘:3'
`
`52-N
`
`_
`
`Visual identification I
`
`60
`_ _ _ _ _ _
`
`_ ._I
`
`66
`
`Output to User i/o
`Device
`
`402
`
`Participant data structure
`46
`
`Output device
`38
`
`FIG. 4
`
`0005
`0005
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 5 of 7
`
`US 6,192,395 B1
`
`Enque
`
`52-1
`
`502-N
`
`502-6
`
`502-5
`
`502-4
`
`502-3
`
`502-2
`
`5024
`
`Deque
`
`Tail
`
`Head
`
`FIG. 5
`
`0006
`0006
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 6 of 7
`
`US 6,192,395 B1
`
`Identify participant
`associated with the
`
`tail packet in buffer[i]
`606
`
`Set visual ID state of
`
`identified participant
`to state 1
`
`608
`
`
`
`
`Set visual identifier state
`of each participant in the
`participant data structure
`to state 2
`
`602
`
`
`
`i > number
`
`of buffers?
`622
`
`Update display of the
`vlsuat Identifier of each
`participant on output
`device based upon
`updated visual ID
`624
`
`FIG. 6
`
`0007
`0007
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 7 of 7
`
`US 6,192,395 B1
`
`206_1
`
`P
`
`‘
`'
`am°'pa”t1
`
`t1
`
`P If '
`a Iclpan
`Participant 2
`
`303-2 Participant3
`
`Partici 0 ant 2
`
`I
`
`Participant N
`
`Participant 3
`
`%
`
`206-1
`
`Participant N
`-
`Participant N
`
`208
`
`203
`
`206-2 Participant3
`
` $
`
`:
`Participant N
`
`FIG. 7B
`
`- E::‘:::?.::.;L
`205-2 Partirlzipant3
`
`Partic;iant N
`
`.r-206-3 -
`' afticipaflt 3 A
`
`FIG. 7C
`
`0008
`0008
`
`
`
`US 6,192,395 B1
`
`1
`SYSTEM AND MET]-IOD FOR VISUALLY
`II)I‘lNTIFYING SPEAKING PARTICIPANTS
`IN A MUI.Tl-I’AR'1'ICIPAN'I' NETWORKED
`EVENT
`
`CR()SS-REFERENCE TO RI:iLA'I'I:lD
`DOCUMENTS
`
`The present invention is related to the subject matter
`disclosed in U.S. patent application Ser. No. 09858.87?
`("Apparatus and Method for Creating Audio Forums") filed
`Jul. 22, 1999 and U.S. patent application Ser. No. (l9f358,
`878 (“Apparatus and Method for Establishing An Audio
`Conference in a Networked Environment”) filed Jul. 22,
`1999. The present invention is also related to the subject
`matter disclosed in U.S. Pat. No. 5,764,900 (“System and
`Method for Communicating Digitally~l:‘ncoded Acoustic
`Information Across a Network between Computers"). These
`related documents are commonly assigned and hereby incor-
`porated by reference.
`This application claims priority to the provisional patent
`application entitled, "System and Method For Visually Iden-
`tifying Speaking Participants In a Multi-Participant Net-
`worked Event," Ser. No. 6OK113,644, filed Dec. 23, 1998.
`
`BRIEF DESCRIPTION OF THE INVENTION
`
`The present invention discloses an apparatus and method
`for
`identifying which participants in a multi-participant
`events are speaking. Exemplary multi-participant events
`include audio conferences and an on—linc games.
`
`BACKGROUND OF THE INVENTION
`
`Historically, multi-participant events such as multi-party
`conferences have been hosted using Public Switched Tele-
`phone Networks (PSTNs) andfor commercial wireless net-
`works. Although such networks allow multiple participants
`to
`at once,
`they are urtsatisfactory because they
`provide no means for visually identifying each participant in
`the event. More recently, teleconferencing systems that rely
`on Internet Protocol based networks have been introduced.
`
`Such systems, which enable two or more persons to speak to
`each other using the Internet, are often referred to as
`"Internet telephony.”
`Multi-participant events include audio conferences and
`on—linc games. Such eve nts typically rely on the conversion
`of analog speech to digitized speech. The digitized speech is
`routed to all other participants across a network using the
`Internet Protocol ("IP”) and "voice over IP" or “VOIP"
`technologies. Accordingly, each participant
`to the multi-
`partieipant event has a client computer. When a participant
`speaks, the speech is digitized and broken down into packets
`that may be transferred to other participants using a protocol
`such as IP,
`transmission control protocol {TCl’), or user
`datagram protocol (UDP). See, for example, Peterson &
`Davie, Computer Netivorilrs, 1996, Morgan Kaufmann
`Publishers, lnc., San Francisco, Calif.
`While prior art Internet telephony is adequate for limited
`purposes, such as a basic two-party conference call in which
`only one participant speaks at any given time, prior art
`telephony systems are unsatisfactory. First, they frequently
`do not permit mu ltiplc participants to speak at the same time
`without data loss. That
`is, if one participant speaks,
`the
`participant
`typically cannot hear what other people said
`while the participant was speaking. Second, prior art tele-
`phony does not adequately associate a visual identifier with
`each participant. Therefore, when a multi-participant event
`
`Ill
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`includes several participants, they have djliiculty determin-
`ing who is speaking. Some Internet telephony systems have
`attempted to remedy this deficiency by requiring (i) that only
`one speaker talk at any given time andfor by (ii) posting, on
`each client associated with a participant
`in the multi-
`participant event, the icon of the current speaker. However,
`such solutions to the problem in the art are unsatisfactory
`because effective multi-participant communication requires
`that their be an ability for multiple people to simultaneously
`speak. Therefore,
`the concept of waiting in line for the
`chance to speak is not a satisfactory solution to the problem
`in the art.
`
`they
`A third drawback with prior art systems is that
`provide no mechanism for associating the characteristics of
`a participant with a visual identifier that is displayed on the
`client associated with each participant
`in the multi-
`participant event. Such characteristics could be,
`for
`example, a visual representation of how loudly a particular
`speaker is speaking relative to some historical base state
`associated with the participant. A fourth drawback of prior
`an Internet telephony systems is that they provide an unsat-
`isfactory privilege hierarchy for dictating who may partici-
`pate in a particular multi-participant event. For example, in
`typical prior art systems, there is no privilege hierarchy and
`any user, ie. the public, may join the multi-participant event.
`Such multi-participant events can be designated as “public
`forums." While public forums serve a limited purpose, they
`suffer from the drawback that there is no protection against
`hecklers or otherwise disruptive participants in the event. To
`summarize this point, prior art systems are unsatisfactory
`because they do not provide a set of hierarchical privileges
`that are associated with a participant and that allow partici-
`pants to designate events as private, public, or moderated. As
`used in this context, private events include conference calls
`in which the participants are preselected, typically by each
`other. Other users of a system may not join the event unless
`invited by one of the existing participants. Public events are
`those in which anyone can join and speak at any time.
`Moderated events may be public or private, but require that
`at least one participant be given enhanced privileges, such as
`the privilege to exclude particular participants, invite par-
`ticipants or grant and deny speaking privileges to partici-
`pants.
`What is needed in the art is an Internet telephony system
`and method that provides the tools necessary to conduct an
`effective multi-participant event. Such a system should not
`have limitations on the number of participants that may
`concurrently speak. Further, such a system should provide
`an adequate way of identifying the participants in the
`multi-participant event.
`SUMMARY OF THE INVENTION
`
`The system and method of the present invention addresses
`the need in the art by providing an Internet telephony system
`and method that visually identifies the participants in a
`multi-participant event. In the present invention, there is no
`limitation on the number of participants that may concur-
`rently speak. Each participant in a multi-participant event is
`associated with a visual identifier. The visual identifier of
`each participant is displayed on the client display screen of
`the respective participants in the multi-participant event. In
`one embodiment, at least one characteristic of the participant
`is reflectcd in the visual
`identifier associated with the
`participant. Further, the system and method of the present
`invention addresses the unmet need in the art by providing
`participants with the flexibility to assign a privilege hierar-
`chy. Using this privilege hierarchy, events may be desig-
`
`0009
`0009
`
`
`
`US 6,192,395 B1
`
`5
`
`ill
`
`3
`nated as public, private, or moderated and selected partici-
`pants may be granted moderation privileges.
`A system in accordance with one aspect of the present
`invention includes a participant data structure comprising a
`plurality of participant records. Each participant record is
`associated with a dilferc nt participant in a multi-participant
`event. Multi-participant events of the present
`invention
`include audio conferences and on—linc games. Further, sys-
`tems in accordance with one aspect of the present invention
`include an application module, which provides a user inter-
`face to the multi-participant event. and a sound control
`module that is capable of receiving packets from a network
`connection. Each packet is associated with a participant in
`the rnulti-participant event and includes digitized speech
`from the participant. The sound controller has a set of
`bulfers. Each bufier preferably manages packets as a first—in
`first-out queue. The sound controller
`further includes a
`packet controller that determines which participant is asso-
`ciated with each packet
`that has been received by the
`network connection. The sound controller routes the packet
`to a buffer based on the identity of the participant associated
`with the packet. The sound controller also includes a visual
`identification module for determining which participants in
`the Inulti-participant event are speaking. The visual identi-
`fication module updates the visual identifier associatctl with
`each participant that is speaking to reflect the fact that they _
`are speaking. Further
`the visual
`identification module
`updates the visual identifier associated with each participant
`that is not speaking to reflect
`the fact
`that
`they are not
`speaking. Finally, systems in accordance with a preferred
`embodiment of the present invention include a sound mixer
`for mixing digitized speech from at least one of the buffers
`to produce a mixed signal that is presented to an output
`device.
`
`15
`
`30
`
`In some embodiments of the present invention, the par-
`ticipant record associatcd with each participant includes a
`reference speech amplitude. In such embodiments, the visual
`identification module determines a butfercd speech ampli-
`tude based upon a characteristic of digitized speech in at
`least one packet, associated with said participant,
`that
`is
`managed by a butler and computes a speech amplitude
`differential based on (i) the bttflercd speech amplitude and
`(ii) the reference speech amplitude stored in the participant
`record. The visual identifier associated with the participant
`is updated based on this speech amplitude dilferential.
`Further, the buffered speech amplitude is saved as a new
`reference speech amplitude in the participant record asso-
`ciated with the participant.
`In a method in accordance with the present invention
`packets are received from a remote source and an identity
`associated with each packet
`is determined.
`In one
`embodiment, this identity does not disclose the true identity
`of the participant. in such embodiments, the identity could
`be a random number assigned to the participant for the
`duration of the multi-participant event. The identity of each
`packet received from the remote source is compared with an
`identity associated with a channel bulfer. The identity asso-
`ciated with each channel buffer is determined by an identity
`of :1 packet stored by the channel butler. In one embodiment,
`a channel buffer is reserved for a single participant in the
`rnulti-participant event at any given time. When the channel
`bullet’ is not storing a packet
`the channel buffer is not
`associated with a participant and is considered "available."
`Packets are routed based on the following rules:
`(i) to a channel buffer when the identity of the packet
`matches the identity associated with a channel buffer;
`(ii) to an available channel bullet’ when the identity of the
`packet does not match an identity of a channel buffer;
`and
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`0010
`0010
`
`4
`(iii) to an overflow bullet when the identity of the packet
`does not match an identity of a channel bu flier and there
`is no available channel bulfer.
`A dilferent visual identifier is associated with each par-
`ticipant in the multi-participant event. In some embodiments
`of the present invention, the appearance of the visual iden-
`tifier is determined by whether the identity of the participant
`associated with the visual
`identilier matches an identity
`associated with a channel buifer. in other embodiments of
`the present invention, the appearance of the visual identifier
`is determined by the difference between an acoustic param-
`eter derived from digitized speech in a channel butler
`associated with the participant and a reference acoustic
`parameter stored in a participant record.
`BRIEF DESCRIPTION OF Tl-IE DRAWINGS
`
`For a better understanding of the invention, reference
`should be made to the following detailed description taken
`in conjunction with the accompanying drawings, in which:
`F] G. 1 illustrates a system for identifying which partici-
`pants in a multi-participant event are speaking in accordance
`with one embodiment of the invention.
`
`FIG. 2 illustrates a participant data structure in accordance
`with one embodiment of the invention.
`
`FIG. 3 illustrates a llow diagram of the processing steps
`associated with updating the visual identifier associated with
`a participant on a client computer in accordance with one
`embodiment of the invention.
`FIG. 4 is a more detailed view of how a sound control
`module interfaces with other components of memory in a
`client computer in accordance with one embodiment of the
`invention.
`FIG. 5 illustrates the structure of a channel buffer in one
`embodiment of the invention.
`
`FIG. 6 illustrates the processing steps associated with
`identifying which participants in a multi-participant event
`are speaking in accordance with one embodiment of the
`present invention.
`FIGS.
`'7rr—'.-'c are a stylized illustration of the visual
`identifiers associated with N participants in a multi-
`participant event.
`Like reference numerals refer to corresponding parts
`throughout the several views of the drawings.
`Dl:"l‘AILl:ID DESCRIPTION 01*‘ 'l'l-lli
`INVENTION
`
`FIG. 1 illustrates a clienlfserver computer apparatus 10
`incorporating the technology of the present invention. The
`apparatus ll] includes a set of client computers 22 which are
`each linked to a transmission channel 84. The transmission
`channel 84 generically refers to any wire or wireless link
`between computers. The client computers 22 use transmis-
`sion channel 84 to communicate with each other
`in a
`multi-participant event. The multi-participant event could be
`regulated by a server computer 24 or other server computers
`designated by server computer 24.
`Each client computer 22 has a standard computer eon-
`figuration including a central processing unit (CPU) 30,
`network interface 34, and memory 32. Memory 32 stores a
`set of executable programs. Client computer 22 also
`includes inputfoutput device 36. Inputtoutput device 36 may
`include a microphone, a keyboard, a mouse, a display 38,
`andfor one or more speakers.
`In one embodiment,
`the
`microphone is PC 99 compliant with a close speaking
`headset design having a full scale output voltage of 100 mV.
`
`
`
`US 6,192,395 B1
`
`ill
`
`15
`
`30
`
`35
`
`40
`
`5
`Further, in one embodiment, the microphone has a frequency
`response of :5 db from 100 Ilz to 10 kl 12., 13 dB from 300
`Hz to 5 kHz and 0 db at 1 kHz. The microphone has been
`implemented with a minimum sensitivity of -44 dB relative
`to 1 VtPa. CPU 30, memory 32, network interface 34 and
`inputfoutput device 36 are connected by bus 68. The execut-
`able programs in memory 32 include operating system 40,
`an application module 44 for providing a user interface to
`the multi-participant event, a participant data structure 46 for
`storing information about each participant
`in a multi-
`participant event, and a sound control module 48. Sound
`control module 48 receives sound from remote participants
`through network interface 34 and transmits sound from the
`local participant. which is associated with client 22,
`to
`remote panicipants across transmission channel 84. Memory
`34 also includes sound mixer 66 for combining the sound of
`each participant in the multi-participant event into a single
`signal that is sent to inputloutput device 36. In a preferred
`embodiment. operating system 40 is capable of supporting
`multiple concurrent processes or threads and includes sound
`mixer 66. In an even more preferred embodiment, operating
`system 40 is a WIN32 environment or an environment that
`provides functionality equivalent to WIN32.
`FIG. 1 illustrates that each client 22 is associated with a
`local participant
`in the multi-participant event. The local
`participant uses inputtoutput device 36 to communicate to _
`remote participants in the multi-participant event via trans-
`mission channel 84. Sound control module 48 has instruc-
`tions Iior routing sound from the local participant to the
`remote participants and for receiving sound from remote
`participants. To receive sound from remote participants,
`sound control module 48 includes a plurality of receive
`sound buffers 50.
`In a preferred embodiment, one of the
`receive sound buffers is an overflow buffer 54 and each of
`the remaining receive sound bu ffers is a channel buffer 52.
`In a preferred embodiment, receive sound bulfers 50 com-
`prises four channel bulfers 52 and one overflow buffer 54.
`Sound control module 48 further includes a packet controller
`56 for determining the participant associated with a packet
`of sound received from a remote participant and for routing
`the packet
`to the appropriate receive sound buifer 50. In
`addition, sound control module 48 includes a visual identi-
`fier module 60 that determines which participants in a
`multi—participant event are speaking.
`Sound from the local participant is stored in a transmit
`sound bulIer 62 and routed to the appropriate destination by
`transmit router 64. Transmit router 64 breaks the signal in
`transmit sound buffer 62 into packets and places the appro-
`priate header in each packet. Typically, the header includes
`routing information that will cause the packet to be sent to
`server 24 via transmission channel 84. Server 24 will then
`route the packet to all participants in the multi-participant
`event. However, in some embodiments, transmit router 64
`may direct the packets to other clients 22 directly instead of
`through server 24.
`Server 24 in system 10 includes a network interface 70 for
`receiving sound from clients 22 and for directing the sound
`to each client 22 that is participating in a multi-participant
`event. Server 24 further includes CPU 72 and memory 76.
`Network interface 70, CPU 72 and memory 76 are con-
`nected by bus 74. In a typical server 24, memory 76 includes
`one or more server applications 78 for tracking multi-
`participant events hosted by the server. Memory 76 further
`includes the profile of each user that has the privilege of
`using server 24 to participate in multi-participant events.
`These profiles are stored as user data 80. An identity of each
`participant in a multi-participant event hosted by server 24
`is stored in memory 76 as participant data 82.
`
`45
`
`50
`
`55
`
`60
`
`65
`
`0011
`0011
`
`6
`The general architecture and processing associated with
`the invention has now been disclosed. Attention presently
`turns to a more detailed consideration of the architecture of
`the invention, the processing of the invention, the distinc-
`tions between these elements and corresponding elements in
`the prior art, and advantages associated with the disclosed
`technology.
`FIG. 2 provides a detailed view of participant data struc-
`ture 46 that
`is used in one embodiment of the present
`invention. Data structure 46 includes a record 202 for each
`participant in an multi-participant event. Each record 202
`includes a participant source identifier 204. In one embodi-
`ment of the present invention, participant source identifier
`204 does not provide information that identifies the actual
`(true) identity of a participant in the rnulti-participant event.
`In such embodiments, participant source identifier 204 does
`not include the IP address or name of the remote participant.
`For example, server application 78 (FIG. 1) may assign each
`participant a random number when the participant joins a
`multi-participant event. This random number is transiently
`assigned for the duration of the multi-participant event and
`cannot be traced to the participant. In other embodiments,
`participant source identifier 204 is the true identity of the
`participant or is a screen name of the participant. In such
`embodiments visual ID 206 (FIG. 3) identifies the associated
`participant. Thus, each participant is aware of exactly who
`is included in the multi-participant event.
`In one embodiment, visual ID 206 is an icon that repre-
`sents the participant. Visual ID 206 may be randomly chosen
`from a library of icons. by sound control module 48 when a
`participant
`joins the multi—participant event. Automatic
`assignment of a visual ID 206 to each participant has the
`advantage of preserving participant anonymity.
`Alternatively, visual ID 206 is selected by the participant or
`uniquely identifies the participant by, for example, including
`the participant’s actual name, screen name, andfor a picture
`of the participant.
`In complex applications, a local participant may engage in
`several concurrent multi-participant events using client 22.
`Each multi-participant event will be assigned a window by
`application module 44 (FIG. 1). In such embodiments, it is
`necessary to provide a visual ID window field 208 in record
`202 to indicate which window visual ID 208 is located in
`andtor which multi-participant event visual ID 206 is asso-
`ciated with. Each record 202 further includes the position
`that visual II) 204 occupies in visual ID window 208. In
`embodiments that do not support concurrent multi-
`participant events on a single client 22, visual ID window
`field 208 is not required. In such embodiments, visual ID
`position 210 represents the position of. visual ID in the
`window assigned to application module 44 by operation
`system 40 (FIG. 1).
`One advantage of the present system over the prior art is
`the use of visual ID state 212, which represents a state of
`participant 204. In some embodiments, visual ID state 212
`is a single bit. When panicipant 204 is speaking the bit is set
`and when participant 204 is not speaking the bit is not set.
`In other embodiments, visual ID state 212 is selected from
`a spectrum of values. The low end of this spectrum indicates
`that participant 204 is not speaking and the high end of the
`spectrum indicates that participant 204 is speaking much
`louder then nonnal. Regardless of how visual ID state 212
`is configured, it is used to modify the appearance of visual
`ID 206 on inputfoutput device 36. For example, in embodi-
`ments where visual ID state 212 is a value selected from a
`spectrum, the value stored by visual ID state 212 may be
`used to determine visual
`ID 206 brightness. As an
`
`
`
`US 6,192,395 B1
`
`7
`illustration, when visual II) 206 is a two-dimensional array
`of pixel values, each pixel value in the array may be adjusted
`by a constant determined by the value of visual state 212.
`Thus, visual ID states 212 at the high end of the spectrum
`alter the average pixel value in the array by a larger constant
`than visual ID states 212 in the low end of the spectrum. In
`an alternative embodiment, visual ID state 212 is used to
`shift
`the color of each pixel
`in visual II) 206. As an
`illustration, when visual ID state 212 is at the high end of the
`spectrum, visual ID 206 is red-shifted and when visual ID
`state 212 is at the low end of the spectrum, visual ID 206 is
`green-shifted.
`In yet another embodiment,
`the particular
`visual ID 206 is selected from a library of visual IDs to
`represent participant 202 based on a function of the value of
`visual ID state 212. For instance, a particularly animated
`visual ID 206 may be selected to represent participant 202
`when the participant is speaking loudly.
`In yet another embodiment, visual state 212 includes
`information about the participant, such as whether the par-
`ticipant is speaking, has certain privileges, has placed the
`event on hold, or
`is away front
`the keyboard.
`In one
`embodiment, visual state 212 specifies whether the partici-
`pant has moderation privileges, such as the privilege to
`invite potential participants to the multi-participant event,
`the privilege to remove people from the multi—participant
`event, or the privilege to grant and deny speaking privileges
`to other participants in the multi-participant event. In sotne
`embodiments, a participant may place an event on hold in
`the classic sense that the event is muted on client 22 and
`sound is not communicated front or to the client 22 associ-
`ated with the participant. Visual state 212 may be used to
`reflect
`the fact
`that
`the participant
`is on hold.
`In some
`embodiments, the participant may also have the option to
`designate that he is “away from the keyboard." This state is
`used to inform other participants. that the participant is not
`within earshot of the multi-participant event but will be back
`momentarily.
`It will be appreciated that any information
`included in visual state 212 may be used to update the
`appearance of the visual ID 206 associated with the partici-
`pant.
`Referring to FIG. 3, detailed steps that describe how
`sound from a local participant is processed by sound control
`module 48 is shown. In step 302, sound control module 48
`monitors a microphone 86 (FIG. 1} for sound.
`In the
`embodiment depicted in FIG. 3, when the amplitude of
`sound detected by microphone 36 exceeds a base state
`(302-Yes}, processing step 304 is triggered. In processing
`step 304, the signal is stored in transmit sound buffer 62,
`packaged into a packet and routed to remote participants of
`the multi-participant event by transmit router 64 (FIG. I). In
`one embodiment, transmit router 64 will forward the packet
`to server 24 and server 24 will route the packet to remote
`participants based on participant data 82 information stored
`in memory 76 (FIG. 1). Such an embodiment is capable of
`preserving the anonymity of each participant in the multi-
`participant event. In an alternative embodiment, the identity
`of the participant is included in the packet and participant
`anonymity is precluded. In processing step 306, the visual
`ID 206 (FIG. 2} of the local participant is updated. If the
`visual ID 206 (FIG. 2) corresponding the local panicipant is
`not in state 1 (306-No}, processing step 308 is executed to
`set visual II) 206 to state "1“. Processing steps 306 and 308
`represent an embodiment in which visual ID is set to “1"
`when the local participant is speaking and “2” when the local
`participant
`is not speaking. One of skill
`in the art will
`appreciate that visual ID state 212 could be assigned any
`number in a spectrum and this number could be used to
`
`ill
`
`15
`
`_
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`adjust the appearance of visual ID 206. Further, it will be
`appreciated that processing steps 306 and 308 or their
`equivalents could he executed prior to processing step 304
`and that the updated value of visual ID state 212 could be
`packaged into the sound packet that is transmitted in pro-
`cessing step 30-4. Other clients 22 could then use the value
`of visual ID state 212 to update the appearance of the
`associated visual ID 206. Once processing steps 304 thru
`308 have been executed, the process repeats itself by return-
`ing to processing step 302.
`Processing step 302 is a continuous input function. Micro-
`phone 36 constantly detects sound and system 10 is capable
`of storing sound from transmit sound buffer 62 during the
`execution of processing steps 304 thru 318. In a WIN32
`environment, step 302 may be implemented using Microsoft
`Windows DirectSoundCapture. Techniques well known in
`the art may be used to control the capture of sound from
`microphone 36 so that the speech from the local participant
`is stored in transmit sound bullier 62 and periods of silence
`or lengthy pauses in the speech of the local participant are
`not stored in the buffer. In one embodiment, “frames” of
`Sound from microphone 36 are acoustically analyzed. Pref-
`erably each framc of sound has a length of 500 milliseconds
`or less. More preferably each frame ofsound has a length of
`250 milliseconds or less and more preferably 100 millisec-
`onds or less. In an even more preferred embodiment, each
`frame has a length of about ten milliseconds. Typically, the
`acoustic analysis comprises deriving parameters such as
`energy, zero-crossing rates, auto-correlation coe