`(10) Patent No.:
`a2) United States Patent
`US 6,192,395 Bl
`
` Lerneret al. (45) Date of Patent: Feb. 20, 2001
`
`
`(54) SYSTEM AND METHOD FOR VISUALLY
`IDENTIFYING SPEAKING PARTICIPANTS
`IN A MULTI-PARTICIPANT NETWORKED
`EVENT
`
`(75)
`
`Inventors: Edward A. Lerner, San Francisco, CA
`(US); Arthur W. Min, Kirkland, WA
`‘
`7
`ro
`Ca Ee Ge Morais, Sani Jose;
`‘A
`(US)
`(73) Assignee: Multitude, Inc., South San Francisco,
`CA (US)
`
`(*) Notice:
`
`Under 35 U.S.C. 154(b), the term ofthis
`patent shall be extended for 0 days.
`
`(21) Appl. No.: 09/465,717
`
`(22)
`
`Filed:
`
`Dec. 17, 1999
`
`(60)
`
`_
`Related U.S. Application Data
`Provisional application No. 60/113,644, filed on Dec. 23,
`ae
`(ST) Tnte@0! sensessceuccaconcs GO06F13/00
`(52) OSs! Qhiosccoseccissecscsos 709/204; 709/205; 709/314;
`379/202
`(58) Field of Search ........ccsccsecseesesseeseen 709/204, 205,
`709/218, 223, 224, 313, 314, 328, 329:
`379/202
`
`(56)
`
`References Cited
`;
`;
`U.S. PATENT DOCUMENTS
`
`5/1989 Bruce .
`4,831,618
`5/1991 Celli .
`5,020,098
`9/1996 Bieselin etal. .
`5,559,875
`4/1997 Fenton et al. .
`5,619,555
`5/1997 Altom et al.
`.sccesseeseeeeee 395/330
`5,627,978 *
`9/1997 Bieselin et al. .
`5,668,863
`1/1998 Bruno et al. .
`5,710,591
`4/1999 Ichimura.
`5,894,306
`11/1999 Dunnet al.
`5,991,385
`.
`5,999,208 * 12/1999 McNerneyet al. wow 348/15
`:
`:
`* cited by examiner
`
`Primary Examiner—Viet D. Vu
`(74) Attorney, Agent, or Firm—Pennie & Edmonds LLP
`
`(57)
`
`ABSTRACT
`
`A method ofvisually identifying speaking participants in a
`multi-participant event such as an audio conference or an
`on-line game includes the step of receiving packets of
`digitized sound from a network connection. The identity of
`the participant associated with each packet is used to route
`the packet to a channel buffer or an overflow buffer. Each
`channel buffer may be assigned to a single participantin the
`multi-participant. A visual
`identifier module updates the
`visual identifier associated with participants that have been
`assigned a channelbuffer. In some embodiments,the appear-
`anceof the visual identifier associated with the participant is
`dependent upon the differential of an acoustic parameter
`derived from content in the associated buffer channel and a
`reference value stored in a participant record.
`
`4,274,155
`
`6/1981 Funderburk et al. .
`
`25 Claims, 7 Drawing Sheets
`
`Sound control
`
`II||II!|||1
`
`Channel||Channel
`
`
`buffer 1||buffer 2
`
`
`52-1
`
`fe module 48
`
`
`
`
`
`
`46
`
`Visualidentification)
`module
`60
`
`
`
`,
`|
`
`
`
`Sound mixer
`66
`
`
`Output to User i/o
`Device
`
`402
`
`Output device
`38
`
`Participant data structure
`
`0001
`0001
`
`Facebook Ex. 1021
`Facebook Ex. 1021
`U.S. Pat. 8,243,723
`US. Pat. 8,243,723
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 1 of 7
`
`US 6,192,395 BI
`
`eal
`
`v8
`
`TS
`
`IOMJON
`
`edeHe}U]
`
`Bd1IA9q]olJasn
`L-?:
`
`
` N-ZS|NJeyngjeuueYyd|L-zS||4eyngjeuueyd|||4eyngjeuueyd|jeuueuy
`z9Jayngpunosjiwsued|e|npow
`
`
`
`9SJ9]]O1JUODJOyOeY
`yg——|_unOUTER)
`[eipuR08}99
`
`
`
`ainjonayseyepyuediowed
`
`
`$JajjNqpuNOseAIeDeyY
`
`
`g|NnpowjoyjuCDpunos
`
`ajnpowuojeolddy
`
`
`Jal{USP][ENSIA\
`
`43}NOJPILWWSUBL|
`
`wie}shsBuneiedo
`
`0002
`0002
`
`
`
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 2 of 7
`
`US 6,192,395 BI
`
`202-1
`rtici
`
`
`Participantidentifiersource
`Participant source identifier
`—
`Aaa
`Visual ID
`VisualIDwindow
`Visual ID window
`
`202-2
`
`Cc
`
`ne
`
`saan
`
`Visual ID state
`
`Participant N
`
`|
`
`Visual ID
`Visual ID window
`Visual ID position
`
`FIG.2
`
`0003
`0003
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 3 of 7
`
`US 6,192,395 Bl
`
`
`
` Is visual
`
`
`ID state = 2?
`310
`
`
`
`
`
`
`
`Cnt >
`
` Sendsignal
`
`
`Threshold
`
`to other users
`
`314
`
`
`
`304
`
`
`
`
`Set visual ID
`Set visual ID
`
`state to 1
`state to 2
`308
`316
`
`318
`
`Set Cnt = 0
`
`FIG.3
`
`0004
`0004
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 4 of 7
`
`US 6,192,395 BI
`
`v v
`
`Networkinterface
`34
`
`|
`
`Sound control
`
`Packet controller
`56
`
`
`
`Channel
`
`|Channel
`Channel
`buffer N
`
`|buffer 1
`buffer 2
`
`52-1
`52-2
`52-N
`
`
`Overtiow
`
`Visual identification |
`module
`
`| | | | | |
`
`|
`
`| | | | | |
`
`60
`—~ -to.-Li____}j_ 72 _ —_ —|
`
`Sound mixer
`66
`
`Output to Useri/o
`Device
`
`402
`
`Participant data structure
`46
`
`Output device
`38
`
`FIG.4
`
`0005
`0005
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 5 of 7
`
`US 6,192,395 Bl
`
`502-6
`
`502-5
`
`502-1
`
`502-4
`
`502-3
`
`502-2
`
`FIG. 5
`
`0006
`0006
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 6 of 7
`
`US 6,192,395 Bl
`
`
`
`Set visual identifier state
`of each participant in the
`participant data structure
`to state 2
`
`
`
`
`
`602
`
`Identify participant
`associated with the
`tail packet in buffer{i]
`606
`
`Set visual ID state of
`identified participant
`to state 1
`608
`
`Update display of the
`fs number
`visual identifier of each
`
`
`participant on output
`of buffore?
`device based upon
`622
`updated visual ID
`624
`
`
`
`FIG.6
`
`0007
`0007
`
`
`
`U.S. Patent
`
`Feb. 20, 2001
`
`Sheet 7 of 7
`
`US 6,192,395 BI
`
`206-1
`
`Participant 1
`
`Participant 2
`an
`ici
`Participant 1
`206-2|Participant 3
`Participant 2
`:
`Participant N
`
`Participant 3
`
`ee ~
`
`Participant 1
`
`Participant2
`
`Participant 2
`
`:
`
`FIG. 7B
`
`Participant N
`Participant N
`Participant N
`
`aan
`
`—_1Participant|V/// 206-2|Participant3 |-—208
`
`
`
`ecYe
`:
`cea4
`ParticipantN
`V/s3 Lg
`
`Participant 1
`Participant 2
`
`FIG. 7C
`
`0008
`0008
`
`
`
`US 6,192,395 Bl
`
`1
`SYSTEM AND METHOD FOR VISUALLY
`IDENTIFYING SPEAKING PARTICIPANTS
`IN A MULTL-PARTICIPANT NETWORKED
`EVENT
`
`CROSS-REFERENCE TO RELATED
`DOCUMENTS
`
`The present invention is related to the subject matter
`disclosed in U.S. patent application Ser. No. 09/358,877
`(“Apparatus and Method for Creating Audio Forums”)filed
`Jul, 22, 1999 and U.S. patent application Ser. No. 09/358,
`878 (“Apparatus and Method for Establishing An Audio
`Conference in a Networked Environment”) filed Jul. 22,
`1999. The present invention is also related to the subject
`matter disclosed in U.S. Pat. No. 5,764,900 (“System and
`Method for Communicating Digitally-Encoded Acoustic
`Information Across a Network between Computers”). These
`related documents are commonly assigned and hereby incor-
`porated by reference.
`This application claims priority to the provisional patent
`application entitled, “System and Method For Visually Iden-
`tifying Speaking Participants In a Multi-Participant Net-
`worked Event,” Ser. No. 60/113,644, filed Dec. 23, 1998,
`
`BRIEF DESCRIPTION OF THE INVENTION
`
`The present invention discloses an apparatus and method
`for
`identifying which participants in a multi-participant
`events are speaking. Exemplary multi-participant events
`include audio conferences and an on-line games.
`
`BACKGROUND OF THE INVENTION
`
`Historically, multi-participant events such as multi-party
`conferences have been hosted using Public Switched Tele-
`phone Networks (PSTNs) and/or commercial wireless net-
`works. Although such networks allow multiple participants
`lo speak at once,
`they are unsatisfactory because they
`provide no meansforvisually identifying each participant in
`the event. More recently, teleconferencing systemsthat rely
`on Internet Protocol based networks have been introduced.
`
`Such systems, which enable two or more persons to speak to
`each other using the Internet, are often referred to as
`“Internet telephony.”
`Multi-participant events include audio conferences and
`on-line games. Such events typically rely on the conversion
`of analog speech to digitized speech. The digitized speech is
`routed to all other participants across a network using the
`Internet Protocol (“IP”) and “voice over IP” or “VOIP”
`technologies. Accordingly, each participant
`to the multi-
`participant event has a client computer. When a participant
`speaks, the speech is digitized and broken downinto packets
`that may be transferred to other participants using a protocol
`such as IP,
`transmission control protocol (TCP), or user
`datagram protocol (UDP). See, for example, Peterson &
`Davie, Computer Networks, 1996, Morgan Kaufmann
`Publishers, Inc., San Francisco, Calif.
`While prior art Internet telephony is adequate for limited
`purposes, such as a basic two-party conference call in which
`only one participant speaks at any given time, prior art
`telephony systems are unsatisfactory. First, they frequently
`do not permit multiple participants to speak at the same time
`without data loss. That
`is, if one participant speaks,
`the
`participant
`typically cannot hear what other people said
`while the participant was speaking. Second, prior art tele-
`phony does not adequately associate a visual identifier with
`each participant. Therefore, when a multi-participant event
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`includes several participants, they have difficulty determin-
`ing who is speaking. Some Internet telephony systems have
`attempted to remedythis deficiency by requiring (i) that only
`one speaker talk at any given time and/or by(ii) posting, on
`each client associated with a participant
`in the multi-
`participant event, the icon of the current speaker. However,
`such solutions to the problem in the art are unsatisfactory
`because effective multi-participant communication requires
`that their be an ability for multiple people to simultaneously
`speak. Therefore,
`the concept of waiting in line for the
`chance to speak is nota satisfactory solution to the problem
`in the art.
`
`they
`A third drawback with prior art systems is that
`provide no mechanism for associating the characteristics of
`a participant with a visual identifier that is displayed on the
`client associated with each participant
`in the multi-
`participant event. Such characteristics could be,
`for
`example, a visual representation of how loudly a particular
`speaker is speaking relative to some historical base state
`associated with the participant. A fourth drawback ofprior
`art Internet telephony systemsis that they provide an unsat-
`istactory privilege hierarchy for dictating who may partici-
`pate in a particular multi-participant event. For example, in
`typical prior art systems, there is no privilege hierarchy and
`anyuser, i.c. the public, may join the multi-participant event.
`Such multi-participant events can be designated as “public
`forums.” While public forumsserve a limited purpose, they
`suffer from the drawback that there is no protection against
`hecklers or otherwise disruptive participants in the event. To
`summarize this point, prior art systems are unsatisfactory
`because they do not provide a set of hierarchical privileges
`that are associated with a participant and that allow partici-
`pants to designate events as private, public, or moderated. As
`used in this context, private events include conference calls
`in which the participants are preselected, typically by each
`other. Other users of a system may not join the event unless
`invited by one of the existing participants. Public events are
`those in which anyone can join and speak al any time.
`Moderated events may be public or private, but require that
`at least one participant be given enhanced privileges, such as
`the privilege to exclude particular participants, invite par-
`licipants or grant and deny speaking privileges to partici-
`pants.
`What is neededin theart is an Internet telephony system
`and method that provides the tools necessary to conduct an
`effective multi-participant event. Such a system should not
`have limitations on the number ofparticipants that may
`concurrently speak. Further, such a system should provide
`an adequate way of identifying the participants in the
`multi-participant event.
`SUMMARY OF THE INVENTION
`
`The system and methodof the present invention addresses
`the need in the art by providing an Internet telephony system
`and method that visually identifies the participants in a
`multi-participant event. In the present invention, there is no
`limitation on the numberofparticipants that may concur-
`rently speak. Each participant in a multi-participant event is
`associated with a visual identifier. The visual identifier of
`each participantis displayed on the client display screen of
`the respective participants in the multi-participant event. In
`one embodiment, at least one characteristic of the participant
`is reflected in the visual
`identifier associated with the
`participant. Further, the system and method of the present
`invention addresses the unmet need in the art by providing
`participants with the flexibility to assign a privilege hierar-
`chy. Using this privilege hierarchy, events may be desig-
`
`0009
`0009
`
`
`
`US 6,192,395 Bl
`
`3
`nated as public, private, or moderated and selected partici-
`pants may be granted moderation privileges.
`A system in accordance with one aspect of the present
`invention includes a participant data structure comprising a
`plurality of participant records. Each participant record is
`associated with a different participant in a multi-participant
`event. Multi-participant events of the present
`invention
`include audio conferences and on-line games. Further, sys-
`tems in accordance with one aspectof the present invention
`include an application module, which provides a user inter-
`face to the multi-participant event, and a sound control
`module that is capable ofreceiving packets from a network
`connection. Each packet is associated with a participant in
`the multi-participant event and includes digitized speech
`from the participant. The sound controller has a set of
`buffers. Each buffer preferably manages packets as a first-in
`first-out queue. The sound controller
`further includes a
`packet controller that determines which participant is asso-
`ciated with each packet
`that has been received by the
`network connection. The sound controller routes the packet
`to a buffer based on theidentity of the participant associated
`with the packet. The sound controller also includes a visual
`identification module for determining which participants in
`the multi-participant event are speaking. The visual identi-
`fication module updates the visual identifier associated with
`each participant that is speaking to reflect the fact that they
`are speaking. Further
`the visual
`identification module
`updates the visual identifier associated with each participant
`that is not speaking to reflect
`the fact
`that
`they are not
`speaking. Finally, systems in accordance with a preferred
`embodiment of the present invention include a sound mixer
`for mixing digitized speech from at least one of the buffers
`to produce a mixed signal that is presented to an output
`device.
`
`In some embodiments of the present invention, the par-
`licipant record associated with each participant includes a
`reference speech amplitude. In such embodiments,the visual
`identification module determines a buffered speech ampli-
`tude based upon a characteristic of digitized speech in at
`least one packet, associated with said participant,
`that
`is
`managed by a buffer and computes a speech amplitude
`differential based on (i) the buffered speech amplitude and
`(ii) the reference speech amplitude stored in the participant
`record. The visual identifier associated with the participant
`is updated based on this speech amplitude differential.
`Further, the buffered speech amplitude is saved as a new
`reference speech amplitude in the participant record asso-
`ciated with the participant.
`In a method in accordance with the present invention
`packets are received from a remote source and an identity
`associated with each packet
`is determined.
`In one
`embodiment, this identity does not disclose the true identity
`of the participant. In such embodiments, the identity could
`be a random number assigned to the participant for the
`duration ofthe multi-participant event. The identity of each
`packet received from the remote source is compared with an
`identity associated with a channel buffer. The identity asso-
`ciated with each channel buffer is determined by an identity
`of a packet stored by the channel buffer. In one embodiment,
`a channel buffer is reserved for a single participant in the
`mulli-participant event at any given time. When the channel
`buffer is not storing a packet
`the channel buffer is not
`associated with a participant and is considered “available.”
`Packets are routed based on the following rules:
`(i) to a channel buffer when the identity of the packet
`matches the identity associated with a channel buffer;
`(ii) to an available channel buffer when the identity of the
`packet does not match an identity of a channel butfer;
`and
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`(iii) to an overflow buffer when the identity of the packet
`does not match an identity of a channel buffer and there
`is no available channel buffer.
`A different visual identifier is associated with each par-
`ticipant in the multi-participant event. In some embodiments
`of the present invention, the appearance of the visual iden-
`tifier is determined by whether the identity ofthe participant
`associated with the visual
`identifier matches an identity
`associated with a channel buffer. In other embodiments of
`the present invention, the appearanceofthe visual identifier
`is determined by the difference between an acoustic param-
`eter derived from digitized speech in a channel buffer
`associated with the participant and a reference acoustic
`parameter stored in a participant record.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`For a better understanding of the invention, reference
`should be made to the following detailed description taken
`in conjunction with the accompanying drawings, in which:
`FIG. 1 illustrates a system for identifying which partici-
`pants in a multi-participant event are speaking in accordance
`with one embodiment of the invention.
`
`FIG, 2 illustrates a participantdata structure in accordance
`with one embodiment ofthe invention.
`
`FIG, 3 illustrates a flow diagram ofthe processing steps
`associated with updating the visual identifier associated with
`a participant on a client computer in accordance with one
`embodiment of the invention.
`FIG. 4 is a more detailed view of how a sound control
`module interfaces with other components of memory in a
`client computer in accordance with one embodimentof the
`invention.
`FIG, 5 illustrates the structure of a channel buffer in one
`embodiment of the invention.
`
`FIG. 6 illustrates the processing steps associated with
`identifying which participants in a multi-participant event
`are speaking in accordance with one embodiment of the
`present invention.
`FIGS. 7a-7e are a stylized illustration of the visual
`identifiers associated with N participants in a multi-
`participant event.
`Like reference numerals refer to corresponding parts
`throughout the several views of the drawings.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`FIG. 1 illustrates a client/server computer apparatus 10
`incorporating the technology of the present invention. The
`apparatus 10 includes a set ofclient computers 22 which are
`each linked to a transmission channel 84. The transmission
`channel 84 generically refers to any wire or wireless link
`between computers. The client computers 22 use transmis-
`sion channel 84 to communicate with each other
`in a
`multi-participant event. The multi-participant event could be
`regulated by a server computer 24 or other server computers
`designated by server computer 24.
`Each client computer 22 has a standard computer con-
`figuration including a central processing unit (CPU) 30,
`network interface 34, and memory 32. Memory 32 stores a
`set of executable programs. Client computer 22 also
`includes input/output device 36. Input/output device 36 may
`include a microphone, a keyboard, a mouse, a display 38,
`and/or one or more speakers.
`In one embodiment,
`the
`microphone is PC 99 compliant with a close speaking
`headset design having a full scale output voltage of 100 mV.
`
`0010
`0010
`
`
`
`US 6,192,395 Bl
`
`5
`Further, in one embodiment, the microphonehas a frequency
`response of +5 db from 100 Hz to 10 kHz, +3 dB from 300
`Hz to 5 kHz and 0 db at 1 kHz. The microphone has been
`implemented with a minimum sensitivity of -44 dB relative
`to 1 V/Pa. CPU 30, memory 32, network interface 34 and
`input/output device 36 are connected by bus 68. The execut-
`able programs in memory 32 include operating system 40,
`an application module 44 for providing a user interface to
`the multi-participant event, a participant data structure 46 for
`storing information about each participant
`in a multi-
`participant event, and a sound control module 48. Sound
`control module 48 receives sound from remote participants
`through network interface 34 and transmits sound from the
`local participant, which is associated with client 22,
`to
`remote participants across transmission channel 84. Memory
`34 also includes sound mixer 66 for combining the sound of
`each participant in the multi-participant event into a single
`signal that is sent to input/output device 36. In a preferred
`embodiment, operating system 40 is capable of supporting
`multiple concurrent processes or threads and includes sound
`mixer 66. In an even more preferred embodiment, operating
`system 40 is a WIN32 environment or an environmentthat
`provides functionality equivalent to WIN32.
`FIG. 1 illustrates that each client 22 is associated with a
`local participant
`in the multi-participant event. The local
`participant uses input/output device 36 to communicate to
`remote participants in the multi-participant event via trans-
`mission channel 84. Sound control module 48 has instruc-
`tions for routing sound from the local participant to the
`remote participants and for receiving sound from remote
`participants. To receive sound from remote participants,
`sound control module 48 includes a plurality of receive
`sound buffers 50.
`In a preferred embodiment, one of the
`receive sound buffers is an overflow buffer 54 and each of
`the remaining receive sound buffers is a channel buffer 52.
`In a preferred embodiment, receive sound buffers 50 com-
`prises four channel buffers 52 and one overflow buffer 54.
`Sound control module 48 further includes a packet controller
`56 for determining the participant associated with a packet
`of sound received from a remote participant and for routing
`the packet
`to the appropriate receive sound buffer 50. In
`addition, sound control module 48 includes a visual identi-
`fier module 60 that determines which participants in a
`multi-participant event are speaking.
`Sound from the local participant is stored in a transmit
`sound buffer 62 and routed to the appropriate destination by
`transmit router 64. Transmit router 64 breaks the signal in
`transmit sound buffer 62 into packets and places the appro-
`priate header in each packet. Typically, the header includes
`routing information that will cause the packet to be sent to
`server 24 via transmission channel 84. Server 24 will then
`route the packet to all participants in the multi-participant
`event. However, in some embodiments, transmit router 64
`may direct the packets to other clients 22 directly instead of
`through server 24.
`Server 24 in system 10 includes a network interface 70 for
`receiving sound from clients 22 and for directing the sound
`to each client 22 that is participating in a multi-participant
`event. Server 24 further includes CPU 72 and memory 76.
`Network interface 70, CPU 72 and memory 76 are con-
`nected by bus 74. In a typical server 24, memory 76 includes
`one or more server applications 78 for tracking multi-
`participant events hosted by the server. Memory 76 further
`includes the profile of each user that has the privilege of
`using server 24 to participate in multi-participant events.
`These profiles are stored as user data 80. An identity of each
`participant in a multi-participant event hosted by server 24
`is stored in memory 76 as participant data 82.
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`The general architecture and processing associated with
`the invention has now been disclosed. Attention presently
`turns to a more detailed consideration of the architecture of
`the invention, the processing of the invention, the distinc-
`tions between these elements and corresponding elementsin
`the prior art, and advantages associated with the disclosed
`technology.
`FIG, 2 provides a detailed view of participant data struc-
`ture 46 that
`is used in one embodiment of the present
`invention. Data structure 46 includes a record 202 for each
`participant in an multi-participant event. Each record 202
`includes a participant source identifier 204. In one embodi-
`ment of the present invention, participant source identifier
`204 does not provide information that identifies the actual
`(true) identity of a participant in the multi-participant event.
`In such embodiments, participant source identifier 204 does
`not include the IP address or nameof the remote participant.
`For example, server application 78 (FIG. 1) may assign each
`participant a random number when the participant joins a
`multi-participant event. This random number is transiently
`assigned for the duration of the multi-participant event and
`cannot be traced to the participant. In other embodiments,
`participant source identifier 204 is the true identity of the
`participant or is a screen name of the participant. In such
`embodiments visual ID 206 (FIG. 3) identifies the associated
`participant. Thus, each participant is aware of exactly who
`is included in the multi-participant event.
`In one embodiment, visual ID 206 is an icon that repre-
`sents the participant. Visual ID 206 may be randomly chosen
`from a library of icons by sound control module 48 when a
`participant
`joins the multi-participant event. Automatic
`assignment of a visual ID 206 to each participant has the
`advantage of preserving participant anonymity.
`Alternatively, visual ID 206 is selected by the participant or
`uniquely identifies the participant by, for example, including
`the participant’s actual name, screen name, and/or a picture
`of the participant.
`In complex applications,a local participant may engagein
`several concurrent multi-participant events using client 22.
`Each multi-participant event will be assigned a window by
`application module 44 (FIG. 1). In such embodiments, it is
`necessary to provide a visual ID window field 208 in record
`202 to indicate which window visual ID 208 is located in
`and/or which multi-participant event visual ID 206 is asso-
`ciated with. Each record 202 further includes the position
`that visual ID 204 occupies in visual ID window 208. In
`embodiments that do not support concurrent mullti-
`participant events on a single client 22, visual ID window
`field 208 is not required. In such embodiments, visual ID
`position 210 represents the position of visual ID in the
`window assigned to application module 44 by operation
`system 40 (FIG. 1).
`One advantage of the present system overthe prior art is
`the use of visual ID state 212, which represents a state of
`participant 204. In some embodiments, visual ID state 212
`is a single bit. When participant 204 is speakingthebit is set
`and when participant 204 is not speaking the bit is not set.
`In other embodiments, visual ID state 212 is selected from
`a spectrum ofvalues. The low end of this spectrum indicates
`that participant 204 is not speaking and the high endof the
`spectrum indicates that participant 204 is speaking much
`louder then normal. Regardless of how visual ID state 212
`is configured, it is used to modify the appearance of visual
`ID 206 on input/output device 36. For example, in embodi-
`ments where visual ID state 212 is a value selected from a
`spectrum, the value stored by visual ID state 212 may be
`used to determine visual
`ID 206 brightness. As an
`
`0011
`0011
`
`
`
`US 6,192,395 Bl
`
`7
`illustration, when visual ID 206 is a two-dimensional array
`ofpixel values, each pixelvalue in the array may be adjusted
`by a constant determined by the value of visual state 212.
`Thus, visual ID states 212 at the high end of the spectrum
`alter the average pixel valuein the array by a larger constant
`than visual ID states 212 in the low end of the spectrum. In
`an alternative embodiment, visual ID state 212 is used to
`shift
`the color of each pixel
`in visual
`ID 206. As an
`illustration, when visual ID state 212 is at the high end of the
`spectrum, visual ID 206 is red-shifted and when visual ID
`state 212 is at the low end ofthe spectrum, visual ID 206 is
`green-shifted.
`In yet another embodiment,
`the particular
`visual ID 206 is selected from a library of visual IDs to
`represent participant 202 based on a function of the value of
`visual ID state 212. For instance, a particularly animated
`visual ID 206 may be selected to represent participant 202
`when the participant is speaking loudly.
`In yet another embodiment, visual state 212 includes
`information about the participant, such as whether the par-
`ticipant is speaking, has certain privileges, has placed the
`event on hold, or
`is away from the keyboard.
`In one
`embodiment, visual state 212 specifies whether the partici-
`pant has moderation privileges, such as the privilege to
`invite potential participants to the multi-participant event,
`the privilege to remove people from the multi-participant
`event, or the privilege to grant and deny speaking privileges
`to other participants in the multi-participant event. In some
`embodiments, a participant may place an event on hold in
`the classic sense that the event is muted on client 22 and
`sound is not communicated from orto the client 22 assaci-
`ated with the participant. Visual state 212 may be used to
`reflect
`the fact
`that
`the participant
`is on hold.
`In some
`embodiments, the participant may also have the option to
`designate that he is “away from the keyboard.” This state is
`used to inform other participants that the participant is not
`within earshot of the multi-participant event but will be back
`momentarily,
`It will be appreciated that any information
`included in visual state 212 may be used to update the
`appearance of the visual ID 206 associated with the partici-
`pant.
`Referring to FIG. 3, detailed steps that describe how
`sound from a local participant is processed by sound control
`module 48 is shown. In step 302, sound control module 48
`monitors a microphone 86 (FIG. 1)
`for sound.
`In the
`embodiment depicted in FIG, 3, when the amplitude of
`sound detected by microphone 36 exceeds a base state
`(302-Yes), processing step 304 is triggered. In processing
`step 304, the signal is stored in transmit sound buffer 62,
`packaged into a packet and routed to remote participants of
`the multi-participant event by transmit router 64 (FIG. 1). In
`one embodiment, transmit router 64 will forward the packet
`to server 24 and server 24 will route the packet to remote
`participants based on participant data 82 information stored
`in memory 76 (FIG, 1). Such an embodiment is capable of
`preserving the anonymity of each participant in the multi-
`participant event. In an alternative embodiment, the identity
`of the participant is included in the packet and participant
`anonymity is precluded. In processing step 306, the visual
`ID 206 (FIG. 2) of the local participant is updated. If the
`visual ID 206 (FIG, 2) corresponding the local participant is
`not in state 1 (306-No), processing step 308 is executed to
`set visual ID 206 to state “1”. Processing steps 306 and 308
`represent an embodiment in which visual ID is set to “1”
`whenthe local participant is speaking and “2” when thelocal
`participant
`is not speaking. One of skill
`in the art will
`appreciate that visual ID state 212 could be assigned any
`number in a spectrum and this number could be used to
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`adjust the appearance ofvisual ID 206. Further, it will be
`appreciated that processing steps 306 and 308 or their
`equivalents could be executed prior to processing step 304
`and that the updated value of visual ID state 212 could be
`packagedinto the sound packet that is transmitted in pro-
`cessing step 304, Other clients 22 could then use the value
`of visual ID state 212 to update the appearance of the
`associated visual ID 206. Once processing steps 304 thru
`308 have been executed, the process repeats itself by return-
`ing to processing step 302.
`Processing step 302is a continuous input function. Micro-
`phone 36 constantly detects sound and system 10 is capable
`of storing sound from transmit sound buffer 62 during the
`execution of processing steps 304 thru 318. In a WIN32
`environment, step 302 may be implementedusing Microsoft
`Windows DirectSoundCapture. Techniques well known in
`the art may be used to control the capture of sound from
`microphone 36 so that the speech from the local participant
`is stored in transmit sound buffer 62 and periods ofsilence
`or lengthy pauses in the speech of the local participant are
`not stored in the buffer. In one embodiment, “frames” of
`sound from microphone 36 are acoustically analyzed. Pref-
`erably each frame of sound has a length of 500 milliseconds
`or less. More preferably each frame of sound hasa length of
`250 milliseconds or less and more preferably 100 millisec-
`onds or less. In an even more preferred embodiment, each
`frame has a length of about ten milliseconds. Typically, the
`acouslic analysis comprises deriving parameters such as
`energy, zero-crossing rates, auto-correl