throbber
Illllllllllllllllllllllllll|||||IlllllllllllllllllllIllllllllllllllllllllll
`USOO553974l A
`
`United States Patent
`
`[19]
`
`Barraclough et al.
`
`[11] Patent Number:
`
`5,539,741
`
`[45] Date of Patent:
`
`Jul. 23, 1996
`
`................... .. 379/202 X
`6/1992 Steagall et al.
`5,127,001
`......
`370/62
`12/1994 Palmer et al.
`5,375,068
`.......
`370/62
`1/1995 Cotton et al.
`5,379,280
`.......................... .. 370/62
`3/1995 Shibata et al.
`5,402,418
`FOREIGN PATENT DOCUMENTS
`
`
`
`[54] AUDIO CONFERENCEING SYSTEM
`
`[75]
`
`Inventors: Keith Barraclough, Fremont, Calif.;
`Peter R. Cripps, Southampton; Adrian
`Gay, Fareham, both of United Kingdom
`
`[73] Assignce:
`
`IBM Corporation, Armonk, N.Y.
`
`2207581
`
`7/1987 United Kingdom .
`
`[21] Appl. No.: 346,553
`
`[22]
`
`Filed:
`
`Nov. 29, 1994
`
`[30]
`
`Foreign Application Priority Data
`
`Dec. 18, 1993
`
`[GB]
`
`United Kingdom ................. .. 9325924
`
`Int. Cl.“ ..................................................... H04Q 11/04
`[51]
`
`[52] U.S. Cl.
`370/62; 370/110.1; 379/202;
`348/15
`[58] Field of Search ............................... 370/62, 58.1, 61,
`370/112, 119, 110.1; 379/201, 202, 205,
`88, 157, 158, 165, 203, 93, 94, 96; 348/14,
`13, 15
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`Primary Examiner—Alphus H. Hsu
`Assistant Examiner—Ricky Q. Ngo
`Attorney, Agent, or Firmwleanine S. Ray-Yarletts
`
`[57]
`
`ABSTRACT
`
`A computer workstation receives multiple audio input
`streams over a network in an audio conference. The audio
`input streams are kept separate by storing them in different
`queues. Digital samples from each of the queues are trans-
`ferred to an audio adapter card 28 for output. A digital signal
`processor 46 on the audio adapter card multiplies each audio
`stream by its own weighting parameter, before summing the
`audio streams together for output. Thus the relative volume
`of each of the audio output streams can be controlled. For
`each block of audio data, the voiume is calculated and
`displayed to the user, allowing the user to see the volume in
`each audio input stream independently. The user is also
`provided with volume control for each audio input stream,
`which elfeetively adjusts the weighting parameter, thereby
`allowing the user to alter the relative volumes of each
`speaker in the conference.
`
`10 Claims, 4 Drawing Sheets
`
`..... 370/62
`379/202 X
`370/62
`. 370/62
`........................ 370/62
`
`
`
`4,389,720
`4,730,306
`4,750,166
`4,953,159
`5,014,267
`
`. . . ..
`
`6/I983 Baxter ct al.
`3/1988 Uchida .........
`6/1988 Illman et al.
`8/1990 Hayden et al.
`5/1991 Tompkins et al.
`
`.
`
`24
`
`
`MEMORJ
`
`
`
`
`
`RPX Exhibit 1137
`RPX Exhibit 1137
`RPX v. DAE
`RPX V. DAE
`
`

`
`U.S. Patent
`
`Jul. 23, 1996
`
`Sheet 1 of 4
`
`5,539,741
`
`MEMORY
`
`
`
`

`
`U.S. Patent
`
`Jul. 23, 1996
`
`Sheet 2 of 4
`
`5,539,741
`
`

`
`U.S. Patent
`
`Jul. 23, 1996
`
`Sheet 3 of 4
`
`5,539,741
`
`OBTAIN M BLOCKS
`
`602
`
`CONVERT TO LINEAR SCALE
`
`604
`
`MULTIPLY BY WEIGHTING PARAMETER
`
`505
`
`UPDATE RUNNING AVERAGE
`
`808
`
`SUM AUDIO STREAMS
`
`810
`
`RESANIPLE
`
`612
`
`PASS TO CODEC
`
`6”’
`
`Figure 6
`
`

`
`U.S. Patent
`
`Jul. 23, 1996
`
`Sheet 4 of 4
`
`5,539,741
`
`7'00
`
`720
`
`721
`
`722
`
`724
`
`810
`
`812
`
`814
`
`816
`
`818
`
`SUPPORT LAYEFI
`
`
`
`OPERATING
`
`
`
`
`
`
`
`APPLICATION
`
`APPLICATION
`
`SYSTEM
`
`COMIVIS
`
`SOFTWARE
`
`DEVICE
`
`DRIVERS
`
`Figure 8
`
`

`
`1
`AUDIO CONFERENCEING SYSTEM
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`invention relates to the processing by a
`The present
`computer workstation of multiple streams of audio data
`received over a network.
`
`2. Description of the Prior Art
`Convcntionally voice signals have been transmitted over
`standard analog telephone lines. However, with the increase
`in locations provided with local area networks (LANS) and
`the growing importance of multimedia communications,
`there has been considerable interest in the use of LANs to
`carry voice signals. This work is described, for example in
`“Using Local Area Networks for Carrying Online Voice” by
`D. Cohen, pages 13-21 and “Voice Transmission over an
`Ethernet Backbone” by P. Ravasio, R, Marcogliese, and R.
`Novaresc, pages 39-65, both in “Local Computer Net-
`works” (edited by P. Ravasio, G. Hopkins, and N. Naffah;
`North Holland, 1982). The basic principles of such a scheme
`are that a first terminal or workstation digitally samples a
`voice input signal at a regular rate (e.g. 8 kHz). A number of
`samples are then assembled into a data packet for transmis-
`sion over the network to a second terminal, which then feeds
`the samples to a loudspeaker or equivalent device for
`playout, again at a constant rate.
`One of the problems with using a LAN to carry voice data
`is that the transmission time across the network is variable.
`Thus the arrival of packets at a destination node is both
`delayed and irregular. If the packets were played out in
`irregular fashion,
`this would have an extremely adverse
`elfect on intelligibility of the voice signal. Therefore, voice
`over LAN schemes utilize some degree of buffering at the
`reception end, to_ absorb such irregularities. Care must be
`taken to avoid introducing too large a delay between the
`original voice signal and the audio output at the destination
`end, which would render natural interactive two-way con-
`versation diflicult (in the same way that an excessive delay
`on a transatlantic conventional phone call can be highly
`intrusive). A system is described in “Adaptive Audio Playout
`Algorithm for Shared Packet Networks”, by B. Aldred, R.
`Bowater, and S. Woodman, IBM Technical Disclosure Bul-
`lctin, pp. 255-257, Vol. 36, No. 4, April 1993 in which
`packets that arrive later than a maximum allowed value are
`discarded. The amount of bulfering is adaptively controlled
`depending on the number of discarded packets (any other
`appropriate measure of lateness could be used). If the
`number of discarded packets is high, the degree of bufiering
`is increased, while if the number of discarded packets is low,
`the degree of buffering is decreased. The size of the buffer
`is altered by temporarily changing the play—out rate (this
`at1'ccts the pitch; a less noticeable technique would be to
`detect periods of silence and artificially increase or decrease
`them as appropriate).
`Another important aspect of audio communications is
`conferencing involving multipoint communications,
`as
`opposed to two-way or point-to-point communications.
`When implemented over traditional analog telephone lines,
`audio conferencing requires each participant
`to send an
`audio signal to a central hub. The central hub mixes the
`incoming signals, possibly adjusting for the diiferent levels,
`and sends each participant a summation of the signals from
`all
`the other participants (excluding the signal from that
`particular node). U.S. Pat. No. 4,650,929 discloses a cen-
`tralized video/audio conferencing system in which individu-
`
`l0
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,539,741
`
`2
`
`als can adjust the relative volumes of the other participants.
`U.S. Pat. No. 4,389,720 discloses a telephone conferencing
`system with individual gain adjustment performed by sys-
`tem ports for multiple end user stations.
`The use of a centralized mixing node, often referred to as
`a multipoint control unit (MCU), has been carried over into
`some multimedia (audio plus video) workstation conferenc-
`ing systems. For example, U.S. Pat. No. 4,710,917 describes
`a multimedia conferencing system, in which each participant
`transmits audio to and receives audio from a central mixing
`unit. Other multimedia conferencing systems are described
`in “Distributed Multiparty Desktop Conferencing System:
`MERMAID” by K. Watabe, S. Sakata, K. Maeno, H.
`Fukuoka, and T. Ohmori, pp. 27-38 in CSCW ’9O (Proceed-
`ings of the Conference on Computer-Supported Cooperative
`Work, 1990, Los Angeles) and “Personal Multimedia Mul-
`tipoint Communications Services for Broadband Networks”
`by E. Addeo, A. Gelman and A. Dayao, pp. 53-57 in _Vol. 1,
`IEEE GLOBECOM, 1988.
`The use of a centralized MCU or summation node how-
`ever has several drawbacks. Firstly, the architecture of most
`LANS is based on a peer—to—peer arrangement, and so there
`is no obvious central node. Moreover,
`the system relies
`totally on the continued availability of the nominated central
`node to operate the conference. There can also be problems
`with echo suppression (the central node must be careful not
`to include the audio from a node in the summation signal
`played back to that node).
`These problems can be avoided by the use of a distributed
`audio conferencing system, in which each node receives a
`separate audio signal from every other node in the confer-
`ence. U.S. Pat. No. 5,127,001 describes such a distributed
`system, and discusses the synchronisation problems that
`arise because of the variable transit time of packets across
`the network. U.S. Pat. No. 5,127,001 overcomes this prob-
`lem by maintaining separate queues of incoming audio
`packets from each source node. These effectively absorb the
`jitter in arrival time in the same way as described above for
`simple point-to-point communications. At regular intervals a
`set of audio packets are read out, one packet from each of the ,
`queues, and summed together for playout. In U.S. Pat. No.
`5,127,001 the audio contributions from the different parties
`are combined using a weighted sum. A somewhat similar
`approach is found in GB 2207581, which describes a rather
`specialized local area network for the communication of
`digital audio in aircraft. This system includes means for
`adjusting independently the gain of each audio charmel
`using a store of predetermined gain coefficients.
`One of the problems in audio conferencing systems, as
`discovered with the MERMAID system referred to above, is
`determining who is speaking at any given moment. U.S. Pat.
`No. 4,893,326 describes a multimedia conferencing system,
`in which each workstation automatically detects if its user is
`speaking. This information is then fed through to a central
`control node, which switches the video so that each partici-
`pant sees the current speaker on their screen. Such a system
`requires both a video and audio capability to operate, and
`furthermore relies on the central video switching node, so
`that it cannot be used in a fully distributed system.
`A distributed multimedia
`conferencing
`system is
`described in “Personal Multimedia-Multipoint Teleconfer-
`ence System” by H. Tanigawa, T. Arikawa, S. Masaki, and
`K. Shimamura, pp. 1127-1134 in IEEE INFOCOM 91,
`Proceedings Vol 3. This system provides sound localization
`for a stereo workstation, in that as a window containing the
`video signal from a conference participant is moved from
`
`

`
`3
`
`4
`
`5,539,741
`
`right to left across the screen, the apparent source of the
`corresponding audio signal moves likewise. This approach
`provides limited assistance in identification of a speaker. A
`more comprehensive facility is described in Japanese
`abstract JP O2-123886 in which a bar graph is used to depict
`the output voice level associated with an adjacent window
`containing a video of the source of the sound.
`The prior art therefore describes a variety of audio con-
`ferencing systems. While conventional centralized tele-
`phone audio conferencing is both widespead and well under-
`stood from a technological point of view, much work
`remains to be done to increase the performance of audio
`conferencing implementations in the desk—top environment.
`
`SUMMARY OF THE INVENTION
`
`Accordingly the invention provides a computer worksta-
`tion for connecting to a network and receiving multiple
`audio input streams from the network, each audio stream
`comprising a sequence of digital audio samples, the work-
`station including:
`means for storing the digital audio samples from each
`audio input stream in a separate queue;
`means for forming a sequence of sets containing one
`digital audio sample from each queue;
`means for producing a weighted sum for each set of
`digital audio samples, each audio input stream having a
`weighting parameter associated therewith;
`means for generating an audio output from the sequence
`of weighted sums;
`and characterized by means responsive to user input for
`adjusting said weighting parameters to control the relative
`volumes within the audio output of the multiple audio
`streams.
`
`the provision of audio
`The invention recognizes that
`conferencing over a distributed network, in which each node
`receives a separate audio stream from all of the other
`participants, naturally allows extra functionality that could
`only previously be achieved with great difiiculty and
`expense in in centralized conferencing systems. In particu-
`lar, each user can adjust the relative volume of all the other
`participants according to their own personal preferences.
`This can be very desirable for example if they need to focus
`on one particular aspect of the conference, or because of
`language problems (for example, maybe one person has a
`strong accent which is difficult for some people to under-
`stand, or maybe the conference involves simultaneous trans-
`lation). Moreover, even during the conference the system is
`responsive to user input to alter the relative volumes of the
`different participants. In order to permit this control, the
`incoming audio signals are kept separate, being placed into
`different queues according to their source (the queues are
`logically separate storage, although physically they may be
`adjacent or combined), before being weighted by the appro-
`priate volume control factor. Only then are they combined
`together to produce the final audio output. The invention
`thus recognizes that a distributed audio conferencing system
`is particularly suited to the provision of individual control
`for the relative volumes.
`
`Preferably the workstation further comprises means for
`providing a visual indication for each of said multiple audio
`input streams of whether or not that stream is currently
`silent. This overcomes one of the recognized problems in
`audio conferencing, determining who is speaking. The
`visual indication may simply be some form of on/off indi-
`
`eator, such as a light or equivalent feature, but in a preferred
`embodiment is implemented by a display that indicates for
`each of said multiple audio input streams the instantaneous
`sound volume in that audio stream. In other words,
`the
`display provides a full
`indication of the volume of the
`relevant participant. The volume output can be calculated on
`the basis of a running root-mean-square value from the
`sequence of digital audio samples, or if processing power is
`limited a simpler algorithm may be employed, such as using
`the maximum digital audio value in a predetermined number
`of samples. In general the incoming audio data anives in
`blocks, each containing a predetermined number of digital
`audio samples, and said visual indication is updated for each
`new block of audio data. Thus the volume figure will
`typically be calculated on a per block basis.
`It is also preferred that said visual indication is displayed
`adjacent a visual representation of the origin of that audio
`input stream, such as a video or still
`image. The former
`requires a full multimedia conferencing network, whereas
`the latter can be provided on much lower bandwidth net-
`works which cannot support
`the transmission of video
`signals. Such a visual indication, whether still or moving,
`allows easy identification of the source of any audio.
`Preferably the workstation further includes means for
`providing the user with a visual indication of the values of
`said weighting parameters, said means being responsive to
`user mouse operations to adjust said weighting parameters.
`This can be implemented as a scroll—bar or the like, one for
`each audio input stream, and located adjacent the visual
`indication of output volume for that stream. It is also
`convenient for the computer workstation to further comprise
`means for disabling audio output from any of said multiple
`audio input streams. Thus the user is eflectively provided
`with a full set of volume controls for each audio input
`stream.
`
`The invention also provides a method of operating a
`computer workstation, connected to a network for the receipt
`of multiple audio input streams, each audio stream compris-
`ing a sequence of digital audio samples, said method com-
`prising the steps of:
`storing the digital audio samples from each audio input
`stream in a separate queue;
`forming a sequence of sets containing one digital audio
`sample from each queue;
`producing a weighted sum for each set of digital audio
`samples, each audio input stream having a weighting param-
`eter associated therewith;
`
`10
`
`I5
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`generating an audio output from the sequence of weighted
`sums;
`
`55
`
`60
`
`65
`
`and characterized by adjusting responsive to user input
`said weighting parameters to control the relative volumes
`within the audio output of the multiple audio streams.
`
`BRIEF DESCRIPTION OF THE DRAWING
`
`An embodiment of the invention will now be described by
`way of example with reference to the following drawings:
`FIG. 1 is a schematic diagram of a computer network;
`FIG. 2 is a simplified block diagram of a computer
`workstation for use in audio conferencing;
`FIG. 3 is a simplified block diagram of an audio adapter
`card in the computer workstation of FIG. 2;
`FIG. 4 is a flow chart illustrating the processing per-
`formed on an incoming audio packet;
`
`

`
`5
`
`6
`
`5,539,741‘
`
`FIG. 5 illustrates the queue of incoming audio packets
`waiting to be played out;
`FIG. 6 is a flow chart illustrating the processing per-
`formed by the digital signal processor on the audio adapter
`card;
`
`FIG. 7 shows a typical screen interface presented to the
`user of the workstation of FIG. 2; and
`
`FIG. 8 is a simplified diagram showing the main software
`components running on the workstation of FIG. 2.
`
`l0
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT(S)
`
`FIG. 1 is a schematic diagram of computer workstations
`A~E linked together in a local area network (LAN) 2. These
`workstations are participating in a multiway conference,
`whereby each workstation is broadcasting its audio signal to
`all
`the other workstations in the conference. Thus each
`workstation receives a separate audio signal from every
`other workstation. The network shown in FIG. 1 has aToken
`Ring architecture, in which a token circulates around the
`workstations. Only the workstation currently in possession
`of the token is allowed to transmit a message to another
`workstation. It should be understood that the physical trans-
`mission time for a message around the ring is extremely
`short. In other words, a message transmitted by A for
`example is received by the all the other terminals almost
`simultaneously. This is why a token system is used to
`prevent interference arising from two nodes trying to trans-
`mit messages at the same time.
`As described in more detail below, a one-way audio
`communication on a LAN typically requires a bandwidth of
`64 kHz. In the conference of FIG. 1, each node will be
`broadcasting its audio signal to four other nodes, implying
`all overall bandwidth requirement of 5><4><64 kHz (1.28
`MHz). This is comfortably within the capability of a stan-
`dard Token Ring, which supports either 4 or I6 MBits per
`second transmission rate. It will be recognized that for larger
`conferences the bandwidth requirements quickly become
`problematic, although future networks are expected to offer
`much higher bandwidths.
`Note that the invention can be implemented on many
`different network architectures or configurations other than
`Token Ring, providing of course the technical requirements
`regarding bandwidth, latency and so on necessary to support
`audio conferencing can be satisfied.
`FIG. 2 is a simplified schematic diagram of a computer
`system which may be used in the network of FIG. 1. The
`computer has a system unit 10, a display screen 12, a
`keyboard 14 and a mouse 16. The system unit 10 includes
`microprocessor 22, serni-conductor memory (ROM/RAM)
`24, and a bus over which data is transferred 26. The
`computer of FIG. 2 may be any conventional workstation,
`such as an IBM PS/2® computer.
`The computer of FIG. 2 is equipped with two adapter
`cards. The first of these is aToken Ring adapter card 30. This
`card, together with accompanying software, allows mes-
`sages to be transmitted onto and received from the Token
`Ring network shown in FIG. 1. The operation of the Token
`Ring card is well-known, and so will not be described in
`detail. The second card is an audio card 28 which is
`connected to a microphone and a loudspeaker (not shown)
`for audio input and output respectively.
`The audio card is shown in more detail in FIG. 3. The card
`
`is an
`illustrated and used in this particular embodiment
`M-Wave card available from IBM, although other cards are
`
`l5
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`available that perform an analogous function. (“M-Wave” is
`a trademark of IBM Corporation.) The card contains an AID
`converter 42 to digitized incoming audio signals from an
`attached microphone 40. The A/D converter is attached to a
`CODEC 44, which samples the incoming audio signal at a
`rate of 44.l kHz into 16 bit samples (corresponding to the
`standard sampling rate/size for compact disks). Digitized
`samples are then passed to a digital signal processor (DSP)
`46 on the card Via a double buffer 48 (ie the CODEC loads
`a sample into one half of the double buffer while the
`CODEC reads the previous sample from the other half). The
`DSP is controlled by one or more programs stored in
`semiconductor memory 52 on the card. Data can be trans-
`ferred by the DSP to and from the main PC bus.
`Audio signals to be played out are received by the DSP 46
`from the PC bus 26, and processed in a converse fashion to
`audio input from the microphone. That is, the output audio
`signals are passed through the DSP 46 and a double buffer
`50 to the CODEC 44, from there to a D/A converter 54, and
`finally to a loudspeaker 56 or other appropriate output
`device.
`
`the DSP is pro-
`In the particular embodiment shown,
`grammed to transform samples from the CODEC from 16
`bits at 44.1 kHz into a new digital signal having an 8 kHz
`sampling rate, with 8-bit samples on a p—1aw scale (essen-
`tially logarithmic), corresponding to CCI'l'I‘ standard G.7ll,
`using standard re-sampling techniques. The total bandwidth
`of the signal passed to the workstation for transmission to
`other terminals is therefore 64 kHz. The DSP also performs
`the opposite conversion on an incoming signal received
`from the PC, i.e. it converts the signal from 8-bit 8 kHz to
`16-bit, 44.1 kHz, again using known re-sampling tech-
`niques. Note that this conversion between the two sampling
`formats is only necessary because of the particular choice of
`hardware, and has no direct bearing on the invention. Thus
`for example, many other audio cards include native support
`for the 8 kHz format, i.e. the CODEC can operate according
`to G.7ll format thereby using the 8 kHz format throughout
`(alternatively the 44.1 kHz samples retained for transmis-
`sion over the network, although the much higher bandwidth
`and greatly increased processing speed required render this
`unlikely unless there is a particular need for the transmitted
`audio signal to be of CD quality; for normal voice commu-
`nications the 64 kHz bandwidth signal of the G.7ll format
`is adequate).
`Data is transferred between the audio adapter card and the
`Workstation in blocks of 64 bytes: i.e. 8 ms of audio data, for
`8-bit data sampled at 8 kHz. The workstation then only
`processes whole blocks of data, and each data packet trans-
`mitted from or received by the workstation typically con-
`tains a single 64 byte block of data. The choice of 64 bytes
`for the block size is a compromise between minimizing the
`granularity of the system (which introduces delay), whilst
`maintaining efficiency both as regards internal processing in
`the workstation and transmission over the network. In other
`
`systems a block size of 32 or 128 bytes for example may be
`more appropriate.
`The operation of a computer workstation as regards the
`transmission of audio data is well-known in the prior art, and
`so will not be described in detail. Essentially, the audio card
`receives an input signal, whether in analogue form from a
`microphone, or from some other audio source, such as a
`compact disk player, and produces blocks of digital audio
`data. These blocks are then transferred into the main
`memory of the workstation, and from there to the LAN
`adapter card (in some architectures it may be possible to
`transfer blocks from the audio adapter card directly into the
`
`

`
`5,539,741
`
`7
`
`LAN adapter card, without the need to go via the worksta-
`tion memory). The LAN adapter card generates a data
`packet containing the digital audio data along with header
`information, identifying the source and destination nodes,
`and this packet is then transmitted over the network to the
`desired recipient(s). It will be understood that in any two or
`multi-way communications this transmission process will be
`executing at the workstation simultaneously with the recep-
`tion process described below.
`The processing by the computer workstation as regards
`the reception of audio data packets is illustrated in FIG. 4.
`Whenever a new packet arrives (step 402), the LAN adapter
`card notifies a program executing on the microprocessor in
`the workstation, providing information to the program iden-
`tifying the source of the data packet. The program then
`transfers the incoming 64 byte audio block into a queue in
`main memory (step 404). As shown in FIG. 5, the queue in
`main memory 500 actually comprises a set of separate
`subqueues containing audio blocks from each of the diifer—
`ent source nodes. Thus one queue contains the audio blocks
`from one source node, one queue contains the audio blocks
`from another source node, and so on. In FIG. 5 there are
`three subqueues 501, 502, 503, for audio data from nodes B,
`C and D respectively;
`the number of subqueues will of
`course vary with the number of participants in the audio
`conference. The program uses the information in each
`received packet identifying the source node in order to
`allocate the block of incoming audio data to the correct
`queue. Pointers PB, PC, and PD indicate time position of the
`end of the queue and are updated whenever new packets are
`added. Packets are removed for further processing from the
`bottom of the subqueues (“OUT” as shown in FIG. 5). The
`subqueues in FIG. 5 are therefore essentially standard First
`In First Out queues and can be implemented using conven-
`tional programming techniques. Note that apart from the
`support of multiple (parallel) queues,
`the processing of
`incoming audio blocks as described so far is exactly analo-
`gous to prior art methods, allowing equivalent buffering
`techniques to be used if desired, either with respect to
`individual subqueues, or on the combined queue in its
`entirety.
`The operations performed by the DSP on the audio
`adapter card are illustrated in FIG. 6. The DSP runs in a
`cycle, processing a fresh set of audio blocks every 8 milli-
`seconds in order to ensure a continuous audio output signal.
`Thus every 8 ms the DSP uses DMA access to read out one
`audio block from each of the subqueues corresponding to the
`diiferent nodes—i.e. one block from the bottom of queues B,
`C, and D as shown in FIG. 5 (step 602: i.e. M=3 in this case).
`These blocks are treated as representing simultaneous time
`intervals: in the final output they will be added together to
`produce a single audio output for that time interval. The DSP
`therefore efibctivcly performs a digital mixing function on
`the multiple audio input streams. Using a look-up table, the
`individual samples within the 64 byte blocks are then
`converted out of G.7ll format (which is essentially loga-
`rithmic) into a linear scale (step 604). Each individual
`sample is then multiplied by a weighting parameter (step
`606). There is a separate weighting parameter for each
`received audio data stream; i.e. for the three subqueues of
`FIG. 5, there is one weighting parameter for the audio stream
`from node B, one for the audio stream from node C, and one
`for the audio stream from node D. The weighting parameters
`are used to control the relative loudness of the audio signals
`from different sources.
`
`The DSP maintains a running record of the root mean
`square (rms) value for each audio stream (step 608). Typi-
`
`10
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`cally such an rms value is produced for each block of audio
`data (i.e. every 8 milliseconds) by generating the sum and
`sum of squares of the values within that block. The rms
`value represents the volume of that individual audio input
`stream and is used to provide volume information to the user
`as described below.
`
`Once the digital audio samples have been multiplied by
`the appropriate weighting parameter,
`they are summed
`together (step 603; note that this can eilectively happen in
`parallel
`to the processing of step 606). Thus a single
`sequence of digital audio samples is produced, representing
`the weighted sum of the multiple input audio streams. This
`sequence of digital audio samples is then re-sampled up to
`44.1 kHz (step 610, although as mentioned previously, this
`is hardware-dependent and not directly relevant
`to the
`present invention), before being passed to the CODEC (step
`612) for supply to the loudspeaker.
`Note that the actual DSP processing used to generate the
`volume adjusted signals may vary somewhat from that
`shown in FIG. 6, although eifectively the end result is
`similar. Such variations might typically be introduced to
`maximise computational efficiency or to reduce demands on
`the DSP. For instance, if processor power is limited, then the
`volume control can be implemented at the conversion out of
`u—law format. Thus after the correct look-up value has been
`located (step 604), the actual read-out value can be deter-
`mined by moving up or down the table a predetermined
`number of places, according to whether the volume of the
`signal is to be increased or decreased from its normal value.
`In this case the weighting parameter is effectively the
`number of steps up or down to adjust the look—up table
`(obviously allowing for the fact
`that
`the G.7l1 format
`separates the original amplitudes according to whether they
`are positive or negative, and volume adjustment cannot
`convert one into the other). The above approach is compu-
`tationally simple, but provides only discrete rather than
`continuous volume control. Alternatively it would be pos-
`sible to add the logarithm of the volume control value or
`weighting parameter to the u-law figure. This approach
`elfectively performs the multiplication of step 606 prior to
`the scale conversion of step 604 using logarithmic addition,
`which for most processors is computationally less expensive
`than multiplication. The result can then be converted back
`into a linear scale (step 604) for mixing with the other audio
`streams. This approach does permit fine volume control
`providing the look-up table is sufficiently detailed (although
`note that time output remains limited to 16 bits). Typically
`the logarithm of the weighting parameter could be obtained
`from a look-up table, or alternatively could be supplied
`already in logarithmic form by the controlling application.
`Of course, it is only necessary to calculate a new logarithmic
`value when the volume control is adjusted, which is likely
`to be relatively infrequently.
`Similarly, if the available processing power was insuffi-
`cient to perform a continuous rms volume measurement,
`then the processing might perhaps be performed on every
`other block of data, or alternatively some computationally
`simpler algorithm such as summing the absolute value of the
`diiference between successive samples could be used. Note
`that the summation of the squared values can be performed
`by logarithmic addition prior to step 604 (i.e. before the
`scale conversion). An even simpler approach would be to
`simply use the maximum sample value in any audio block as
`a volume indicator.
`
`FIG. 7 shows the screen 700 presented to the user at the
`workstation who is involved in an audio conference. As in
`the previous discussion, this involves the receipt of three
`
`

`
`9
`
`10
`
`5,539,741
`
`different streams of audio data, although obviously the
`invention is not limited to just three participants. The screen
`in FIG. 7 has been split by dotted lines into three areas 701,
`702, 703, each representing one participant, although in
`practice these dotted lines do not appear on the screen.
`Associated with each participant is a box 724 containing the
`name of the participant (in our case simply B, C and D).
`There is also an image window 720 which can be used to
`contain a video image of the audio source, transmitted over
`the network with the audio, or a still bit map (either supplied
`by the audio source at
`the start of time conference, or
`perhaps already existing locally at
`the workstation and
`displayed in response to the

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket