`
`(11) 3,601,530
`
`72 Inventors Robert C. Edson
`Brielle;
`Doren Mitchell, Martinsville; George P.
`Reid, Holmdel, all of, N.J.
`(21) Appl. No. 820,131
`22 Filed
`Apr. 29, 1969
`ted
`:
`t
`Laboratories. In
`73)
`ssignee
`N ratories, incorpora
`
`3,369,073. 2/1968 Scholz..........................
`3,419,674 12/1968 Burns et al....................
`3,423,532
`1/1969 Coel et al......................
`3,492,419
`1/1970 Bartonik.
`3,515,807 6/1970 Clark............................
`Primary Examiner-Robert L. Richardson
`Assistant Examiner-P. M. Pecori
`Attorneys-R. J. Guenther and E. W. Adams, Jr.
`
`
`
`178/5.6
`17912
`17911
`178/6.8
`17911
`
`54) VIDEO CoNFERENCE SYSTEM USNG VOICE-
`SWITCHED CAMERAS
`30 Claims, 13 Drawing Figs.
`178/5.6,
`52 U.S.C.......................................................
`17815.8, 178/DIG. 30, 17911 CN, 17912 TV
`(51) Int Cl...................................................... H04n 5/24
`50 Field of Search............................................
`178/5.6,
`5.8, 6 TM, 6 PD, 6, 6.8, 7.2 ST; 17911 H, 1 CN, 2
`TV; 235,151,52
`
`56
`
`s
`References Cited
`E" STATES PATENTS
`3,050,584 81962 Miller...........................
`3,128,348 4/1964 Lummis........................
`
`79/
`17911
`17911
`
`ABSTRACT: This disclosure relates to a video conference
`system for a plurality of groups of remotely located conferees.
`At each group location, 2. plurality of video cameras are used
`and the field of each is restricted to a small number of persons
`in the group. Voice voting and switching are used to deter
`mine the location of the person in the group who is talking and
`to "enable' the appropriate camera, in response thereto, so
`that the talker will be seen at the remote location. As different
`people in the group speak, the appropriate cameras covering
`the same are successively enabled so that the outgoing video
`signal matches the audio signal. Operational features include a
`graphic mode, for the remote display of written or graphic
`material,
`al E. leader mode, i.E. E. is
`biased in favor of the leader so as to give him substantial con
`trol over the conference.
`
`N w
`
`M O N
`
`-
`
`-
`
`Mo -
`
`-
`
`O 2
`
`r
`
`o
`s
`
`-
`
`ceco bO - -
`
`-
`
`. O
`
`C
`
`- D C C
`
`-
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 1 of 21
`
`
`
`PATENTED aug2 4 197!
`
`3,601,530
`
`SHEET 1 GF 10
`
`FIG./
`
`!
`
` R.C. ED.
`
`SON
`«< “| fr
`UNVENTORS aMITCHELL
`ev2 NCWl
`
`P REID
`
`ATTORNEY
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 2 of 21
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 2 of 21
`
`
`
`PATENTED Aus24 tr
`
`3,601,530
`
`SHEET
`
`Ce OF 10
`
`JLOWSY
`
`NOILVY301
`
`O3CIA
`
`3INA4Y34NO)
`
`SNLVeVdd¥
`
`NOISSIASNVYL
`
`
`
`Allllov4
`
`&Ola
`
`
`
`WACOW
`
` “4|
`
` :|/HOLIMS|O3GIN)O3qia
`LI—-
`a
`JINIWIINOD
`9I7| 30d0W
`
`yO1D3973S
`
`olany
`
`Las
`
`S
`
`HOLIMS
`
`TONLNOD*
`
`{3907|
`
`ONILLids
`
`OIGAY
`
`YYOMLIN
`
`NOILW10S1anv
`
`Tr}LINDID
`
`vi
`
`¢Ole
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 3 of 21
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 3 of 21
`
`
`
`
`
`
`
`
`
`
`PATENTED AUC2497
`
`3, 6 Ol. 53 O
`
`SHEET O3 OF 1 O
`
`
`
`
`
`s
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 4 of 21
`
`
`
`PATENTED AUG2497
`
`3, 6 Ol. 53O
`
`SHEET
`
`O4 OF 1 O
`
`
`
`099
`
`A
`
`|:||
`096L-- -
`
`d
`Cud
`S
`l l
`
`8399481 L | || WHOS T
`
`229
`
`619
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 5 of 21
`
`
`
`PATENTED aus24 wr;
`
`3,601,530
`
`SHEET
`
`OS OF 10.
`
`
`
`
`
`ONILVHOSLNI
`
`AW
`
`
`
`
`
`ees
`
`096
`
`
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 6 of 21
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 6 of 21
`
`
`
`
`
`
`ONE-SHOT MV
`
`.10DELAYDRIVERA
`
`DRIVER-C
`
`SHEET
`
`O6 OF 10
`
`3,601,530
`
`PATENTED aus24 197
`
`10DELAYDRIVER~0
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 7 of 21
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 7 of 21
`
`
`
`PATENTED AUG24 St.
`
`SHEET
`
`O7 OF 1 O
`
`3, 6O).
`53O
`
`736
`
`AGÐ
`
`ZZ9
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 8 of 21
`
`
`
`PATENTED AUC 2497
`
`3.
`,601,530
`
`SHEET 08 OF 10
`
`
`
`099
`
`096016096
`
`MN
`
`BONBH3 ? N00
`
`HEG\f3T-Tl
`
`BOJOW
`
`BOJOW
`
`
`
`TWW HON - N
`
`–––––
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 9 of 21
`
`
`
`PATENTED AUG2497.
`
`3, 6 Ol. 53O
`
`SHEET O9 OF 1 O
`
`AFA.G. WO
`Fig. 4 FIG 5 FG 6 FG 7
`FG 9
`F. G. 8
`
`ca.
`
`A/G AW
`RDA
`
`O
`
`
`
`SYNC
`GENERATOR
`
`MON-2
`
`RDG-2
`
`MON-3- D.C -
`
`RDG-5 21N
`\RDG-3
`
`NCOMNG
`
`SIGNAL
`
`N- OUTGOING
`SIGNAL
`A
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 10 of 21
`
`
`
`PATENTED AUG2497,
`
`3,601,530
`
`SHEET 10 OF 10
`
`Af/G. /2
`
`O Wy Vy w
`PEECH
`SPEECH-
`-TRIGGER LEVEL
`
`b
`
`m - -- - - - - - - - -
`
`--- DEAY
`
`f
`
`TIME --
`
`RESET
`HANGOVER-156NAL
`TME
`
`
`
`AF/G. /3
`
`s
`
`1305
`NDICATOR
`LAMP
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 11 of 21
`
`
`
`3,601,530
`
`WHIDEO CONFERENCESYSTEMUSING WOICE
`SWITCHED CAMERAS
`
`2
`that is transmitted to the remote location along with the audio
`signal. As different people in the group speak, in turn, the ap
`propriate cameras covering the same are successively enabled
`so that the outgoing video provides a good visual image of the
`BACKGROUND OF THE INVENTION
`person when talking. A corresponding operation takes place
`at the other location, i.e. the video conferencing is two-way.
`This invention relates to vial telephone systems and more
`It is a feature of the invention to provide a group of con
`particularly, to a video system for conference connecting two
`ferees with a display of the outgoing video. Thus, each con
`or more groups of remote conferees in a manner which ap
`feree sees an image of the person in his group who is presently
`proaches a true face-to-face conference situation.
`10
`talking, even though he might not be able to see the talker
`Visual telephone systems presently provide communication
`directly because of intervening conferees. This feature also
`between at least two locations. With the use of wide-angle len
`provides a "self-view' so that a person can verify the fact that
`ses at these locations, a video conference can be provided for
`he is adequately covered by a camera.
`two groups of remotely located conferees. Even though such
`A further feature of the invention is the provision of an
`arrangements are somewhat expensive, it has been recognized
`"overview' camera with a wide-angle lens so as to take in the
`for some time that this type of communication has the poten
`whole group of conferees at a given location. In the presence
`tial of greatly reducing travel and thus justifying substantial
`of a sustained silence (e.g. 12 seconds) at a location, the
`expense. Obviously, the reduction of travel not only saves
`switching reverts to the overview camera. Thus, one end or lo
`travel expenses, but even more importantly, the time of highly
`cation will periodically be given a view of the whole group at
`paid personnel. Now this wide-angle lens approach is accepta
`20
`the other end. Among other uses, this feature shows how the
`ble if each of the groups of conferees is small in number. To
`conferees are seated and tells one end when one or more con
`achieve good visual contact (i.e. to approximate a true face
`ferees at the other end has left the conference room.
`to-face conference situation) it is not practical to try to view
`A still further feature of the invention is an optional graphic
`more than a few people (e.g. three or four) at a time. As the
`mode of operation which permits the visual exchange of
`number of conferees in a group increases, it becomes increas
`graphic or written material. And in a still further modification
`ingly difficult to identify the conferees at the other location
`of this, a combined graphic-voice-switching mode of opera
`and specifically the particular person talking at a given time.
`tion is possible. In this latter mode, the system continually
`Present day commercial television has, at times, provided
`reverts to the graphic display, but other cameras may be selec
`programs which contain discussions between two groups of
`tively voted in (i.e. enabled) in response to sustained speech.
`remote conferees. In some instances, a technician at each
`30
`This hybrid mode of operation is advantageous when graphic
`group location manually points or aims the television camera
`material is being presented with the expectation that the same
`at the person presently talking and may even manually "-
`will be commented on by local conferees.
`zoom' in on the speaker to achieve good visual contact. In
`In accordance with another feature of the invention an op
`other cases, several fixed cameras are used and the technician
`tional conference leader mode of operation is provided. In this
`manually camera-switches between the participants of the
`35
`mode, the switching system is biased in favor of the con
`conference in order to display to the viewing audience the per
`ference leader, so as to provide him with a substantial degree
`son then talking. These prior art approaches to a true face-to
`of control over the conference at his location. Such a bias is,
`face conference situation have not been entirely satisfactory.
`of course, analogous to that appropriated by a leader in a true
`The technicians are expensive and of course they are fallible.
`face-to-face conference situation.
`40
`It often happens that the camera is aimed at the wrong per
`son-i.e. at someone other than the present speaker. If con
`ferencing by way of visual telephone is to be at all possible, the
`luxury of manual switching by video technicians can not be
`permitted.
`Accordingly, the primary object of the present invention is
`to establish a visual telephone conference connection between
`at least two groups of remote conferees which closely approxi
`mates a true face-to-face conference situation.
`A related object of the invention is to provide a video con
`50
`ference arrangement which utilizes voice-controlled switching
`to automatically direct the field of view of the participants at
`one end of the line toward the source of speech at the other
`end.
`
`15
`
`25
`
`45
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The invention will be more fully appreciated from the fol
`lowing detailed description when considered in connection
`with the accompanying drawings in which:
`FIGS. 1 and 2, when arranged as shown in FIG. 3, show a
`schematic block diagram of a visual telephone system con
`structed in accordance with the principles of the present in
`vention;
`FIGS. 4 through 9, when arranged as shown in FIG. 10,
`show a detailed schematic drawing of the voting circuit, mode
`selector and switch control logic, shown in block form in FIG.
`2;
`FIG. 11 shows a detailed schematic drawing of the video
`switching network;
`FIG. 12 illustrates certain waveforms useful in explanation
`of the invention; and
`FIG. 13 shows a typical relay driver circuit.
`DETALED DESCRIPTION
`Turning now to the drawings, FIGS. 1 and 2 show in sche
`matic block diagram a visual telephone system, which con
`ference-connects two groups of remotely located conferees.
`For purposes of illustrating the various features and aspects of
`the invention only a two-group conference situation need be
`considered. However, as will be evident hereinafter, the fea
`tures of the invention are in no way limited thereto and have
`equal applicability to a three-group conference, a four-group
`one, etc. For more than two groups of remote conferees, some
`additional switching should be employed to interconnect auto
`matically the remote groups. This additional switching can be
`of the same nature as that disclosed in the copending applica
`tion of I. Dorros, D. B. Robinson, Ser. No. 646,525, filed June
`16, 1967, now Pat. No. 3,519,744.
`
`55
`
`60
`
`SUMMARY OF THE INVENTION
`In accordance with the present invention two, or more,
`groups of remotely located conferees are connected by a two
`way video conference system which, in function, approaches a
`true face-to-face conference situation. At each location, a plu
`rality of video cameras are used and the field of each is
`restricted to a relatively small number of people who can be
`seen well enough to provide good visual contact. Voice voting
`and switching are used to determine the location of the person
`in the group who is talking, and in response thereto the ap
`65
`propriate camera is enabled so that the talker will be seen at
`the remote location. To this end, a plurality of microphones,
`equal in number to the video cameras, are positioned before a
`group; the microphone positions with respect to the group
`correspond to fields of view of the cameras. The location of
`70
`the person who is speaking is determined by the level of
`speech signals generated in each of the microphones. In
`response to the loudest speech signal, a voting circuit causes
`the camera which is covering the microphone generating the
`loudest speech signal to be enabled. And it is this video image
`
`75
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 12 of 21
`
`
`
`O
`
`15
`
`25
`
`30
`
`3,601,530
`3
`4.
`The visual telephone system of FIGS. 1 and 2 comprises a
`conference set 10. To establish an audio conference connec
`near end or proximate location, shown in detail, and a far end
`tion between two or more groups of conferees at remote loca
`or remote location, indicated by reference numeral 20. The
`tions and to assure sufficient volume at each location, it is
`apparatus and modes of operation for the two locations are
`common practice to use voice switching of speech to reduce
`the same and hence only the one location need be covered in
`the problems of echo and singing due to acoustic feedback.
`detail herein.
`The audio conference set 10 is utilized herein to these ends
`A typical terminal or conference location is schematically
`and any one of several known voice-switching networks can
`shown in plan in FIG. 1. Variations, particularly in the physi
`advantageously be used in the present system. For example,
`cal arrangement, will be evident hereinafter and hence it
`the audio conference set 10 can be of the type disclosed in the
`should be clear that the principles of the invention are in no
`article "General Transmission Considerations in Telephone
`way limited to the arrangement illustrated. For example, three
`Conference Systems' by D. Mitchell, IEEE Transactions on
`cameras C, C, C are used in FIG. 1 to cover the local group
`Communication Technology, Feb. 1968, Vol. Com-16, No. 1,
`of conferees, but two, or four, or five cameras can just as
`pages 163-167. The incoming audio signal from the remote
`readily be utilized, in the manner to be described, with only
`location is coupled to the loudspeaker LS and LS via this
`minor modification of the station equipment. Other variations
`audio conference set.
`of the same nature will be evident.
`The voting circuit 14 serves to detect the location of the
`A table 11, of a nondescript nature, is shown to have 10
`talker in the group. The speech energies from the
`chairs 12 disposed along its length-one chairper conferee. A
`microphones M, M, and M are compared in the voting cir
`second row of chairs can, if necessary, be placed directly be
`cuit and a decision is made as to which is the strongest. This is
`20
`hind chairs 12. Three video cameras C, C, and C are shown
`done on the basis of the speech envelope. If the speech energy
`and the field of each is restricted to a sufficiently small
`from microphone M is the strongest, an appropriate signal is
`number of people (four in this case) who can be seen well
`delivered by the voting circuit 14 to the switch control logic
`enough to provide good visual contact.
`15 which, in response thereto, serves to "enable' camera C.
`The fields of view of the cameras are designated A, B, and
`so that the remote conferees see the talker who is in region A.
`C, respectively. For each of these fields or regions there is also
`As the name implies, the mode selector 16 serves to select
`provided a microphone (M, M, M) and a typical television
`the desired mode of operation at that location. This selection
`receiver or monitor (MON-1, MON-2, MON-3). The
`is done manually by depressing the appropriate pushbutton.
`microphones are placed on table 11 more or less centrally
`There are four modes of operation and each will be covered in
`disposed with respect to the field of view of the associated
`detail hereinafter.
`camera, e.g., microphone M is approximately centered with
`The switch control logic 15 receives the output signals from
`respect to field or region A of camera C. The monitors are set
`the voting circuit 14 and in response thereto, and in ac
`across the table and preferably are large (e.g., 24 inches) so
`cordance with the mode established in mode selector 16, it
`that the images of the distant parties that appear thereon are
`delivers the appropriate signals to the video switch 17 to selec
`about life size. A pair of loudspeakers LS and LS can also be
`tively connect the video cameras and the receiver monitors to
`positioned on the table, as shown, or, alternatively, they can
`the outgoing and incoming video lines. The possible permuta
`hang down from the ceiling in a known manner, Location of
`tions in the connections established in the video switch 17, in
`the loudspeakers should be such that acoustic coupling to the
`response to signals from the control logic 15, are too nu
`microphones is minimized.
`merous to be here set forth; these will be set forth in detail
`An additional camera C is provided with a wide-angle lens
`below.
`so that it takes in the whole group of conferees-this camera is
`In addition to selectively energizing the video switch 17, nu
`designated hereinafter as the "overview' camera. A further
`merous other functions are carried out by the switch control
`camera C is typically mounted in the ceiling of the con
`logic 15. For example, control logic 15 contains memory to
`ference room and it is provided with a "zoom' lens system so
`decide which camera should be selected when a talker is in the
`that it can view graphic or written material disposed on the
`midregion between two cameras, and memory to keep a
`table therebelow. The zooming is carried out electromechani
`camera activated or enabled during pauses in speech. It also
`cally under pushbutton control, the button being located near
`includes circuitry which initiates a reversion to the overview
`either, or both, of the middle chair locations. A fourth monitor
`camera C, or in another instance to the graphic camera C, in
`MON-4 is centrally disposed with respect to the group of con
`the presence of a sustained silence. These and other functions
`ferees and it normally displays the outgoing video signal. The
`50
`of the control logic 15 will be covered in detail later.
`cameras and monitors are typically at different elevations so
`The video switch, in response to the enabling signals from
`as not to interfere with the respective views thereof.
`control logic 15, establishes the necessary video interconnec
`A pushbutton assembly not shown is used to select the mode
`of operation and it is placed adjacent one of the middle chair
`tions in accordance with the desired functional modes of
`operation set forth below. When a camera is said to be ena
`locations, preferably near the chair intended for the con
`55
`bled, it is in fact connected via the video switch 17 to the out
`ference leader. The cough buttons CB, CB, and CB, are
`going or incoming video line, as the case may be.
`located as shown in FIG. 1 and these may be used as desired to
`To prevent the loudspeakers from initiating a camera
`prevent a cough turning on a camera or to assure privacy for a
`switching operation, the incoming audio signal, delivered to
`side conversation at a given location.
`the speakers LS, and LS2, is also coupled to the control logic
`Since the conferees are preferably seated in a normal or
`15 where it performs an inhibit operation.
`natural fashion, i.e., at uniformly spaced positions, the fields
`of view or regions A, B, and C of cameras C, C, and C will
`The video switch 17 and the audio conference set 10 are
`overlap and some conferees will, of course, be located in the
`each 4-wire connected to the MODEM 18. The word
`midregions A-B and B-C. It is a particularly advantageous
`MODEM is a commonly used acronym for the modulator
`feature of the voting circuit of the present invention to posi
`demodulator apparatus of a transmitting-receiving terminal or
`tively detect when a speaker is in such a midregion and to
`station. That is, a MODEM comprises all the necessary ap
`eliminate all possible camera-switching ambiguities that might
`paratus forming the interface between the terminal equip
`result therefrom.
`ment, of whatever nature, and the transmission facility. This
`The microphones M, M, and M are used for both audio
`interface apparatus modulates the outgoing signals (i.e., the
`and location detection (i.e., location of the talker) purposes
`video and audio) onto distinct and appropriate carriers, and
`70
`and hence the output of each is initially coupled to an audio
`for the incoming signals it demodulates each and delivers the
`splitting and isolation network 13. The latter network delivers
`same to the appropriate station equipment.
`a respective portion of the speech energy of each microphone
`The transmission facility 19 may comprise any of the known
`to the voting circuit 14, with the remaining portions of the
`transmission links such as coaxial cable, radio relay, etcetera.
`speech energies then combined and delivered to the audio 75 It will be obvious to those in the art that the station equipment
`
`35
`
`40
`
`45
`
`60
`
`65
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 13 of 21
`
`
`
`10
`
`15
`
`30
`
`3,601,530
`S
`in accordance with the present invention is in no way limited
`to any particular transmission facility or interface apparatus.
`Before proceeding with the detailed explanation of the
`schematic diagram of FIGS. 4 through 9 and the numerous
`operations thereof, it should prove advantageous to set forth
`at this point the four basic modes of operation of the video
`conference system. Each of these operating modes is available
`at each location.
`Normal Mode
`In this mode a conferee will see whatever video is being sent
`from the remote end on monitors MON-1, MON-2 and
`MON-3. The conferee also sees the outgoing video, sent from
`the local station to the remote one, on the centrally disposed,
`overhead monitor MON-. Speech from anyone in the A, B or
`C regions will vote in (i.e., enable) the proper camera so as to
`show the speaker. Thus, the outgoing video will in this in
`stance match the audio. The last speaker will remain on
`camera for a short time (e.g. several seconds) unless someone
`else talks. When someone else, in a different region, talks the
`camera covering him is enabled and the previously enabled
`camera is disabled. If no one talks for a given period, the over
`view camera C is enabled so as to show the whole group of
`conferees to those at the remote end. A conferee in a midre
`gion is covered by two cameras; when such a conferee talks
`25
`one, or the other, of the two cameras will be enabled in ac
`cordance with “memory logic' in the switch control logic 15.
`Locked Graphic Mode
`In this mode the graphic camera C is locked to the outgoing
`video line and it is also connected to the three local monitors
`MON-1, MON-2, and MON-3 for local viewing of the
`graphic material. The monitor MON-4 now shows the video
`signal from the remote end. No other camera (e.g. C, C, C)
`can be connected or enabled with the system in this mode, i.e.,
`35
`no voice controlled, camera switching can occur.
`Automatic Graphic Mode
`This is similar to the locked graphic mode except that
`sustained speech in region A or C will vote in camera C or C.
`A pause of a few seconds, or even a brief speech by someone
`40
`in region B, switches the system back into the graphic mode.
`Thus, the system is, in this case, biased in favor of the graphic
`mode.
`Conference Leader Mod
`This mode is used for lectures or for any other situation in
`which it is desired to view the conference leader as much as
`possible. The leader will sit at one of the middle chair loca
`tions, in region B. A sustained speech in region A or C is
`required to vote in camera C or C. And a short pause in the
`latter or a brief speech from region B, once again enables
`50
`camera C. Thus, the system is biased in favor of the con
`ference leader positioned in region B. The monitors MON-1,
`MON-2 and MON-3 show the video from the far end, while
`MON-4 displays the outgoing video.
`55
`As the name would imply, the normal mode is the one nor
`mally utilized. The following description will, therefore, con
`sider the detailed logic circuitry and its functions with regard
`to this mode. The interaction of the various ancillary features
`(e.g. reverting) and alternative operating modes (e.g. graphic
`and leader) will then be subsequently covered in detail.
`Turning now to FIGS. 4 through 9, and first to FIG. 4, the
`output signals of microphones Ma, M and M are coupled via
`the preamp stages 401, 402 and 403, the impedance-matching
`transformers 405, 406 and 407 and the buffer or isolation am
`65
`plifiers 410,411 and 412 to the band-pass filters 413,414 and
`45. The filters have the same passband (e.g. 600-3,200 Hz.)
`and are used primarily to filter out nonspeech sounds. The
`microphone outputs are, as heretofore indicated, also used for
`audio conferencing purposes and, to this end, a portion of 70
`each microphone output is coupled, via the respective isola
`tion amplifiers 425, 426 and 427, to the four-way resistance
`pad 428. This pad is conventional and serves merely to com
`bine the microphone output signals and thence delivers the
`75
`same to the audio conference set 10.
`
`6
`The output signals of filters 413-415 are delivered to the
`voting circuit 14 for the purpose of detecting the location of
`the talker in the group. This determination of location is made
`by a comparison of the amplitudes of the speech envelopes
`picked up by the microphones. When a talker is decidedly in
`one, and only one, given region (i.e., A, B or C), a simple am
`plitude voting operation takes place. The voice-operated vot
`ing circuit, however, also determines if the talker is located in
`a midregion by comparing the amplitude of the speech energy
`received by adjacent microphones. When the difference in
`received energy is less than a preset value (e.g. 2 db.) the
`signal will be recognized as one coming from a midregion
`between two microphones. As will be covered hereinafter, the
`physical width of the microphone midregions can be varied
`and they preferably should correspond to the camera midre
`gions (A-B, B-C). When it has been determined that the
`talker is in a midregion, a decision must be made to turn on
`one of the two adjacent cameras; the control logic 15 makes
`this decision in a manner which will be covered hereinafter.
`20
`Considering the voting circuit now in greater detail, the out
`put signals of filters 413, 414 and 415 are respectively
`delivered to three full-wave, voltage doubler rectifiers 423,
`424 and 425 which, as will be recognized, are of a conven
`tional design. The rectified outputs are smoothed by the
`capacitors shown. Two transistors are connected to each recti
`fier outpute. For example, the bases of transistors 431 and
`441 are connected across the output of rectifier 423, with the
`base of transistor 441 being connected, of course, via the
`potentiometer 426. As indicated, the three potentiometer
`arms are preferably ganged. The transistors 431-433 and
`441-443 are also connected in a two-stage, common emitter,
`comparator configuration. That is, the transistors 431, 432
`and 433 have their emitters connected to the source -V. via
`the common emitter resistance 450, and the transistors 441,
`442 and 443 likewise have their emitters connected to said
`source via the common emitter resistance 451. The transistors
`461, 462 and 463 comprise conventional emitter follower
`stages.
`The comparator circuit operates in the following manner.
`Assume, first, that the talker is in the midregion A-B and the
`signals to the microphones M and M are thus substantially
`the same and produce a voltage eat each rectifier output (i.e.
`rectifiers 423 and 424) equal to 10 volts. Also, assume that the
`arm or tap of each potentiometer is adjusted to provide a volt
`age e' of 7.95 volts at the tap point (note, 20 log 1017.95=2
`db.). Accordingly, the relative value of voltages measured
`between each base and reference point 460, for the first set of
`emitter coupled transistors 431, 432 and 433, are such that
`transistor 43 conducts and transistors 432 and 433 are cut
`off. This cutoff of transistors 432 and 433 is due to the high
`emitter current flow of transistor 431 through the common
`emitter resistance 450. This operation is typical of common
`emitter comparators. In the second set of emitter-coupled
`transistors 441,442 and 443, a corresponding operation takes
`place and transistor 442 conducts and transistors 441 and 443
`are cut off. With transistors 431 and 442 conducting, the
`emitter follower transistors 461 and 462 are caused to con
`duct and an energizing signal is delivered to each of the output
`leads 471 and 472. This output is indicative of the fact that the
`talker is intermediate region A and region B, i.e., he is in
`midregion A-B.
`The more common situation is where the talker is decidedly
`in one, and only one, given region. Assume, for this case, that
`the talker is in region A and the signal to microphone M is
`such as to provide an output voltage e from rectifier 423 of 10
`volts and a voltage e from rectifier 424 of something less than
`7.95 volts. The output of rectifier 425 will, of course, be even
`less than that of rectifier 424. For the first set of emitter-cou
`pled transistors 431,432 and 433, the transistor 431 conducts
`and transistors 432 and 433 are cut off. In the second set of
`emitter-coupled transistors 441, 442 and 443, the transistor
`44 conducts since its input (7.95 volts) is greater than the
`input to transistor 442. This is because the output of rectifier
`
`45
`
`60
`
`CSCO-1007
`CISCO SYSTEMS, INC. / Page 14 of 21
`
`
`
`3,601,530
`
`5
`
`10
`
`15
`
`7
`8
`424 was assumed to be something less than 7.95 volts. Since
`To prevent a cough from turning on a camera, the cough
`transistor 441 is conducting, transistors 442 and 443 are cut
`button contacts CB, CB, and CBs are connected between
`off and only the voting circuit output lead 471 is energized.
`ground and the output leads of AND gates 511, 512 and 513,
`This output is indicative of the fact that the talker is located in,
`respectively. When a cough button is depressed, the make
`and only in, region A.
`contact thereof shorts the appropriate AND gate output to
`The ganged potentiometers control the physical width of
`ground and hence camera switching in response to a cough is
`the midregions between adjacent microphones. The greater
`prevented.
`the difference between the voltages e and e', the larger the
`For the normal mode, the occurrence of a sustained silence
`midregions, and, conversely, the smaller this difference, the
`results in the switching microphone reverting or respect to the
`smaller the midregions. The microphone midregions should
`overview camera C. For the automatic graphic mode, the oc
`correspond more or less to the overlap or midregions defined
`currence of a sustained silence of given duration results in the
`by the cameras. This preferred setting of the potentiometers
`reverting of the switching circuit back into the graphic mode,
`can be arrived at empirically by talking in a known midregion
`i.e., camera C is enabled. And for the conference leader
`location and then while talking in a monotone gradually shift
`mode, a sustained silence results in the reversion of the
`position until a camera switching occurs. The display on local
`switching to camera C, which covers the leader. The signal
`monitor MON-4 will provide an indication of the degree of
`that initiates this reversion is generated in the automatic
`correspondence between the microphone and camera midre
`reverting circuit 800 of FIG. 8, which will be described in
`gions.
`detail hereinafter. This reverting signal is delivered to the
`The zener diodes 481,482 and 483 serve to prevent the as
`input of the Schmitt trigger 504. The reverting signal is in the
`20
`sociated transistors from going into saturation; this extends
`nature of an RC-charging waveform, which, in the presence of
`the operating range of the comparison circuitry.
`a sustained silence, increases until it reaches the threshold
`The output signals of the voting circuit 14 are coupled to
`value of the Schmitt trigger circuit 504. The Schmitt trigger
`the analog to digital interface circuit 500, of FIG. 5. As the
`then goes to its "one' state and re