`Baker
`
`[54] TELECONFERENCING IMAGING SYSTEM
`WITH AUTOMATIC CAMERA STEERING
`
`[75] Inventor: Robert G. Baker, Delray Beach, Fla.
`[73] Assignee: International Business Machines
`Corporation, Arrnonk, N.Y.
`
`[211 APPl' No‘ 496,742
`[22] Filed:
`Jun. 30 1995
`’
`Related U_S_ Application Data
`
`[63] Continuation-impart of S61‘. No. 281,331, Jul. 27, 1994, Pat
`No. 5,508,734.
`s
`
`‘
`................................
`° """""""""""""""" "
`
`[
`
`' '
`_
`
`é48/214f348/586
`’
`of Search .............................. .. 348/53’ 580’ H04N 7/18 36,
`
`
`
`
`
`USOO5686957A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,686,957
`Nov. 11, 1997
`
`4,980,761 12/1990 Natori ..................................... .. 348/15
`5,503,734 4/1996 Baker ...................................... .. 348/36
`Primary Examiner-—Howard W. Britton
`Attorney, Agent, or FiWRichard A. Tomlin; John C_
`Black; Malm’ Haley’ DlMagglo & Crosby’ RA'
`[57]
`ABSTRACT
`An automatic, voice-directional video camera image steer
`ing system speci?cally for use for teleconferencing that
`electronically selects segmented images from a selected
`panoramic video scene typically around a conference table
`so that the participant in the conference currently speaking
`will be the selected segmented image in the proper viewing
`aspect ratio, eliminating the need for manual camera move
`ment or automated mechanical camera movement. The
`system includes an audio detection circuit from an array of
`microphoncs that can instantaneously det
`. e the direc_
`tion of a particular speaker and provide directional signals to
`a video camera and lens system that provides a panoramic
`
`that can electronically select portions of and, through warping techniques, remove any distortion
`
`from the most signi?cant portions of the image which lie
`from the horizon up to approximately 30 degrees in a
`hemispheric viewing area.
`
`[56]
`
`.
`References cued
`US. PATENT DOCUMENTS
`
`4,264,928
`
`4/1981 Schober .................................. .. 348/15
`
`14 Claims, 7 Drawing Sheets
`
`25’
`
`I I
`
`//3
`
`CAPTURE }
`LENS
`
`AUDIO
`DIRECTION
`PROCESSOR
`\
`m
`
`VIDEO
`CAM ERA
`I
`f/lé'
`VIEW
`WARPING 1---
`LOGIC
`1
`r7
`[J
`PC OR
`CONTROLLER
`
`
`
`US. Patent
`
`Nov. 11, 1997
`
`Sheet 1 of 7
`
`5,686,957
`
`Sm,
`
`NW /
`
`226D
`\%
`EOON
`
`EmOumZ/EH a _
`
`mommmuoma
`
`mZGZm
`
`Ill
`
`\ QR
`
`
`
`US. Patent
`
`Nov. 11,1997
`
`Sheet 2 of7
`
`5,686,957
`
`42\ l
`
`PANORAMIC IMAGE TRANSFORM. A50 r___‘I. ______ __1
`PARAMEI'ERS LOOKUP TABLE
`I NTSC 0
`i
`DIGITAL
`{CONVERSION
`
`"
`. WARP
`I----- ENGINE
`
`52
`
`ROWOOAJJ‘Z
`Ir
`
`F
`56
`
`Hogr
`BUS...-
`
`40
`
`SOURCE
`/- IMAGE _.__
`BUFFER
`
`4;
`
`.1 r
`)I/
`
`MULIIPLY/
`lNTERPOLATION
`- COEFFICIENT / ACCUMULATE
`BUFFER
`UNIT
`
`58b
`
`_ I
`WARP
`
`46’
`)I/ .
`I” IMAGE -~.
`
`' A
`'
`‘h MEMORY
`0"?
`CONTROL
`
`-
`
`/
`
`/
`/
`
`I
`
`29
`
`\
`
`27
`
`//.9
`1/
`VIDEO
`2/ CAPTURE
`LENS '__" CAMERA
`15
`I
`VIEw “J
`WARPING ——
`LOGIC
`17
`I
`PC 0R “4
`CONTROLLER
`
`AUDIO
`DIRECTION
`PROCESSOR
`\
`/9
`[2:95 [4
`
`
`
`US. Patent
`US. Patent
`
`Nov. 11, 1997
`Nov. 11, 1997
`
`Sheet 3 of 7
`Sheet 3 of 7
`
`5,686,957
`5,686,957
`
`
`
`Panasonic Exhibit 1002 Page 4 of 18
`
`
`
`US. Patent
`US. Patent
`
`Nov. 11, 1997
`Nov. 11, 1997
`
`Sheet 4 of 7
`Sheet 4 0f 7
`
`5,686,957
`5,686,957
`
`
`
` Fag £44
`
`9
`I— ----- 4,245
`
`Panasonic Exhibit 1002 Page 5 of 18
`
`
`
`US. Patent
`
`Nov. 11, 1997
`
`Sheet 5 of 7
`
`5,686,957
`
`Fay. 3B6’
`
`31915’
`
`3315’
`
`
`
`/6‘
`0497/)?
`
`{ADDRESS
`
`ERPOLAT
`C FHCIEN
`BUFFER
`
`I,
`
`\/ <0 9 $1;
`
`%O<<>~
`v?“
`Fig. 4
`
`RCE
`RESS
`
`A1
`
`SOURCE
`IMAGE
`BUFFER
`
`_
`
`C
`DA
`
`v
`
`SOURCE
`
`DATA
`MULTIPLY/
`QEI‘ETUMULATE
`
`__
`"'
`
`,MAG
`‘NG
`Z586
`WARPING
`CIRCUITS
`
`ADDRESS
`
`COLUMN(Y)
`p
`NE
`
`WA
`DA
`
`D
`
`' WARP
`[MAG
`BUFFER
`
`TlNATlON
`RESS
`
`Fay. 6B
`
`
`
`US. Patent
`
`Nov. 11, 1997
`
`Sheet 7 of 7
`
`Q)5 686
`
`,957
`
`
`
`201;(bedmo<§_
`
`
`
`mmo<§9205.0me
`
`mo<§_
`
`oz_mmmoom1
`
`QZ<
`
`OZEm<>>
`
`mt30m_o
`
`9.004<Z<
`
`43.55
`
`zo_mmm>zoo
`
`00.2241
`
`
`
`<mm§<o20m...HDQZ
`
`mmIFOmoomkz
`
`
`
`mmm52<m<mEKOLmZxEH
`
`
`
`m..m<._.n_3v_001_
`
`
`
`mo<§o_§<moz<&
`
`howimaOH
`
`>5am5OH><1Em5OH
`
`
`
`hug/EB;mam
`
`Jomhzoo
`
`kg.ANK
`
`
`
`:_O~:.zoo
`
`wmmmoo<
`
`oz<QZ<
`
`
`
`02Em<>>OZEmis
`
`
`
`mbaomambaomwo
`
`oz_mmmoomn.Oz_mmmoomn_
`
`.552:
`
`<._.<o
`
`JOKFZOO
`
`HmOI
`
`mam
`
`Panasonic Exhibit 1002 Page 8 of 18
`
`
`
`
`
`
`
`
`1
`TELECONFERENCING IMAGING SYSTEM
`WITH AUTOMATIC CAMERA STEERING
`
`5,686,957
`
`2
`to achieve accurately. With the recent increased emphasis on
`multimedia applications, various methods and apparatuses
`have been developed to manage visual data. A unique class
`of multimedia data sets is that of hemispheric visual data.
`Known multimedia methods and apparatuses attempt to
`combine various multimedia imaging data, such as still and
`motion (or video) images, with audio content using storage
`media such as photographic ?lm, computer diskettes, com
`pact discs (CDs), and interactive CDs. These are used in
`traditional multimedia applications in various ?elds, such as
`entertainment and education. Teleconferencing is an appli
`cation where automated electronic selection of scene content
`would result in greatly improved usability. Non-multimedia
`applications also exist that would employ hemispheric visual
`data, such as in security, surveillance, unmanned
`exploration, and ?re and police situations. However, as will
`be described below, the known methods and apparatuses
`have certain limitations in capturing and manipulating valu
`able information and hemispheric scenes in a rapid (i.e.,
`real-time) and cost eifective manner.
`One well known multimedia technique is used at theme
`parks, wherein visual information from a scene is displayed
`on a screen or collection of screens that covers almost 360
`degrees ?eld of view. Such a technique unfortunately results
`in the consumption of vast quantities of ?lm collected from
`multiple cameras, requires specially designed caniages to
`carry and support the cameras during ?lming of the scene,
`and necessitates synchronization of shots during capture and
`display. The technique is also limited in that the visual image
`cannot be obtained with a single camera nor manipulated for
`display, e.g., pan, tilt, zoom, etc., after initial acquisition.
`Hence, this technique, while providing entertainment, is
`unable to ful?ll critical technical requirements of many
`functional applications.
`Other known techniques for capturing and storing visual
`information about a large ?eld of view (FOV) are described
`in U.S. Pat. Nos. 4,125,862; 4,442,453; and 5,185,667. In
`U.S. Pat. No. 4,125,862, a system is disclosed that converts
`signal information from a scene into digital form, stores the
`data of the digitized scene serially in two-dimensional
`format, and reads out the data by repetitive scan in a
`direction orthogonally related to the direction in which the
`data was stored. U.S. Pat. No. 4,442,453 discloses a system
`in which a landscape is photographed and stored on ?rm.
`The ?lm is then developed, with display accomplished by
`scanning with electro-optical sensors at “near real-time”
`rates. These techniques, however, do not provide instant
`visual image display, do not cover the ?eld of view required
`for desired applications (hemispheric or 180 degrees ?eld
`of-view), do not generate visual image data in the format
`provided by the techniques of this invention, and are also not
`easily manipulated for further display, e.g., pan, tilt, etc.
`The technique disclosed in U.S. Pat. No. 5,185,667 over
`comes some of the above-identi?ed drawbacks in that it is
`able to capture a near-hemispheric ?eld of view, correct the
`image using high speed circuitry to form a normal image,
`and electronically manipulate and display the image at
`real-time rates.
`For many hemispheric visual applications, however, even
`U.S. Pat. No. 5,185,667 has limitations in obtaining su?i
`cient information of critical and useful details. This is
`particularly true when the camera is oriented with the central
`axis of the lens perpendicular to the plane bounding the
`hemisphere of acquisition (i.e. lens pointing straight up). In
`such applications, the majority of critical detail in a scene is
`contained in areas of the ?eld along the horizon and little or
`no useful details are contained in central areas of the ?eld
`
`This application is a continuation-in-part of U.S. patent
`application Ser. No. 281,331 ?led Jul. 27, 1994 U.S. Pat. No.
`5,508,734.
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`This invention relates to a video conferencing system that
`has automatic, voice-directional camera image steering, and
`speci?cally to a teleconferencing system that employs auto
`matic video image selection of the current participant speak
`ing electronically selected from a panoramic video scene.
`2. Description of the Prior Art
`Teleconferencing provides for the exchange of video and
`audio information between remotely separated participants.
`Typically, a ?rst group of participants is arranged around a
`conference table or seated strategically in a conference room
`and telecommunicating with a second group of participants
`similarly situated at a remote location. One or more video
`cameras at each location creates video images of the par
`ticipants through manual manipulation of each video
`camera, normally directed at the participant speaking at the
`moment. Microphones at each location provide for sound
`transmission signals. ‘The video image and audio voice
`signals are then transmitted to the remote location. The
`video image is projected onto a large screen or other type of
`video display which also would include audio outputs for
`providing the sounds.
`Manual manipulation of each video camera at each con
`ference site is required to change the direction of each
`camera to different participants as speakers change, unless a
`large overall view of all the participants is maintained Such
`a process is labor intensive. Also image content and
`perspective, dependent on the location of the video camera
`relative to the participants, contributes to the quality of the
`?nal visual display available to the participants watching the
`display screen. The quality of the image and the scene
`content all contribute to the overall effectiveness of the
`telecommunication process. In particular, in a setting such as
`a conference table in a conference room, a hemispheric or
`panoramic viewpoint would be much more e?icient for
`video image capture of surrounding selected participants.
`With a hemispheric scene, certain efficiencies are gained by
`eliminating large areas that are unused scene content while
`concentrating on a band of hemispheric areas populated by
`the teleconferencing participants. Therefore, it is believed
`that hemispheric or panoramic electronic imaging would be
`greatly bene?cial to a teleconferencing environment, espe
`cially when controlled with audio directional processors.
`The selected video image is taken from a desired segment of
`a hemispherical view in the correct video aspect ratio. A
`centralized panoramic image capture system which already
`has a distorted picture of the hemisphere bounded by the
`plane of the table upward selects a portion of the scene and
`warps the image to correspond to a normal aspect ratio view
`of the person spealdng. The signal can be converted to
`whatever display format is desired for transmission to a
`remote location. The present invention has incorporated, in
`one automated system, audio beam steering and electroni
`cally selectable subviews of a much larger panoramic scene.
`The video/subviews can be converted to an NTSC display
`format for transmission to a remote location for video
`display.
`The collection, storage, and display of large areas of
`visual information can be an expensive and di?icult process
`
`10
`
`20
`
`25
`
`30
`
`35
`
`45
`
`55
`
`65
`
`
`
`3
`located closer to the axis of the lens (the horizon being
`de?ned as the plane parallel to the image or camera plane
`and perpendicular to the optical axis of the imaging system).
`For example, in surveillance, the imaging system is aimed
`upward and the majority of the critical detail in the scene
`includes people, buildings, trees, etc., most of which are
`located within only a few degrees along the horizon (i.e.. this
`is the peripheral content). Also, in this example, although the
`sky makes up the larger central arc of the view, it contains
`little or no useful information requiring higher relative
`resolution.
`To obtain su?icient detail on the critical objects in the
`scene. the technique should diiferentiate between the rel
`evant visual information along the horizon and the remain
`ing visual information in the scene in order to provide
`greater resolution in areas of higher importance. U.S. Pat.
`No. 5,185,667 does not differentiate between this relevant
`visual information contained along the horizon and the
`remaining visual information in this scene. Thus, it fails to
`yield a su?icient quality representation of the critical detail
`of the scene for projected applications.
`Instead, techniques described above concentrate on
`obtaining, storing, and displaying the entire visual informa
`tion in the scene, even when portions of this information are
`not necessary or useful. To obtain the near-hemispheric
`visual information, such techniques require speci?c lens
`types to map image information in the ?eld of view to an
`image plane (where either a photographic ?lm or electronic
`detector or imager is placed). Known examples of U.S. Pat.
`No. 5,185,667 and U.S. Pat. No. 4,442,453 respectively use
`a ?sh~eye lens and a general wide-angle lens. As these lenses
`map information of a large ?eld without ditferentiation
`between the central and peripheral areas, information from
`the periphery will be less fully represented in the image
`plane than ?'om the central area of acquisition.
`In U.S. Pat. No. 4,170,400, Bach et al. describes a
`wide-angle optical system employing a ?ber optic bundle
`that has dill’ering geometric shapes at the imaging ends.
`Although this is useful in itself for collecting and reposi
`tioning image data. bending of light is a natural character
`istic of optical ?bers and not exclusive to that patent.
`Further, U.S. Pat. No. 4,170,400 employs a portion of a
`spherical mirror to gather optical information, rendering a
`very reduced subset of the periphery in the ?nal imaging
`result. This con?guration is signi?cantly different from the
`multi-element lens combination described in the present
`invention.
`Imperfections in the image representation of any ?eld
`inherently result from the nature of creating an image with
`any spherical glass (or plastic) medium such as a lens. The
`magnitude of these imperfections increases proportionally to
`the distance a point in the ?eld is from the axis perpendicular
`to the optical imaging system. As the angle between the
`optical axis and a point in the ?eld increases, aberrations of
`the corresponding image increase proportional to this angle
`cubed. Hence, aberrations are more highly exaggerated in
`the peripheral areas with respect to more central areas of a
`hemispheric image.
`Although the lens types above achieve a view of a large
`?eld. the valuable content from the peripheral areas lacks in
`potential image quality (resolution) mapping because the
`imaging device and system does not differentiate between
`these areas and the central areas of less valuable detail.
`Often. the di?’erence between the imaging capabilities
`between the two areas is compensated for by using only the
`central portion of a lens to capture the scene (“stopping the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`50
`
`65
`
`5,686,957
`
`4
`lens down”). This works in etfect to reduce the image quality
`of both areas such that the di?’erence in error is a lesser
`percentage of the smallest area even the central area can
`resolve. Simultaneously, this compensation technique fur
`ther degrades the performance of the lens by limiting the
`amount of light which is allowed to enter the lens, and thus
`reducing the overall intensity of the image.
`More typically, the peripheral content imaged by a con
`ventional lens is so degraded in comparison with the central
`area that the lens allows for only a minimal area of the
`periphery to be recorded by the ?lm or electronic irnager. As
`a result of these “oil-axis” aberrations inherent to large ?eld,
`the relevant information of the horizon in the scene can be
`underutilized or worse yet, lost.
`Another limitation in U.S. Pat. No. 5,185,667 is its
`organization for recording only views already corrected for
`perspective. The nature of that methodology is that the
`speci?c view of interest must be selected and transformed
`prior to the recording process. The result is that no additional
`selection of views can be accomplished after the storage
`process, reducing system ?exibility from the user's perspec
`tive.
`Hence, there is a demand in the industry for single camera
`imaging systems that efficiently capture, store, and display
`valuable visual information within a hemispheric ?eld of
`view containing particularly peripheral content, and that
`allow electronic manipulation and selective display of the
`image post-acquisition while minimizing distortion etfects.
`Such a system ?nds advantageous application in a tele
`conferencing environment in accordance with the present
`invention.
`Limited control of video cameras is disclosed in the prior
`art U.S. Pat. No. 4,980,761 issued to Natori, Sep. 25, 1990
`describes an image processing system that rotates a camera
`for a teleconference system. A control unit outputs a drive
`signal based on an audio signal to control the movement of
`the image until the image controlling unit receives an
`operational completion signal In this case, the rotational
`movement of the camera, moving the video image from one
`participant to another participant, alleviates having to view
`the camera movement. Once the camera stops skewing, the
`picture will then provide the proper aspect ratio. Aplurality
`of microphones are provided to each attendant. A sound
`control unit then determines with a speaker detection unit
`which participant is speaking. U.S. Pat. No. 4,965,819
`shows a video conferencing system for courtroom and other
`applications in which case each system includes a local
`module that includes a loud speaker, a video camera, a video
`monitoring unit and a microphone for each local conferee.
`U.S. Pat. No. 5,206,721 issued to Ashida, Apr. 27, 1998,
`shows a television conference system that allows for auto
`matically mechanically moving and directing a camera
`towards a speaking participant In this system a microphone
`is provided for each participant and is recognized by the
`control system Image slew is corrected to avoid camera
`image motion. Areview of these systems thus shows that the
`automation provided is very expensive and in every case
`requires individualized equipment for each participant.
`Limited audio direction ?nding for multiple microphone
`arrays is known in the prior art. For example, a self steering
`digital microphone array de?ned by W. Kellerman of Bell
`Labs at ICASSP in 1991 created a teleconference in which
`a unique steering algorithm was used to determine direction
`of sound taking into account the acoustical environment in
`which the system was located. Also a two stage algoritlun
`for determining talker location from linear microphone array
`
`
`
`5
`data was developed by H. Silverman and S. Kirkman at
`Brown University and disclosed in April, 1992. The ?ltered
`cross correlation of the system is introduced as the locating
`algorithm.
`A “telepresence” concept from BellCorp brie?y described
`in IEEE Network Magazine in March, 1992 suggests a
`spherical camera for use in the teleconference system
`However, the entire image is sent in composite form for the
`remote users to select from at the‘ other end. The present
`invention is quite different and includes automated pointing
`and control including incorporation in one automated system
`of both audio beam steering and selectable subviews of a
`much larger panoramic scene.
`
`SUNIMARY OF THE INVENTION
`The present invention comprises a video conferencing,
`voice-directional video imaging system for automatic elec
`tronic video image manipulation of a selected, directional
`signal of a hemispheric conference scene transmitted to a
`remote conference site. The system employs three separate
`subsystems for voicedirected, electronic image manipula
`tion suitable for automated teleconferencing imaging in a
`desirable video aspect ratio.
`The audio beam, voice pickup and directing subsystem
`includes a plurality of microphones strategically positioned
`near a predetermined central location, such as on a confer
`ence table. The microphone array is arranged to receive and
`transmit the voices of participants, while simultaneously
`determining the direction of a participant speaking relative
`to the second subsystem, which is a hemispheric imaging
`system used with a video camera. The third subsystem is a
`personal computer or controller circuits in conjunction with
`the hemispheric imaging system which ultimately provides
`automatic image selection of the participant speaking that is
`ultimately transmitted as a video signal to the remote video
`display at the remote teleconference location.
`The hemispheric electronic image manipulator subsystem
`includes a video camera having a capture lens in accordance
`with the invention that allows for useful electronic manipu
`lation of a segmented portion of a hemispheric scene. In a
`conference table setting, as viewed from the center of the
`conference table, participants are arranged around the table
`in the lower segment of the hemisphere, with the plane of the
`table top forming the base of the hemisphere. The electronic
`image is warped to provide a desired subview in proper
`aspect ratio in the audio selected direction.
`The present invention provides a new and useful voice
`directional visual imaging system that emphasizes the
`peripheral content of a hemispheric ?eld of view using a
`single video camera. The invention allows user-selected
`portions of a hemispheric scene to be electronically
`manipulated, transmitted, and displayed remotely from the
`video camera in real-time and in a cost-effective manner.
`The visual imaging system of the present invention
`involves a video image having a lens with enhanced periph
`eral content imaging capabilities. The lens provides an
`enhanced view of the valuable information in the scene’s
`periphery by imaging the ?eld of view to the image plane
`such that the ratio of the size of the smallest detail contained
`within the periphery of the scene to the size of the smallest
`resolving pixel of an image device is increased. For this to
`be accomplished, the peripheral content must map to a larger
`percentage of a given image detector area and,
`simultaneously, the mapped image of the central area of the
`scene must be minimized by the lens so that it does not
`interfere with the peripheral content now covering a wider
`
`20
`
`25
`
`35
`
`50
`
`55
`
`60
`
`65
`
`5,686,957
`
`6
`annulus in the image plane. Information in the image plane
`is then detected by the video camera. The detected infor
`mation of the entire hemispheric scene is then stored as a
`single image in memory using traditional methods.
`When a portion of the scene is to be displayed, the image
`information relating to the relevant portion of the scene is
`instantaneously retrieved from memory. A transform pro
`cessor subsystem electronically manipulates the scene for
`display as a perspective-correct image on a display device,
`such as a teleconference display screen or monitor, as if the
`particular portion of the scene had been viewed directly with
`the video camera pointed in that direction. The transform
`processor subsystem compensates for the distortion or dif
`ference in magni?cation between the central and peripheral
`areas of the scene caused by the lens by applying appropriate
`correction criteria to bring the selected portion of the scene
`into standard viewing format. The transform processor sub
`system can also more fully compensate for any aberrations
`of the enhanced peripheral image because of the image’s
`improved resolution as it covers a larger portion of the image
`device (increased number of pixels used to detect and
`measure the smallest detail in the periphery image). More
`pixels equates to more measurement data, hence more
`accurate data collection.
`The stored image can also be manipulated by the trans
`form processor subsystem to display an operator-selected
`portion of the image through particular movements, such as
`pan, zoom, up/down, tilt, rotation, etc.
`By emphasizing the peripheral content of a scene, the
`visual imaging system can use a single camera to capture the
`relevant visual information within a panoramic ?eld of view
`existing along the horizon, while being able to convention
`ally store and easily display the scene, or portions thereof, in
`real-time. Using a single optical system and camera is not
`only cost-effective, but keeps all hemispheric visual data
`automatically time-synchronized.
`In the present invention, at a conference table view point,
`with participants seated around a conference table, hemi
`spheric scene content is ideally suited for segmented sub
`views of participants, especially when directionally elec
`tronically manipulated by voice actuation. The video image
`should be of the current speaker.
`One advantage of the present invention is that the unique
`visual imaging system lens can capture information from a
`hemispheric scene by emphasizing the peripheral portion of
`the hemispheric ?eld of view and thus provide greater
`resolution with existing imaging devices for the relevant
`visual information in the scene. As an example, if an
`ordinary ?sheye lens focuses the lowest 15 degrees up from
`the horizon on ten percent of ?re imager at the imaging plane
`and the peripheral-enhancing lens focuses that same 15
`degrees on ?fty percent of the imager, there is a ?ve-fold
`increase in resolution using the same imaging device.
`Depending on the application and exact formulation of the
`lens equations, there will be at least a ?ve times increase in
`resolving power by this lens/imager combination.
`’
`The third subsystem of the present invention comprises a
`control apparatus such as a personal computer or other
`collection of electronic circuits, connected to the imagery
`system to allow ?exible operation delivering options and
`defaults, including an override of the automated video image
`manipulation. A minimal control program is the software of
`the host controller to provide the options that may be
`necessary for individual teleconferences. An example would
`be to delay switching time segments between speakers, or
`perhaps the use of alternate cameras that may include a dual
`display.
`
`
`
`5,686,957
`
`8
`FIG. 5 is a schematic block diagram of the panoramic
`transform processor subsystem for use with the teleconfer
`encing system of the present invention.
`FIGS. 6A and 6B are a schematic diagram showing how
`multiple transform processor subsystems can be tied into the
`same distorted image to provide multiple diiferent view
`perspectives to different users from the same source image
`as described in the parent application.
`
`10
`
`DESCRIPTION OF THE PREFERRED
`EMBODIIVIENTS
`
`7
`In operation. at a particular teleconferencing site, partici
`pants will be arranged at a conference table or in a confer
`ence room with an array of microphones, each of which will
`pick up the normal speaking voice of each participant. The
`array of microphones is directly connected to an audio
`direction processor. The hemispheric lens system in con
`junction with the video camera is attached to view warping
`logic as explained above and to the controller, or a personal
`computer. The video and audio signals are then transmitted
`through a transmission medium in an NTSC or other format
`to the remote teleconferencing site for remote display.
`Sound from the participant speaking that is processed in
`the audio direction processor determines the direction of the
`participant speaking relative to the panoramic video camera
`and lens. Once the particular speaker direction is
`determined, the panoramic image of a speci?c hemispherical
`region of interest. such as the participant’s face, is processed
`to provide a normal video aspect ratio view for the remote
`participants using the system.
`It is a principal object and advantage of this invention to
`provide a video conferencing system that automatically
`directs a video image to the participant that is speaking while
`providing a hemispherical video imaging subview that can
`be electronically manipulated in the direction of the speaker
`selected from a panoramic scene.
`It is another principal advantage of this invention to
`provide an automatic teleconferencing system that saves
`transmission time, reduces coincident cost by eliminating or
`reducing manual operation of a video camera, and does not
`detract from the concentration of the subject during the
`conference.
`And yet another advantage of this invention is to provide
`an automatic video camera with electronic image manipu
`lation for video conferencing equipment that has no moving
`mechanical parts or physical mechanisms which improves
`the reliability of the system and reduces maintenance costs
`and service costs.
`In accordance with these and other objects which will
`become apparent hereinafter, the instant invention will now
`be described with particular reference to the accompanying
`drawings.
`
`20
`
`25
`
`30
`
`35
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a schematic illustration of the visual imaging
`system organization and components of the parent applica
`tion.
`FIG. 1A is a schematic illustration of the automated video
`conferencing system organization and components.
`FIGS. 2A, 2B, and 2C show a cross sectional diagram
`indicating the ?eld input and output rays and the resulting
`relative ?eld coverage a lens typically provides in the image
`plane for detection by an imager device.
`FIGS. 3AA, SAB, and 3AC show a cross sectional
`diagram indicating the ?eld input and output rays and the
`resulting ?eld coverage that optical system Example I,
`constructed according to the principles of the present
`invention. provides in the image plane for detection by an
`imaging device or substrate.
`FIGS. 3BA, 313B show a cross sectional diagram indicat
`ing the ?eld input and output rays and the resulting ?eld
`coverage that optical system Example II of this present
`invention provides in the image plane for detection by an
`imaging device or substrate.
`FIG. 4 is a schematic representation of the mapping
`locations on the imaging device.
`
`45
`
`50
`
`55
`
`65
`
`The invention will be de?ned initially with a brief
`description of the principles thereof.
`
`Principles of the Present Invention
`
`As described in the parent US. patent application, the
`imaging invention stems from the realization by the inven
`tors that in many of the technical hemispheric ?eld
`applications, Where the image detector is parallel to the
`plane of the horizon, much of the relevant visual information
`in the scene (e.g., trees, mountains, people, etc.) is found
`only in a small angle with respect to the horizon. Although
`the length of the are from the horizon containing the relevant
`information varies depending upon the particular
`application, the inventors have determined that in many
`situations, almost all the relevant visual information is
`contained within about 10 to 45 degrees with respect to the
`horizon. This determination is especially true with respect to
`the teleconference environment which is normally centered
`around a conference table or conference room.
`To maximize data collection and resolution for analysis
`and/or display of the relevant visual information located in
`this portion of the hemispheric scene, it is desirable to
`maximize the dedication of the available image detection
`area to this peripheral ?eld portion. To accommodate this, it
`is necessary that the “central” portion of the scene (from 45
`to 90 degrees with respect to the horizon) cover only the
`remaining areas of the imager plane so as not to interfere
`with light from the periphery.
`area contains less
`In many cases, since the “centr
`detailed information, such as a solid white ceiling or a clear
`or lightly clouded sky, it is allowable to maximize com
`pletely the dedication of the available image detection area
`to the peripheral ?eld portion by reducing the portion of the
`imager device representing the “central” area