`Baker
`
`ts
`
`[54]
`
`TELECONFERENCING IMAGING SYSTEM
`WITH AUTOMATIC CAMERA STEERING
`
`[75]
`
`Inventor: Robert G. Baker, Delray Beach, Fla.
`
`[73]
`
`Assignee:
`
`International Business Machines
`Corporation, Armonk, N-Y.
`
`[21]
`
`[22]
`
`[63]
`
`51]
`52]
`
`[58]
`
`[56]
`
`Appl. No.: 496,742
`
`Filed:
`
`Jun. 30, 1995
`
`Related U.S. Application Data
`
`Continuation-in-part of Ser. No. 281,331, Jul. 27, 1994, Pat.
`No. 5,508,734.
`
`Wt, Co oncceccssssseostosssecsessnssetesorssesnesetaseess HOAN 7/18
`TS. Ce cecsssssssensssssssnssanveasees 348/36; 348/15; 348/53;
`348/214; 348/580
`..........ccccsssssssesses 348/15, 214, 36,
`348/53, 580; HOAN 7/18
`
`Field of Searcht
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`A 0A
`US005686957A,
`[11] Patent Number:
`[45] Date of Patent:
`
`5,686,957
`Nov. 11, 1997
`
`4,980,761
`5,508,734
`
`
`woe 348/15
`.
`12/1990 Natori
`4/1996 Baker 0...cccsscsesesssseseesesseeees 348/36
`
`Primary Examiner—Howard W. Britton
`Attorney, Agent, or Firm—Richard A. Tomlin; John C.
`Black; Malin, Haley, DiMaggio & Crosby, P.A.
`[57]
`ABSTRACT
`
`An automatic, voice-directional video camera image steer-
`ing system specifically for use for teleconferencing that
`electronically selects segmented images from a selected.
`panoramic video scene typically around a conference table
`so that the participant in the conference currently speaking
`will be the selected segmented image in the proper viewing
`aspect ratio, eliminating the need for manual camera move-
`ment or automated mechanical camera movement. The
`system includes an audio detection circuit from an array of
`microphones that can instantaneously determine the direc-
`tion of a particular speaker and provide directional signals to
`a video camera and lens system that provides a panoramic
`display that can electronically select portions of that image
`and,
`through warping techniques, remove any distortion
`from the most significant portions of the image which lie
`from the horizon up to approximately 30 degrees in a
`hemispheric viewing area.
`
`4,264,928
`
`4/1981 Schober .....c..sssssessssarsressssesseenee 348/15
`
`14 Claims, 7 Drawing Sheets
`
`as
`
`
`
`
`
`
`13°
`
`AUDIO
`
`DIRECTION
`PROCESSOR
`
`17
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 1 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 1 of 18
`
`
`
`US. Patent
`
`Noy. 11, 1997
`
`Sheet 1 of 7
`
`5,686,957
`
`T//
`
`HEADS— UPDISPLAY
`
`ae TRANSFORM
`
`PROCESSOR
`ENGINE
`
`
`CH
`
`18
`
`fig.7
`
`
` 10
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 2 of 18
`
`£0
`
`16
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 2 of 18
`
`
`
`U.S. Patent
`
`Noy. 11, 1997
`
`Sheet 2 of 7
`
`5,686,957
`
`Ii |
`
`|
`DIGITAL
`| CONVERSION !
`Dae mee ee et cawe
`
`ENGINE
`
`COEFFICIENT
`BUFFER
`
`
`
` WARP
`INTERPOLATION:
`C
`Y) *
`
`
`
`
`
`OLUMN(
`
`WARP
`ENGINE
`
`
`
`
` DIRECTION
`PROCESSOR
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 3 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 3 of 18
`
`
`
`US. Patent
`
`Nov. 11, 1997
`
`5,686,957
`
`Sheet 3 of 7
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 4 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 4 of 18
`
`
`
`U.S. Patent
`
`Nov. 11, 1997
`
`5,686,957
`
`Sheet 4 of 7
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 5 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 5 of 18
`
`
`
`Nov. 11, 1997
`
`Sheet 5 of 7
`
`U.S. Patent
`
`5,686,957
`
`ILBR
`
`ILE
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 6 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 6 of 18
`
`
`
`US. Patent
`
`Nov. 11, 1997
`
`Sheet 6 of 7
`
`5,686,957
`
`SOURCE
`ADDRESS
`
`WARP
`ENGINE
`
`SOURCE
`DATA
`
`MULTIPLY/
`Tee
`
`WARPED
`DATA
`
`WARPED
`IMAGE
`BUFFER
`
`CIRCUITS
`
`
`
`PROCESSING
`AND
`WARPING
`
`DESTINATION
`ADDRESS
`
`.
`tig. 64
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 7 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 7 of 18
`
`
`
`U.S. Patent
`
`
`
`
`
`NSOVWIDINOMLOSTS
`
`
`_VYaWVOWOMLAdNI
`
`
`
`WOdsVIVOJ9VNISOWNYYSHLOYOOSLN
`
`
`
`NOC”SUALAWVeVd~WYOASNVYL
`
`OLSONYJOVIDINVHONVd
`
`
`
`FIGVLdnyoo7
`
`
`
`
`
`INISSS9IOUdINISSSOOUdINISSAVOdd
`
`
`
`
`
`JOVI]JOVANISOVNI
`
`Noy. 11, 1997
`
`Tt
`Pty
`—n_
`
`pf
`
`PpfTscauaayazaTOYLNOOD
`
`
`
`Sheet 7 of 7
`
`aNVQNVONV
`
`
`
`
`
`ONIddVMOINIddvVMONIGYVM
`
`
`
`
`
`SLINDAIDSLINDYIDSLINDYIO
`
`AYOWAN
`
`ISOH
`
`Sng
`
`
`
`
`
`TOYNLNOOD
`
`5,686,957
`
`
`
`
`
`OVGWNVYOVANVYOVAWVY/SNSYSINISNE
`
`AVIdSIGOLAVIdSIGOLAVIdSIGOLKgOig
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 8 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 8 of 18
`
`
`
`
`
`5,686,957
`
`1
`TELECONFERENCING IMAGING SYSTEM
`WITH AUTOMATIC CAMERA STEERING
`
`This application is a continuation-in-part of U.S. patent
`application Ser. No. 281,331 filed Jul, 27, 1994 U.S. Pat. No.
`§,508,734.
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`
`This inventionrelates to a video conferencing system that
`has automatic, voice-directional camera image steering, and
`specifically to a teleconferencing system that employs auto-
`matic video image selection of the current participant speak-
`ing electronically selected from a panoramic video scene.
`2. Description of the Prior Art
`Teleconferencing provides for the exchange of video and
`audio information between remotely separated participants.
`Typically, a first group of participants is arranged around a
`conferencetable or seated strategically in a conference room
`and telecommunicating with a second group of participants
`similarly situated at a remote location. One or more video
`cameras at cach location creates video images of the par-
`ticipants through manual manipulation of each video
`camera, normally directed at the participant speaking at the
`Ioment. Microphones at each location provide for sound
`transmission signals. The video image and audio voice
`signals are then transmitted to the remote location. The
`video image is projected onto a large screen or other type of
`video display which also would include audio outputs for
`providing the sounds.
`Manual manipulation of each video camera at each con-
`ference site is required to change the direction of each
`camera to different participants as speakers change, unless a
`large overall view of all the participants is maintained. Such
`a process is labor intensive. Also image content and
`perspective, dependent on the location of the video camera
`relative to the participants, contributes to the quality of the
`final visual display available to the participants watching the
`display screen. The quality of the image and the scene
`content all contribute to the overall effectiveness of the
`telecommunication process. In particular, in a setting such as
`a conference table in a conference room, a hemispheric or
`panoramic viewpoint would be much moreefficient for
`video image capture of surrounding selected participants.
`With a hemispheric scene,certain efficiencies are gained by
`eliminating large areas that are unused scene content while
`concentrating on a band of hemispheric areas populated by
`the teleconferencing participants. Therefore, it is believed
`that hemispheric or panoramic electronic imaging would be
`greatly beneficial to a teleconferencing environment, espe-
`cially when controlled with audio directional processors.
`The selected video image is taken from a desired segment of
`a hemispherical view in the correct video aspect ratio. A
`centralized panoramic image capture system which already
`has a distorted picture of the hemisphere bounded by the
`plane of the table upward selects a portion of the scene and
`warps the image to correspond to a normal aspect ratio view
`of the person speaking. The signal can be converted to
`whatever display format is desired for transmission to a
`remote location. The present invention has incorporated, in
`one automated system, audio beam steering and electroni-
`cally selectable subviews of a much larger panoramic scene.
`The video/subviews can be converted to an NTSC display
`format for transmission to a remote location for video
`display.
`The collection, storage, and display of large areas of
`visual information can be an expensive anddifficult process
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`2
`to achieve accurately. With the recent increased emphasis on
`multimedia applications, various methods and apparatuses
`have been developed to manage visual data. A unique class
`of multimedia data sets is that of hemispheric visual data.
`Known multimedia methods and apparatuses attempt to
`combine various multimedia imaging data, such asstill and
`motion (or video) images, with audio content using storage
`media such as photographic film, computer diskettes, com-
`pact discs (CDs), and interactive CDs. These are used in
`traditional multimedia applications in variousfields, such as
`entertainment and education. Teleconferencing is an appli-
`cation where automated electronic selection of scene content
`would result in greatly improved usability. Non-multimedia
`applications also exist that would employ hemispheric visual
`data, such as in security, surveillance, unmanned
`exploration, and fire and police situations. However,as will
`be described below, the known methods and apparatuses
`havecertain limitations in capturing and manipulating valu-
`able information and hemispheric scenes in a rapid (ie.,
`real-time) and cost effective manner.
`One well known multimedia technique is used at theme
`parks, wherein visual information from a sceneis displayed
`on a screen or collection of screens that covers almost 360
`degreesfield of view. Such a technique unfortunately results
`in the consumption of vast quantities of film collected from
`multiple cameras, requires specially designed carriages to
`carry and support the cameras during filming of the scene,
`and necessitates synchronization of shots during capture and.
`display. The techniqueis also limited in that the visual image
`cannot be obtained with a single camera nor manipulated for
`display, e.g., pan, tilt, zoom, etc., after initial acquisition.
`Hence, this technique, while providing entertainment,
`is
`unable to fulfill critical technical requirements of many
`functional applications.
`Other known techniques for capturing and storing visual
`information abouta large field of view (FOV) are described
`in U.S. Pat. Nos. 4,125,862; 4,442,453; and 5,185,667. In
`US. Pat. No. 4,125,862, a system is disclosed that converts
`signal information from a scene into digital form, stores the
`data of the digitized scene serially in two-dimensional
`format, and reads out
`the data by repetitive scan in a
`direction orthogonally related to the direction in which the
`data was stored. U.S. Pat. No. 4,442,453 discloses a system
`in which a landscape is photographed and stored on firm.
`Thefilm is then developed, with display accomplished by
`scanning with electro-optical sensors at “near real-time”
`tates. These techniques, however, do not provide instant
`visual image display, do not cover the field of view required
`for desired applications (hemispheric or 180 degrees field-
`of-view), do not generate visual image data in the format
`provided by the techniquesof this invention, and are also not
`easily manipulated for further display, e.g., pan,tilt, etc.
`The technique disclased in U.S. Pat. No. 5,185,667 over-
`comes some of the above-identified drawbacks in that it is
`able to capture a near-hemispheric field of view, correct the
`image using high speed circuitry to form a normal image,
`and electronically manipulate and display the image at
`Teal-timerates.
`
`For many hemispheric visual applications, however, even
`U.S. Pat. No. 5,185,667 has limitations in obtaining suffi-
`cient information of critical and useful details. This is
`particularly true when the camerais oriented with the central
`axis of the lens perpendicular to the plane bounding the
`hemisphere of acquisition (i.e. lens pointing straight up). In
`such applications, the majority of critical detail in a scene is
`contained in areas ofthe field along the horizon andlittle or
`no useful details are contained in central areas ofthe field
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 9 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 9 of 18
`
`
`
`5,686,957
`
`3
`located closer to the axis of the lens (the horizon being
`defined as the plane parallel to the image or camera plane
`and perpendicular to the optical axis of the imaging system).
`For example, in surveillance, the imaging system is aimed
`upward and the majority of the critical detail in the scene
`includes people, buildings, trees, etc., most of which are
`located within only a few degrees along the horizon(i.e., this
`is the peripheral content). Also, in this example, although the
`sky makesupthe larger central are of the view, it contains
`little or no useful information requiring higher relative
`Tesolution.
`
`To obtain sufficient detail on the critical objects in the
`scene, the technique should differentiate between the rel-
`evant visual information along the horizon and the remain-
`ing visual information in the scene in order to provide
`greater resolution in areas of higher importance. U.S. Pat.
`No. 5,185,667 does not differentiate between this relevant
`visual
`information contained along the horizon and the
`remaining visual information in this scene. Thus,it fails to
`yield a sufficient quality representation ofthe critical detail
`of the scene for projected applications.
`Instead,
`techniques described above concentrate on
`obtaining, storing, and displaying the entire visual informa-
`tion in the scene, even when portions ofthis information are
`not necessary or useful. To obtain the near-hemispheric
`visual information, such techniques require specific lens
`types to map image information in the field of view to an
`image plane (where either a photographic film or electronic
`detector or imager is placed). Known examples of U.S. Pat.
`No. 5,185,667 and U.S. Pat. No. 4,442,453 respectively use
`afish-eye lens and a general wide-angle lens. As these lenses
`map information of a large field without differentiation
`betweenthe central and peripheral areas, information from
`the periphery will be less fully represented in the image
`plane than from the central area of acquisition.
`In US. Pat. No. 4,170,400, Bach et al. describes a
`wide-angle optical system employing a fiber optic bundle
`that has differing geometric shapes at the imaging ends.
`Although this is useful in itself for collecting and reposi-
`tioning image data, bending of light is a natural character-
`istic of optical fibers and not exclusive to that patent.
`Further, U.S. Pat. No. 4,170,400 employs a portion of a
`spherical mirror to gather optical information, rendering a
`very reduced subset of the periphery in the final imaging
`result. This configuration is significantly different from the
`tmulti-element lens combination described in the present
`invention.
`
`Imperfections in the image representation of any field
`inherently result from the nature of creating an image with
`any spherical glass (or plastic) medium such as a lens. The
`magnitude of these imperfections increases proportionally to
`the distance a pointin the field is from the axis perpendicular
`to the optical imaging system. As the angle between the
`optical axis and a point in the field increases, aberrations of
`the corresponding image increase proportional to this angle
`cubed. Hence, aberrations are more highly exaggerated in
`the peripheral areas with respect to more central areas of a
`hemispheric image.
`Although the lens types above achieve a view ofa large
`field, the valuabic content from the peripheral areas lacks in
`potential image quality (resolution) mapping because the
`imaging device and system does not differentiate between
`these areas and the central areas of less valuable detail.
`Often,
`the difference between the imaging capabilities
`between the two areas is compensated for by using only the
`central portion of a lens to capture the scene (“stopping the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`4
`lens down”). This works in effect to reduce the image quality
`of both areas such that the difference in error is a lesser
`percentage of the smallest area even the central area can
`resolve. Simultaneously, this compensation technique fur-
`ther degrades the performance of the lens by limiting the
`amountof light which is allowed to enter the lens, and thus
`reducing the overall intensity of the image.
`Moretypically, the peripheral content imaged by a con-
`ventional lens is so degraded in comparison with the central
`area that the lens allows for only a minimal area of the
`periphery to be recorded by the film or electronic imager. As
`a result of these “off-axis” aberrations inherentto largefield,
`the relevant information of the horizon in the scene can be
`underutilized or worse yet, Lost.
`Another limitation in U.S. Pat. No. 5,185,667 is its
`organization for recording only views already corrected for
`perspective. The nature of that methodology is that the
`specific view of interest must be selected and transformed
`prior to the recording process. The result is that no additional
`selection of views can be accomplished after the storage
`process, reducing system flexibility from the user’s perspec-
`tive.
`
`Hence, there is a demandin the industry for single camera
`imaging systems thatefficiently capture, store, and display
`valuable visual information within a hemispheric field of
`view containing particularly peripheral content, and that
`allow electronic manipulation and selective display of the
`image post-acquisition while minimizing distortion effects.
`Such a system finds advantageous application in a tele-
`conferencing environment in accordance with the present
`invention.
`
`Limited control of video cameras is disclosed in the prior
`art. U.S. Pat. No. 4,980,761 issued to Natori, Sep. 25, 1990
`describes an image processing system that rotates a camera
`for a teleconference system. A control unit outputs a drive
`signal based on an audio signal to control the movement of
`the image until
`the image controlling unit receives an
`operational completion signal. In this case, the rotational
`movementof the camera, moving the video image from one
`participant to another participant, alleviates having to view
`the camera movement. Once the camera stops skewing, the
`picture will then provide the proper aspectratio. A plurality
`of microphones are provided to each attendant. A sound
`control unit then determines with a speaker detection unit
`which participant is speaking. U.S. Pat. No. 4,965,819
`shows a video conferencing system for courtroom and other
`applications in which case each system includes a local
`module that includes a loud speaker, a video camera, a video
`monitoring unit and a microphone for each local conferee.
`U.S. Pat. No. 5,206,721 issued to Ashida, Apr. 27, 1993,
`showsa television conference system that allows for auto-
`matically mechanically moving and directing a camera
`towards a speaking participant. In this system a microphone
`is provided for each participant and is recognized by the
`control system. Image slew is corrected to avoid camera
`image motion. Areview of these systems thus showsthat the
`automation provided is very expensive and in every case
`requires individualized equipment for each participant.
`Limited audio direction finding for multiple microphone
`arrays is knownin theprior art. For example, a self steering
`digital microphone array defined by W. Kellerman of Bell
`Labs at ICASSP in 1991 created a teleconference in which
`a unique steering algorithm wasused to determine direction
`of sound taking into account the acoustical environmentin
`which the system was located. Also a two stage algorithm
`for determiningtalker location from linear microphonearray
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 10 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 10 of 18
`
`
`
`5,686,957
`
`5
`data was developed by H. Silverman and S. Kirkman at
`Brown University and disclosed in April, 1992. Thefiltered
`cross correlation of the system is introducedas the locating
`algorithm.
`A “telepresence” concept from BellCorp briefly described
`in TERE Network Magazine in March, 1992 suggests a
`spherical camera for use in the teleconference system.
`However,the entire image is sent in composite form for the
`remote users to select from at the other end. The present
`invention is quite different and includes automated pointing
`and control including incorporation in one automated system
`of both audio beam steering and selectable subviews of a
`much larger panoramic scene.
`SUMMARY OF THE INVENTION
`
`The present invention comprises a video conferencing,
`voice-directional video imaging system for automatic elec-
`tronic video image manipulation of a selected, directional
`signal of a hemispheric conference scene transmitted to a
`remote conference site. The system employs three separate
`subsystems for voicedirected, electronic image manipula-
`tion suitable for automated teleconferencing imaging in a
`desirable video aspect ratio.
`The audio beam, voice pickup and directing subsystem
`includes a plurality of microphonesstrategically positioned.
`near a predetermined central location, such as on a confer-
`ence table. The microphonearray is arrangedto receive and
`transmit the voices of participants, while simultaneously
`determining the direction of a participant speaking relative
`to the second subsystem, which is a hemispheric imaging
`system used with a video camera. The third subsystem is a
`personal computer or controller circuits in conjunction with
`the hemispheric imaging system which ultimately provides
`automatic image selectionof the participant speaking thatis
`ultimately transmitted as a video signal to the remote video
`display at the remote teleconference location.
`The hemispheric electronic image manipulator subsystem
`includes a video camera having a capture lens in accordance
`with the invention that allows for useful electronic manipu-
`lation of a segmented portion of a hemispheric scene. In a
`conference table setting, as viewed from the center of the
`conference table, participants are arranged around the table
`in the lower segment ofthe hemisphere, with the plane of the
`table top forming the base of the hemisphere. Theelectronic
`image is warped to provide a desired subview in proper
`aspect ratio in the audio selected direction.
`The present invention provides a new and useful voice-
`directional visual
`imaging system that emphasizes the
`peripheral content of a hemispheric field of view using a
`single video camera. The invention allows user-selected
`portions of a hemispheric scene to be electronically
`manipulated, transmitted, and displayed remotely from the
`video camera in real-time and in a cost-effective manner.
`
`invention
`imaging system of the present
`The visual
`involves a video image having a lens with enhanced periph-
`eral content imaging capabilities. The lens provides an
`enhanced view of the valuable information in the scene’s
`periphery by imaging the field of view to the image plane
`such that the ratio of the size of the smallest detail contained
`within the periphery of the scene to the size of the smallest
`resolving pixel of an image device is increased. For this to
`be accomplished, the peripheral content must mapto a larger
`percentage of a given image detector area and,
`simultaneously, the mapped imageof the central area of the
`scene must be minimized by the lens so that it does not
`interfere with the peripheral content now covering a wider
`
`15
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`6
`annulus in the image plane. Information in the image plane
`is then detected by the video camera. The detected infor-
`mation of the entire hemispheric scene is then stored as a
`single image in memory using traditional methods.
`Whena portion of the scene is to be displayed, the image
`information relating to the relevant portion of the scene is
`instantaneously retrieved from memory. A transform pro-
`cessor subsystem electronically manipulates the scene for
`display as a perspective-correct image on a display device,
`such as a teleconference display screen or monitor, asif the
`particular portion of the scene had been vieweddirectly with
`the video camera pointed in that direction. The transform
`processor subsystem compensates for the distortion or dif-
`ference in magnification between the central and peripheral
`areas of the scene caused by the lens by applying appropriate
`correction criteria to bring the selected portion of the scene
`into standard viewing format. The transform processor sub-
`system can also more fully compensate for any aberrations
`of the enhanced peripheral image because of the image’s
`improved resolution as it covers a larger portion of the image
`device (increased number of pixels used to detect and
`measure the smallest detail in the periphery image). More
`pixels equates to more measurement data, hence more
`accurate data collection.
`
`The stored image can also be manipulated by the trans-
`form processor subsystem to display an operator-selected
`portion of the image through particular movements, such as
`pan, zoom, up/down,tilt, rotation, etc.
`By emphasizing the peripheral content of a scene, the
`visual imaging system can use a single camera to capture the
`relevant visual information within a panoramicfield of view
`existing along the horizon, while being able to convention-
`ally store and easily display the scene, or portions thereof, in
`real-time. Using a single optical system and camera is not
`only cost-effective, but keeps all hemispheric visual data
`automatically time-synchronized.
`In the present invention, at a conference table view point,
`with participants seated around a conference table, hemi-
`spheric scene content is ideally suited for segmented sub-
`views of participants, especially when directionally elec-
`tronically manipulated by voice actuation. The video image
`should be of the current speaker.
`One advantage of the present inventionis that the unique
`visual imaging system lens can capture information from a
`hemispheric scene by emphasizing the peripheral portion of
`the hemispheric field of view and thus provide greater
`resolution with existing imaging devices for the relevant
`visual
`information in the scene. As an example,
`if an
`ordinary fisheye lens focuses the lowest 15 degrees up from
`the horizon on ten percent of the imager at the imaging plane
`and the peripheral-enhancing lens focuses that same 15
`degrees on fifty percent of the imager, there is a five-fold
`increase in resolution using the same imaging device.
`Depending on the application and exact formulation of the
`lens equations, there will be at least a five times increase in
`resolving power by this lens/imager combination.
`Thethird subsystem of the present invention comprises a
`control apparatus such as a personal computer or other
`collection of electronic circuits, connected to the imagery
`system to allow flexible operation delivering options and.
`defaults, including an override of the automated video image
`manipulation. A minimal control program is the software of
`the host controller to provide the options that may be
`necessary for individual teleconferences. An example would
`be to delay switching time segments between speakers, or
`perhaps the use of alternate cameras that may include a dual
`display.
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 11 of 18
`
`LGE Exhibit 1005
`LGE v. ImmerVision
`Page 11 of 18
`
`
`
`5,686,957
`
`7
`In operation, at a particular teleconferencingsite, partici-
`pants will be arranged at a conference table or in a confer-
`ence room with an array of microphones, each of which will
`pick up the normal speaking voice of each participant. The
`array of microphones is directly connected to an audio
`direction processor. The hemispheric lens system in con-
`junction with the video camerais attached to view warping
`logic as explained above andto the controller, or a personal
`computer. The video and audio signals are then transmitted
`through a transmission medium in an NTSC or other format
`to the remote teleconferencing site for remote display.
`Sound from the participant speaking that is processed in
`the audio direction processor determines the direction of the
`participant speaking relative to the panoramic video camera
`and lens. Once the particular speaker direction is
`determined, the panoramic imageof a specific hemispherical
`region of interest, such as the participant’s face, is processed
`to provide a normal video aspect ratio view for the remote
`participants using the system.
`It is a principal object and advantage ofthis invention to
`provide a video conferencing system that automatically
`directs a video image to the participant that is speaking while
`providing a hemispherical video imaging subview that can
`be electronically manipulated in the direction of the speaker
`selected from a panoramic scene.
`It
`is another principal advantage of this invention to
`provide an automatic teleconferencing system that saves
`transmission time, reduces coincident cost by eliminating or
`reducing manual operation of a video camera, and does not
`detract from the concentration of the subject during the
`conference.
`
`Andyet another advantage of this invention is to provide
`an automatic video camera with electronic image manipu-
`lation for video conferencing equipment that has no moving
`mechanical parts or physical mechanisms which improves
`the reliability of the system and reduces maintenance costs
`and service costs.
`
`In accordance with these and other objects which will
`becomeapparent hereinafter, the instant invention will now
`be described with particular reference to the accompanying
`drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a schematic illustration of the visual imaging
`system organization and components of the parent applica-
`tion.
`FIG. 1Ais a schematic illustration of the automated video
`conferencing system organization and components.
`FIGS. 2A, 2B, and 2C show a cross sectional diagram
`indicating the field input and output rays and the resulting
`relative field coverage a lens typically provides in the image
`plane for detection by an imager device.
`FIGS. 3AA, 3AB, and 3AC show a cross sectional
`diagram indicating the field input and output rays and the
`resulting field coverage that optical system Example I,
`constructed according to the principles of the present
`invention, provides in the image plane for detection by an
`imaging device or substrate.
`FIGS.3BA, 3BB show a cross sectional diagram indicat-
`ing the field input and output rays and the resulting field
`coverage that optical system Example I of this present
`invention provides in the image plane for detection by an
`imaging device or substrate.
`FIG. 4 is a schematic representation of the mapping
`locations on the imaging device.
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`8
`FIG. 5 is a schematic block diagram of the panoramic
`transform processor subsystem for use with the teleconfer-
`encing system of the present invention.
`FIGS.6A and 6B are a schematic diagram showing how
`multiple transform processor subsystems can be tied into the
`same distorted image to provide multiple different view
`perspectives to different users from the same source image
`as described in the parent application.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`The invention will be defined initially with a brief
`description of the principles thereof.
`
`Principles of the Present Invention
`
`As described in the parent U.S. patent application, the
`imaging invention stems from the realization by the inven-
`tors that
`in many of the technical hemispheric field
`applications, where the image detector is parallel to the
`plane of the horizon, muchof the relevant visual information
`in the scene (e.g., trees, mountains, people, etc.) is found
`only in a small angle with respect to the horizon. Although
`the length of the arc from the horizon containing the relevant
`information varies depending upon the particular
`application, the inventors have determined that in many
`situations, almost all
`the relevant visual information is
`contained within about 10 to 45 degrees with respect to the
`horizon. This determination is especially true with respect to
`the teleconference environment which is normally centered
`around a conference table or conference room.
`
`To maximize data collection and resolution for analysis
`and/or display of the relevant visual information located in
`this portion of the hemispheric scene,
`it is desirable to
`maximize the dedication of the available image detection
`area to this peripheralfield portion. To accommodate this,it
`is necessary that the “central” portion of the scene (from 45
`to 90 degrees with respect to the horizon) cover only the
`Temaining areas of the imager plane so as not to interfere
`with light from the periphery.
`”
`area contains less
`In many cases, since the “central”
`detailed information, such as a solid white ceiling or a clear
`or lightly clouded sky, it is allowable to maximize com-
`pletely the dedication of the available image detection area
`to the peripheral field portion by reducing the portion of the
`imager device representing the “central” area to near zero,
`Of course, in certain instances, it is desirable to analyze this
`less detailed information, but this portion of the scene can be
`minimized to some extent without significant degradation of
`such visu