throbber
VENABLE, BAET]ER,HOWARD&OVILETTI, LIP
`Including professwnal corporations
`
`1201 New York Avenue, N.W., Smte 1000
`Washington, D.C 20005-3917
`(202) 962-4800, Fax (202) 962-8300
`www.venable.com
`
`October 24, 2000
`
`VENABIE
`
`ATTORNEYS AT LAW
`
`A
`
`OFFICES IN
`
`WASHINGTON, D.C.
`MARYLAND
`VIRGINIA
`
`Assistant Commissioner for Patents
`Washington, D.C. 20231
`ATTENTION: Box PATENT APPLICATION
`
`Sir:
`
`Attorney Docket No.: 37112-164994
`
`Submitted herewith is a patent application under 37 C.F.R. § l .53(b) for:
`Inventor: ALAN J. LIPTON
`Title: INTERACTIVE VIDEO MANIPULATION
`This is not a Provisional Application.
`
`The application includes:
`_x_ Specification (39 pages), which includes claims (1-58) and
`an Abstract (1 page)
`X Formal drawings (8 sheets, Figures 1-8)
`
`In view of the above, it is requested that this application be accorded a filing date.
`
`Please address all communications to:
`VENABLE
`Post Office Box 34385
`Washington, D.C. 20043-9998
`Telephone: (202) 962-4800
`Facsimile: (202) 962-8300
`
`MAS/rl
`DCZ-247199
`
`Respectfully submitted,
`
`~Jl&~..,
`
`Michael A. Sartori, Ph.D. "q
`Registration No. 41,289
`
`

`

`APPLICATION FOR UNITED STATES PATENT
`
`INVENTOR:
`
`ALAN J. LIPTON
`
`TITLE:
`
`INTERACTIVE VIDEO MANIPULATION
`
`ATTORNEYS' ADDRESS:
`
`VENABLE
`1201 New York Avenue, N. W., Suite 1000
`Washington, D. C. 20005-3917
`Telephone: (202) 962-4800
`Telefax: (202) 962-8300
`
`ADDRESS FOR U.S.P.T.O. CORRESPONDENCE:
`
`VENABLE
`Post Office Box 34385
`Washington, D. C. 20043-9998
`
`ATTORNEY DOCKET NO. :
`
`37112-164994
`
`

`

`INTERACTIVE VIDEO MANIPULATION
`
`BACKGROUND OF THE INVENTION
`
`Field of the Invention
`
`5
`
`A system in the field of video processing is disclosed. More specifically, techniques are
`
`disclosed for interacting with and manipulating video streams for applications, such as
`
`entertainment, education, video post-production, gaming, and others.
`
`References
`
`For the convenience of the reader, the references referred to herein are listed below. In
`
`the specification, the numerals within brackets refer to respective references. The listed
`
`references are incorporated herein by reference.
`
`[1] H. Fujiyoshi and A. Lipton, "Real-Time Human Motion Analysis by Image
`
`Skeletonization," Proceedings ofIEEE WACV '98, Princeton, NJ, 1998, pp. 15-21.
`
`[2] A. Lipton, H. Fujiyoshi and R. S. Patil, "Moving Target Detection and Classification
`
`from Real-Time Video," Proceedings ofIEEE WACV '98, Princeton, NJ, 1998, pp. 8-14.
`
`[3] A. J. Lipton, "Local Application of Optic Flow to Analyse Rigid Versus Non-Rigid
`
`Motion," International Conference on Computer Vision, Corfu, Greece, September 1999.
`
`[4] A. Selinger and L. Wixson, "Classifying Moving Objects as Rigid or Non-Rigid
`
`20 Without Correspondences," Proceedings of DARPA Image Understanding Workshop, 1,
`
`November 1998, pp. 341-58.
`
`- 1 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`Background of the Invention
`
`In augmented reality, which is a research topic in the computer vision community, video
`
`imagery is augmented by accurately registered computer graphics. Computerized x-ray vision
`
`and video assisted surgery are two examples of augmented reality. One of the long-time goals of
`
`5
`
`computer vision community is to analyze and interact directly with real-time video-derived data.
`
`One of the long-time goals of the entertainment industry, such as the movie industry and
`
`the computer gaming industry, is the creation ofrealism. To achieve this, the movie industry
`
`invested in computer graphics to create realistic false images. Additionally, the computer
`
`gaming industry integrates photo-realistic still imagery and video to enhance a user's experience.
`
`'
`
`it~
`
`To date, this integration is largely non-interactive using only "canned" video sequences to
`
`.. :
`
`achieve little more than setting atmosphere.
`
`Examples of the early use of imagery in games include still images or canned video
`
`sequences as a backdrop to the action, with computer generated characters overlaid on top, rather
`
`than truly interacting with the action. A slightly more interactive use of video is displayed in
`,tt5 more recent games, such as Return to Zork™ and Myst™, in which short, relevant video
`
`sequences provide the player with timely information or atmosphere. The most interactive use of
`
`video has been in video-disc based games, like Dragon's Lair™, in which the game itself is made
`
`up of small image sequences, each containing a small problem or challenge. Based on the
`
`player's choice, the next appropriate video sequence is selected to provide the next challenge
`
`20
`
`exploiting the fast random access time available to the videodisc medium.
`
`There has been some effort made to use video interactively, most notably as an input
`
`device. There exist companies that produce games based on chroma key screen technology.
`
`Real players are inserted into a virtual environment to perform simple actions like tending a
`
`- 2 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`virtual soccer goal or shooting virtual baskets. These games require considerable infrastructure.
`
`The player must wear distinguishing clothing, for example, green gloves, so that the computer
`
`recognizes body parts, and the game is played in front of a large blue screen stage. More modest
`
`applications of this type that run on desktop computers include, for example, SGI's Lumbus™, in
`
`5 which the IndyCam is used for simple head or hand tracking to control a plant-like creature
`
`called a "Lumbus" in three-dimensional (3D) space.
`
`SUMMARY OF THE INVENTION
`
`It is an object of the invention to provide a system and techniques to accomplish real-time
`
`and non-real time interactive video manipulation.
`
`It is a further object of the invention to provide a system and techniques to apply
`
`interactive video processing to applications such as entertainment, simulation, video editing, and
`
`teleconferencing.
`
`These and other objects are achieved by the invention, which is embodied as a method, a
`
`system, an apparatus, and an article of manufacture.
`
`The invention includes a method comprising the steps of: extracting an object of interest
`
`from a video stream; analyzing said object from said video stream to obtain an analyzed object;
`
`manipulating said analyzed object to obtain a synthetic character; and assembling a virtual video
`
`using said synthetic character. The method further comprises the step of tracking said object.
`
`20
`
`The step of assembling comprises the step of inserting the synthetic character into said video
`
`stream. The step of assembling comprises removing said synthetic character from said video
`
`stream. The method further comprises the step of determining functional areas within said video
`
`stream.
`
`- 3 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`The invention includes a method comprising the steps of: obtaining a video stream as a
`
`setting for one of a video game, a simulation, a teleconference, and a distance education
`
`presentation; tracking a moving object in said video stream; analyzing said moving object to
`
`obtain an analyzed moving object; generating a synthetic character based on said analyzed
`
`5 moving object; and assembling a virtual video based on said synthetic character and said video
`
`stream.
`
`The invention includes a method comprising the steps of: extracting in real time a
`
`background model from a video stream; generating in real time a synthetic character; and
`
`assembling in real time a virtual video based on said background model and said synthetic
`
`mo
`
`character. The step of generating comprises generating said synthetic character using a computer
`
`graphics engine, an object extracted from the video stream, or using both a computer graphics
`
`engine and an object extracted from the video stream.
`
`The system of the invention includes a computer system to perform the method of the
`
`invention.
`
`The system of the invention includes means for processing to perform the method of the
`
`invention.
`
`The apparatus of the invention includes a computer to perform the method of the
`
`invention.
`
`The apparatus of the invention includes application-specific hardware to perform the
`
`20 method of the invention.
`
`The apparatus of the invention includes a computer-readable medium comprising
`
`software to perform the method of the invention.
`
`-4-
`
`(Atty Dkt No. 37112-164994)
`
`

`

`Moreover, the above objects and advantages of the invention are illustrative, and not
`
`exhaustive, of those which can be achieved by the invention. Thus, these and other objects and
`
`advantages of the invention will be apparent from the description herein, both as embodied
`
`herein and as modified in view of any variations which will be apparent to those skilled in the
`
`5
`
`art.
`
`Definitions
`
`In describing the invention, the following definitions are applicable throughout.
`
`A "computer" refers to any apparatus that is capable of accepting a structured input,
`
`processing the structured input according to prescribed rules, and producing results of the
`
`processing as output. Examples of a computer include: a computer; a general purpose computer;
`
`a supercomputer; a mainframe; a super mini-computer; a mini-computer; a workstation; a micro(cid:173)
`
`computer; a server; an interactive television; a hybrid combination of a computer and an
`
`··~!:
`
`interactive television; and application-specific hardware to emulate a computer and/or software.
`~~'.f 5 A computer can have a single processor or multiple processors, which can operate in parallel
`and/or not in parallel. A computer also refers to two or more computers connected together via a
`
`network for transmitting or receiving information between the computers. An example of such a
`
`computer includes a distributed computer system for processing information via computers
`
`linked by a network.
`
`20
`
`A "computer-readable medium" refers to any storage device used for storing data
`
`accessible by a computer. Examples of a computer-readable medium include: a magnetic hard
`
`disk; a floppy disk; an optical disk, like a CD-ROM or a DVD; a magnetic tape; a memory chip;
`
`- 5 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`and a carrier wave used to carry computer-readable electronic data, such as those used in
`
`transmitting and receiving e-mail or in accessing a network.
`
`"Software" refers to prescribed rules to operate a computer. Examples of software
`
`include: software; code segments; instructions; computer programs; and programmed logic.
`
`5
`
`A "computer system" refers to a system having a computer, where the computer
`
`comprises a computer-readable medium embodying software to operate the computer.
`
`A "network" refers to a number of computers and associated devices that are connected
`
`by communication facilities. A network involves permanent connections such as cables or
`
`temporary connections such as those made through telephone or other communication links.
`
`:;::.
`:~~l
`rh/10
`
`,/:
`
`Examples of a network include: an internet, such as the Internet; an intranet; a local area network
`
`(LAN); a wide area network (WAN); and a combination of networks, such as an internet and an
`
`intranet.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The above, as well as other aspects of the inventive system and techniques, are further
`
`explained via the description below, taken in combination with the drawings, in which:
`
`Figure 1 illustrates an overview of a virtual video architecture for the invention;
`
`Figure 2 illustrates an example extraction of foreground objects in a video stream;
`
`Figures 3 and 4 illustrate an example for determining object rigidity based on residual
`
`20
`
`flow;
`
`Figure 5 illustrates an example for determining a periodic sequence corresponding to a
`
`non-rigid character;
`
`Figure 6 illustrates an example of synthesizing a synthetic character;
`
`- 6 -
`
`(AttyDktNo. 37112-164994)
`
`

`

`Figure 7 illustrates an example of functional areas in a video image;
`
`Figure 8 illustrates a plan view for the invention.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`5
`
`"Virtual video," which is a term coined by the inventor, is the concept that a video stream
`
`is altered in real-time and treated as a virtual world into which one or more objects are
`
`interactively inserted or removed at will. Furthermore, augmentations to the video stream are
`
`derived directly from the video stream, rather than being solely computer generated. Thus,
`
`"real" objects appear to move through space and/or time in a synthetic manner.
`
`Two fundamental challenges of virtual video are: (1) the ability to remove seamlessly a
`
`character from a video stream; and (2) the ability to add seamlessly a synthetic character to a
`
`video stream. A synthetic character is derived from the video stream itself, and thus, the motion
`
`of the synthetic character must be understood to re-create the synthetic character accurately in
`
`different times and places.
`
`In describing the invention, reference is made to Virtual Postman, which is a video game
`
`developed by the inventor to demonstrate and experiment with virtual video techniques. With
`
`virtual video, real-time, live interactive video is used for the first time as a game playing field.
`
`In Virtual Postman, a camera is pointed at a scene, either indoor or outdoor, and the video stream
`
`is viewed by a user (e.g., a player) on a desktop computer. Moving objects, like vehicles and
`
`20
`
`people, are detected and presented to the player as "targets." The player simulates shooting the
`
`targets, which appear to expire in computer generated explosions. "Dead" targets are
`
`synthetically removed from the video stream in real-time. Furthermore, the dead targets are, at
`
`- 7 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`random, synthetically brought back to "life" as "zombies" enhanced by computer graphics and
`
`re-inserted into the video stream at any position and/or time.
`
`There are several situations when it is necessary to insert synthetic characters into the
`
`virtual video stream in the context of Virtual Postman. When a "dead" character is brought back
`
`5
`
`to life, the dead character must appear to interact with the environment in a realistic manner. A
`
`subtler situation is when a "live" character is occluded by a "dead" one. Here, because no
`
`imagery in the video stream exists to represent the "live" character, synthetic imagery is inserted
`
`to complete the real segments without apparent discontinuity to the user. To achieve this, the
`
`appearance of the motion of a character is modeled. For the purposes of Virtual Postman, a
`
`character which is a vehicle is assumed to be rigid and move with non-periodic motion, and a
`
`character which is a human or an animal is assumed to be non-rigid and move with periodic
`
`motion. Hence, for Virtual Postman, it is only necessary to determine the rigidity and periodicity
`
`of a character.
`
`Figure 1 illustrates an overview of a virtual video architecture for the invention. The
`
`architecture is able to operate in real time or non-real time. An example of non-real time
`
`et:::.
`
`as
`
`operation is using the invention to perform video editing.
`
`In block I, a source video stream is obtained. Examples of the source video stream
`
`include: a video stream input in real time from a camera, such as a digital video camera, a color
`
`camera, and a monochrome camera; a video stream generated via computer animation; a video
`
`20
`
`input to a computer, such as from a firewire digital camera interface or through digitization; a
`
`video stream stored on a computer-readable medium; and a video stream received via a network.
`
`In block 2, the video stream is smoothed. Preferably, the video stream is smoothed by
`
`applying a Gaussian filter to each frame. As an option, other filters are used to achieve desired
`
`- 8 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`smoothing properties and processing speed. As an option, block 2 is skipped, which is likely to
`
`be beneficial if the video stream is computer generated.
`
`In block 3, one or more objects of interest are extracted from the smoothed video stream.
`
`The video stream is segmented into foreground objects (or blobs) 4 and background components
`
`5
`
`5. An object extracted in a frame of the video stream is identified as a foreground object 4, and
`
`the remainder of the frame is identified as background components 5. A foreground object is
`
`one or more pixels in a frame that are deemed to be in the foreground of the frame because the
`
`pixels do not conform to a background model of the frame.
`
`An object of interest is any object in a frame that is of interest to a user and/or for the
`
`generation of the virtual video stream. Examples of an object of interest include: a moving
`
`object, such as a person or a vehicle; a geographical region, such as a doorway; and a consumer
`
`product, such as furniture or clothing.
`
`Numerous techniques are available for extracting an object from a video stream. For
`
`example, some approaches are model-based and identify specific types of objects, such as
`
`;15
`
`{~::.ft
`
`vehicles or people. Other approaches us segmentation, while others use motion detection
`
`schemes. Preferably, foreground object extraction is accomplished using a stochastic
`
`background modeling technique, such as dynamically adaptive background subtraction.
`
`Dynamically adaptive background subtraction is preferred for two reasons. First, dynamically
`
`adaptive background subtraction provides a desirably complete extraction of a moving object.
`
`20
`
`Second, a by-product of the motion detection is a model of the background in the video stream,
`
`which is provided to block 6.
`
`The preferred technique of dynamically adaptive background subtraction employs motion
`
`detection to extract objects from a frame as described in [2] and has several steps. First, a
`
`(AttyDktNo. 37112-164994)
`
`

`

`stochastic model of each pixel is created and includes a mean and a threshold pair (B,T) for
`
`each pixel, where B represents the mean value of the pixel intensity and Trepresents a number of
`
`standard deviations. Preferably, the stochastic model is an infinite impulse response (IIR)
`
`filtered Gaussian model. Preferably, the mean and standard deviation are computed from the
`
`5
`
`red-green-blue (RGB) values of the pixel over time with R, G, and B treated separately. Thus,
`
`r:1i0
`
`~-.:~
`;; :/;:~
`
`~,:'.'.
`
`the model contains three means and variances for each pixel location, and the procedure is
`
`applied to each color band in an identical fashion. Instead of using RGB values, other chromatic
`
`representations of the color space are possible, for example: monochrome; hue-saturation value
`
`(HSV); YUV, where Y represents the luminosity of the black and white signal, and U and V
`
`represent color difference signals; cyan-magenta-yellow (CYN); and cyan-magenta-yellow-black
`
`(CYN).
`
`Second, using this pair designation (B,T) for each pixel, a pixel having an intensity
`
`value greater than T color levels from B is considered a foreground pixel and is otherwise
`
`considered a background pixel.
`
`Third, a first frame / 0 of the video stream is taken as the initial background model B0 ,
`
`and the initial threshold T0 is set to a default value.
`
`Fourth, a binary motion mask image Mn is determined and contains a "l" at each pixel
`
`which represents a "moving" pixel and "O" at each pixel which represents a "non-moving" pixel.
`
`A "moving" pixel is a pixel that does not conform to the background model and is, hence,
`
`20
`
`considered to be in the foreground, and a "non-moving" pixel is a pixel that does conform to the
`
`background and is, hence, considered to be in the background. At each frame n, "moving" pixels
`
`are detected with the binary motion mask image Mn(x) for pixel x of frame n as follows:
`
`- 10 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`(1)
`
`where n is the subsequent frame, n-1 is the previous frame, and Tis an appropriate threshold.
`
`Fifth, the stochastic models of the "non-moving" pixels are updated. The B value for each
`
`pixel is updated using an UR filter to reflect changes in the scene ( e.g., illumination, which
`
`5 makes the technique appropriate to both indoor and outdoor settings):
`
`(2)
`
`where a is the filter's time constant parameter. Further, the threshold T for each non-moving
`
`pixel is updated using an IIR filter as follows:
`
`( )
`T
`T(x)=
`n-1 X
`{
`aK!In(x)-Bn-i(x)!+(l-a)T,1_ 1(x)
`n
`
`if Mn (x) = 1
`if Mn(x)=O
`
`(3)
`
`Jo where K represents the number of standard deviations.
`
`Sixth and finally, clusters of "moving" pixels are clustered into "blobs" by preferably
`
`using a connected component algorithm. As an option, any clustering scheme is used to cluster
`
`the "moving" pixels.
`
`With the preferred technique for extracting objects via motion detection, two benefits
`
`15
`
`occur. First, the resulting dynamic background model contains the most recent background
`
`image information for every pixel, including the pixels that are occluded. Second, the extracted
`
`moving objects are complete and contain neither background pixels nor holes. The extracted
`
`moving objects are ideal templates to be removed and inserted into the virtual video stream.
`
`Consequently, removing characters from the video stream is achieved by replacing the pixels of
`
`20
`
`the characters with the corresponding background pixels from Bn.
`
`- 11 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`Figure 2 illustrates an example extraction of foreground objects in a video stream.
`
`Specifically, Figure 2 illustrates blobs extracted from a frame of a video stream in Virtual
`
`Postman. From frame 31, individual blobs 32a and 33a are extracted as extracted blobs 32b and
`
`33b, respectively. Blob 32a represents a vehicle, and blob 33b represents a person.
`
`5
`
`In block 6, the background components 5 extracted in block 3 are stored as a background
`
`model. When determining the background model, no foreground objects are preferably present
`
`in the frame. With a clean background model, removing "dead" characters is able to be
`
`realistically accomplished. The background model 6 provides background components 21 for
`
`assembling the virtual video in block 22. The background components 21 are used in block 22 to
`
`tW
`
`insert synthetic objects and remove objects. The background model 6 also provides background
`
`~~::
`
`'""'"
`
`components 25 for rendering the virtual video stream in block 24.
`
`In block 7, the foreground objects 4 are tracked to obtain tracked foreground objects 8. A
`
`foreground object is identified as a character (or a blob, or a target) and is distinguished from
`
`other characters and the background components. A character is tracked through occlusions with
`
`ns
`
`other characters and background objects. Proper tracking of a character ensures that a removed
`
`character is not accidentally reinstated in the virtual video stream.
`
`To track a character in a video stream, numerous techniques are available. For example,
`
`a character is tracked using Kalman filtering or the CONDENSATION algorithm. With the
`
`invention, because video-derived characters are tracked, a template matching technique, such as
`
`20
`
`described in [2], is preferably used. More preferable, an extension of the template matching
`
`technique that provides for tracking multiple objects through occlusion is used as described in
`
`[3]. With the template matching technique in [3], as a character is tracked, templates are
`
`- 12 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`collected showing visual variability over time. These image sequences are useful for generating
`
`synthetic character based on the tracked character.
`
`For the preferred technique, a standard frame-to-frame tracking algorithm is employed as
`
`described in [3] and has several steps. First, a known character from previous frames is selected
`
`5
`
`and its position is predicted into the coordinate system of the current frame using the previously
`
`:.
`
`::~a
`
`i~O
`
`computed velocity of the character.
`
`Second, blobs extracted from the current frame are identified as candidate matches based
`
`on the proximity of each blob to the predicted position of the character. The position of the
`
`character is predicted based on the previous motion of the character.
`
`Third, using a template matching algorithm, the candidate blob which best matches the
`
`character is selected to define the position of the character in the current frame. Preferably, the
`
`template matching algorithm is a standard sum of absolute difference (SAD) algorithm and
`
`involves convolving the character with the blob and taking the sum of the absolute differences of
`
`the pixel values for each pixel location. The result is a correlation surface D for each candidate
`
`blob. A low value for the SAD correlation indicates a good match, and the candidate blob with
`
`the minimum SAD correlation value is selected as the new position of the character. The
`
`displacement that corresponds to the minimum of the SAD correlation is considered to be the
`
`frame-to-frame displacement of the character.
`
`Fourth and finally, the character is updated with the new information in the subsequent
`
`20
`
`frame. The process repeats until either the character exits the video stream or the video stream
`
`ends.
`
`In block 9, the tracked foreground objects (or characters) 8 are analyzed to obtain
`
`analyzed foreground objects 10. Preferably, the analysis includes: (1) performing a rigidity
`
`- 13 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`analysis; and (2) performing a periodicity analysis. Through this analysis, a synthetic character,
`
`such as a video-derived character, is able to be generated from the foreground object and inserted
`
`into the virtual video stream as a realistic synthetic character.
`
`For the rigidity analysis, a character is classified as being a rigid character or a non-rigid
`
`5
`
`character. With a rigid character, like a vehicle, less information is required to generate a
`
`synthetic character based on the rigid character, and with a non-rigid character, like a person,
`
`more information is required to generate a synthetic character based on the non-rigid character.
`
`The type of information required for a rigid character and a non-rigid character is discussed with
`
`respect to determining a periodic sequence for a character.
`
`View invariant approaches exist for determining character rigidity. For example,
`
`character rigidity is determined using image matching or image skeletons, which determines
`
`walking and running of humans [l]. Preferably, character rigidity is determined as described in
`
`[3] by examining internal optic flow of a character,
`
`Preferably, the rigidity of a character is determined by the optical residual flow of the
`
`character, as described in [3]. A local optic flow computation is applied to a tracked character to
`
`produce a flow field v(x) for the pixels in that character. The residual flow i\ is a measure of
`the amount of internal visual motion within the character. In this case, vR is the standard
`
`deviation of the flow vectors:
`
`_ I)v(x)-vl
`VR= - - - - -
`p
`
`(4)
`
`~::.::.
`
`:~;15
`
`lk ~1
`
`20 where v is the average flow vector, and p is the number of pixels in the image of the character.
`If the residual flow vR for a character is low, the character is assumed to be a rigid character, and
`
`- 14 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`if the residual flow v R for a character is high, the character is assumed to be a non-rigid
`
`character.
`
`Figures 3 and 4 illustrate an example for determining character rigidity based on residual
`
`flow. The example is taken from Virtual Postman. Figure 3 illustrates a clustering 42 based on a
`
`5
`
`residual flow field v(x) for a human (non-rigid) character. Figure 3 also illustrates a clustering
`
`44 based on a residual flow field v(x) for a vehicle (rigid) character. Arms and legs are
`
`apparent in the human cluster 42, while there is only one significant area apparent for the vehicle
`
`cluster 44.
`
`Figure 4 illustrates the results of the residual flow computations for the residual flow
`
`(F~
`;jO
`
`fields of Figure 3. The residual flows vR are computed over time for a sequence of frames in the
`
`video stream containing the two characters. The residual flow 46 for the human character has a
`
`high average residual flow, and the human character is considered to be a non-rigid character.
`
`The residual flow 4 7 for the vehicle character is low, and the vehicle character is considered to
`
`be a rigid character. The residual flow 46 of the human character also reflects the periodic nature
`
`of a human moving with a constant gait.
`
`After the rigidity analysis is completed, the periodicity analysis is performed for block 9.
`
`Depending on whether the character is rigid or non-rigid, different information on the periodic
`
`sequence of the character is used. To determine the periodicity and the periodic sequence for a
`
`character, the motion of the character is analyzed. View invariant approaches exist for
`
`20
`
`determining periodicity of a moving object. For example, using image matching or image
`
`skeletons, walking and running of a human is determined and results in determining the
`
`periodicity of the human [ 1]. Preferably, the periodic sequence of a character is determined by
`
`an image matching technique.
`
`- 15 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`Preferably, a periodicity model for a non-rigid object is generated and includes an image
`
`sequence representing a complete cycle of the motion of the non-rigid object and a displacement
`
`sequence representing the spatial relationship between each image. Preferably, for a rigid
`
`character, only one sample of the image of the rigid character is required for the periodic
`
`5
`
`sequence of the rigid character. In the virtual video stream, the periodic sequence of the rigid or
`
`non-rigid character is repeated, scaled, translated, inverted, enhanced by computer graphics, etc.,
`
`to simulate the character appearing in any position and/or at any time and to create a realistic
`
`synthetic character.
`
`For a non-rigid character, periodic motion is assumed for the character. The periodic
`
`sequence P(k) = {Q1c,dk} is extracted from the frames of the video stream in which the character
`
`appears, where Qk represents the visual appearance of the character at frame k, and dk represents
`
`the velocity ( or frame-to-frame displacement) of the character from frame k to frame k+ 1. The
`
`periodic sequence P(k) represents the character exhibiting one cycle of motion over a set of
`
`frames ke [Po,PN], where Po represents the first frame in the periodic sequence, and PN
`
`represents the last frame in the periodic sequence.
`
`For a rigid character, the periodic sequence includes only one pair {Qo, do}, where Qo is
`
`a good view of the rigid character and do is the frame-to-frame displacement, in pixels per frame
`
`::..1,
`
`:;;15
`
`~.: :;;:ll
`
`( or pixels per second), of the rigid character.
`
`Periodicity is preferably determined by a method similar to that discussed in [ 4] and has
`
`20
`
`several steps. First, for each instance of a character detected in the video stream, visual
`
`templates are collected over time, which result in a series of visual templates Q 1, ... ,Qn for the
`
`character for frames 1 ton. The frame-to-frame displacements d1 to dn are also collected for
`
`- 16 -
`
`(Atty Dkt No. 37112-164994)
`
`

`

`frames 1 ton. A sufficient number of visual templates are collected to insure that at least one
`
`cycle of character motion is accounted for by the series of visual templates.
`
`Second, visual template Dn is matched with each of the collected visual templates
`
`Q 1, ... ,Dn using the SAD correlation algorithm discussed above for tracking a foreground object.
`
`5
`
`Third, visual template Qk is identified from among the visual templates D1, ... ,Dn as the
`
`visual template having the minimum correlation value determined in the second step. Visual
`
`template Qk is the closest visual template in appearance to the visual template Dn and is selected
`
`as the first visual template in the periodic sequence. Visual template Dn-1 is selected as the last
`
`visual template in the periodic sequence.
`
`Figure 5 illustrates an example of determining a periodic sequence for a non-rigid
`
`character. The example is taken from Virtual Postman, and the non-rigid character in Figure 5 is
`
`a human. The most recent template 51 is compared (i.e., convolved) with previous templates 52,
`
`and the results are plotted as a template match sequence 53. The previous template 54 that most
`
`closely matches the most recent template 51 has a minimum for the template match sequence 53
`
`1;15
`
`and is considered to mark the beginning of a periodic sequence 55.
`
`In block 11,

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket