`Bing-Yu Chen, Jan Kautz, Tong-Yee Lee, and Ming C. Lin
`(Guest Editors)
`
`Volume 30 (2011), Number 7
`
`Creating Fluid Animation
`from a Single Image using Video Database
`
`Makoto Okabe†1,2
`
`Ken Anjyo‡3
`
`Rikio Onai§1
`
`1The University of Electro-Communications
`
`2JST PRESTO
`
`3OLM Digital, Inc. / JST CREST
`
`Abstract
`
`We present a method for synthesizing fluid animation from a single image, using a fluid video database. The user
`inputs a target painting or photograph of a fluid scene along with its alpha matte that extracts the fluid region of
`interest in the scene. Our approach allows the user to generate a fluid animation from the input image and to enter
`a few additional commands about fluid orientation or speed. Employing the database of fluid examples, the core
`algorithm in our method then automatically assigns fluid videos for each part of the target image. Our method
`can therefore deal with various paintings and photographs of a river, waterfall, fire, and smoke. The resulting
`animations demonstrate that our method is more powerful and efficient than our prior work.
`
`Categories and Subject Descriptors (according to ACM CCS): I.4.8 [Image Processing and Computer Vision]: Scene
`Analysis—Motion
`
`1. Introduction
`Pictures were first animated in lift-the-flap books, and the
`animation of pictures is now recognized as a classic vi-
`sual effect
`in the animation industry. It
`is also an ac-
`tive area of research within the field of computer graph-
`ics [HAA97, IMH05, HDK07]. In the animation of pictures,
`the designer specifies a single target image along with sev-
`eral characteristics regarding motion and uses a computer to
`synthesize animated sequences derived from the input. Of
`course, the level of difficulty involved in animating a picture
`varies markedly according to the complexity of the scene and
`the objects to be animated. It is difficult to animate a picture
`of a fluid. Early research [CGZ∗05,OAIS09] was successful
`in designing fluid animation. Here, we focus on this chal-
`lenging issue using an efficient data-driven approach to han-
`dle a wide variety of fluid motions that could not be animated
`previously.
`
`A picture of a fluid can be animated in three ways. First,
`a physics-based fluid simulation can be applied to the fluid
`part of a target image. However, it is difficult to set the many
`
`† m.o@acm.org
`‡ anjyo@olm.co.jp
`§ onai@onailab.com
`
`c(cid:13) 2011 The Author(s)
`Computer Graphics Forum c(cid:13) 2011 The Eurographics Association and Blackwell Publish-
`ing Ltd. Published by Blackwell Publishing, 9600 Garsington Road, Oxford OX4 2DQ,
`UK and 350 Main Street, Malden, MA 02148, USA.
`
`physical parameters required to reproduce the target appear-
`ance. The second method can synthesize relatively calm
`fluid motions, such as water surfaces [CGZ∗05]. However,
`our study focuses on synthesizing more dynamic motions,
`such as water splashes, smoke, or fire, in addition to calm
`fluids. A third method, which we described in our previous
`paper, requires users to specify a video and then transfers
`its fluid features to the target image [OAIS09]. This tech-
`nique relies on a single video example that limits variation in
`the available fluid features. Another problem is that the user-
`specified motion field is temporally stationary, which limits
`the dynamics. The user must also expend considerable effort
`to find an appropriate video example and specify the motion
`field.
`
`To address these problems, we develop a data-driven
`method to create a fluid animation from a picture (Fig. 1).
`The user inputs a target image and an alpha matte that ex-
`tracts the fluid region of interest, while providing a few op-
`tional suggestions about fluid motion (i.e., flow direction and
`speed). We employ a video database that includes hundreds
`of video examples of fluids, which therefore helps the user
`to synthesize better quality animations with less effort than
`with previous methods. The quality of the synthesized ani-
`mation is improved due to the large variety of available video
`examples. We test the flexibility of our method by synthesiz-
`
`Lightricks Ltd.
`EX1009
`Page 1 of 10
`
`
`
`M. Okabe et al.
`
`/ Creating Fluid Animation from a Single Image using Video Database
`
`Figure 1: We employ a database of video examples of fluids (a). The user specifies a target image (b) with a few optional
`suggestions about fluid motion, e.g., sketches of flow direction, shown as orange arrows. The user also provides an alpha matte
`of the region of interest (c). The system synthesizes an animation (d).
`
`ing a wide variety of fluid animations, i.e., river, waterfall,
`smoke, and fire.
`
`The present study makes in three major technical con-
`tributions to solve the problem addressed in this paper.
`First, we develop an image-search-based technique to effi-
`ciently extract local fluid features from a video database. We
`then cut each video into smaller pieces, which are vector-
`quantized to construct a bag-of-features codebook for effi-
`ciently finding a video piece with an appearance that is sim-
`ilar to part of the picture. Second, we develop an algorithm
`to assign the video pieces to the picture so that the integrated
`appearance and motion become smooth. This task is for-
`mulated as a multi-label assignment problem based on the
`Markov random field (MRF). The third contribution is the
`extension of the original synthesis algorithm. In our previ-
`ous study, we decomposed a video into three components,
`i.e., the average image, the motion field, and the residu-
`als [OAIS09]. However, we found that two components (the
`average image and the differences between the average im-
`age and the original frames) produce a wider variety of dy-
`namics than previously possible by using the video database
`effectively. Additionally, our approach markedly reduces the
`user burden, as illustrated below.
`
`2. Previous Work
`
`Image and Video Database Image and video searching is
`an active research field. Recently, this kind of search has
`also been applied to image and video syntheses for com-
`puter graphics applications. Millions of images are useful
`for scene completion [HE07]. Sketch2Photo allows a user to
`draw rough sketches and search for adequate images from a
`database to synthesize a desired scene [CCT∗09]. Skyfinder
`searches for a user-desired sky image, modifies its appear-
`ance, and creates a composite [TYS09]. With regard to video
`synthesis, SIFT Flow can be used to estimate the motion
`field in a single image and also to transfer part of a video
`example onto a single image [LYT∗08]. Webcam Clip Art
`provides a video database of outdoor scenes and is useful
`for scene relighting [LEN09]. However, no existing method
`addresses the problem of animating a picture of fluid using
`a video database.
`
`Animating Pictures Many methods have been proposed for
`creating an animation from a single image [HAA97,IMH05,
`HDK07]. These methods are useful for touring into a pic-
`ture and creating character animations, but fluid animation
`is beyond their scope. Chuang et al. proposed a method to
`synthesize an animation with stochastic motion from a sin-
`gle image [CGZ∗05], but this technique only supports oscil-
`lation of the water surface, such as ripples, and not flowing
`fluid animation. Lin et al. synthesized animation from mul-
`tiple, instead of single, high-resolution stills [LWW∗07].
`
`Video Texture Synthesis Video texture synthesis has
`also been well-explored [WL00, BJEYLW01, SSSE00,
`DCWS03]. However, it is difficult for users to modify the
`appearance or motion of a synthesized animation. Wang and
`Zhu analyzed fluid animation, represented it with textons,
`and synthesized an animation [WZ03]. Bhat et al. proposed
`a sketching interface that enables users to edit a fluid anima-
`tion of a video example [BSHK04]. Users can also change
`the appearance of the animation, but the problem of animat-
`ing a picture of fluid has not yet been solved. Kwatra et al.
`developed a method that can design the animation of a tex-
`ture flowing over a user-specified motion field [KEBK05],
`where only a stationary motion field is demonstrated with-
`out any high-frequency fluid features. Ma et al. extended
`example-based texture synthesis to allow an exemplar to af-
`fect the details of motion fields [MWGZ09]. These methods
`enable users to modify an existing fluid animation and syn-
`thesize an animation, but they do not allow users to specify a
`single image as an appearance constraint. Okabe et al. devel-
`oped a technique for animating a picture of fluid [OAIS09];
`this technique requires users to design a motion field manu-
`ally and to search for an appropriate video example. We ex-
`tended this work, incorporating a video database to reduce
`user effort and to improve the quality of the synthesized an-
`imations.
`
`3. System Overview
`Given a target image of fluid that a user wants to create a
`fluid animation, our process involves selecting appropriate
`video examples from a database, assigning them onto parts
`of the target image, and integrating them all seamlessly into
`
`c(cid:13) 2011 The Author(s)
`c(cid:13) 2011 The Eurographics Association and Blackwell Publishing Ltd.
`
`Page 2 of 10
`
`
`
`M. Okabe et al.
`
`/ Creating Fluid Animation from a Single Image using Video Database
`
`Figure 2: System overview.
`
`a final animation. We assume that the user inputs a single
`target image and an alpha matte that extracts the fluid region
`of interest. The user can also provide a desired motion field
`by sketching the flow direction and painting a speed map,
`which is used as a constraint in the assignment process.
`
`Our system consists of three components: 1) construction
`of a video database of fluids (Fig. 2-a), where each video
`example is cut into small pieces, 2) a best-match search for
`an appropriate video example piece and assignment of this
`to a part of the target image, and 3) synthesis of the final
`animation through seamless integration of all the assigned
`pieces and adjustment of the overall appearance. The of-
`fline process of database construction begins with gathering
`original video examples of fluids (Fig. 2-b). To increase the
`number of video examples, we cut each video example into
`smaller pieces (Fig. 2-c). For each video example, we com-
`pute the average image by averaging the frames (Fig. 2-e) to
`obtain representative appearance information. We also com-
`pute differences between the average image and the frames
`(Fig. 2-f) that have no significant color properties but cap-
`ture high-frequency fluid features. From all of the averaged
`images in the database, we construct a bag-of-features code-
`book and describe each average image using a histogram of
`visual words (Fig. 2-d).
`
`Given a target image (Fig. 2-g), we cut it into pieces using
`the same process used in the database construction (Fig. 2-
`i). Next, we compute the histogram of visual words for each
`piece (Fig. 2-h), perform a best-match search between his-
`tograms of visual words (see Figs. 2-d and h), and assign
`video example pieces that have appearances similar to target
`image pieces. When a user-specified motion field is given, it
`is used as a constraint for solving the assignment problem.
`Based on the assignment results, differences are copied onto
`the corresponding target image pieces (Fig. 2-j). Finally, all
`assigned differences are integrated seamlessly and the fi-
`
`c(cid:13) 2011 The Author(s)
`c(cid:13) 2011 The Eurographics Association and Blackwell Publishing Ltd.
`
`nal appearance is synthesized by adjusting the appearance
`(Fig. 2-k).
`Our data-driven transfer of fluid features to a target image
`was inspired by Image Analogies [HJO∗01]; the transferred
`differences relate to the target image piece in the same way
`as the differences originally related to the average image of
`the video example. Our method differs from Image Analo-
`gies in that it is not pixel-based but rather is a patch-based
`texture synthesis, similar to Image Quilting [EF01]. When
`integrating patches, Image Quilting computes a minimum
`error boundary cut to remove seams between patches; we
`compute alpha blending using a computationally inexpen-
`sive image pyramid. A video example piece is decomposed
`into its average image and differences; the former captures
`the overall low-frequency appearance and the latter has high-
`frequency fluid features. The high-frequency features can be
`integrated seamlessly through simple alpha blending.
`
`4. Database of Video Examples
`We need to prepare as many video examples as possible, be-
`cause a large amount of data is always important to achieve
`high-quality data-driven image synthesis [HE07, TYS09].
`However, despite the large volume of video data available
`on the Internet, we found it difficult to gather the tens of
`thousands of video examples suitable for our purpose. Our
`criteria for video examples are:
`• The camera be fixed and focused on the fluid itself, i.e.,
`the fluid must be the primary character of the video.
`• The video examples of fluids need to be of sufficiently
`high quality for animation synthesis.
`• To simplify our method, we use only video examples with
`no other significant moving object other than fluids.
`We gathered several hundred video examples of water scenes
`from commercially available video footage collections, but
`discarded any that did not satisfy our requirements. Finally,
`
`Page 3 of 10
`
`
`
`M. Okabe et al.
`
`/ Creating Fluid Animation from a Single Image using Video Database
`
`we have a total of 151 video examples of water scenes
`(Fig. 3). All video examples in the database have a resolution
`between 640× 480 pixels and 400× 300 pixels.
`
`Figure 3: Thumbnails of video examples in the database.
`All video examples satisfy three requirements: 1) the fluid is
`the main character in each scene, 2) each video example has
`a resolution quality between 640×480 pixels and 400×300
`pixels, and 3) no other moving objects appear in the scene.
`
`To increase the amount of data, our strategy is to observe
`each video example locally. For example, a video example of
`a waterfall may have a different global shape from the shape
`of the waterfall in the target image. However, its local parts,
`such as droplets or splashes, may be used to animate a part
`of the target image.
`
`To increase the number of video examples in the database,
`we flip each video example horizontally (thereby doubling
`the number of examples); we also scale and rotate each ex-
`ample (Fig. 4-a). We then cut examples into smaller video
`pieces (Figs. 2-c and Fig. 4-b). We make three versions of
`each video example by scaling them to 50, 75, and 100% of
`their original size. We also make three versions of each video
`example by rotating the original by −22.5, 0.0, and +22.5
`degrees. Each version is then cut into smaller pieces with a
`resolution of 48× 48 pixels, allowing neighboring pieces to
`overlap each other (Fig. 4-b). Such an idea of patch library
`is also used for synthesizing facial image [MPK09].
`
`Figure 4: Database construction.
`
`Video example pieces that include not only fluids but also
`other objects such as rocks, trees, or the sky degrade the effi-
`ciency of the database. Therefore, we remove such stationary
`examples by calculating the significance of motion. For each
`piece, we compute optical flow between neighboring frames
`(Fig. 5-b) to obtain an average through all of the frames
`
`(Fig. 5-c). In a video example, only a pixel position with a
`significant averaged motion is used; other pixels are masked
`out (Fig. 5-d). In addition, any video example in which no
`pixel position has significant motion is completely removed
`from the database. We use OFLib to compute a dense optical
`flow [ZPB07]. This algorithm has three important open pa-
`rameters: θ, λ, and the number of iterations, which we set to
`1.0, 0.8, and 25, respectively. Because a fluid video has high-
`frequency dynamics, we set parameters that reduce sensitiv-
`ity to such dynamics but still yield relatively smooth results.
`We have collected 246,477 video examples of water scenes.
`
`Figure 5: Processing a video example piece.
`
`To ensure efficient best-match searching and efficient as-
`signment of video examples, we construct a bag-of-features
`codebook, by which each video example piece is described
`using a histogram of visual words (Fig. 2-d). The bag-
`of-features technique works well for searches involving
`small image patches such as those in our database, espe-
`cially when it is combined with spatial pyramid represen-
`tation [SSSFF09]: while bag-of-features has only informa-
`tion of presence of textures, the spatial pyramid representa-
`tion adds information of location of textures that is important
`even for our small patches. We make a representative image
`by averaging the frames of each video example piece when
`constructing the codebook (Fig. 2-e). Averaging involves the
`motion blur that often appears in a picture of fluid; this mo-
`tion is often intentionally portrayed by a painter or a photog-
`rapher to visualize the trajectories of fluid motion. We ex-
`tract SIFT features of each representative image by applying
`a SIFT descriptor, and compute 128 dimensional feature vec-
`tors at each point on a regular grid of 9× 9 over the 48× 48
`pixel resolution image. The use of SIFT descriptors on a reg-
`ular grid is often better for image searching than using key
`points detected by Gaussian differences [LP05].
`
`We then compute vector quantization on all extracted
`SIFT features to obtain the visual words used to construct the
`codebook. For the vector quantization, we apply a repeated
`cluster bisectioning approach, because this method is report-
`edly faster and yields a better quality of clustering than K-
`means [ZKF05]. Similarly to K-means, we set the number of
`clusters, k, based on experimental results for settings of 100,
`200, 300, and 500; we have chosen 200 as the best parameter.
`Given the codebook, we generate a bag of features for each
`representative image of a video example piece. We assign
`
`c(cid:13) 2011 The Author(s)
`c(cid:13) 2011 The Eurographics Association and Blackwell Publishing Ltd.
`
`Page 4 of 10
`
`
`
`M. Okabe et al.
`
`/ Creating Fluid Animation from a Single Image using Video Database
`
`each SIFT feature to its nearest visual word and construct a
`histogram with spatial pyramid representation [LSP06].
`
`field at x, i.e., [M1(x),M2(x), ..., , MN (x)] and N is the num-
`ber of frames. We can thus define the smoothness of neigh-
`boring motion fields, Mi and M j, as
`
`d( ˆMi
`
`f , ˆM j
`
`f ), (2)
`
`N/2
`
`∑f
`
`smooth(Mi,M j) = d( ˆMi
`
`1, ˆM j
`
`1) + σ
`
`=2
`(F(x)− G(x))2,
`
`d(F,G) = ∑
`x∈Ω
`where Ω is a set of pixel positions in the overlapping area
`between neighboring target image pieces, F and G, and σ is a
`parameter balancing the DC component, which corresponds
`to the average motion field, and the AC component, which
`describes the higher-frequency dynamics.
`
`(3)
`
`Figure 6: Fourier analysis on motion fields.
`
`For the third criterion, we can describe the textureness of
`an assigned video example piece and measure its similarity
`to neighboring pieces. Fig. 7 illustrates why this criterion
`is required. The two video examples of smoke have differ-
`ent appearances: one has strong contrast and the other has a
`smooth appearance. However, because they have similar mo-
`tions, flowing from left to right, the first and second criteria
`allow them to be neighboring pieces and will result in a vi-
`sual artifact with inconsistent neighboring textures: one has a
`high contrast appearance and its neighbor has a low contrast
`appearance. This is because the first criterion is based on a
`bag of features of a SIFT descriptor, which is invariant to
`illumination and contrast change and therefore too weak to
`preserve consistency. To solve this problem, we can describe
`the textureness of each frame of a video example piece us-
`ing a four-level scale-space Laplacian pyramid, and sum the
`texture through all frames; this process distinguishes Fig. 7-
`a and b, because the former has stronger coefficients than the
`latter through all of the sub-bands. The textureness of each
`video example piece, T , has a dimension of 48 × 48 × 4.
`The similarity of neighboring textureness, texture(T i,T j) is
`computed as the Euclidean distance in the overlapped area,
`similar to Eq. 3.
`
`We solve the assignment problem by first selecting mul-
`tiple video candidates for each target image piece based on
`their similar appearance (Eq. 1). We set the number of can-
`didates to 100, i.e., each target image piece has 100 labels,
`then determine which label should be assigned to each tar-
`get image piece. We next formulate the assignment problem
`with all of the criteria using MRF, which can be expressed
`
`5. Assignment of Video Example Pieces
`Given a target image, we first decompose it into pieces
`(Fig. 2-i) in the same manner as in the database construc-
`tion (Fig. 4-b). Next, we assign an appropriate video exam-
`ple piece to each target image piece. Our solution of the
`assignment problem involves three criteria: 1) appearance
`similarity between a video example piece and a target im-
`age piece; 2) smoothness of motion fields between neighbor-
`ing assigned pieces; and 3) appearance similarity between
`neighboring assigned pieces. The first criterion means that
`the more similar a video example piece is to a target image
`piece, the more likely it is that both pieces have similar fluid
`motions. The second criterion means that if a part of the tar-
`get image is assigned a fluid motion, its neighboring piece
`is likely to have a similar fluid motion, i.e., their velocities
`and temporal video textural patterns would be similar. The
`third criterion prevents neighboring pieces from generating
`an inconsistent appearance in the video texture.
`To address the first criterion, we compute appearance sim-
`ilarity by comparing bag-of-features histograms (Figs. 2-d
`and h). As in the database construction, we extract SIFT
`features on the regular grid in each target image piece and
`describe the image feature as a histogram of visual words.
`Appearance similarity is computed using a histogram inter-
`section,
`
`k∑i
`
`I(He,Ht ) =
`
`min(He[i], Ht [i]),
`
`(1)
`
`=1
`where He and Ht are the histograms of visual words of a
`video example piece and a target image piece, respectively.
`For the second criterion, we assume that the similarity
`measure of the average motion field is a simple definition
`of smoothness between neighboring motion fields (Fig. 5-c).
`However, this definition is insufficient, as shown in Fig. 6,
`when each video example piece has a similar average mo-
`tion field; that is, when they all flow from top to bottom at a
`similar speed. However, if we select motion fields and assign
`them to neighboring target image pieces, the result looks un-
`natural because their temporal video textural patterns differ.
`Fig. 6-a shows a small waterfall captured from a short dis-
`tance in which the motion is temporally unstable due to vis-
`ible droplets and splashes. Fig. 6-b , however, shows a large
`waterfall captured from a long distance, with a temporally
`smooth motion. These differences in temporal patterns are
`well described by Fourier analysis; Fig. 6-a has uniformly
`strong coefficients at all frequencies whereas Fig. 6-b has
`strong coefficients only in the DC component and weak co-
`efficients at other frequencies. We can compute the Fourier
`transform at each pixel position x as ˆM(x) = F (M(x)). F is
`the one-dimensional discrete Fourier transform and conver-
`sion to the power spectrum; M(x) is a sequence of motion
`
`c(cid:13) 2011 The Author(s)
`c(cid:13) 2011 The Eurographics Association and Blackwell Publishing Ltd.
`
`Page 5 of 10
`
`
`
`M. Okabe et al.
`
`/ Creating Fluid Animation from a Single Image using Video Database
`
`Figure 7: Analyzing textureness using Laplacian pyramid.
`
`as:
`
`Figure 8: User-specified motion field. Here, the user wants
`fire to move from bottom to top along the green arrows (a
`and b) with a slower speed at the bottom and a higher speed
`at the top (c).
`tween neighboring pieces. Our strategy for seamless inte-
`gration is to decompose a video example piece into its av-
`erage image and differences. We average all the frames of
`N ∑N
`each video example piece as A = 1
`i=1 Fi, where Fi are
`the frames of the video example piece and N is the number
`of frames. We compute differences as Di = Fi − A. These
`differences capture high-frequency fluid features, which can
`be integrated using alpha blending without introducing ar-
`tifacts, and not the low-frequency appearance (Fig. 9). To
`merge smoothly overlapped areas between neighboring tar-
`get image pieces, we apply an image pyramid [BA83] and
`perform alpha blending for each sub-band. This generates
`better results than standard alpha blending along boundaries
`because it can introduce more continuity at low frequencies
`and also preserves high-frequency fluid features.
`
`Figure 9: Integration of neighboring video example pieces.
`The left side illustrates how each target image piece has an
`assigned video example piece, and the right side illustrates
`how corresponding differences are copied and integrated us-
`ing alpha blending with an image pyramid.
`
`Recovery of Color Appearance The integrated differences
`have video textures but no color appearance. To recover
`the color of the target image, we can synthesize an image
`that corresponds to the average image (Fig. 2-e) and add it
`onto the integrated differences. We can synthesize this ap-
`proximated average image by applying image-based motion
`blur [BE01] and Gaussian blur to the target image. For mo-
`tion blur, we compute a motion field over the target image,
`based on the motion fields of assigned video pieces. An open
`parameter to simulate shutter speed controls the effect of
`motion blur. The additional Gaussian blur removes the sharp
`edges that are introduced by image-based motion blur; ker-
`nel size is determined in proportion to the magnitude of the
`motion field.
`
`Appearance Matching Finally, we can match the appear-
`ance of the synthesized animation more precisely to the
`
`c(cid:13) 2011 The Author(s)
`c(cid:13) 2011 The Eurographics Association and Blackwell Publishing Ltd.
`
`argmin
`i
`
`E = ∑
`
`p
`
`Vp(li) + λ ∑
`
`Wp,q(li,l j),
`
`(4)
`
`(5)
`
`(6)
`
`(p,q)
`where labels, li and l j, are assigned to the neighboring tar-
`get image pieces, p and q. Vp is the data term and Wp,q is
`the smoothness term; these are defined using a balancing pa-
`rameter ρ, as follows:
`Vp(li) = I(Hp, Hli ),
`Wp,q(li, l j) = smooth(Mli ,Ml j ) + ρ· texture(T li , T l j ).
`To ensure energy minimization, we apply the graph-cut
`method with a-expansion [BVZ01], which is provided as a
`package by [SZS∗08]. After the graph cut is run, each target
`image piece has an assigned video example piece that best
`satisfies all criteria.
`
`User-Specified Motion Field A user can manually spec-
`ify a desired motion field and influence the assignment pro-
`cess, which is useful for controlling dynamics or modify-
`ing a failed assignment. To make the editing process as easy
`as possible, we can decompose motion field into orientation
`and speed maps [OAIS09]. The user draws using strokes
`(Fig. 8-a) and generates an orientation map by interpolat-
`ing the sparsely drawn strokes using a radial basis function
`(Fig. 8-b). The speed map is a gray-scale image that is easily
`edited using paint software (Fig. 8-c). The user can specify
`the orientation and the speed map, or only one of these. It
`is possible to edit a speed map from scratch, but it is eas-
`ier to edit a speed map obtained from the already assigned
`video example pieces. The edited motion field affects the se-
`lection of the 100 candidates: before the selection, we can
`narrow down candidates in advance by discarding video ex-
`ample pieces with motions that differ greatly from the user-
`specified motion field.
`
`6. Video Synthesis
`Integration of Assigned Video Example Pieces After as-
`signment, we expect neighboring target video example im-
`age pieces will have video textures. However, because they
`are likely to come from different parts of different video ex-
`amples, naïve integration of these pieces causes visible ar-
`tifacts, especially discontinuities along the boundaries be-
`
`Page 6 of 10
`
`
`
`M. Okabe et al.
`
`/ Creating Fluid Animation from a Single Image using Video Database
`
`target image. We apply histogram-matching between each
`frame of the synthesized animation and the target image in
`the same manner as [OAIS09] using Heeger and Bergen’s
`texture synthesis algorithm [HB95]. To recover local appear-
`ance, we divide the image space into regular grids and apply
`the process to each region independently.
`
`Figures. 1, 11-d, and 11-g. Designing a speed map from
`scratch is difficult, but when synthesizing the first version of
`an animation, our system outputs the resulting speed map.
`This is a gray-scale image that can be easily loaded into any
`paint software. By editing the map, the user can make the
`animation run faster or slower.
`
`Dynamic Boundaries of Fluid The synthesis method de-
`scribed above is useful for rendering fluid features with
`a static overall shape, e.g., a waterfall seen from a long
`distance. However, when we apply this method to fire or
`smoke, the synthesized animation looks unnatural. We de-
`velop a simple ad-hoc method to achieve dynamic bound-
`aries of fluid by advecting the alpha matte according to mo-
`tion fields [BNTS07]. We can define an image-warping func-
`tion Wi(I) that deforms an image I according to the motion
`field Mi of the synthesized animation. Given an alpha matte
`α, we repeatedly synthesize the time-varying alpha matte Bi:
`Bi+1 = τWi(Bi) + (1−τ)α, where B1 is equal to α and τ con-
`trols the dynamics of Bi. We also apply an area-preserving
`adjustment each time Bi is computed to ensure that the area
`of Bi is always the same. Synthesis of dynamic boundaries
`requires that the user perform an additional task: manual de-
`sign of the background image. The background is hidden
`by fluid in the original image, but it appears when the fluid
`moves. In our experiment, we used Adobe Photoshop’s clone
`stamp tool and found that it took less than 10 min to design
`each background.
`
`Figure 10: Each frame of a fire animation has dynamic
`boundaries and a corresponding alpha matte.
`
`7. Results and Discussion
`We employed an independent database for water, fire, and
`smoke scenes. This involved gathering 151, 96, and 89
`video examples for the water, fire, and smoke databases, re-
`spectively. From these, we obtained 246,477, 226,986 and
`195,318 video example pieces. We synthesized fluid anima-
`tions for target images, including photographs and paint-
`ings (Figs. 1 and
`11). Figure 1 shows the application
`of our technique to a tour into the picture in the supple-
`mentary video [HAA97]. We designed an alpha matte for
`each target image using a scribble-based image segmenta-
`tion tool [LSTS04]; this process takes less than 5 min. We
`specified the orientation map for all target images; design-
`ing the orientation map requires a sparse set of user-drawn
`strokes, which is a simple task and takes less than 1 min.
`We specified the speed map only for the examples shown in
`
`c(cid:13) 2011 The Author(s)
`c(cid:13) 2011 The Eurographics Association and Blackwell Publishing Ltd.
`
`Figure 11: Thumbnails of synthesized animations: pho-
`tographs(a,b,c,e,h) and paintings (d,f,g).
`
`Assignment of Video Example Pieces The supplementary
`video demonstrates that our assignment algorithm works
`well, using Figures 11-e and 11-f as target images. Render-
`ings from the assigned video example pieces shown in the
`supplementary video already resemble the target image. In
`Figure 11-e, the top region has strong flames, with smaller
`flames appearing around the tree trunks. For each region,
`the assignment process copies and pastes appropriate ani-
`mations from various video examples. Figure 11-f has re-
`gions with waterfalls, whitewater, and calm water surfaces;
`each of these areas is based on appropriate video examples.
`Waterfalls or regions with flames tend to have successful
`assignment even without any user input because they have
`strong image edges that correlate to the flow direction. For
`example, waterfalls and flames always move from top to bot-
`tom and from bottom to top, respectively. High-frequency
`regions tend to have slower, more detailed motion. However,
`the flow direction of the water surface in Figures 11-a and
`11-h is ambiguous. It is also difficult to determine the flow
`direction of smoke, because image edges do not correlate
`to an underlying flow direction and speed. The correlation
`between appearance and motion is most noticeably absent in
`paintings. In these cases, orientation and speed maps become
`especially important.
`
`Timing In our current implementation, which is not opti-
`mized, we spend an average of approximately 1 hour cre-
`ating the first version of the synthesized animation. We use
`a workstation with an Intel(R) Xeon(R) 3.33 GHz cpu for
`video synthesis and a remote file server for the database.
`We spend 30 min on the assignment process and 30 min on
`video sy
Accessing this document will incur an additional charge of $.
After purchase, you can access this document again without charge.
Accept $ ChargeStill Working On It
This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.
Give it another minute or two to complete, and then try the refresh button.
A few More Minutes ... Still Working
It can take up to 5 minutes for us to download a document if the court servers are running slowly.
Thank you for your continued patience.
This document could not be displayed.
We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.
Your account does not support viewing this document.
You need a Paid Account to view this document. Click here to change your account type.
Your account does not support viewing this document.
Set your membership
status to view this document.
With a Docket Alarm membership, you'll
get a whole lot more, including:
- Up-to-date information for this case.
- Email alerts whenever there is an update.
- Full text search for other cases.
- Get email alerts whenever a new case matches your search.
One Moment Please
The filing “” is large (MB) and is being downloaded.
Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.
Your document is on its way!
If you do not receive the document in five minutes, contact support at support@docketalarm.com.
Sealed Document
We are unable to display this document, it may be under a court ordered seal.
If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.
Access Government Site