throbber

`
`Jian Sun
`
`Lazy Snapping
`†
`
`Chi-Keung Tang
`
`Hong Kong University of Science and Technology
`
`†
`
`Yin Li
`
`∗
`†
`
`‡
`
`Heung-Yeung Shum
`
`Microsoft Research Asia
`
`‡
`
`(a) Input image
`
`(b) Object Marking
`
`(c) Boundary editing
`
`(d) Output composition
`
`Figure 1: Lazy Snapping is an interactive image cutout system, consisting of two steps: a quick object marking step and a simple boundary editing step. In
`(b), only 2 (yellow) lines are drawn to indicate the foreground, and another (blue) line to indicate the background. All these lines are far away from the true
`object boundary. In (c), an accurate boundary can be obtained by simply clicking and dragging a few polygon vertices in the zoomed-in view. In (d), the cut
`out is composed on another Van Gogh painting.
`
`Abstract
`
`In this paper, we present Lazy Snapping, an interactive image cutout
`tool. Lazy Snapping separates coarse and fine scale processing,
`making object specification and detailed adjustment easy. More-
`over, Lazy Snapping provides instant visual feedback, snapping the
`cutout contour to the true object boundary efficiently despite the
`presence of ambiguous or low contrast edges.
`Instant feedback
`is made possible by a novel image segmentation algorithm which
`combines graph cut with pre-computed over-segmentation. A set
`of intuitive user interface (UI) tools is designed and implemented
`to provide flexible control and editing for the users. Usability stud-
`ies indicate that Lazy Snapping provides a better user experience
`and produces better segmentation results than the state-of-the-art
`interactive image cutout tool, Magnetic Lasso in Adobe Photoshop.
`
`Keywords: User Interface, Image Cutout, Interactive Image Seg-
`mentation, Graph Cut
`
`1 Introduction
`
`“Image cutout” is the technique of removing an object in a picture
`or photograph from its background. The cutout result is typically
`composited on a different background to create a new scene. Image
`cutout has been around for many years, and is popular in film, tele-
`vision, publication, and photography. It is simple enough to explain
`that even young children make cutouts from magazines or picture
`∗
`This research was done when Yin Li was with Microsoft Research Asia
`as an intern.
`Permission to make digital or hard copies of part or all of this work for personal or
`classroom use is granted without fee provided that copies are not made or distributed for
`profit or direct commercial advantage and that copies show this notice on the first page or
`initial screen of a display along with the full citation. Copyrights for components of this
`work owned by others than ACM must be honored. Abstracting with credit is permitted. To
`copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any
`component of this work in other works requires prior specific permission and/or a fee.
`Permissions may be requested from Publications Dept., ACM, Inc., 1515 Broadway, New
`York, NY 10036 USA, fax +1 (212) 869-0481, or permissions@acm.org.
`© 2004 ACM 0730-0301/04/0800-0303 $5.00
`
`books. With the advent of digital imaging, it has become possible
`to specify the foreground and background on an individual pixel
`level, providing more accurate results than any scissors could, but
`no less tedious.
`The task in image cutout is in specifying which parts of the image
`are “foreground” (the part you want to cut out) and which belong
`to the background. While a human finds it quite easy to specify
`foreground and background to another human by saying something
`like “cut out the tree from the field of flowers”, the computer is still
`a long way from the sort of cognitive image understanding required
`to do this work unassisted. The user is forced to specify each region
`of foreground individually, with pixel accuracy. The tediousness of
`this pixel-accurate work, done in support of what is a cognitively
`simple task, makes image cutout a particularly frustrating task for
`users.
`The challenge, therefore, is to come up with a way to specify the
`foreground that is less tedious than marking every pixel individu-
`ally, without sacrificing pixel-accurate quality.
`Related Work
`For general image cutout, there are two main methods that improve
`on standard pixel-level selection tools: boundary-based and region-
`based. Each of these methods takes features of the image that the
`computer can detect (such as color consistency) and uses them to
`help automate or guide the foreground specification process.
`Boundary-based methods cut out the foreground by allowing the
`user to surround its boundary with an evolving curve. The user
`traces along the object boundary and the system optimizes the
`curve in a piecewise manner. Examples include intelligent scis-
`sor [Mortensen and Barrett 1995; Mortensen and Barrett 1999],
`image snapping [Gleicher 1995] and Jetstream [Perez and Blake
`2001].
`While easier than just selecting pixels manually with a traditional
`selection tool, these techniques still demand a large amount of at-
`tention from the user. There is never a perfect match between the
`features used by the algorithms and the foreground image. As a re-
`sult, the user must control the curve carefully. If a mistake is made,
`
`303
`
`Lightricks Ltd.
`EX1010
`Page 1 of 6
`
`

`

`2 Object Marking
`
`In the object marking step, the major task is to allow the user to
`conceptually group the foreground object against its background.
`Instead of tracing the object boundary, our system allows users to
`use lines and curves to specify the extent of the object of interest.
`
`2.1 UI Design
`
`To specify an object, a user marks a few lines on the image by drag-
`ging the mouse cursor while holding a button (left button indicating
`the foreground, and right button for the background). A yellow line
`or a blue line is displayed for the foreground marker or background
`marker respectively. This high level, painting-type UI does not re-
`quire very precise user inputs. As shown in Figure 1(b), most mark-
`ing lines are in fact far from the object boundary. Similar marking
`UI to separate object from background is also presented in[Falcao
`et al. 2000; Boykov and Jolly 2001; Fails and Olsen 2003] for im-
`age segmentation or gesture tracking for camera-based interaction.
`The segmentation process is triggered once the user releases the
`mouse button after each marking line is drawn. The user then in-
`spects the segmentation result on screen and decides if more lines
`need to be marked. It is therefore critical that our system gener-
`ates the cutout boundary with very little delay. Our system adopts a
`novel interactive graph cut algorithm to optimize the object bound-
`ary, by maximizing both the color similarity inside the object and
`the gradient along the boundary.
`
`2.2 Graph Cut Image Segmentation
`
`An image cutout problem can be posed as a binary labelling prob-
`lem. Suppose that the image is a graph G = (cid:1)V, E(cid:2), where V is the
`set of all nodes and E is the set of all arcs connecting adjacent nodes.
`Usually, the nodes are pixels on the image and the arcs are adja-
`cency relationships with four or eight connections between neigh-
`boring pixels. The labelling problem is to assign a unique label xi
`for each node i ∈ V, i.e. xi ∈ {foreground(= 1), background(=
`0)}. The solution X = {xi} can be obtained by minimizing a
`Gibbs energy E(X) [Geman and Geman. 1984]:
`(cid:1)
`
`E1(xi) +λ
`
`(i,j)∈E
`
`E2(xi, xj)
`
`(1)
`
`(cid:1) i
`
`∈V
`
`E(X) =
`
`where E1(xi) is the likelihood energy, encoding the cost when the
`label of node i is xi, and E2(xi, xj) is the prior energy, denoting
`the cost when the labels of adjacent nodes i and j are xi and xj
`respectively.
`In this paper, we will concentrate on how to define the energy terms
`E1 and E2 according to user input. We refer readers to [Boykov and
`Jolly 2001] for a detailed formulation of energy minimization as a
`graph cut problem and how to solve it. The graph cut algorithm has
`also been used in the computer graphics community, such as Graph
`Cut Textures [Kwatra et al. 2003], GrabCut [Rother et al. 2004] and
`Photomontage [Agarwala et al. 2004].
`Once the user marks the image, two sets of pixels intersecting with
`the foreground and background markers are defined as foreground
`seeds F and background seeds B respectively, as shown in Figure 2.
`Likelihood energy. In Equation (1), E1 encodes the color simi-
`larity of a node, indicating if it belongs to the foreground or back-
`ground. To compute E1, first the colors in seeds F and B are clus-
`tered by the K-means method [Duda et al. 2000]. The mean colors
`of the foreground and background clusters are denoted as {K
`}
`} respectively. The K-means method is initialized to have
`and {K
`64 clusters in our experiments. Then, for each node i, we compute
`the minimum distance from its color C(i) to foreground clusters as
`
`Bm
`
`Fn
`
`the user has to “back up” the curve and try again. The user is also re-
`quired to enclose the entire boundary, which can take some time for
`a complex, high-resolution object. The close control required in-
`terferes with the user’s ability to get an overview of their progress.
`It is difficult to zoom in and out of the image while you are drag-
`ging the pixel-accurate boundary line. Finally, once the boundary is
`specified, the tool is no longer helpful. Any errors must be cleaned
`up at the end using traditional selection tools (e.g., using the Lasso
`tool with Boolean operation in Photoshop).
`Recently, researchers have managed to improve image cutout
`by using region-based methods, e.g., magic wand in Photoshop,
`intelligent paint [Reese and Barrett 2002; Barrett and Cheney
`2002], marker drawing [Falcao et al. 2000], sketch-based interac-
`tion [Tan and Ahuja 2001], interactive graph cut image segmenta-
`tion [Boykov and Jolly 2001], GrabCut [Rother et al. 2004]) and
`interactive image Photomontage [Agarwala et al. 2004]. Region-
`based methods work by allowing the user to give loose hints as to
`which parts of the image are foreground or background without en-
`closing regions or being pixel accurate. These hints usually take
`the form of clicking or dragging on foreground or background ele-
`ments, and are thus quick and easy to do. An underlying optimiza-
`tion algorithm extracts the actual object boundary based on the user
`input hints.
`Region-based methods allow the user to operate at whatever scale
`they want. They also show partial results. After each hint, the fore-
`ground/background specification becomes more and more accurate.
`The problem with region-based techniques is that there are often
`cases where the features used by the region detection algorithms do
`not match up with the desired foreground or background. Areas in
`shadow, low-contrast edges, and other ambiguous areas can be ex-
`tremely tedious to hint. Sometimes, they cannot be hinted and need
`to be specified explicitly by hand.
`Clearly, there is still a need for a user interface that can combine
`the quick hinting of region-based approaches while still providing
`a simple affordance for pixel-accurate boundary editing.
`Our approach
`We propose Lazy Snapping, which is a novel coarse-to-fine UI de-
`sign for image cutout. As shown in Figure 1, Lazy Snapping con-
`sists of two steps: a quick object marking step (b) and a simple
`boundary editing step (c). The first step, object marking, works at
`a coarse scale, which specifies the object of interest by a few mark-
`ing lines (Section 2). The second step, boundary editing, works at
`a finer scale or on the zoomed-in image, which allows the user to
`edit the object boundary by simply clicking and dragging polygon
`vertices (Section 3).
`Our system inherits the advantages of region-based and boundary-
`based methods in two steps. The first step is intuitive and quick
`for object context specification, while the second step is easy and
`efficient for accurate boundary control.
`Inspired by [Boykov and Jolly 2001], we also formulate image
`cutout as a graph cut problem in both steps. Furthermore, at the
`object marking step, we propose an efficient graph cut algorithm
`by employing pre-computed over-segmentation so that the marking
`UI can provide instant visual feedback for users. At the boundary
`editing step, we introduce a simple polygon editing UI, and use the
`polygon locations as soft constraints to improve snapping results
`around ambiguous or low contrast edges.
`We have conducted usability studies (Section 4) to compare Lazy
`Snapping with the state-of-the-art interactive cutout tool, Magnetic
`Lasso in Photoshop, which has perhaps the best implementation
`of intelligent scissor. It shows that Lazy Snapping outperforms in
`terms of ease of use, efficiency, and quality of results. We have
`experimented with our system on many natural images.
`
`304
`
`Page 2 of 6
`
`

`

`(a)
`
`(b)
`
`(c)
`
`(a) A small region by the pre-segmentation. (b) The nodes and edges for the graph cut
`algorithm with pre-segmentation. (c) The boundary output by the graph cut segmenta-
`tion.
`Figure 3: Our new graph cut algorithm works on the graph whose nodes
`are small regions from watershed segmentation.
`
`2.3 Graph Cut with Pre-segmentation
`
`To improve efficiency, we introduce a novel graph cut formulation
`which is built on a pre-computed image over-segmentation, instead
`of image pixels. We choose the watershed algorithm [Vincent and
`Soille 1991], which locates boundaries well, and preserves small
`differences inside each small region.
`We again formulate object cutout as a graph cut problem where
`the nodes are instead the segmented regions from the watershed
`segmentation. As shown in Figure 3, we use the same notation
`G = (cid:1)V, E(cid:2) for the new graph, while the nodes V are the set of all
`small regions from pre-segmentation, and the edges E are the set of
`all arcs connecting adjacent regions.
`The foreground seeds F, the background seeds B, and the uncer-
`tain region U are defined similarly as in Section 2.2, except that
`now these nodes are small regions instead of pixels. The likelihood
`energy E1 is also similar to Equation (2) while the color C(i) is
`computed as the mean color of the small region i .
`For the prior energy E2 in Equation (3), we compared two defini-
`tions of Cij: 1) Cij is the mean color difference between the two
`regions i and j; 2) Similarly defined Cij, but it is weighted by the
`shared boundary length between regions i and j.
`In our experi-
`ments, similar results were obtained.
`Since watershed segmentation provides a good super set of object
`boundaries, this approximation produces reasonable results and im-
`proves the speed significantly. As shown in Table 1, the number of
`nodes and edges for the graph cut algorithm is reduced by more
`than 10 times compared to the pixel based method in our experi-
`ments with real life images. Most importantly, our new algorithm
`is able to feedback the cut out results almost instantly.
`
`3 Boundary Editing
`
`Although the object marking step preserves the object boundary
`as accurately as possible, there still exist some errors, especially
`around ambiguous and low contrast edge boundaries. Therefore,
`we design a simple polygon editing UI for the user to refine the
`object boundary.
`
`3.1 UI Design
`
`The object boundary produced from the previous step is first con-
`verted into editable polygons. The polygon is constructed in an
`iterative way: the initial polygon has only one vertex, which is the
`point with the highest curvature on the boundary. At each step, we
`compute the distance of each point on the boundary to the polygon
`in the previous step. The farthest point is inserted to generate a new
`polygon. The iteration stops when the largest distance is less than a
`pre-defined threshold (typically 3.2 pixels).
`
`305
`
`(a)
`
`(b)
`
`(c)
`
`(d)
`
`(e)
`
`(f)
`(c)Uncertain regions U
`(b)Background seeds B
`(a) Foreground seeds F
`(f) Graph cut result
`(e) Background marker
`(d) Foreground marker
`Figure 2: Graph cut formulation for Object Marking. The graph cut al-
`gorithm is defined on F, B, and U. All these nodes participate in the opti-
`mization process and are assigned a unique label, either foreground or back-
`ground.
`
`(cid:4).
`
`(cid:4)C(i) − K
`
`Bm
`
`= min
`m
`
`Bi
`
`Fn
`
`Fi
`
`(cid:4), and similarly d
`(cid:4)C(i) − K
`= min
`d
`n
`Therefore, E1(xi) is defined as follows:
`
`∀i ∈F
`E1(xi = 0) = ∞
`E1(xi = 1) = 0
`∀i ∈B
`E1(xi = 1) = ∞
`
`E1(xi = 0) = 0
`E1(xi = 0) = dB
`E1(xi = 1) = dF
`∀i ∈U
`dF
`+ dB
`dF
`+ dB
`Here, U = V \{F ∪B} is the uncertain region (Figure 2). The first
`two equations guarantee that the nodes in F or B will always have
`the label consistent with user inputs. The third equation encourages
`the nodes to have the label with similar colors to foreground or
`background.
`Prior energy. We use E2 to represent the energy due to the gradient
`along the object boundary. We define E2 as a function of the color
`gradient between two nodes i and j:
`E2(xi, xj) = |xi − xj| · g(Cij)
`(3)
`ξ+1 , and Cij = ||C(i) − C(j)||2 is the L2-Norm
`where g(ξ) = 1
`of the RGB color difference of two pixels i and j. Note that |xi −
`xj| allows us to capture the gradient information only along the
`segmentation boundary. In other words, E2 is a penalty term when
`adjacent nodes are assigned with different labels. The more similar
`the colors of the two nodes are, the larger E2 is, and thus the less
`likely the edge is on the object boundary.
`To minimize the energy E(X) in Equation (1), we use the max-
`flow algorithm in [Boykov and Kolmogorov 2001]. This algorithm
`is specially designed for some vision problems. Unfortunately, as
`shown in the last column of Table 1, it fails to provide interactive
`visual feedback for real life image cutouts.
`
`i
`
`i
`
`i
`
`i
`
`i
`
`i
`
`(2)
`
`Lag with
`Pre-segmentation
`0.12s
`
`Lag without
`Pre-segmentation
`0.57s
`
`0.21s
`
`0.25s
`
`0.22s
`
`1.39s
`
`1.82s
`
`2.49s
`
`Image
`
`Dimension Nodes
`Ratio
`10.7
`
`(408, 600)
`
`(440, 800)
`
`11.4
`
`(1024, 768) 20.7
`
`(768, 1147) 23.8
`
`Edges
`Ratio
`16.8
`
`18.3
`
`32.5
`
`37.6
`
`Boy
`Ballet
`Twins
`Girl
`(1147, 768) 19.3
`30.5
`0.22s
`3.56s
`Grandpa
`• The nodes (edges) ratio is the number of pixels (connection between pixels) divided
`by the number of nodes (edges) after the pre-segmentation.
`• The feedback lag is the delay from when the user releases the mouse to when the
`object boundary is displayed.
`• All lags are timed on a laptop PC with Centrino 1.5GHz CPU and 512M memory.
`Table 1: Performance comparison of the graph cut segmentation algo-
`rithms with and without pre-segmentation on the images shown in Figure 9.
`
`Page 3 of 6
`
`

`

`The likelihood energy E1 is defined as in Equation (2) in the ob-
`ject marking step. But the prior energy E2 is defined differently.
`In addition to the gradient term, E2 uses the polygon locations as
`soft constraints, in order to deal with ambiguous and low contrast
`(cid:6)
`(cid:7)
`gradient boundaries:
`E2(xi, xj) =|x i − xj| · g
`(1 − β) · Cij + β · η · g(D
`j)
`(4)
`where g(·) is the same as in Equation (3), Dij is the distance from
`the center of arc (i, j) to the polygon and η is the scale to unify the
`units of the two terms (typical value is 10).
`In Equation (4), β ∈ [0, 1] is used to control the influence of
`D(i, j). A typical value of β is 0.5 and it works well in most of
`our experiments, although we allow expert users to adjust this pa-
`rameter for better performance. Note that β = 1 makes the graph
`cut segmentation output the result that is snapped onto the polygon,
`regardless of the image gradient.
`When color gradient Cij is small, g(D2ij) dominates E2, which
`
`encourages the result to snap close to the polygon location. This
`is shown in Figure 5 where low contrast edges are very difficult to
`snap without polygon soft constraints. As shown in Figure 5(a),
`it is also a difficult example for region-based methods (e.g., in the
`object marking step).
`If there are two edges with comparable strength, the polygon lo-
`cation can also help users to select the desired one, as shown in
`Figure 6(b). Otherwise, the segmentation result may not be fully
`controlled by the polygon, as shown in Figure 6(c).
`Hard vertex constraint: The users may prefer to specify manu-
`ally a polygon vertex to be a “hard” constraint, so that the system
`ensures the graph cut segmentation result to pass through this ver-
`tex. For this hard constrained vertex, the uncertain region U is auto-
`matically split into two parts along its bisector. The two “split” lines
`are added into foreground seeds F and background seeds B respec-
`tively, so that graph cut segmentation must output a result passing
`through this vertex, because it is the only connection between the
`foreground and background at this place.
`
`2i
`
`4 Usability Study
`
`We believe that Lazy Snapping is superior to existing cutout meth-
`ods in being easier to learn and able to produce results of equal or
`better quality in less time. In order to test this, we have conducted
`a usability study that compared the performance of our Lazy Snap-
`ping prototype system to Magnetic Lasso, Adobe Photoshop’s im-
`age cutout tool.
`Methodology
`Fourteen subjects were selected. Ten were novices with little or
`no experience with Photoshop or its image cutout tools, while four
`were Photoshop experts. Each subject was given a five minute
`
`(a)
`
`(b)
`
`(c)
`
`(d)
`
`(a) Foreground seeds F
`(c) Uncertain regions U
`(e) Polygon vertices and lines
`
`(b) Background seeds B
`(d) Pixels ignored by graph cut
`
`(e)
`
`Figure 4: Graph cut formulation for boundary editing. Only pixels in F,
`B, or U are considered in optimization. The polygon location is encoded as
`an energy term to guide the optimization to snap to user inputs.
`
`Two UI tools are provided for polygon editing:
`Direct vertex editing allows users to drag the vertex to adjust the
`shape of the polygon. Users can add or delete vertices as well.
`Multiple vertices can be grouped and processed together.
`Overriding brush enables users to draw a single stroke to replace
`a segment of a polygon. This is more efficient than dragging many
`vertices individually.
`The overriding brush is inspired by the Paintbrush tool in Adobe
`Illustrator. The user brushes a stroke starting and stopping at two
`points A and B on the original polygon so that the original polygon
`is split into two parts, one of which has less angle difference to the
`user stroke. This part is replaced by the user stroke to generate a
`new polygon. The angle of the user stroke and the two parts of the
`polygon is measured by the tangent direction at point A and from
`A to B.
`Once the user releases the mouse button after each polygon editing
`operation, the system will optimize the object boundary using the
`graph cut segmentation algorithm again. The optimized boundary
`automatically snaps to the object boundary even though the poly-
`gon vertices may not be on it. Compared with a simple polygon
`boundary where the user needs to modify so many vertices, our UI
`uses many fewer polygon vertices to describe the object shape.
`
`3.2 Boundary Editing using Graph Cut
`
`Again, we formulate boundary editing as a pixel-based graph cut
`problem in a small band around the polygons. The band is 7 pixels
`wide by default. Figure 4 shows foreground seeds F, background
`seeds B and uncertain region U. Given the editable polygon, U is
`a band computed by dilating the polygon, whereas F and B are
`defined as the inner and outer boundaries of U respectively.
`
`(a)
`
`(b)
`
`(c)
`
`(a) Original image
`
`(b) With constraint
`
`(c) Without constraint
`
`Figure 5: The polygon soft constraint can override edge locations at low
`contrast regions. (a) The object marking step produces a bad boundary. Us-
`ing the polygon overriding brush (thick orange line) can replace a segment
`of polygon (b) Enabling the polygon as a soft constraint, the result (dotted
`line) is very close to the polygon (solid line). (c) Otherwise, the optimiza-
`tion is vulnerable to noise due to weak edges.
`
`Figure 6: (a) is the original image, and (b), (c) show the color gradient
`image of the region marked on (a) as a white square.
`(b) With polygon
`soft constraints, users can select which strong edge to snap. (c) Without
`polygon soft constraints, the same input polygon may produce erroneous
`edges because of the inherent edge ambiguity.
`
`306
`
`Page 4 of 6
`
`

`

`Object Marking
`
`Boundary Editing
`
`Object Marking
`
`Boundary Editing
`
`100%
`
`75%
`
`50%
`
`25%
`
`0%
`
`Time (Photoshop = 100%)
`
`100%
`
`75%
`
`50%
`
`25%
`
`0%
`
`Time (Photoshop = 100%)
`
`1
`
`3
`
`5
`
`7.00
`
`9
`
`11
`
`13
`
`A
`
`B
`
`C
`
`D
`
`E
`
`F
`
`G
`
`H
`
`Subject
`
`(a)
`
`Image
`
`(b)
`
`Photoshop
`
`LazySnapping
`
`Photoshop
`
`LazySnapping
`
`600%
`
`500%
`
`400%
`
`300%
`
`200%
`
`100%
`
`# Error Pixels
`
`0% 100% 200% 300% 400% 500% 600%
`
`0%
`
`0%
`
`50%
`
`100%
`
`150%
`
`Time
`
`(c)
`
`Time
`
`(d)
`
`200%
`
`100%
`
`# Error Pixels
`
`0%
`
`Figure 8: (a) and (b) illustrate the average time of cutout process across
`fourteen subjects and eight images respectively. We normalize the time by
`that of Photoshop for each column, so that all data can be compared together.
`Moreover, the number of error pixels to the time is shown in (c) for the first
`task and in (d) for the second task. We normalized the time and quality
`by the mean of all samples of each image for all subjects, so that the data
`from different images can be aligned around 100% for comparison. Lazy
`Snapping is clustered at the lower left corner, indicating better quality in
`less time.
`
`overall took less than 60% of the time than they did when using
`Photoshop (Figure 8(a)(b)). The exact benefit varied widely de-
`pending on the subject and the image (standard deviation is 30%).
`We compared the quality of the Photoshop result with that of the
`Lazy Snapping result. For the second task, Lazy Snapping has less
`than 60% (the average of all 14 subjects and 4 images) the num-
`ber of error pixels than Photoshop has. In this time-restricted task,
`Lazy Snapping was a clear winner. Most subjects were able to com-
`plete the entire task in the 60 seconds allotted (86% less time than
`Photoshop), and for those that were not, Lazy Snapping produced
`satisfactory intermediate results (less than 53% error pixels than
`Photoshop). Photoshop’s magnetic lasso tool does not produce in-
`termediate results. Users who ran out of time were left with large
`errors. (See Figure 8(d)).
`Subject Feedback
`Overall, subjects preferred Lazy Snapping to the tools in Photo-
`shop. They reported it to be “much easier” and “almost magic”.
`One expert user expressed a concern that the ease of working with
`Lazy Snapping might encourage him to be lazy himself and per-
`form less accurately than with the more tedious traditional tools.
`Other users made suggestions for combining the Lazy Snapping
`tools with existing tools like the lasso and magic wand. Several
`users expressed some dissatisfaction with the two steps and won-
`dered if we could make it easier to go back and forth between them.
`We are considering these and other suggestions for improving the
`user experience.
`
`5 Experiments and Summary
`
`instruction session on the Lazy Snapping software. The novices
`were also given five minutes of instruction on Photoshop’s mag-
`netic lasso tool. All users were allowed to experiment with both
`software packages until they were comfortable that they understood
`their functions.
`The study consisted of two tasks. In the first task, the subjects had
`to cut out 4 images (A, B, C and D in Figure 7) as accurately as
`possible. The subjects were asked to work as quickly as they could
`without sacrificing accuracy. For each image, the subject had access
`to a printed version of the desired cutout. After they completed the
`task using one software package, they then repeated the task with
`the same 4 images using the other software. The order was alter-
`nated between subjects in case there was an ordering effect, with
`half of the subjects using Photoshop first, the other half using Lazy
`Snapping. For the second task, the subjects were given another 4
`images (E, F, G and H in Figure 7) to cut out, but this time they
`were given only 60 seconds per image. They were instructed to get
`the cutout as accurately as possible in the allotted time. Again, the
`order was alternated between subjects.
`When using Photoshop, the users were advised to use magnetic
`lasso as the major tool, but were also allowed to use other tools
`in Photoshop, such as free lasso, and work path editing, etc.
`Subjects were videotaped and their cutout results were saved for
`detailed logging and quality analysis.
`We evaluate the quality of cutouts by measuring the number of error
`pixels in the images. Avoiding bias, we compute the quality by av-
`eraging the number of error pixels for four ‘ground truth’ cutouts,
`which are produced by two experts (not selected as subjects) us-
`ing Lazy Snapping and Photoshop respectively. We also exclude
`the pixels in the hairy and furry regions, to avoid the influence of
`subjective recognition.
`Ease of Use
`We tested ease of use by counting the number of errors made by
`the novice users. Using the video tapes, we counted the number of
`times the users chose the incorrect tool or had to invoke the UNDO
`command. For example, while the user wants to draw foreground in
`Lazy Snapping, the background brush is used instead; when using
`the magnetic lasso, the user clicks the zoom button on the navigator
`tool window, which will produce an unexpected result. We found
`the error rate of Lazy Snapping to be less than 20% of the rate
`of Photoshop on the same image. Users also subjectively reported
`Lazy Snapping to be far easier to use than Photoshop.
`Better Quality in Less Time
`We tested time improvements by measuring how long it took users
`to complete the first task. We found subjects using Lazy Snapping
`
`A
`
`B
`
`C
`
`D
`
`H
`G
`F
`E
`Figure 7:
`Images used in usability study. The four images in the first row
`(A, B, C, and D) are for the first task. And the other four (E, F, G, and H)
`are for the second task.
`
`Figure 9 shows more examples produced by Lazy Snapping. The
`number of object markers and times for polygon editing are also
`listed for each image.
`We use Coherent matting [Shum et al. 2004], an extended Bayesian
`
`307
`
`Page 5 of 6
`
`

`

`(a) Girl (4/2/12)
`
`(b) Ballet (4/7/14)
`
`(c) Boy (6/2/13)
`
`(c) Grandpa (4/2/11)
`
`(d) Twins (4/4/12)
`
`Figure 9: More Experiments The numbers in the brackets denote the number of foreground markers, the number of background markers and the times to
`adjust polygon vertices, respectively. Each pair of images shows the marking lines for the first step and the final result. The polygons in the boundary editing
`step are not shown here. Please refer to the accompanying video to view the polygon editing process.
`
`matting [Chuang et al. 2001] with alpha prior, to compute the opac-
`ity around the object boundary before compositing the cutout object
`on a new background. The uncertain region for matting is computed
`by dilating the object boundary. Usually this dilation is of 4 pixels
`width on each side (in most of our experiments).
`In this paper, we have developed an interactive image cutout system
`that is easy to learn, produces better quality cutouts in less time than
`existing image cutout tools. Our system explicitly separates two
`tasks: object context specification and boundary refinement. We
`have designed two user interfaces for these two tasks respectively.
`The first is a marking UI which quickly specifies the object, and
`the second is a polygon editing UI which allows simple and fast
`boundary adjustment. Our usability study shows that our system
`is easy to learn and produces high quality cutouts. The UI can be
`easily and naturally extended to pen-computing devices.
`Our current system is not good at thin and branch structures. We are
`working on it. As well, we are trying to combine the object marking
`and boundary editing steps in a seamless way, without switching
`between each other. Moreover, we plan to extend our works to
`video segmentation.
`Acknowledgements:
`We would like to thank the anonymous reviewers for their construc-
`tive critiques. Many thanks to Dave Vronay for his help on the us-
`ability study, and to Steve Lin for his professional help in video pro-
`duction and proofreading. Chi-Keung Tang’s research is supported
`in part by the Research Grant Council of Hong Kong Special Ad-
`ministration Region, China: HKUST6193/02E and AOE/E-01/99.
`References
`
`AGARWALA, A., DONTCHEVA, M., AGRAWALA, M., DRUCKER, S., COLBURN,
`A., CURLESS, B., SALESIN, D., AND COHEN, M. 2004.
`Interactive digital
`photomontage. In Proceedings of ACM SIGGRAPH 2004.
`BARRETT, W. A., AND CHENEY, A. S. 2002. Object-based image editing. In Pro-
`ceedings of ACM SIGGRAPH 2002.
`BOYKOV, Y., AND JOLLY, M. P. 2001. Interactive graph cuts for optimal boundary &
`region segmentation of objects in n-d images. In Proceedings of ICCV 2001.
`
`BOYKOV, Y., AND KOLMOGOROV, V. 2001. An experimental comparison of min-
`cut/max-flow algorithms for energy minimization in vision. In Energy Minimiza-
`tion Methods in Computer Vision and Pattern Recognition, 2001.
`CHUANG, Y.-Y., CURLESS, B., SALESIN, D. H., AND SZELISKI, R. 2001. A
`bayesian approach to digital matting. In Proceedings of CVPR 2

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket