throbber
Hyperlinked Video
`
`Jonathan Dakss, Stefan Agamanolis, Edmond Chalom, and V. Michael Bove, Jr.
`
`MIT Media Laboratory
`20 Ames Street, Cambridge, Massachusetts 02139 USA
`
`ABSTRACT
`
`Hyperlinked video is Video in which specific objects are made selectable by some form of user interface, and
`the user’s interactions with these objects modify the presentation of the video. Identifying and tracking the
`objects remains one of the chief difficulties in authoring hyperlinked Video; we solve this problem through the
`use of a Video tracking and segmentation algorithm that uses color, texture, motion, and position parameters.
`An author uses a computer mouse to scribble roughly on each desired object in a frame of Video, and the
`system generates a segmentation mask for that frame and following frames. We have applied this technique
`in the production of a soap opera program, with the result that the user can inquire about purchasing
`clothing and furnishings used in the show. We will discuss this and other uses of the technology, describe our
`experiences in using the segmentation algorithm for hyperlinked Video purposes, and present several different
`user—interface methods appropriate for hyperlinked video.
`
`Keywords: hyperlinked video, hypertext, Video object segmentation, video object tracking, digital television
`
`1. INTRODUCTION
`
`Users of the World Wide Web are familiar with the concept of hyperlinks, in which “clicking” on specially
`tagged words or graphics in a document retrieves other documents, or perhaps modifies the current one.
`The idea of applying the same kind of interaction in video programs has often been discussed as a desirable
`possibility 7 consider for instance a fashion program in which clicking on an article of clothing provides
`information about it, or a nature documentary in which children click on plants and animals in the scene to
`learn more about them. Playback of such material is well within the capabilities of typical digital television
`decoders with graphical overlay capability, but creating it has posed a challenge because of the difficulty of
`identifying and tracking the selectable regions in every frame, by either manual or automatic methods.
`
`We have developed a method for tracking and segmenting video objects that simplifies the process of creating
`hyperlinked video. The author of the video uses a computer mouse to scribble roughly on each desired object
`in a frame of video and the system generates full segmentation masks for that frame and for following and
`preceding frames until there is a scene change or an entrance of new objects. These masks label every pixel
`in every frame of the video as belonging to one of the regions roughly sketched out by the author at the
`beginning of the process. The author may then associate each region with a particular action (e.g. graphical
`overlay, switching to a different Video data stream, transmission of data on a back channel). During playback,
`the viewer can select objects with a mouse or an analogous device, such as an enhanced TV remote control
`with point-and—click capability. In our demonstrations, we use a Video projector that can identify the location
`of a laser pointer aimed at its screen.
`Further author information:
`J.D.: E—mail: dakss©media.mit.edu
`S.A.: E—mail: stefan©media.mit.edu
`E.C.: E—mail: chalom©media.mit.edu
`
`V.M.B.(correspondence): E—mail: vmb@media.mit.edu
`
`Hulu
`
`Exhibit 1012
`
`Page 0001
`
`Hulu
`Exhibit 1012
`Page 0001
`
`

`

`We apply a novel method of using color, texture, motion, and position features to segment and track video
`objects. Our system uses a combination of these attributes to develop multi-modal statistical models for
`each region as roughly defined by the author. The system then creates the segmentation masks by finding
`areas that are statistically similar and tracking them throughout a video scene. The authoring tool and the
`playback system are supported by Isis, a programming language specially tailored for object—based media.
`
`We utilized this system to create HyperSoap, a hyperlinked video program that resembles television serial
`dramas (known as “soap operas”) in which the viewer can select props, clothing and scenery to see purchasing
`information for the item such as the item’s price and retailer. We produced this program entirely from scratch,
`not starting with pre—made video material,
`in order to learn more about how the production (scripting,
`shooting, editing) of hyperlinked video would differ from that of traditional television programming. We
`also learned a great deal about how people interact with hyperlinked video, and based our design of several
`modes of user interaction on this information.
`
`In the following section we briefly describe existing hyperlinked video software. In section 3, we discuss our
`authoring system and how it differs from other approaches. In sections 4 and 5, we describe HyperSoap in
`more detail and discuss several issues that we confronted in the process of designing it. Section 6 summarizes
`our work and presents other applications for this technology.
`
`2. RELATED WORK
`
`Several companies have announced products for authoring hyperlinked video. VisualSHOCK MOVIE from
`the New Business Development Group of Mitsubishi Electric America, Inc.1 concentrates on playback from
`the Web or a local source (e.g. CD—ROM, DVD—ROM). The authoring tool, called the Movie MapEditor,
`requires specifying a rectangular bounding box around the location of a desired object in the starting and
`ending frames of a video sequence, and a tracking algorithm estimates the position of the “hot button” in
`intermediate frames. Manual tracking by the operator is also supported. HotVideo from IBM’s Internet
`Media Group2 has a similar approach, but supports a larger set of geometrical object shapes. Intermediate
`locations of an object may be specified in multiple keyframes or located by a tracking algorithm. Veon’s
`V—Active?’ is another current authoring tool that incorporates an object—tracking algorithm.
`
`Although satisfactory in some situations, these methods for associating links to objects are too coarse for
`many varieties of video sequences. Particular difiiculties include associating links to objects that overlap or
`are enclosed within larger objects, such as a window of a house and the house itself, and perhaps a large
`tree partially obscuring both. To enable the association of links with arbitrarily shaped regions that may be
`changing over time, a system that can differentiate among objects with greater precision is needed. Many such
`segmentation systems exist that employ an automatic process to identify objects in video or imagery with
`block—level or pixel—level precision. For example, the VisualSEEk system4 automatically identifies regions
`based on color and texture distribution. This and other automatic systems are limited by the quality of the
`segmentation of objects and the relevance of the identifiable objects to the needs of the author. Precise and
`high—resolution segmentation masks can be an important factor in playback systems for hyperlinked video
`as well, since they could be used in graphical overlays to indicate the presence of hyperlinks or to highlight
`objects when they are selected by the viewer.
`
`3. OUR APPROACH
`
`We have developed a novel segmentation system that classifies every pixel in every frame of a video sequence
`as a member of an object or group of objects.56 The author of the video defines what the objects are by
`providing the algorithm with “training data77 in the form of rough scribbles on each object in one frame of a
`sequence. The system creates a statistical model for each object based on color, texture, motion, and position
`features calculated from the training data. These models are then used to classify each of the remaining
`pixels in the frame as a member of the object with the highest statistical similarity. By tracking the training
`data forward and backward through the sequence and applying the same operations, several seconds worth
`of video can be segmented. In the following sections, we describe this process in greater detail.
`
`Hulu
`
`Exhibit 1012
`
`Page 0002
`
`Hulu
`Exhibit 1012
`Page 0002
`
`

`

`
`
`Figure 1. A frame from the HyperSoap video, and an example of “training data” scribbled by the author.
`
`3.1. Object identification
`
`In the first step of the segmentation process, the author selects a single representative frame from the
`sequence (typically one in which the important objects are fully visible) and highlights representative pixels
`inside of each object using a simple drawing tool. This information serves as the “training data” for the
`segmentation algorithm (Figure 1).
`
`The system will also estimate the location of the training data pixels within each of the remaining frames
`in the sequence using a block-matching tracking scheme. When this stage has completed, there are pixels in
`each frame that have been classified to correspond to the objects defined by the author.
`
`3.2. Feature calculation
`
`In the next stage of the process, the system calculates a multi-dimensional feature vector for every pixel
`in the video sequence. Unlike many segmentation systems that use only one or two different features to
`help distinguish objects in a video sequence, our system allows the author to select any combination from a
`variety of different features that estimate the color, motion, and texture properties of each pixel. The row
`and column positions of a pixel in the frame may also be used as features. Combining the data of several
`features in this way yields a more robust segmentation whose quality is less affected by misleading data from
`a single feature.
`
`Three primary color spaces, RGB, YIQ and LAB, may be used to indicate the color features of pixels,
`although other spaces are possible as well. Motion features may be calculated with two different constant
`brightness constraint (or “optical flow”) algorithms, one proposed by Kanade and Lucas7 and the other
`by Horn and Schunck.8
`The system incorporates two different texture classification models. One is a
`local-statistics measure which computes intensity mean and standard deviation across pixel blocks at several
`scales. A more complex technique, known as simultaneous autoregressive modeling (SAR),9 uses a linear
`prediction method in which a pixel’s value is estimated from those of its neighbors. Here the weighting
`coefficients of the neighbors and the error term are the texture features.
`
`3.3. Pixel classification
`
`The next step in the process is to build statistical models for each object based on the feature information
`calculated for the tracked “training data” pixels. The system then compares the feature information of each
`remaining pixel to these models and labels the pixel as a member of the region to which it bears the greatest
`statistical similarity. A certainty measurement is also included in the classification of each pixel.
`
`Because only statistical methods are employed to classify pixels, it is likely that there will be small aberrations
`
`Hulu
`
`Exhibit 1012
`
`Page 0003
`
`Hulu
`Exhibit 1012
`Page 0003
`
`

`

`
`
`Figure 2. The output segmentation mask for the frame shown in Figure 1.
`
`in the output, such as a small group of incorrectly classified pixels surrounded by properly classified pixels.
`To help rectify these anomalies, any pixel that was classified with a certainty factor below a specific threshold
`is reclassified according to a “K-nearest neighbor” strategy that reassigns the pixel to the object that is most
`popular among its neighboring pixels. Once this stage is complete, the final output of the system is a
`segmentation mask that associates every pixel in the entire video sequence to a particular object (Figure 2).
`
`3.4. Linking objects with actions
`
`The last step in the authoring process is to link the objects in the video with the actions that should be
`taken when they are selected by the viewer, such as the display of additional information or a change in
`the presentation of the video. Selecting an object might cause a graphic overlay to appear containing a
`description of the object, or it might cause a new video clip to be displayed. For example, in HyperCafelO
`the viewer can navigate through various video clips of conversations occurring in a cafe by selecting actors
`in the video or icons that appear at certain times during the presentation.
`
`Linking objects with actions involves creating a system for associating each object with some kind of data,
`such as a URL, a procedure, an image, or a structure containing several separate pieces of data.
`It also
`involves writing the computer program or script that will interpret that data to perform the desired action.
`Associating objects with data can be accomplished in a number of ways, including information attached
`separately to each frame of the video or a central database that can be queried during playback. For
`example, VisualSHOCK maintains an “anchor list,” accessible via a drop—down menu system, where data
`structures containing link information are stored and later referenced by the VisualSHOCK movie player to
`perform the desired action.
`
`In our system, a simple text file database associates each object with important data, although other strate-
`gies may be used. The playback system, written in the Isis programming language,11 includes instructions
`on how to read and interpret the data in the database. When a viewer selects an object in the video, the value
`of the corresponding pixel in the segmentation mask identifies the object. The playback system retrieves the
`data associated with that object from the database and changes the presentation of the video accordingly.
`
`4. HYPERSOAP
`
`We used our authoring tool to create HyperSoap, a four—minute hyperlinked video drama closely patterned
`after daytime television soap operas. The viewer watches this show on a large projection screen with the
`knowledge that everything in the scene is for sale, including clothes, props and furnishings. Instead of using
`a mouse, the viewer selects objects with a laser pointer (Figure 3). Pointing the laser at the screen highlights
`selectable objects, and keeping the pointer fixed on one object for a short period selects it (indicated by a
`
`Hulu
`
`Exhibit 1012
`
`Page 0004
`
`Hulu
`Exhibit 1012
`Page 0004
`
`

`

`
`
`Figure 3. Viewers interact with HyperSoap by aiming a laser pointer at objects in a large screen video
`projection.
`
`change in color of the highlight).
`
`Depending on the mode of playback and the preferences of the viewer, the playback system will display
`information about the selected object in a number of different ways.
`In one particular mode, the system
`waits for an appropriate point to interrupt the video, typically when an actor has finished speaking his
`line, and displays a separate screen containing a detailed still image of the selected product along with a
`text box that includes the product’s brand name, description, and price.
`In another mode, appropriate
`for a broadcast scenario or when the viewer desires more instant feedback, an abbreviated information box
`appears immediately, without pausing the video, and then fades away after a few seconds (Figure 4).
`If
`requested by the viewer, a summary of all the products that were selected is shown at the end of the video.
`
`We oversaw every aspect of the production of HyperSoap, including scriptwriting, storyboarding, shooting
`and editing. We also supervised the creation of a unique musical soundtrack in which individual pieces,
`composed to match the mood of a particular part of the scene, are capable of being seamlessly looped and
`cross-faded. When the video is paused to display product information, the music continues to play in the
`background, lessening the impact of the interruption on the continuity of the video.
`
`The pixel—level segmentation of the 40 products appearing in HyperSoap was good enough to permit the
`use of the segmentation mask for highlighting objects. A total of 45 shots were processed through the
`system individually, where the number of linkable objects per shot ranged from 10 to 20. We found that to
`maintain a consistently high quality of segmentation, we needed to provide new training data to the system
`approximately every 30 frames, or one second of video, in order to minimize the propagation of error when
`estimating the location of the trained pixels in unmarked frames. However, this still provided an exceptional
`level of automation of the segmentation process; the author scribbled on an average of 5000 pixels in each
`training frame, meaning that for each 30—frame sequence the algorithm was required to classify 99.8 percent
`of the total pixels.
`
`Hulu
`
`Exhibit 1012
`
`Page 0005
`
`Hulu
`Exhibit 1012
`Page 0005
`
`

`

`$10
`
`Marina ID.
`
`Earrings
`
`Figure 4. A screen grab of HyperSoap during playback in which an object has been selected. The crosshairs
`have been added to indicate where the viewer is aiming the laser pointer.
`
`5. DESIGNING HYPERSOAP
`
`In this section, we describe some of the issues we were confronted with throughout the process of designing
`HyperSoap, such as how viewers should interact with our video, how the presence of hyperlinks should be
`indicated, and how hyperlink actions should be executed. In addition we discuss ways in which the production
`of hyperlinked video might differ from that of traditional programs.
`
`5.1. Venue and mode of interaction
`
`One of the most critical issues in designing hyperlinked video is one of venue. The video might be presented
`in a window on a computer monitor or on a standard television set, or perhaps in a large screen projection.
`The viewer might be sitting at a desk at work or on a sofa at home. Several different kinds of devices might
`be used to select objects and interact with the video. In some ways, using a mouse on a PC may be natural
`for hyperlinked video, considering that people are familiar with using a mouse to activate hyperlinks on the
`World Wide Web. However, the desktop is not always an ideal place to view video. Many of the genres
`of content suitable for hyperlinked video are those that people are accustomed to watching in their living
`rooms.
`
`A television viewing situation may be more natural for certain kinds of programming, but devices enabling
`viewers to interact with the video are less developed in this environment than in other venues. Since viewers
`are comfortable using a remote control device to browse through channels of programming on their television,
`it makes sense that they might also use this device to browse through hyperlinks. For example, WebTV’s
`WebPIP12 users can press a button on their remote control when an icon appears on the screen indicating
`the presence of a link. They can then choose to display the web page content referenced by the link or
`archive the link for later viewing. However, this system allows for only one link per segment of video. We
`envision several different kinds of devices that could be used to select from several links that are available
`
`simultaneously. For example, a viewer could cycle through available links with a button on a remote control,
`and then activate a link by pressing another button. Other approaches might incorporate a position touch
`pad or a joystick. Alternatively, embedding an inertial sensor inside a remote control would allow the viewer
`to move the remote in the air to control the position of a cursor on the screen.
`
`We felt the usual computer screen and mouse style of interaction would not be appropriate for the kind of
`content we were developing in HyperSoap. We designed our playback system with the intent of simulating
`a future television viewing scenario, one in which a device with significant computational ability (e.g. a
`set-top box, digital television receiver or DVD player) would be capable of mediating viewer interactions and
`modifying the presentation based on the author’s design. Our prototype viewing system includes a large
`screen projection driven by a workstation running Isis, the programming language in which the playback
`
`Hulu
`
`Exhibit 1012
`
`Page 0006
`
`Hulu
`Exhibit 1012
`Page 0006
`
`

`

`software is written.
`
`As discussed earlier, the viewer selects objects with a hand-held laser pointer. A small video camera attached
`to the video projector enables the playback system to sense the location of the laser dot on the projection.
`This is possible because the red dot generated by the laser is always brighter than the brightest possible
`image our projector can produce. Since the camera’s image is never perfectly aligned to the projection, a
`coordinate transformation is applied to correct for any displacement or keystone distortion. The parameters
`for this transformation need to be calculated only once by a special calibration program after the position
`of the projector/ camera is fixed relative to the screen.
`
`5.2. Indication of hyperlinks
`
`When displaying hyperlinked video, it is important to indicate the presence of hyperlinks to viewers in a
`manner suited to the application and to the goals of the viewing experience. In most cases, the presence of
`hyperlinks should be readily apparent but not overly distracting. If possible, the preferences of the viewer
`should be addressed as well.
`
`Many of the existing hyperlinked video playback tools that employ a pointing device for interaction indicate
`the presence of hyperlinks by changing the shape of the cursor when it is positioned over a hyperlinked
`object. When this happens, a name or description for the link may be displayed in an area of the computer
`screen. For example, IBM’s HotVideo software changes the cursor to an icon that corresponds to the media
`type of the information contained in the link. HotVideo also displays an icon at all times during playback
`that changes color to indicate when hyperlink opportunities exist. Similarly, the WebPIP system displays a
`small icon in a graphic overlay when a hyperlink opportunity is available.
`
`Other approaches have been used as well. HotVideo is capable of displaying wireframe shapes around the
`hyperlinked objects in the video, although this method can cause the video to become quite cluttered if there
`are several hyperlinks in a single frame. It can also be confusing to the viewer if a linked object occupies a
`large portion of the video frame or surrounds other objects. A HotVideo author can also choose to indicate
`hyperlinks by changing the brightness or tint of the pixels inside the wireframe shape, or by applying other
`transformations.
`
`If the viewer is given this
`In developing HyperSoap, we wanted every object to be selectable all the time.
`knowledge prior to watching the video, then indicating the presence of hyperlinks is not necessary. When
`nobody is interacting, the video looks fairly ordinary, enabling the viewer to watch the presentation in
`a more passive manner without any distraction that might arise from indicating hyperlink opportunities.
`When the viewer selects an object with the laser pointer, the playback system utilizes the information in
`the segmentation mask to highlight all of the pixels that comprise the object. The highlight remains on the
`object for a few seconds after the laser pointer is turned off and while the corresponding product information
`is presented, and then fades out. We also found that subsampling the segmentation mask by a factor of two
`had a minimal effect on the quality of the rendered object highlights.
`
`5.3. Initiation of hyperlink actions
`
`Unlike traditional hypertext documents, the temporal nature of hyperlinked video raises interesting issues
`regarding the timing of actions associated with selected hyperlinks.
`In many hyperlinked video playback
`systems, clicking on a link causes a new web page to be loaded immediately in a separate window. However,
`displaying lengthy or detailed information while the video continues to run may cause the viewer to feel
`distracted from the video content. Likewise, any delay in showing the link information may blur the context
`of the information such that it is no longer relevant to the video presentation. Thus, it is important to
`consider how and when hyperlink actions should occur in order to best serve the goals of the application. It
`may be important to maintain the continuity of the video while still presenting the desired link information
`to the viewer in a timely and meaningful fashion.
`
`From our experimentation with HyperSoap, we found that viewers have a variety of preferences regarding
`the timing with which product information is displayed when an object is selected. When a viewer selected
`
`Hulu
`
`Exhibit 1012
`
`Page 0007
`
`Hulu
`Exhibit 1012
`Page 0007
`
`

`

`a linked object in an early version of HyperSoap, the video was interrupted immediately to show the link
`content. After a few seconds, the link content would disappear and the video would resume from the point it
`was interrupted. While some users enjoyed the instant response from the system, others found it somewhat
`troubling that their interactions would cut off actors in the middle of delivering lines or interrupt other
`important moments within the program. Also, viewers tended to forget the context of the video content
`before the interruption and found themselves somewhat lost in the plot when it resumed.
`
`Our next implementation yielded a more desirable manner of interaction: when a viewer selected a hyperlink,
`the system waited until the video reached a break in the action, typically after an actor finished speaking a
`line, before pausing the video and showing the link information. This yielded a more favorable reaction from
`viewers, although the delay in displaying the information caused many first—time users to think the playback
`mechanism was not working properly. Others who felt more interested in the link information expressed that
`they would have preferred instant feedback.
`
`In the latest implementation, there are three modes from which to choose. In the first, link information is
`displayed at breakpoints as described above, but with the addition that a selected object remains highlighted
`until the breakpoint is reached in order to indicate to the viewer that the system hasn’t “forgotten” about
`it. In the second mode, abbreviated product information is displayed immediately in a graphic overlay when
`an object is selected, without pausing the video. In the last mode, no information about selected products
`is displayed until the very end of the show.
`
`5.4. Production design
`
`HyperSoap raises several interesting issues regarding the differences between traditional television program-
`ming and hyperlinked video content. A traditional television soap opera is designed in such a way that
`simply adding hyperlinks and allowing interruptions would likely detract from its effectiveness because they
`would obstruct the flow of the story.
`
`In designing the content for HyperSoap, we had to move away from a format where the story is central and
`any product placement or viewer interaction are decorations, to a new format where all of these components
`are equally important and dependent on each other for the survival of the presentation. The knowledge that
`everything would be for sale and that viewers would be interacting was part of the program’s design from
`the earliest stages of planning. This entailed several changes in the way the scene was scripted, shot, and
`edited when compared to a traditional TV program.
`
`For example, shots were designed to emphasize the products for sale while not diverting too much attention
`from the flow of the story. Certain actions are written into the script in order to provide natural opportunities
`for close-ups of products which would have otherwise been difficult to see or select with our interaction device.
`Similarly, the scene is edited so that shots are either long enough or returned to often enough for viewers to
`spot products and decide to select them. The story itself involves the products in ways that might increase
`their value to the consumer. Integrating the story with the products makes interruptions to show product
`information less jarring overall. The knowledge that everything is for sale and that the program was designed
`for the dual purpose of entertainment and shopping motivates the viewer to “stay tuned77 and to interact
`with the products.
`
`6. CONCLUSIONS
`
`The pixel—level segmentation enabled by our approach was critical in HyperSoap because the scene contained
`many oddly—shaped and overlapping objects. Identifying objects by scribbling roughly on key frames in the
`sequence was an effective way of dealing with complex shapes and scenes containing large numbers of objects.
`The pixel-level tracking of objects helped automate the authoring process since many of these objects move
`rapidly and change shape as a result of both camera movement and the action in the scene.
`
`We are currently exploring how hyperlinked video capabilities might be utilized in educational applications.
`For example, children could collect specimens and learn more about particular aspects of nature by selecting
`
`Hulu
`
`Exhibit 1012
`
`Page 0008
`
`Hulu
`Exhibit 1012
`Page 0008
`
`

`

`animals and foliage in a “safari” program. Hyperlinked video might also be useful in training applications,
`especially those that involve complicated tools and a strong temporal component. Consider a presentation of
`a surgical procedure in which selecting objects would allow viewers with different backgrounds to learn more
`about the instruments and people involved or follow alternative edits of the footage that emphasize their
`individual interests or preferences. Stories in which there is an emphasis on interactions with physical objects
`can also benefit from hyperlinked video. For example, we implemented a different version of HyperSoap in
`which selecting objects reveals hidden plot details related to those objects.
`
`The tools we have developed can be applied to these and other kinds of hyperlinked video programming as
`well. Although our segmentation system can handle many different types of video content, it is particularly
`well—suited to highly detailed sequences containing complex object shapes and movements. Our playback
`system was specially tailored to fulfill the specific goals of the HyperSoap application, but many of the ideas
`behind its design are applicable to other kinds of content. In addition, the creation of HyperSoap alerted us
`to many important aspects of the design of hyperlinked video in general, including production techniques,
`choice of venue, and methods for viewer interaction.
`
`ACKNOWLEDGMENTS
`
`We would like to thank Michael Ponder, J CPenney, and the J CPenney Headquarters Production Staff for
`their invaluable assistance with the HyperSoap shoot. We would also like to thank Kevin Brooks, Paul
`Nemirovsky and Alex Westner for providing their scriptwriting, scoring and Foley sound expertise. Also a
`special thank you to JuliaAnn Lewis, Matthew Rupe and Andrew T. Chandler, for their tremendous acting
`talents. This research has been supported by the Digital Life Consortium at the MIT Media Laboratory.
`
`REFERENCES
`
`9.03.“?
`
`Mitsubishi Electric America, Inc., VisualSHOCK MOVIE Web Site, http://www.visualshock.com.
`IBM, HotVideo Web Site, http://www.software.ibm.com/net.media/solutions/hotvideo.
`Veon, V-Active Web Site, http://www.veon.com.
`J. R. Smith and S.—F. Chang, “Local color and texture extraction in spatial query,” in IEEE Proc. Int.
`Conf. Image Processing, pp. 11171011711171016, 1996.
`5. E. Chalom, Statistical Image Sequence Segmentation Using Multidimensional Attributes. PhD thesis,
`MIT, Cambridge MA, 1998.
`6. E. Chalom and V. M. Bove, Jr., “Segmentation of an image sequence using multi-dimensional image
`attributes,” in IEEE Proc. Int. Conf. Image Processing, pp. II75257II7528, 1996.
`7. T. Kanade and B. D. Lucas, “An iterative image registration technique with an application to stereo
`vision,” in Proc. DARPA Image Understanding Workshop, pp. 1217130, 1981.
`8. B. K. P. Horn and B. G. Schunck, “Determining optical flow,” AI Memo 572, MIT Artificial Intelligence
`Laboratory, April 1980.
`9. A. K. Jain and J. Mao, “Texture classification and segmentation using multiresolution simultaneous
`autoregressive models,” Pattern Recognition 25, pp. 1737188, February 1992.
`10. N. Sawhney, D. Balcom, and I. Smith, “Authoring and navigating video in space and time,” IEEE
`Multimedia 4, pp. 30739, October 1997.
`11. S. Agamanolis and V. M. Bove, Jr., “Multilevel scripting for responsive multimedia,” IEEE Multimedia
`4, pp. 40749, October 1997.
`
`12. WebTV Web Site, http://www.webtv.com.
`
`Hulu
`
`Exhibit 1012
`
`Page 0009
`
`Hulu
`Exhibit 1012
`Page 0009
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket