`
`Name: David C. Gibbon
`Date: 04/18/01
`Social Security Number: 145-58-1984
`Telephone Number: 732 420 9127
`E-mail Address: dcg @ research.att.com
`Organization: HA157000O
`AT&T Business Unit: ALRES
`Location: RM A5-4FO2, Middletown, NJ
`
`Please answer the following questions as completely as possible.
`
`1.
`
`SUBJECT (Title of your idea)
`
`Method for content-based non-linear control of multimedia playback.
`
`2.
`
`OBJECTIVE (What problem does the proposal solve or what purpose does it serve?)
`
`The method provides an intelligent and efficient means for users to easily
`find and navigate high quality video material, using another (preferably small,
`portable) device (e.g. a PDA), and based on detailed program-specific index.
`
`3.
`
`BRIEF DESCRIPTION (1.What is it? 2. How does it operate? 3. is there a date
`involved, e.g. introduction or announcement of a service or product?)
`
`1.
`
`9°!“
`
`it can be thought of as the next generation of interactive television
`remote control that enables the searching and browsing of video
`and multimedia content archives. The remote control can be similar
`to today’s high-end PDA devices, which are able of displaying
`dynamic content including graphic images. it is assumed that the
`remote control communicates with a multimedia database which
`includes not only metadata such as program title etc., but also
`detailed content-specific index data that are extracted from the
`content either by automatic media processing techniques (e.g.,
`video indexing, audio indexing), or manually by a human.
`See attached disclosure for details.
`Our current negotiations with potential technology licensees for our
`multimedia indexing technology has confirmed a market for such a
`method. We would like to include this submission as part of the
`package of technologies to be licensed.
`
`
`
`AT&T - PROPRIETARY
`
`Use pursuant to Company instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 1 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 1 of 14
`
`
`
`4.
`
`COMPARISON (1. What is the known prior art (e.g. past publications or products), if
`any? 2. What are the differences over the prior art? 3. What commercial benefits are
`derived from these differences?)
`
`1. Multimedia indexing systems that support searching and browsing of
`video using a single device such as a desktop computer and displaying
`condensed versions of video programs on portable devices is covered by:
`"Second Supplemental Preliminary Amendment Method for Providing a
`Compressed Rendition of a Video Program in a Format Suitable for
`Electronic Searching and Retrieval," docket # 109579 C. Dynamic
`bitmapped displays in remote controls for home entertainment systems
`are known. Using a PDA as a remote by displaying control keys (similar to
`the mechanical keys on today’s remote controls, and the dynamic bitmaps
`on remote controls) on the touch—sensitive screen is known.
`2. Unlike the existing methods that either simulate the physical media control
`keys on the touch—sensitive screen, or display metadata, such as the title
`of the movie or name of a song, the proposed method is based on the
`display of detailed content-specific information from the content. Unlike
`existing systems that display information related to the content on the
`same display, the new method employs two separate devices and uses
`each to its maximum advantage.
`3. The method enables the creation of more user-friendly network-based
`video-on-demand entertainment and information services. it also has
`
`commercial applications to self-contained home entertainment systems.
`
`5.
`
`USE (1. What is the probability of commercial use? By AT&T? By Others? 2. is it
`scheduled for use in an AT&T Product or service? 3. Which one, and when? 4. Is this
`idea likely to be adopted by others? If so, to what extent? Why? 5. is it likely to become
`a standard? 6. Do you see applications for the idea other than the one described above?)
`
`:“.°°‘.\>
`
`1. There is a high probability of commercial use of this technology
`given the trends to lower device cost and increasing availability of
`wireless IP networks and broadband IP to the home. Any cable TV
`service provider, including AT&T can make use of this service.
`Not currently.
`Not applicable.
`Yes. AT&T Broadband does not operate in every geographic
`market, so it is likely that other cable companies will adopt this
`method. The invention also has applications in home networking
`environments for controlling the replay of video and multimedia
`content stored at the customer premises or at a remote location.
`5. No, but the method makes use of existing and emerging standards
`such as XML, HTML, MPEG-2, MPEG-7.
`6. Yes, e.g., pay—per-view video services on—the—go (e.g., airports)
`where the user uses a personal remote (e.g., PDA) to find/select
`video or multimedia content to be delivered on a separate device
`(e.g., a network-connected video monitor at the airport).
`
`
`
`M
`
`AT&T — PROPRIETARY
`
`Use pursuant to Company instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 2 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 2 of 14
`
`
`
`6.
`
`SUBMITTERS (You and any others who collaborated with you in the development of
`this idea) Please include name, social security number, home address, including
`county, and citizenship of each submitter.
`
`David C. Gibbon
`US Citizen
`‘I 45-58-1 984
`
`107 Majestic South
`Lincroft, NJ 07738
`
`Behzad Shahraray
`369-82-2270
`194 Sherwood Drive
`
`Freehold, NJ 07728
`
`Edward Y. Chen
`
`Laurence W. Ruedisueli
`
`
`
`AT&T — PROPFHETARY
`
`Use pursuant to Company Instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 3 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 3 of 14
`
`
`
`
`
` AT&T
`
`
`AT&T Labs - Research
`.3 ()0 3.02’ 5"?
`
`subject: Method for content-based non-linear
`control of multimedia playback.
`
`date:
`
`APri|30,2001
`
`from;
`
`David C. Gibbon,
`Behzad Shahraray,
`Edward Y. Chen,
`Laurence W. Ruedisueli
`
`Overview:
`This document discloses a method for content—based non-linear control of multimedia
`
`playback. A request for opinion of council has been submitted under the same title, and the
`introduction section repeats some of the information from that document.
`
`Introduction:
`
`The method provides an intelligent and efficient means for users to easily find and
`navigate high quality video material, using another (preferably small, portable) device (e.g. a
`PDA), and based on detailed program-specific index.
`It can be thought of as the next generation of interactive television remote control that
`enables the searching and browsing of video and multimedia content archives. The remote control
`can be similar to today’s high-end PDA devices, which are able of displaying dynamic content
`including graphic images. It is assumed that the remote control communicates with a multimedia
`database which includes not only metadata such as program title etc., but also detailed content-
`specific index data that are extracted from the content either by automatic media processing
`techniques (e.g., video indexing, audio indexing), or manually by a human.
`Unlike the existing methods that either simulate the physical media control keys on the
`touch—sensitive screen, or display metadata, such as the title of the movie or name of a song, the
`proposed method is based on the display of detailed content-specific information from the content.
`Unlike existing systems that display information related to the content on the same display, the
`new method employs two separate devices and uses each to its maximum advantage.
`The method enables the creation of more user-friendly network-based video-on~demand
`entertainment and information services.
`it also has commercial applications to self-contained
`home entertainment systems. Other possible uses of the invention include pay-per-view video
`services on-the-go (e.g., airports) where the user uses a personal
`remote (e.g., PDA)
`to
`find/select video or multimedia content to be delivered on a separate device (e.g., a network-
`connected video monitor at the airport).
`
`Operation:
`The method involves retrieving video material from a multimedia database and associated
`video server. During operation of the method, a user interacts with a control device and observes
`video material on a video device such as a television monitor.
`It is assumed that the control
`
`device is not capable of playing video material, but that it is capable of displaying dynamically
`generated content; preferably color still images and text. The control device must be capable of
`data communications, but may do so at
`low bandwidth (such as less than 100 Kbps)
`(Alternatively: the control device could play preview video, further, the device may be incapable of
`displaying color images or any images at all.) The data communications of the control device are
`preferably wireless and may employ Bluetooth, IEEE 802.1 1 b, infrared or other means. The video
`device is preferably connected to an IP network of at least 10Mbps bandwidth and can decode
`
`AT&T - PROPRIETARY
`
`Use pursuant to Company Instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 4 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 4 of 14
`
`
`
`compressed digital video. Wireless connectivity is also possible for the video device as well, but
`the wireless connection, it used, must be of sufficient bandwidth to support video.
`Figure 1 shows how video material is added to the database. The video is preferentially
`analyzed automatically to detect video shot boundaries and to record any associated closed
`caption information. Additionally, LVASR can be used to obtain a transcription. Optionally,
`ancillary source material can be added to the database to improve the accuracy or to bring in
`other information suitable for indexing and retrieval of the video content. Examples include: off-
`line transcriptions, manual annotations, topic classifications, post-production scripts, metadata
`such as actors’ names, genera classifications, etc.
`It is assumed that each television program or
`logical unit of video material (such as a single video tape, or a single speech from a corporate
`CEO) will be entered into the database as a distinct entry, and that the database will typically
`consist of a large number of such entries.
`(Alternatively, another embodiment would break down
`video programs into smaller units such as topic or story boundaries. The invention can be applied
`in this case as well.) The advantage of the current invention is to facilitate navigation of a large
`collection of video content, and further, to navigate within a particular entry in the database, such
`as a single video program. An illustrative example is an archive of all television programs in a
`particular geographic area for a seven-day period.
`is
`the video material
`In addition to the metadata or information used for indexing,
`preferably digitized and compressed in a standard format such as MPEG-2 and stored on a video
`server. Other embodiments include digital video encoded for delivery at lower bitrates such as
`3OOKbps MPEG-4. The preferred embodiment maintains multiple versions of the video, and the
`highest possible quality version is selected based upon the available bandwidth and client terminal
`capabilities. While centralizing the video database offers economies of scale and ease of
`maintained, it is also possible that the video material can be stored locally (in close proximity to
`the users video device.) Additionally, hybrid embodiments are possible in which some of the
`video material is stored locally and other video content is stored remotely, perhaps in several
`distinct geographic locations.
`Such embodiments may be organized such that popular or
`frequently viewed content is stored locally to minimize the amount of video material that is
`transported by the network. Further,
`intelligent content distribution networks can be utilized to
`efficiently distribute the content form the source to the consumers.
`The system incorporates a content generation or rendering engine as disclosed in US
`Patent #6,098,082, which is used to generate content for remote devices. During searching the
`multimedia database also generates content to help the user quickly navigate to the desired
`program (see figures). During browsing, the generated content serves two main purposes: 1) it
`conveys a summary or condensed representation of the video program for browsing, and 2) it acts
`as a dynamic control pad for initiating video playback.
`
`One Embodiment:
`
`In one embodiment of the invention, the control device is a Compaq iPAQ model 3650
`running the Microsoft PocketPC (Windows CE) operating system. An expansion pack supports
`PCMClA cards and an Orinoco 802.11b wireless LAN card is used for data communications. An
`Orinoco network access point connects the wireless LAN to a 100 BaseT LAN.
`The iPAQ includes a MS Windows IE web browser that supports HTML 2.0. The
`multimedia database runs MS Windows 2000 Advanced Server and includes MS Internet
`information and Index Services. Custom application software (including CGI programs and
`templates) is used to dynamically generate content from a multimedia database in HTML format
`specifically designed for the iPAQ device.
`A dedicated PC running the Linux operating system and including a Real Magic
`Netstream MPEG-2 decoder PCI bus card is used to serve as an Internet protocol transport
`endpoint and to convert compressed digital video to baseband analog S-Video (and audio) format
`to feed a television monitor.
`
`The video device includes application software for receiving compressed video data over
`IP using either HTTP (TCP) or RTSP (UDP.) The software also listens on a socket for control
`commands such as STOP, PAUSE, etc. The preferred control protocol
`is HTTP, although a
`custom protocol has also been implemented.
`
`
`
`AT&T - PROPRIETARY
`
`Use pursuant to Company Instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 5 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 5 of 14
`
`
`
`As there may be more than one video device that may be controlled by the control device,
`the user first specifies which video device is to be controlled. The list of available devices may be
`a predetermined list of device names maintained on an HTTP sewer. (The list may be created at
`the time that the devices are installed.) Preferably, each video display device has a friendly name
`and there is a corresponding DNS entry that maps the friendly name to an lP address.
`Alternatively, a protocol similar to ARP maybe used to determine the list of active video devices.
`For each television program in the database, the following multimedia data are stored on
`a server and are accessible via HTTP:
`
`o
`
`o
`
`0
`
`6 Mbps MPEG-2 program stream,
`JPEG frames and associated metadata (e.g. time within the broadcast that the
`frame was sampled, type of video transition) representing each scene,
`program metadata including title, broadcaster, time and date the program aired,
`closed caption text,
`data structures indicating pagination (Programs are divided into sets of HTML
`pages and these data structures include the number of pages, and indicate which
`captions and images appear on a given page. An index page is also represented.
`See US 6,098,082.)
`o Optionally: an offline transcription that has been synchronized with the broadcast
`(See US Patent
`application
`"Generating Hypermedia Documents
`from
`Transcriptions of Television Programs Using Parallel Text Alignment.")
`
`
`
`A number of such program data sets are stored on the server and the MS Index Services are
`used to provide a full text search capability on either the CC or offline transcription. HTML "Meta"
`tags are included with the text so that relational database queries can be supported (e.g., "find
`programs containing the term "NASA" from the broadcaster "NBC" that are less than one year
`old.") The system combines the basic CC text with metadata to generate files that the index
`server uses as content for searching. Once the index server has identified programs matching the
`user's query, application software generates content-rich user interfaces in HTML format for
`browsing the multimedia content and for initiating video playback. The browsing capability consists
`of text extracts and corresponding key frame images with hyperlinks for navigation to other points
`of interest within the program, or to other relevant programs. Templates through which the
`application software maps the multimedia content govern the form, as well as the appearance of
`the content.
`
`For initiating and control of video playback the application software generates URLs,
`which pass parameters to a CGI application running under the HTTP server on the video device.
`(In another implementation, the CGI application runs on the multimedia index server and opens a
`TCP/lP socket connection to the video device and communicates using a proprietary protocol. in
`yet another implementation, the video device acts as a content filter similar to a web proxy.) A
`CGI syntax of name/value pairs is used for passing parameters from the control device to the
`video device as follows:
`
`o
`
`o MediaURL - A URL, URN or URI indicating the video stream (typically an MPEG-2
`program stream or an MPEG-1 systems stream.) The protocol is specified also as either
`HTTP or RTSP. For example: http://videoserver/contentmpg or
`rtsp://videoserver/content.mpg
`o VideoDevice — an IP address name or number indicating the device which is being
`directed to display the video
`StartTime —the video play position in units of floating point seconds since the start of the
`media. The decoder application may round this value down to the nearest feasible starting
`point (e.g. integer seconds.) if omitted, StartTime is assumed to be zero.
`o Volume — volume amplitude on a linear scale from O to 100.
`0 Reply — parameter may take on one of the following values
`o Yes - send a status message response in HTML format via HTTP
`o No — issue an HTTP 204 (no response) message
`
`AT&T - PROPRIETARY
`
`Use pursuant to Company Instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 6 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 6 of 14
`
`
`
`a Command parameter may take on one of the following values:
`o
`play — play at the given StartTime
`stop — stop the video and blank the screen
`mute —- set volume to 0
`
`OOOOOO
`
`volup — increase the volume 10 units
`voldown — decrease the volume 10 units
`
`volume — set the volume using the given Volume
`pause — stop the video, freezing on the current video frame
`
`
`
`Figure Descriptions
`Figure 1 shows the system architecture for adding video material to the database.
`Figure 2 shows several alternative network topologies for implementing the invention.
`Figure 2a corresponds to the embodiment described above in which all devices and servers are
`connected to a common network. Figure 2a depicts a topology in which the control device
`communicates with the video device, and the video device in turn is connected to the servers.
`Figure 2b shows yet another embodiment in which there is no upstream connectivity. in this case,
`content indexing information is downloaded and stored locally in the video device.
`It
`is also
`possible for the video content itself to be downloaded and stored locally (the local storage can be
`in video device, or in a connected system component such as a PVR.)
`if the video is not stored
`locally, the video content available would be limited to what content is currently being broadcast.
`Figure 3 represents a more detailed view of the device subcomponents. This figure
`follows the network topology depicted in Figure 2a.
`Figure 4 is a photograph of a device that can be used as the control device to implement
`the invention.
`
`Figures 5 through 11 are screen shots from the control device and disclose one possible
`implementation of the invention. Figure 5 shows the interface for specifying the video device that
`is to be controlled.
`in this example,
`the user has entered the video device name "NTV1“.
`Preferably the interface would include a list of all available devices to which the user has access.
`Also, names entered by the user or selected previously can be presented to the user in a
`selectable list.
`
`Figure 6 depicts the main or "home" screen after the video device has been selected. At
`the top of the screen are icons for device control. These icons are linked via CGI URLs to control
`commands for (from left to right) stop, pause, volume up, volume down. In this embodiment,
`the buttons remain at the top of the screen whenever a video device has been selected for
`control. This is so that the user may easily stop or turn the volume down regardless of where the
`user is in a navigation session. Additionally command icons for mute and other functions can be
`included in this button cluster. A button for linking back to a more conventional numeric keypad
`display would be desirable to allow the user to select a live broadcast channel.
`it would further be
`desirable for the control device to display a status message indicating the currently playing
`content (perhaps by title and some indication of the playing time.) This message could similarly be
`displayed in a persistent manner, as is the button cluster.
`In the embodiment shown here, the
`button cluster is implemented as an HTML frame.
`On Figure 6 the user may enter a search term and may restrict the search to a particular
`broadcaster, program or date range.
`if the user selects the "Topics" link a list of common search
`terms is displayed (Figure 7.) if one of these topics is selected it has the same effect as if the term
`was entered into the search form in Figure 6.
`Once a search term has been specified (either by clicking on a term on the Topic screen,
`or by entering a term on the main screen and clicking "search") a list of programs that are relevant
`to that term are displayed as in Figure 8.
`After selecting a program from the list of programs shown in Figure 8, a display of the
`relevant content extracted from the video program is displayed as in Figure 9. In this example, the
`search term was "NASA", and the system has selected excerpts of the pictorial transcription that
`contain that term. if the user clicks on one of the images, the video will start playing on the video
`device at that point in the program.
`lf the user selects one of the arrow icons, the full transcription
`
`AT&T - PROPRIETARY
`
`Use pursuant to Company Instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 7 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 7 of 14
`
`
`
`is displayed as shown in Figure 10. At this point, the user may scroll the display and select an
`image to initiate video playback.
`If the user selects the "Latest News" link shown in Figure 6, a list similar to that shown in
`Figure 8 will be displayed. However the list will contain the most recently aired programs in the
`database shown in reverse chronological order.
`in this case, selecting one of these will bring the
`user directly to the full transcription as shown in Figure 10.
`Figure 11 depicts an interface for browsing video material that has not been closed
`caption text or otherwise transcribed. The examples shows home video content that has been
`arranged into a series of thumbnail pages or "contact sheets."
`it
`is the intent of these to facilitate searching and browsing of the video material. By
`allowing the user to see some rendition of the video content, the user can make informed
`decisions as to the relevance or desirability of viewing the full video program. This will minimize
`requests for irrelevant content, thus minimizing the load on the video server and saving the user
`time. These figures are for illustrative purposes only, and it is possible to implement the invention
`with user interfaces that differ considerably in appearance from the one depicted here.
`In fact, it
`would be desirable to have a plurality of interfaces that are customizable based upon the user's
`preferences or those of the content providers or broadcasters.
`
`Prior Art:
`US Patents 5,844,620 and 5,539,479 disclose methods of using remote controls to
`interact with menu or other user
`interface elements overlaid on top of
`live video. We
`advantageously make no use of overlays, which obstruct the view of the video and distract other
`viewers. US Patent 5,995,155 discloses a system for recording video programs selectively based
`on program guide information. We do not selectively record programs. US Patent 5410326
`includes a touch screen display that displays advertisements. This does not include the display of
`content derived from the video material for the purposes of searching and browsing.
`
`US5995155: Database navigation system for a home entertainment system
`US5844620: Method and apparatus for displaying an interactive television program guide
`US5539479:Video receiver display of cursor and menu overlaying video
`US54t 0326: Programmable remote control device for interacting with a plurality of remotely
`controlled devices
`
`
`
`AT&T — PROPRIETARY
`
`Use pursuant to Company Instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 8 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 8 of 14
`
`
`
`Figures
`
`
`Ancillary
`Source Material
`
`
`
`
`Media Analysis —
`Multimedia
`Metadata
`Video Source
`Database &
`
`Extraction
`V75 titan .Qr:n'r‘vn‘r‘
`Material
`
`
`
`
`
`
`Figure 1: Content acquisition
`
`
`
` Consumer Premises
`
`MM
`Database
`
`Video
`Database
`
`
`
`Network
`Access Point
`
`lOOBaseT LAN
`
`Figure 2a: System Architecture for one Implementation
`
`AT&T - PROPRIETARY
`Use pursuant to Company Instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 9 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 9 of 14
`
`
`
`
`
` Control
`
`
`Video
`Device
`
`MM/Video
`Database
`
`Device
`
`Figure 2b: Control device communicates with video
`device
`
`
`
` Broadcast
`Video and Datacast
`
`
`Video
`Device
`
`Control
`Device
`
`
`
`
`
`Figure 2c: Content datacast and stored locally (in video
`device or control device)
`
`Video Device
`
`Control Device
`200
`100
`
` 400
` Multimedia
`
`Database
`
`300
`
`
`
`
`
`
`
`Display
`Controller Query
`Processing
`
`Network
`Interface
`
`Network
`Interface
`
` Request
`
`
`Processing
`
`
`Network
`Interface
`
`
`
` Streaming Video and Control
`
`Figure 3: System Architecture
`
`AT&T-PROPRETARY
`
`Use pursuant to Company Instructions
`
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 10 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 10 of 14
`
`
`
`Network
`Interface
`
`
`
`
`
`
`
`
`
`Figure 5: interface for specifying the video device to control (here "NTV1")
`
`AT&T - PROPRIETARY
`Use pursuant to Company instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 11 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 11 of 14
`
`
`
`Figure 6 Main screen after connection
`
`
`
`
`Figure 7 Main Topic Screen
`
`
`
`AT&T - PROPRIETARY
`Use pursuant to Company Instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 12 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 12 of 14
`
`
`
`
`
`
`Figure 8: List of Programs
`
`Figure 9: Display of program segments matching a topic (NASA)
`
`AT&T - PROPR|ETARY
`Use pursuant to Company Instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 13 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 13 of 14
`
`
`
`
`
`Figure 10: Browsable view of a program
`
`
`
`
`
`Figure 11: Browsable view of Home video content
`
`AT&T - PROPRIETARY
`Use pursuant to Company Instructions
`EXHIBIT 2001 Patent Owner IPR2016-00047 Page 14 of 14
`EXHIBIT 2001
`Patent Owner
`IPR2016-00047
`Page 14 of 14