`Rangan et al.
`
`USOO6154771A
`Patent Number:
`11
`(45) Date of Patent:
`
`6,154,771
`Nov. 28, 2000
`
`54 REAL-TIME RECEIPT, DECOMPRESSION
`AND PLAY OF COMPRESSED STREAMING
`VIDEO/HYPERVIDEO; WITH THUMBNAIL
`RSSESSENESS"
`RECoRDING PERMISSIVELY INTLATED
`RETROSPECTIVELY
`
`75 Inventors: P. Venkat Rangan; Vijnan Shastri;
`Arya Ashwani; Parag Arole, all of San
`Diego, Calif.
`
`73 Assignee: Mediastra, Inc., San Diego, Calif.
`
`21 Appl. No.: 09/088,513
`22 Filed:
`Jun. 1, 1998
`51
`Int. Cl." ............................. G06F 13/38; G06F 15/17
`52) U.S. Cl. .......................... 709/217; 709/226; 709/229;
`709/231; 709/247; 709/305; 345/327; 345/328;
`345/335; 345/302; 707/501; 707/513
`58 Field of Search ..................................... 345/327, 328,
`345/302, 433,335, 2, 10, 104; 386/109,
`52; 709/217, 231, 203, 247, 226, 229, 305;
`348/552; 707/501, 502,513
`
`56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`5,684,715 11/1997 Palmer .................................... 364/514
`... 345/328
`5,760,767 6/1998 Shore et al. ..
`5,801,685 9/1998 Miller et al. ..
`... 345/302
`5,809,512 9/1998 Kato .............
`... 707/502
`5,850,352 12/1998 Moezzi et al. ...
`... 364/514
`5,861,880
`1/1999 Shimizu et al. .
`... 345/302
`... 382/232
`5,862,260
`1/1999 Rhoads .............
`... 702/187
`5,893,053 4/1999 Trueblood ...
`5,903,892 5/1999 Hoffert et al. ............................ 707/10
`5,918,012 6/1999 Astiz et al. ......
`... 709/217
`5,929,850 7/1999 Broadwin et al. ...
`... 345/327
`5,936,679 8/1999 Kasahara et al. ...
`... 348/553
`5,961,603 10/1999 Kunkel et al. .......................... 709/229
`
`2Y- - 12
`
`ake . . . . . . . . . . . . .
`
`5,966,121 10/1999 Hubbell et al. ......................... 345/328
`5,970,473 10/1999 Gerszber et al.
`... 705/26
`Ea. 1910 E. et al. ....
`".
`6,006,241 12/1999 Purnaveja et al. ........... 707/512
`6,006,265 12/1999 Mishima et al. .......................... 386/11
`6,009,236 12/1999 Mishima et al.....
`... 386/11
`6,025,837 2/2000 Matthew, III et al. .
`345/327
`6,025,886 2/2000 Koda ....................................... 348/700
`6,026,433 2/2000 Darlach et al. ........................ 709/217
`6,058,141 5/2000 Barger et al. ........................... 375/240
`6,061,054 5/2000 Jolly ........................................ 345/302
`Primary Examiner Zarni Maung
`Assistant Examiner Bunjob Jaroenchonwanit
`Attorney, Agent, or Firm-Fuess & Davidenas
`57
`ABSTRACT
`Streaming compressed digital hyperVideo received upon a
`digital communications network is decoded (decompressed)
`and played in a client-computer-based “video on Web VCR'
`Software System. Scene changes, if not previously marked
`upstream, are automatically detected, and typically twenty
`one past Scenes are displayed as thumbnail images. Hyper
`links within the main Video Scene, and/or any thumbnail
`image, show as hotspots, with text annotations typically
`appearing upon a cursor “mouse over'. All hyperlinkS-as
`are provided and inserted by, inter alia, the upstream net
`work service provider (the "ISP") may be, and preferably
`are, full-custom dynamically-resolved to each Subscriber/
`user/viewer (“SUV”) upon volitional "click throughs” by the
`SUV, including retrospectively on past hypervideo Scenes as
`appear within the thumbnail images. Hyperlinking permits
`(i) retrieving information and commercials, including
`Streaming Video/hyperVideo, from any of local Storage, a
`network (or Internet) service provider ("ISP"), a network
`content provider, and/or an advertiser network site, (ii)
`- - -
`entering
`contest of skill or a lottery of chance, (iii)
`gambling, (iv) buying (and less often, selling), (v) respond
`ing to a Survey, and expressing an opinion, and/or (vi)
`Sounding an alert.
`
`23 Claims, 15 Drawing Sheets
`
`
`
`Hotspot 73
`
`Page 1
`
`AMAZON EX. 1014
`Amazon v. CustomPlay
`US Patent No. 9,124,950
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 1 of 15
`
`6,154,771
`
`SUV 1
`
`7 (Partial)
`SUV
`
`7 (Partial)
`SUVn
`
`7 (Partial)
`
`Content Providers/
`Producers 3
`
`
`
`
`
`
`
`
`
`
`
`Live
`Content
`2
`
`
`
`
`
`------------------------------------------------------
`Content delivered by
`ISP onto Network
`
`
`
`VOW
`Server 9
`
`VOW
`Server
`
`Internet Service Provider (ISP)
`Premises 5
`
`sm is names
`
`mas-rare men - a
`
`sm -- - - -r non-smonum mm -
`
`Page 2
`
`
`
`U.S. Patent
`US. Patent
`
`Nov. 28, 2000
`Nov. 28,2000
`
`Sheet 2 of 15
`Sheet 2 0f 15
`
`6,154,771
`6,154,771
`
`
`
`AdChannel
`
`d
`
`E
`S
`O
`
`s
`Fig.2
`
`Page 3
`
`
`
`--9
`1730
`(D |:
`(Sam:
`o~z=2
`rt II c.
`'U'Uo:
`“TSQm
`as II: e
`83mg
`ar
`CD
`
`Shopplng
`
`Page 3
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 3 of 15
`
`6,154,771
`
`4.
`
`Video Stream 91
`
`
`
`
`
`
`
`Video
`Server 9
`
`VOW
`server 1
`
`Control Stream
`11 (http://)
`
`
`
`Control
`Module
`
`Player
`Module
`
`VOW VCR client
`
`-----------------------
`
`7
`
`SUVX
`
`Fig. 3
`
`Page 4
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 4 of 15
`
`6,154,771
`
`Hotspot 73
`
`72
`
`
`
`
`
`iPO
`2.5Doole
`EEE Una
`EEgert
`
`74
`
`
`
`Fig. 4
`
`Page 5
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 5 of 15
`
`6,154,771
`
`Software
`Component
`
`
`
`Channels
`
`Fig. 5b
`
`Replay Window
`
`Page 6
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 6 of 15
`
`6,154,771
`
`User's web page playing hypervideo in VOWVCR
`
`
`
`Click on hotspot invokes a web page
`
`Fig. 6
`
`Page 7
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 7 of 15
`
`6,154,771
`
`User's web page playing hypervideo in VOWVCR
`
`
`
`Click on hotspot takes user to another video
`
`Fig. 7
`
`Page 8
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 8 of 15
`
`6,154,771
`
`User's web page playing hypervideo in VOWVCR
`
`
`
`Click on hotspot takes user to a slide show
`
`Fig. 8
`
`Page 9
`
`
`
`6
`
`pmmamas:W09525N_n90one25%5m%a\l:
`
`U
`
`m
`
`28:
`
`0
`
`%woman:
`
`m_n
`
`ab”newmom
`PwonvowNew
`
`
`t5520MQELa
`
`
`
`2.ransom258Gashm82>82>82>2:
`
`In0:mo»ooh
`
`Sn
`
`EQE
`
`850m
`
`55
`
`82>
`
`Emobm
`
`
`
`owns:x005EmobmE65:00
`
`65:00
`
`17
`
`7,M4O5m.E1c:6m2.
`65:00w:
`
`Page 10
`
`Page 10
`
`
`
`
`
`U.S. Patent
`US. Patent
`
`Nov. 28, 2000
`Nov. 28, 2000
`
`Sheet 10 of 15
`Sheet 10 0f 15
`
`6,154,771
`6,154,771
`
`
`
`Shasm82>S386
`
`82>
`
`09?bowofiu
`
`oEmE—E.ofiiomo—M
`
`asME
`
`Page 11
`
`Page 11
`
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 11 of 15
`
`6,154,771
`
`Pick Up Video Frame Data
`
`IF I Or P Frame
`
`PTS - DTS First Frame + (No. Of Elapsed Frames * Display Order Frame No.)
`Frame Rate
`
`
`
`Calculate Clock Value Of SCR AS: (Previous SCR + No. Of Bytes. In Pack)
`Mux Pack
`
`-- Enter In Pack Header
`
`Pick Up Leftover Bytes (LB) From Previous Picture
`
`
`
`Pick 2 KB - (LB) Bytes From Current Picture & Store
`Leftover Bytes Of Current Picture
`
`Pick Audio Bytes AS Video PTS
`Audio AAU Time
`
`Fill Audio Packet
`
`Write Pack Header + Video packet + Audio Packet
`
`Fig. 11
`
`Page 12
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 12 of 15
`
`6,154,771
`
`Region 1
`
`Region I
`
`Images Are Obtained
`From Video By Sampling
`Every 15th Frame
`
`Image (A)
`
`Image (B)
`
`Region
`
`
`
`R
`
`255
`
`G 255
`
`B
`
`255
`
`Pixel Values Are Fitted into Appropriate Bucket
`
`60% Match?
`
`Region Pic (A)
`
`Region Pic (B)
`
`If 5 Of 6 Regions Do Not Match Between Image (A) And (B)
`There Exists A Scene Change
`Fig. 12
`
`Page 13
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 13 of 15
`
`6,154,771
`
`User Clicks On Scene
`
`Time Of Scene Sent To Video/Audio Buffer Unit
`
`
`
`
`
`
`
`Lookup Video & Audio Offsets In V/A Buffer
`
`
`
`Send Data to Mux
`
`Video Audio
`
`Mux Writes Data to File
`And Triggers Event To Browser
`
`
`
`
`
`
`
`
`
`Browser Activates Second HoTv Player
`With File Name
`
`
`
`
`
`
`
`2nd HoTv Player Begins Playing
`From Replay Point
`
`Read
`
`HoTv Player
`
`Second Instance Of Player
`
`Fig. 13
`
`Page 14
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 14 of 15
`
`6,154,771
`
`Get Bitmap From Renderer Filter
`
`Reduce Image By A factor Of 8 In X And Y Directions
`
`Divide Image Into 6 Rectangular Regions
`
`Calculate R, G And B Values For Next Pixel
`
`Increment Appropriate Bucket Count For R, G And B
`
`Is Pixel Value Within 5 Pixels Of Bucket Edge?
`
`Increment Count Of Both Buckets
`
`No
`
`
`
`All Pixels In A Region Processed?
`
`YeS
`Is There A 60% Match. In The 2 Corresponding Regions
`In The Previous And Current Image?
`NO
`
`Regions Match
`
`Regions Do Not Match
`
`
`
`
`
`
`
`All Regions Processed
`
`Do 5 Of 6 Regions Show Mismatch?
`NO
`Yes
`No Scene Changes
`Scene Change
`
`Fig. 14
`
`Page 15
`
`
`
`U.S. Patent
`
`Nov. 28, 2000
`
`Sheet 15 of 15
`
`6,154,771
`
`Video Map Structure
`
`
`
`Picture I
`{
`
`Shape ID
`
`Shape Center Co-Ordinates
`Shape Description (Rectangle/Elepse
`Polygon
`
`}
`
`Picture 2
`{
`
`Shape ID
`
`Shape Center Co-Ordinates
`Shape Description (Rectangle/Ellepse
`Polygon
`
`}
`
`Picture N
`*
`
`.....
`
`}
`
`Fig. 15
`
`Page 16
`
`
`
`6,154,771
`
`1
`REAL-TIME RECEIPT, DECOMPRESSION
`AND PLAY OF COMPRESSED STREAMING
`VIDEO/HYPERVIDEO; WITH THUMBNAIL
`DISPLAY OF PAST SCENES AND WITH
`REPLAY, HYPERLINKING AND/OR
`RECORDING PERMISSIVELY INTLATED
`RETROSPECTIVELY
`
`RELATION TO RELATED PATENT
`APPLICATIONS
`The present patent application is related to the following
`U.S. patent application Ser. No. 09/054,362 for HYPER
`LINK RESOLUTION AT AND BY A SPECIAL NET
`WORK SERVER IN ORDER TO ENABLE DIVERSE
`SOPHISTICATED HYPERLINKING UPON A DIGITAL
`NETWORK, U.S. Pat. No. 6,006,265 issued Dec. 21, 1999.
`Both related applications are to inventors including the
`inventors of the present application. Both related applica
`tions concern Servers upon a digital communications net
`work communicating hyperVideo whereas this application
`concerns a client upon the same network.
`Both applications were initially assigned to common
`assignee Tata America International Corporation, and were
`later assigned to common assignee HOTV, Inc. The contents
`of the related patent application are incorporated herein by
`reference.
`
`15
`
`25
`
`BACKGROUND OF THE INVENTION
`
`2
`(vi) the playing back of, and/or (vii) hyperlinking from,
`and/or (viii) recording of, digital Video/hyperVideo, either
`from a current playback position or from the Start of any
`Stored Scene.
`The present invention Still further particularly concerns
`recording and archiving Streaming digital video and hyper
`video.
`2. Description of the Prior Art
`2.1. Introduction to the Theory of Hypervideo
`There is no requirement to read the present Section
`2.1—which Section is based on the early investigations and
`research into hyperVideo of Sawhney, et al., as transpired at
`MIT (reference cited below) in order to understand the
`function, and, at a crude level, the purpose(s) of the present
`invention. However, hyperVideo is, as of the present time
`(1998) very new, and few people have experienced it. The
`present Section may accordingly beneficially be read in order
`to gain a “feel” for hypervideo.
`More fundamentally, the present Section discusses the
`considerable power of hyperVideo, and ends with a discus
`Sion of the empowerment that hyperVideo provides to a
`Subscriber/user/viewer. The present and related inventions,
`although they can be narrowly thought of as mere Systems
`and methods for delivering lowly commercials in the hyper
`Video environment, are really totally consistent with the
`more profound, and the more ennobling, purposes of hyper
`video. Therefore the present section may also beneficially be
`read to understand to what purposes—both good and ill
`hyperVideo may be put, and as background to how the
`present and related inventions Serve these purposes.
`In recent years Sawhney, et al., at MIT (reference cited
`below) have developed an experimental hypermedia proto
`type called “HyperCafe' as an illustration of a general
`hyperVideo System. This program places the user in a virtual
`cafe, composed primarily of digital Video clips of actors
`involved in fictional conversations in the cafe; HyperCafe
`allows the user to follow different conversations, and offers
`dynamic opportunities of interaction via temporal, Spatio
`temporal and textual links to present alternative narratives.
`Textual elements are also present in the form of explanatory
`text, contradictory Subtitles, and intruding narratives. Based
`on their work with HyperCafe, Sawhney, et al. have been
`leaders in discussing the necessary components and a frame
`work for hyperVideo Structures, along with the underlying
`aesthetic considerations. The following discussion is drawn
`entirely from their work.
`“Hypervideo” can be defined as “digital video and
`hypertext, offering to its user and author the richness of
`multiple narratives, even multiple means of Structuring
`narrative (or non-narrative), combining digital video with a
`polyvocal, linked text.” Hypervideo brings the hypertext
`link to digital video. See Sawhney, Nitin, David Balcom, Ian
`Smith “HyperCafe: Narrative and Aesthetic Properties of
`Hypervideo.” Proceedings of the Seventh ACM Conference
`on Hypertext. New York: Association of Computing
`Machinery, 1996.
`An even earlier approach to hypermedia was proposed by
`George Landow, in which he offered rules for hypermedia
`authors, rules that took into account hypermedia's derivation
`from print media and technologies of writing. Landow
`proposed that hypermedia "authors' learn which aspects of
`Writing applied to the emerging hypermedium, and which
`traits or characteristics needed redefinition and rethinking.
`He noted: “To communicate effectively, hypermedia authors
`must make use of a range of techniques, Suited to their
`medium, that will enable the reader to process the informa
`
`35
`
`1. Field of the Invention
`The present and related inventions generally concern (i)
`the machine-automated distribution, processing and network
`communication of streaming digital Video/hypervideo, par
`ticularly upon digital networks having network content
`providers (nominally an “Internet Content Provider”, or
`“ICP”), network service providers (nominally an “Internet
`Service Provider”, or "ISP"), and network client subscribers/
`users/viewers (“client SUVs”). The present and related
`inventions also generally concern the provision of diverse
`Sophisticated responses-including branching, Storage,
`playback/replay, Subscriber/user-specific responses, and
`contests-to SUV "click-throughs' on hyperlinks embedded
`within Streaming digital hyperVideo.
`The present invention itself generally concerns the receipt
`of, the client subscriber-user-viewer (“client SUV") inter
`action with, and the machine processing of, Streaming digital
`Video and hyperVideo.
`The present invention particularly concerns receiving
`(upon digital communications network), decompressing,
`and playing back interactive Video, also known as
`hyperVideo, in real time, including by making manifest to
`the Subscriber/User/Viewer (“SUV") all available imbedded
`hyperVideo linkS.
`The present invention further particularly concerns fol
`55
`lowing in real time any and all hyperlinkS acted upon
`normally by "clicking through' with a computer mouse-by
`the SUV so as to (i) make responses and/or (ii) retrieve
`further information, which further information may include
`the receipt, decompression and playing of further Streaming
`digital Video, and hyperVideo.
`The present invention still further particularly concerns (i)
`caching of digital Video and hypervideo including
`hyperlinks, (ii) detecting Scene changes, (iii) generating
`Scene "keyframes', or thumbnail images, (iv) displaying
`detected Scene changes, and (V) retrospectively initiating the
`recording of, and/or initiating, potentially retrospectively,
`
`40
`
`45
`
`50
`
`60
`
`65
`
`Page 17
`
`
`
`6,154,771
`
`15
`
`25
`
`3
`tion presented by this new technology.” See Landow, George
`P. “The Rhetoric of Hypermedia: Some Rules for Authors.”
`Journal of Computing in Higher Education, 1 (1989), pp.
`39-64; reprinted in Hypermedia and Literary Studies, ed. by
`Paul Delany and George P. Landow, Cambridge, Mass.: MIT
`Press, 1991.
`Hypervideo has its roots in both hypertext and film. As a
`result, hyperVideo embodies properties of each field, but
`wholly can be placed in neither, for hyperVideo is not strictly
`linear motion picture, nor is it strictly hypertext. This
`convergence known as hyperVideo comments on each
`discipline, on their Similarities, and on their differences.
`HyperVideo is potentially nonlinear, like hypertext, but
`displays moving images, like film. HyperVideo can signify
`through montage, like film, and can generate multiple
`dictions, like hypertext. Properties of each medium are
`present in hyperVideo. These properties take on new forms
`and practices in hyperVideo.
`Hypervideo relocates narrative film and video from a
`linear, fixed environment to one of multivocality; narrative
`Sequences (video clips followed by other Video clips) need
`not Subscribe to linearity. Instead of creating a passive
`Viewing Subject, hyperVideo asks its user to be an agent
`actively involved in creation of text through choice and
`interaction. HyperVideo can potentially change viewing Sub
`ject from a passive consumer of the text to an active agent
`who participates in the text, and indeed, is engaged in
`constructing the text.
`Just as hypertext necessitated a re-reading of the act of
`reading and writing, hyperVideo asks for a re-viewing of
`narrative film and film making and practices of Viewing a
`film. Hypervideo redefines the viewing subject by breaking
`the frame of the passive Screen. HyperVideo users are
`participants in the creation of text, as hypertext readers are.
`Research is presently (circa 1997) projected to determine
`just how users of hyperVideo Systems navigate, interact with,
`and experience hyperVideo-texts. Just as J. Yellowlees Dou
`glas has exhaustively researched hypertext readers and the
`act of hypertext reading, Similar projects are expected to be
`40
`undertaken by hyperVideo researchers. See Douglas, J. Yel
`lowlees. “Understanding the Act of Reading: the WOE
`Beginner's Guide to Dissection.” Writing on the Edge, 2.2.
`University of California at Davis, Spring 1991, pp. 112-125.
`See also Douglas, J. Yellowlees. “How Do I Stop This
`Thing?: Closure and Indeterminacy in Interactive Narra
`tives.” Hyper/Text/Theory, ed. by George P. Landow. Bal
`timore: The Johns Hopkins University Press, 1994.
`Hypervideo is related to film. Hypervideo has the poten
`tial to reveal important associations present in a film, and the
`constructedness of linear filmic narratives, and to this end,
`would be a beneficial tool for use with film studies education
`and research. HyperVideo can make available, by way of
`link opportunities, the different associations and allusions
`present in a filmic work. These associations are made
`manifest with hypervideo, made available for the student (or
`teacher) to See and explore. Relationships between different
`films can then be tracked, linked, commented on, revealed.
`HyperVideo engages the same idea of “processing” that
`hypertext writing does: in writing hypertext, one makes
`available the process of writing, representing it visually (in
`the form of the web the writer builds), rhetorically (in the
`linking Structure of the work, the points of arrival and
`departure present in the text)—and so one makes apparent
`the tensions and lines of force present in the act of writing,
`and the creation or reification of narrative. “Writing hyper
`Video does the same for image-making-that is, makes clear
`
`4
`the notion of constructing images and narrative. In the case
`of hypervideo, “narrative' refers to narrative film making.
`Just as hypertext has within it the potential to reveal the
`constructedness of linear writing, and to challenge that
`Structure, hyperVideo does the same for narrative film
`making-While also offering the possibilities for creating
`rich hypervideo texts, or Videotexts.
`How does narrative film function in hypervideo? Narra
`tive film is necessarily re-contextualized as part of a network
`of Visual elements, rather than a Stand-alone filmic device.
`Because narrative Segments can be encountered out of
`Sequence and (original) context, even strictly linear Video
`clips are given nonlinear properties.
`Sergei Eisenstein pioneered the concept and use of mon
`tage in film. HyperVideo reveals and foregrounds this use.
`Eisenstein proposed that a juxtaposition of disparate images
`through editing formed an idea in the viewer's head. It was
`Eisenstein's belief that an idea-image, or thesis, when jux
`taposed through editing, with another, disparate image, or
`antithesis, produced a Synthesis in the viewing Subject's
`mind. In other words, Synthesis existed not on film as
`idea-image, but was a literal product of images to form a
`Separate image-idea that existing Solely for the viewer.
`Eisenstein deliberately opposed himself to continuity
`editing, Seeking out and exploiting what Hollywood could
`call "discontinuities.” He staged, shot, and cut his films for
`the maximum collision from shot to shot, Sequence to
`Sequence, Since he believed that only through being force to
`Synthesize Such conflicts does the viewer participate in a
`dialectical process. Eisenstein Sought to make the collisions
`and conflicts not only perceptual but also emotional and
`intellectual.” See Bordwell, David and Kristin Thompson.
`Film Art: An Introduction. Fourth Edition. New York:
`McGraw-Hill, Inc., 1993.
`Hypervideo potentially reveals this thesis/antithesis
`dialectic, by allowing the user to choose an image-idea (in
`this case, a Video clip), and juxtaposing it with another
`image-idea (another video clip). HyperVideo allows the user
`to act on discontinuities and collisions, to engage with
`colliding Subtexts and threads.
`The user Selects a Video clip from a black canvas of three
`or four clips. Each clip lies motionless on the canvas. The
`user drags a clip onto another one, and they both Start to play.
`Voices emerge and collide, and once-separate image-ideas
`now play concurrently, with one image extending the frame
`of the other. The user is left to determine the relationship
`between the two (or three or four) video clips.
`Such video intersections recall Jim Rosenberg's notion of
`Simultaneities, or the “literal layering on top of one another
`of language elements.’ See Rosenberg, Jim. "Navigating
`Nowhere/Hypertex Infrawhere.” ECHT94, ACM SIGLINK
`Newsletter. December 1994, pp. 6-19. Instead of language
`elements, Video interSections represent the layering of Visual
`elements, or more specifically, Visual elements in motion.
`This is not to Say that words, in the case of Rosenberg's
`Intergrams, are not visual elements, on the contrary, they
`are. In fact, their image-neSS is conveyed with much more
`clarity (and even urgency) than are non-simultaneous words,
`or words without an apparent visual significance (Save the
`“transparent' practice of Seeing “through' letter-images into
`words into Sentences into concepts). Once the word-images
`have to contend with their neighbor-layers for foreground
`Screen Space, their role in both the practice of Signification
`(where meaning is contingent on what neighborly 0's and
`1's are NOT), and as elements of a user interface (words that
`yield to the touch or click or wave of the mouse) become
`
`35
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Page 18
`
`
`
`6,154,771
`
`15
`
`35
`
`40
`
`25
`
`S
`immediate and obvious. Nor is this to say that video clips
`arent “language elements'; on the contrary, they are. The
`hyperVideo clip is caught, as are words and letters, in the act
`of signification and relational meaning-making (. . . what
`neighborly 0's and 1’s are not . .
`. ), mutable to the very
`touch of user, to the layers above and below.
`The hyperVideo author can Structure these video interSec
`tions in Such a way that only X and Y clips can be seen
`together, or X and Z if Y has already been seen (like
`StorySpace's guard fields), and So on, and the author can
`decide if a third video should appear upon the juxtaposition
`of X and Y. For example, Video X is dragged onto Video Y
`and they both start to play. The author can make a choice to
`then show Video Z as a product, or Synthesis, of the
`juxtaposition of Videos X and Y, that reflects or reveals the
`relationship between Videos X and Y. This literal revealing
`of Eisenstein's Synthesis is made possible with hyperVideo.
`Of course, no synthesis need be literally revealed; that can
`be left up to the viewer. While the interactions are structured
`by the hyperVideo author or authors (as Eisenstein structured
`the placement and editing of thesis and antithesis idea
`images), the meaning-making is left up to the hypervideo
`user. His or her choice reveals meaning to him with each
`Video interSection; meaning in the System is neither fixed nor
`pre-determined. This empowering principle of hypertext is
`also a property of hyperVideo.
`2.2. MPEG Standards
`2.2.1. Overview
`The present invention will be seen to involve computer
`Systems and computer processes dealing with compressed
`Video, and hyperVideo, digital data. The Video digital data
`compression may be accomplished in accordance with many
`known techniques and Standards, and is only optionally in
`accordance with the MPEG family of standards. One short
`part of the explanation of the invention within this Specifi
`cation will show the operation of the system of the invention
`in the recording of Video that is, by way of example,
`MPEG-compressed. Accordingly, Some slight background
`as to the MPEG standard is useful, and is provided in this
`and the following three Sections.
`The Motion Picture Experts Group-MPEG-is a joint
`committee of the International Organization for Standard
`ization (ISO) and the International Electrotechnical Com
`mission (IEG). The first MPEG Standard, known as MPEG
`45
`1, was introduced by this committee in 1991. Both video and
`audio standards were set, with the video standard built
`around the Standard Image Format (SIF) of 352x240 at 30
`frame per second. MPEG data rates are variable, although
`MPEG-1 was designed to provide VHS video quality at a
`data rate of 1.2 megabits per second, or 150 KB/sec.
`The MPEG-2 standard, adopted in the Spring of 1994, is
`a broadcast standard specifying 720x480 playback at 60
`fields per Second at data rates ranging from 500 KB/sec to
`over 2 Megabytes (MB) per second.
`The expanded name of the MPEG-1 standard is “Coding
`of Moving Pictures and Associated Audio for Digital Stor
`age Media”. The Standard covers compression of moving
`pictures and Synchronized audio signals for Storage on, and
`real-time delivery from, CD-ROM.
`The sponsoring body is ISO/IEC JTC1/SC29 WG11 (also
`know as the Moving Pictures Expert Group). The standard
`is set forth in ISO/IEC 11172: 1993 Information
`technology-Coding of moving pictures and associated
`audio for digital Storage media up to about 1.5 Mbit/s.
`Characteristics and description of the MPEG-1 standard is
`as follows. A typical interlaced (PAL) TV image has 576 by
`
`50
`
`55
`
`60
`
`65
`
`6
`720 pixels of picture information, a picture speed of 25
`frames per Second and requires data to be delivered at
`around 140 Mbit/s. Computer systems typically use even
`higher quality images, up to 640 by 800 pixels, each with up
`to 24bits of color information, and so require up to 12 Mbits
`per frame, or over 300 Mbit/s. CDs, and other optical storage
`devices, can only be guaranteed to deliver data at Speeds of
`around 1.5 Mbit/s So high compression ratios are required to
`Store full Screen moving images on optical devices.
`The MPEG-1 standard is intended to allow data from
`non-interlaced Video formats having approximately 288 by
`352 pixels and picture rates of between 24 and 30 Hz to be
`displayed directly from a CD-ROM or similar optical stor
`age device, or from magnetic Storage medium, including
`tape. It is designed to provide a digital equivalent of the
`popular VHS video tape recording format.
`High compression rates are not achievable using Standard,
`intraframe, compression algorithms. MPEG-1 utilizes
`block-based motion compensation techniques to provide
`interframe compression. This involves the use of three types
`of frame encoding: 1) intra coded I-Pictures are coded
`without reference to other pictures; 2) predictive coded
`P-Pictures are coded using motion compensation prediction
`based on preceding I-Pictures or P-Pictures; and 3)
`bidirectionally-predictive coded B-Pictures use both past
`and future I-Pictures and B-Pictures as their reference points
`for motion compensation.
`While B-Pictures provide the highest level of compres
`sion they cannot be interpreted until the next I-Picture or
`P-Picture has been processed to provide the required refer
`ence points. This means that frame buffering is required for
`intermediate B-Pictures. The amount of frame buffering
`likely to be available at the receiver, the speed at which the
`intermediate frames can be processed, and the degree of
`motion within the picture therefore control the level of
`compression that can be achieved.
`MPEG-1 uses a block-based discrete coding transform
`(DCT) method with visually weighted quantification and run
`length encoding for video compression. MPEG-1 audio
`Signals can be encoded in Single channel, dual channel (two
`independent signals), Stereo or joint Stereo formats using
`pulse code modulation (PCM) signals sampled at 32, 44.1 or
`48 kHz. A psychoacoustic model is used to control audio
`Signals Sent for quantification and coding.
`2.2.2. How MPEG WorkS
`Like most video compression schemes, MPEG uses both
`interframe and intraframe compression to achieve its target
`data rate. Interframe compression is compression achieved
`between frames, through, essentially, eliminating redundant
`interframe information. The classic case is the “talking
`head’ shot Such as with a news anchor, where the back
`ground remains Stable and movement primarily relates to
`minor face and shoulder movements. Interframe compres
`Sion techniques Store the background information once, and
`then retain only the data required to describe the minor
`changes-facial movements, for example-occurring
`between the frames.
`Intraframe compression is compression achieved by
`eliminating redundant information from within a frame,
`without reference to other video frames. MPEG uses the
`Discrete Cosign Transform algorithm, or DCT, as its
`intraframe compression engine. By and large, however, most
`of MPEG's power come from interframe, rather than
`intraframe compression.
`MPEG uses three kinds of frames during the compression
`process: 1) Intra, or I frames; 2) Predicted, or P frames; and
`
`Page 19
`
`
`
`6,154,771
`
`25
`
`35
`
`40
`
`7
`3) Bi-directional interpolated, or B frames. Most MPEG
`encoding Schemes use a twelve- to fifteen-frame Sequence
`called a group of pictures, or GOP.
`I frames start every GOP, and serve as a reference for the
`first two B frames and first P frame. Since the quality of the
`entire GOP depends upon the quality of its initial I frame,
`compression is usually very limited in the I frame.
`P frames refer back to the immediately preceding P or I
`frame, whichever is closer. For example, P frame 4 could
`refer back to I frame 1, and P frame 7 referring back to frame
`4. During the encoding process, frame 4 Searches frame 1 for
`redundancies, where the data about which are essentially
`discarded. Regions in frame 4 that have changed since frame
`1-called “change regions'-are compressed using
`MPEG's intraframe compression engine, DCT. This com
`15
`bination of interframe and intraframe compression typically
`generates a higher degree of compression than that achieved
`with I frames.
`MPEG uses yet another compression strategy: B frames
`refer backwards and forwards to the immediately preceding
`or succeeding P or I frame. For example, for frame 11, a B
`frame, the compression Scheme would Search for redundant
`information in P frame 10 and the next I frame; once again,
`redundant information is discarded and change regions are
`compressed using DCT. The double-dose of interframe
`compression typically generates the highest compression of
`the three frame types.
`All three types of encoders use the same basic GOP
`scheme defined in the MPEG specification. From a pure
`compression Standpoint, the Schemes differ in two key ways:
`their relative ability to identify interframe redundancies and
`whether they can modify GOP placement and order to
`maximize compressed Video quality.
`2.3 Practical Problems With Hypervideo
`The concept is clear that, once hyperlinkScan be inserted
`into Streaming digital Video (as they efficiently are by the
`related inventions) So as to make hypervideo, then a
`Subscriber/User/Viewer (“SUV") of the streaming digital
`hyperVideo can, by Volitionally exercising the hyperlinkS
`normally by "clicking through” on a link with a comput