throbber
United States Patent
`Davis et al.
`
`19
`
`54
`(75)
`
`73)
`
`TIME-BASED MEDIA PROCESSING SYSTEM
`
`Inventors: Marc Davis, San Francisco; David
`Levitt, Palo Alto, both of Calif.
`Assignee: Interval Research Corporation, Calif.
`
`Appl. No.: 08/693,004
`Filed:
`Aug. 6, 1996
`Int. Cl." ............................ G06F 17/00; G06F 3/00;
`G11B 27/10
`U.S. Cl. .......................... 345/328; 345/302; 34.5/967;
`386/102; 386/55
`Field of Search ..................................... 345/328, 302,
`345/327,967, 349, 356; 386/4, 52,55,
`102; 707/104
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`4.914,568 4/1990 Kodosky et al. ....................... 345/349
`5,099,422 3/1992 Foresman et al.
`386/54 X
`5,119,474 6/1992 Beitel et al. ......
`... 345/302
`5,177,513
`1/1993 Saito .............
`... 352/129
`5,247.666 9/1993 Buckwold .....
`... 707/100
`5,291,587 3/1994 Kodosky et al. .
`345/349 X
`5,301,336 4/1994 Kodosky et al. .
`... 345/348
`5,359,712 10/1994 Cohen et al. .....
`... 345/328
`5,388,197 2/1995 Rayner ..........
`... 345/328
`5,414,808 5/1995 Williams ...
`... 345/328
`5,548,340 8/1996 Bertram ...
`... 348/559
`5,623.587 4/1997 Bulman .........
`... 345/435
`5,659,793 8/1997 Escobar et al. ...
`... 345/302
`5,682,326 10/1997 Klingler et al. ..
`... 345/302
`5,708,767
`1/1998 Yeo et al. ..
`... 345/302 X
`5,724,605 3/1998 Wissner .........
`... 345/302
`5,748,956 5/1998 Lafer et al. ...
`... 707/104
`5,760,767 6/1998 Shore et al. ..
`... 345/328
`5,781,188 7/1998 Amiot et al. ............................ 345/328
`5,861,880
`1/1999 Shimizu et al. ........................ 345/302
`5,889,514 3/1999 Boezeman et al.
`... 345/302
`5,892,506 4/1999 Hermanson ............................. 345/302
`FOREIGN PATENT DOCUMENTS
`0564247A1 10/1993 European Pat. Off..
`0687109A1 12/1995 European Pat. Off..
`
`USOO596971.6A
`Patent Number:
`11
`(45) Date of Patent:
`
`5,969,716
`Oct. 19, 1999
`
`0706124A 4/1996
`WO93/08664 4/1993
`WO93/21635 10/1993
`WO94/16443 7/1994
`WO96/31829 10/1996
`
`European Pat. Off..
`WIPO.
`WIPO.
`WIPO.
`WIPO.
`
`OTHER PUBLICATIONS
`“Advance bei Matador, Fernseh-und Kino-Technik, Vol.
`48, No. 5, May 1, 1994, Heidelberg, DE, pp. 259-260.
`Weitzman et al., “Automatic Presentation of Multimedia
`Documents. Using Relational Grammars”, 1994 ACM Pro
`ceedings, Multimedia 94, pp. 443-451, Oct. 1994, San
`Francisco, California.
`Davis, “Media Streams: An Iconic Visual Language for
`Video Representation”, Proceedings of the 1993 Symposium
`on Visual Languages, pp. 196-220, 1993.
`Adobe After Effects, URL:http://www.adobe.com/prodin
`dex/aftereffects/main.html, http://www.adobe.com/prodin
`deX/aftereffects/details.html#features, 1997.
`Cinebase,
`URL:http://www.cinesoft.com/info/aboutcine
`base/index.html, 1997.
`Primary Examiner Raymond J. Bayerl
`Attorney, Agent, or Firm-Burns, Doane, Swecker &
`Mathis, L.L.P.
`ABSTRACT
`57
`Existing media signals are processed to create new media
`content by defining content representations for the existing
`media and establishing functional dependencies between the
`representations. The content representations constitute dif
`ferent data types which determine the kinds of operations
`that can be performed and dependencies that can be estab
`lished. Among the types of transformation that can be
`achieved are Synchronization, Substitution resequencing
`temporal compression and dilation, and the creation of
`parametric Special effects. The content representations and
`their functional dependencies are combined to construct a
`functional dependency network which causes the desired
`transformations to occur on input media Signals. The inputs
`to the functional dependency network are parametrically
`Specified by media data types to construct a template that can
`be used to create adaptive media productions.
`
`17 Claims, 8 Drawing Sheets
`
`NUYBCELERY FELINE
`
`s
`I
`
`s
`I
`
`A =
`
`2
`
`WER
`
`
`
`WREE
`
`Ly
`
`o
`
`40
`2-66
`
`SCHEB
`
`Akamai Ex. 1009
`Akamai Techs. v. Equil IP Holdings
`IPR2023-00332
`Page 00001
`
`

`

`U.S. Patent
`
`Oct. 19, 1999
`
`Sheet 1 of 8
`
`5,969,716
`
`12
`
`
`
`
`
`
`
`
`
`MEDIA
`INPUT
`
`KEYBOARD
`
`CURSOR
`CONTROL
`
`DISPLAY
`
`PRINTER
`
`NETWORK |
`
`COMM.
`
`27
`
`24
`
`26
`
`28
`
`30
`
`31
`
`FIG 1
`
`MEDIA - PARSER H-CR FIG 2A
`CR-PARSER H-CR FIG 2B
`CR-PRODUCERH-MEDIA FIG 2C
`MEDIA-PRODUCERH - MEDIA FIG 2D
`
`
`
`PHONES
`
`MEDIA
`
`CONTENT
`REPRESENTATIONS
`
`FIG 3
`
`
`
`PROSODY
`
`IPR2023-00332 Page 00002
`
`

`

`U.S. Patent
`
`Oct. 19, 1999
`
`Sheet 2 of 8
`
`5,969,716
`
`MEDIA 1
`
`PARSER
`
`MEDIA 2
`
`PARSER
`
`
`
`
`
`
`
`CONTENT
`REPRESENTATION
`
`CONTENT
`REPRESENTATION
`
`
`
`MEDIA
`
`
`
`
`
`MEDIA-A.
`FCN.
`
`PARSER
`
`CONTENT
`REPRESENTATION
`
`
`
`
`
`PRODUCER
`
`MEDIA
`
`REPRESENTATION
`FIG 5
`
`IPR2023-00332 Page 00003
`
`

`

`U.S. Patent
`
`Oct. 19, 1999
`
`Sheet 3 of 8
`
`5,969,716
`
`76
`
`PREVIEW WINDOW FNix.
`
`TIMELINE WINDOW
`MNM-M1/YM
`VIDEO
`IAI
`A e - || ADO
`
`A O
`
`D
`
`KD
`
`BROWSE
`
`PREMEW/RECORD
`
`
`
`
`
`E. E. BROWSE/EDD
`72
`\\
`T
`| "RENGINA. SRig
`MEDIA
`FILES Resis|| D FILES
`VN-82
`V
`7 N
`BROWSE/SEARCH
`Yinz
`86
`ALVINITISAMCID
`
`SEARCH
`RITERA
`
`APPLY
`FUNCTIONS
`
`84
`
`
`
`
`
`CENCAPSULATD
`
`78
`
`81
`
`
`
`
`
`
`
`
`
`
`
`BROWSE
`SEARC
`
`80
`
`88
`
`FUNCTION BRARY
`OPERATORS ON BUT IN DATA
`TYPES (STREAMS, RECORDS,
`FUNCION AND DATA
`SEQUENCES, NUMBERSETC)
`QUERY PALETE
`PLUG-N
`HIGH EVEL OPERATORS TO
`SYNCHRONIZE, PATTERN MATCH,
`CODE & DATA 2
`HICHER ORDER DATA TYPES
`SUBSTITUTE, ANNOTATE AUDIO,
`PUC-IN
`VIDEO, MUSIC AND TEXT
`META-DATA
`USER-DEFINED FUNCTIONS | USER-DEFINED DATA TYPES
`HUSR-DEN DAATPS
`PLATFORM-SPECIFIC
`(PREPARE FOR
`ERAS QEE) (NUR NERAct N
`
`DATA LIBRARY
`
`FIRST ORDER DATA TYPES
`
`
`
`
`
`ADAPTIVE
`TEMPATES
`
`FIG 6
`
`IPR2023-00332 Page 00004
`
`

`

`U.S. Patent
`U.S. Patent
`
`Oct. 19, 1999
`
`Sheet 4 of 8
`
`5,969,716
`5,969,716
`
`
`
`FUNCTION PALLETTE ONE
`
`FIG 7
`FIG. 7
`
`IPR2023-00332 Page 00005
`
`IPR2023-00332 Page 00005
`
`

`

`U.S. Patent
`U.S. Patent
`
`Oct. 19, 1999
`
`Sheet 5 of 8
`
`5,969,716
`5,969,716
`
`
`
`
`
`vf
`
`c+
`
`
`
`INDIAAYTTIO_GNNN
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IPR2023-00332 Page 00006
`
`IPR2023-00332 Page 00006
`
`

`

`
`
`J O - L
`
`U.S. Patent
`U.S. Patent
`
`Oct. 19, 1999
`Oct. 19, 1999
`
`Sheet 6 of 8
`Sheet 6 of 8
`
`5,969,716
`5,969,716
`
`
`
`S
`
`ge
`
`S &
`
`se
`
`S SE
`
`[|
`
`
`
`
`FIG.9
`
`LI
`LJ
`
`LJ
`
`a L
`
`J
`[|
`
`J
`
`= L
`
`_]
`LI
`L_]
`
`
`
`
`
`IPR2023-00332 Page 00007
`
`
`
`KUNGFUTIMELINE
`
`g
`E
`g
`Se
`
`IPR2023-00332 Page 00007
`
`

`

`U.S. Patent
`
`Oct. 19, 1999
`
`Sheet 7 of 8
`
`5,969,716
`
`
`
`
`
`HOHHX3 JSTOHA
`
`IPR2023-00332 Page 00008
`
`

`

`U.S. Patent
`U.S. Patent
`
`Oct. 19, 1999
`Oct. 19, 1999
`
`Sheet 8 of 8
`Sheet 8 of 8
`
`5,969,716
`5,969,716
`
`
`
`ItOld
`
`YY
`
`SIAOWSAV
`
`S.
`
`IPR2023-00332 Page 00009
`
`IPR2023-00332 Page 00009
`
`

`

`1
`TIME-BASED MEDIA PROCESSING SYSTEM
`
`FIELD OF THE INVENTION
`The present invention is directed to the production,
`transformation, modification, resequencing, and distribution
`of time-based media Signals, Such as Video and audio
`Signals, and more particularly to a media processing System
`that is capable of providing reconfigurable, adaptive media
`productions that can accept, adapt, and/or be adapted to new
`media signals provided by a user, without requiring high
`levels of skill on the user's part. These processes are directed
`to, but not limited to, the motion picture, television, music,
`audio, and on-line content industries.
`
`BACKGROUND OF THE INVENTION
`Today's most advanced media processing Systems are
`mechanical, rather than computational, devices. They
`directly manipulate extents of temporal media in the same
`manner as the first film editing Systems at the dawn of the
`century, and their users are Still required to think that way.
`In order to understand how even the most advanced media
`editing Systems operate, one can imagine a virtual robot arm
`manipulating media according to temporal in and out points.
`A different model of the content being operated upon, and of
`the operations being performed, could result in different
`methods of media production and different kinds of media
`productions. Two historical analogies are illustrative in this
`connection. The first relates to the invention of manufac
`tured interchangeable parts in the process of gun manufac
`ture in the later part of the 18th century. Before the invention
`of interchangeable parts, gun manufacture Suffered from a
`lack of standardization and reusability of components. Every
`part was a unique result of handicraft, rather than a stan
`dardized manufactured component. The invention of manu
`factured interchangeable parts transformed gun production
`from a pre-industrial to an industrial mode of production. In
`the later part of the twentieth century, media production
`methods have yet to achieve the Stage of industrialization
`reached by gun manufacture at the end of the eighteenth
`century. The current invention aims to alter that situation.
`In order for media to be produced by means of the
`manufacture of interchangeable parts, purely mechanical
`modes of production are insufficient. Computational media
`production methods are required, in a manner analogous to
`the invention in the 1980's of computational production
`methods in Software design which enabled the Simple
`definition, creation, and reuse of Software components.
`The ability to quickly, Simply and iteratively produce new
`media content is of Special interest in contexts where movie
`making has been historically hampered by lack of skill and
`resources. In particular, home consumer production of
`movie content suffers from the lack of the following three
`capabilities which are needed to meet these objectives:
`easy-to-use yet powerful composition tools
`access to media content which cannot be produced in the
`home
`tools for producing high-quality Soundtracks (including
`multitrack music, dialogue, narration, and Sound
`effects)
`Another limitation associated with current media process
`ing Systems is the fact that they are poorly Suited for the
`re-use of pre-existing media content. This is especially the
`case in Situations in which the cost and/or difficulty of
`creating new media content exceed the cost and/or difficulty
`of reusing existing media content. For consumers wishing to
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,969,716
`
`2
`participate in media productions, access to existing media is
`of paramount importance given their lack of production
`skill, financial resources, and media assets. Currently, there
`is no mechanism by which pre-existing recordings can be
`efficiently retrieved and combined to present the desired
`effect.
`In Summary, there is a need for a time-based media
`processing System which is capable of providing high
`quality, adaptive media productions without requiring a
`Significant level of skill on the part of the user, and is
`therefore Suited for use by the average consumer. The
`objective of the invention is to enable new efficiencies,
`methods, and forms in the production and distribution of
`media content. The invention also aims to Satisfy a need for
`a media-processing System which facilitates the re-use of
`media content, and indirectly the labor and expertise that
`created it.
`
`SUMMARY OF THE INVENTION
`In pursuit of these objectives, the present invention
`embodies a new paradigm for computational media proceSS
`ing which is comprised of two fundamental components:
`Content Representation
`(automatically, Semi-automatically, and manually gener
`ated descriptive data that represent the content of media
`signals)
`Functional Dependency
`(functional relationships that operate on content represen
`tations and media signals to compute new media
`content)
`The invention combines these two techniques to create
`time-based media processing Systems, which manipulate
`representations of media content in order to compute new
`media content. The invention is intended to Support a
`paradigm shift from the direct manipulation of Simple tem
`poral representations of media (frames, timecodes, etc.), to
`the interactive computation of new media from higher level
`representations of media content and functional dependen
`cies among them. This paradigm of media processing and
`composition enables the production of traditional media
`(e.g., movies, television programs, music videos, etc.) to be
`orders of magnitude faster than current methods. AS Such,
`uses of the invention may have fundamental consequences
`for the current industrial processes of media production,
`distribution, and reuse. By means of content representation
`and functional dependency, the current invention creates a
`production process for computational media components
`which can determine what they contain, and how they can be
`processed, adapted, and reused.
`In accordance with the present invention, a media Signal
`is processed in a media parser to obtain descriptive repre
`Sentations of its contents. Each content representation is data
`that provides information about the media Signal, and is
`functionally dependent on the media Signal. Depending
`upon the particular data type of the content representation,
`different kinds of information can be obtained about the
`media, and different types of operations can be performed on
`this information and the media it is functionally dependent
`upon. Content representations also Support inheritance of
`behavior through directed graph structures (e.g., general to
`Specific) and are composable into new content representa
`tions. For example, an audio signal can be parsed to identify
`its pitch. Higher order parsing can be performed on this
`content representation to obtain additional information
`about the media signal, Such as its prosody (i.e., its pitch
`pattern), or in the case of music, its chord structures.
`
`IPR2023-00332 Page 00010
`
`

`

`3
`Media parsers may operate automatically, Semi
`automatically, or manually. Automatic media parsers require
`no human input in order to produce their content represen
`tations from their input media Signals. Semi-automatic and
`manual media parsers require human input or manual anno
`tation to produce their content representations.
`The information that is obtained from the content repre
`Sentation of a media Signal is fed to a media producer which
`defines a functional relationship between input media Sig
`nals and content representations, to produce the new media
`production. For example, the rate of events of a particular
`Song might be used to control the rate at which a Video signal
`is played, So that events in the Video are Synchronized with
`events in the Song. Alternatively, a Soundtrack can be
`accelerated, decelerated and/or modified to fit it to a Video
`Sequence. In another example, the functional relationship
`can be used to Substitute one item of media for another. For
`instance, original Sounds in a Soundtrack for a Video signal
`can be replaced by a new set of Sounds having similar
`properties, e.g. durations, which correspond to those of the
`original Sounds. In another example, events in a Video or
`audio signal can be detected and used to modify one or both
`media Signals in a particular manner to create special effects.
`In yet another example, Specific media signals can be
`triggered in response to the content of another media Signal
`to, for instance, produce an animation which reacts to the
`Semantic content of an incoming Stream of media Signal with
`its dependent content representation.
`In the System of the present invention, the generation of
`a reconfigurable and adaptive media production is carried
`out in two major phases. In the first phase, a functional
`dependency network is built by a perSon referred to herein
`as a template builder. The functional dependency network
`provides a functional Structure, or template, which outputs
`the ultimate media production. To this end, a multiplicity of
`different media parsers and media producers are employed
`to respectively process different types of media signal S and
`different data types for the content representations. The
`functional dependency network is built by combining
`Selected ones of the media parsers and media producers in a
`manner to process media signals and provide a desired
`functional relationship between them. During the building
`phase, a fixed set of media signals are input to the functional
`dependency network, and the template builder can itera
`tively vary the parsers and producers to obtain a desired
`result using this constant Set of input signals. In addition,
`new content representations and new data types, can be
`defined during this phase. Template builders can re-use
`existing templates in the construction of new ones.
`Once the template has been built, one or more inputs to
`the functional dependency network can be changed from
`constant input Signals to parameters that are defined by their
`data types. The resulting functional dependency network
`with parametric input(s) forms an adaptive template that is
`provided to a template user. In the Second phase of the
`procedure, the template user provides media Signals which
`are of the required data type, to be used as input signals to
`the functional dependency network. These media signals are
`processed in accordance with the functions built into the
`adaptive template to produce a new media production that
`adapts, and/or adapts to, the template users input.
`In an alternative embodiment of the invention, the con
`Stant input Signals need not be changed to parameters once
`the functional dependency network has been defined. In this
`case, a traditional media presentation, i.e. one which is not
`adaptive, is obtained. However, the ability to produce and
`alter the media production in an iterative manner provides a
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,969,716
`
`4
`greater degree of efficiency and automation than more
`traditional methods of media production. In addition, the
`System permits pre-existing media content to be reused in a
`meaningful way.
`As a further feature of the invention, a visual data flow
`interface is provided to facilitate the Selection, combination
`and construction of media parsers and producers in the
`building of the functional dependency network. The
`manipulation of parsers, producers, functions, media
`Signals, data types, and content representations is effected as
`the template builder Selects, drags and connects their iconic
`representations in a graphical data flow network. The func
`tionality provided by the interface is analogous to the
`operation of a Spreadsheet, in the Sense that the network
`builder can Select and place data items, i.e. media Signals, in
`a particular arrangement, and Specify functional dependen
`cies between the data items. The interface displays the input
`Signals, intermediate processing results, and final outputs in
`both a spatial and a temporal manner, to provide ready
`comprehension of the relationships of the media Signals and
`the content representations in the functional dependency
`network. This feature allows the network to be constructed
`in an intuitive manner.
`With the capabilities provided by the present invention,
`data in any particular medium, or combination of media,
`undergoes parsing and/or annotation, and Subsequent func
`tional combination, to construct a template which can pro
`duce new media productions. The new media productions
`may be produced by other template users each providing
`their own media, or by the template builder, to make
`multiple productions with Similar Structures.
`The invention enables consumers to produce movie con
`tent with high production values without the traditionally
`high production costs of training, expertise, and time. The
`invention also enables the creation of a new type of media
`production which can adapt, and adapt to, new media input.
`An example of Such an adaptive media production is a music
`video which can incorporate new video without loss of
`Synchronization, or alternatively adapt its Video content to
`new music. From the Viewpoint of consumers who desire to
`See themselves reflected in movies, Videos, and television
`programs, only simple interactive Selection, rather than
`editing, is required to make or See a media production
`adapted to and/or adapting their own media content.
`These features of the invention, as well as the advantages
`offered thereby, are explained in greater detail hereinafter
`with reference to Specific examples illustrated in the accom
`panying drawings.
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a general block diagram of a computer System
`of the type in which the present invention might be imple
`mented;
`FIGS. 2A-2D are schematic diagrams of the basic opera
`tions that are performed in the context of the present
`invention;
`FIG. 3 is a block diagram of the relationships of different
`types of content representations,
`FIG. 4 is a block diagram of a functional dependency
`network;
`FIG. 5 is a block diagram of an exemplary template;
`FIG. 6 is a block diagram of the architecture of a system
`constructed in accordance with the present invention;
`FIG. 7 is an illustration of a function palette;
`FIG. 8 is an illustration of a user interface for manipu
`lating an audio/video signal to Synchronize its events with
`the events of another audio signal;
`
`IPR2023-00332 Page 00011
`
`

`

`5,969,716
`
`15
`
`25
`
`S
`FIG. 9 is an illustration of a user interface for manipu
`lating an audio/video signal to Substitute new Sounds,
`FIG. 10 is an illustration of a user interface for manipu
`lating a video Signal to create an auto rumble effect; and
`FIG. 11 is an illustration of a user interface for selecting
`new media Signals to produce a new media production from
`an adaptive template.
`DETAILED DESCRIPTION
`To facilitate an understanding of the principles and fea
`tures of the present invention, it is described hereinafter with
`reference to particular examples of media content and pro
`cessing. In particular, the analysis and transformation of
`various video and audio Streams are described in the context
`of Simple, readily comprehensible implementations of the
`invention. It will be appreciated, however, that the practical
`applications of the principles which underlie the invention
`are not limited to these Specific examples. Rather, the
`invention will find utility in a wide variety of situations and
`in connection with numerous different types of media and
`production contexts.
`In general, the present invention is directed to the pro
`cessing and transformation of various types of media
`Signals, to generate new media content. The particular
`hardware components of a System in which the following
`principles might be implemented do not form part of the
`invention itself. However, an exemplary computer System is
`briefly described herein to provide a thorough understanding
`of the manner in which the features of the invention coop
`erate with the components of Such a System to produce the
`desired results.
`Referring to FIG. 1, a computer System includes a com
`puter 10 having a variety of external peripheral devices 12
`connected thereto. The computer 10 includes a central
`processing unit 14 and associated memory. This memory
`generally includes a main memory which is typically imple
`mented in the form of a random acceSS memory 16, a Static
`memory that can comprise a read only memory 18, and a
`permanent Storage device, Such as a magnetic or optical disk
`20. The CPU 14 communicates with each of these forms of
`memory through an internal buS 22. Data pertaining to a
`variety of media Signals can be Stored in the permanent
`storage device 20, and selectively loaded into the RAM 16
`as needed for processing.
`The peripheral devices 12 include a data entry device Such
`as a keyboard 24, a pointing or cursor control device 26 Such
`as a mouse, trackball, pen or the like, and Suitable media
`input devices 27, Such as a microphone and a camera. An
`A/V display device 28, such as a CRT monitor or an LCD
`50
`Screen, provides a visual display of Video and audio infor
`mation that is being processed within the computer. The
`display device may also include a set of speakers (not
`shown) to produce audio Sounds generated in the computer.
`A permanent copy of the media Signal can be recorded on a
`Suitable recording mechanism 30, Such as a video cassette
`recorder, or the like. A network communications device 31,
`Such as a modem or a transceiver, provides for communi
`cation with other computer Systems. Each of these periph
`eral devices communicates with the CPU 14 by means of
`one or more input/output ports 32 on the computer.
`In the processing of media Signals in accordance with the
`present invention, four fundamental types of operations are
`performed. Referring to FIG. 2A, one type of operation is to
`parse an original media signal into a content representation
`of that Signal. The original media Signal comprises data
`which defines the content of the Signal. In the case of an
`
`6
`audio signal, for example, that data comprises individual
`Samples of the amplitude of an audio pressure wave. In the
`case of a Video signal, that data might be the values of the
`individual pixels that make up the frames of the Signal.
`In a first order parser, the original media data is processed,
`or analyzed, to obtain new data which describes one or more
`attributes of the original data. The new data, and its corre
`sponding type information, is referred to herein as content
`representation. For instance, in the case of an audio signal,
`one type of first order parser can produce output data which
`describes the fundamental frequency, or pitch of the Signal.
`A first order parser for Video might indicate each time that
`the Video image Switches to a different camera shot. Various
`types of media signals will have associated forms of content
`representation. For example, a speech Signal could be rep
`resented by the individual Speech components, e.g., phones,
`which are uttered by the Speaker. In this regard, reference is
`made to U.S. patent application Ser. No. 08/620,949, filed
`Mar. 25, 1996, for a detailed discussion of the annotation
`and transformation of media signals in accordance with
`Speech components. Video signals can likewise be analyzed
`to provide a number of different forms of content represen
`tation. In this regard, reference is made to Davis, “Media
`Streams: Representing Video for Retrieval and
`Repurposing”, Ph.D. thesis submitted to the Program in
`Media Arts and Sciences, Massachusetts Institute of
`Technology, February 1995, particularly at Chapter 4, for a
`detailed discussion of the content representation of Video.
`The disclosure of this thesis is incorporated herein by
`reference thereto.
`The parsing of a media Signal to generate a content
`representation can be carried out automatically, Semi
`automatically, or manually. For instance, to manually parse
`a Video signal to identify different camera shots, a human
`observer can view the Video and annotate the frames to
`identify those in which the camera shot changes. In an
`automatic approach, each frame can be analyzed to deter
`mine its color histogram, and a new shot can be labeled as
`one in which the histogram changes from one frame to the
`next by a prespecified threshold value. In a Semiautomatic
`approach, the viewer can manually identify the first few
`times a new shot occurs, from which the System can deter
`mine the appropriate threshold value and thereafter auto
`matically detect the new camera angles.
`Referring to FIG. 2B, in the second fundamental type of
`operation, a content representation is processed in a Second
`or higher order parser to generate additional forms of
`content representation. For example, the pitch content rep
`resentation of an audio signal can be parsed to indicate
`properties of its prosody, i.e. whether the pitch is rising or
`falling. In the case of a Video Signal, a first order content
`representation might compute the location of a colored
`object using the color of pixels in a frame, while a Second
`order parser might calculate the Velocity of that object from
`the first order representation. In another Video example,
`higher order parsing of the shot data can produce content
`representations which identify Scene boundaries in a
`Sequence of shots according to continuity of diegetic (i.e.
`Story) time and location. These types of content represen
`tation may depend on aspects of human perception which
`are not readily computable, and therefore manual and/or
`Semi-automatic annotation might be employed.
`Each different form of content representation employs a
`data type whose data values are functionally dependent upon
`the data of the media Signal. These data types effectively
`define a component architecture for all media Signals. In this
`regard, different representations can have a hierarchical or
`
`35
`
`40
`
`45
`
`55
`
`60
`
`65
`
`IPR2023-00332 Page 00012
`
`

`

`7
`peer-to-peer relationship to one another. Referring to FIG. 3,
`different content representations produced by first-order
`parsing of a given media Signal have a peer-to-peer rela
`tionship. Thus, pitch data and phone data derived from
`parsing a Speech Signal are peers of one another. Content
`representations which are produced by higher order parsers
`may have a hierarchical relationship to the content repre
`Sentations generated by lower-order parsers, and may have
`a peer-to-peer relationship to one another. Hence, prosody
`data is hierarchically dependent on pitch data. The data type
`inherently defines the types of content representations and
`media Signals that a parser or producer can compute, and in
`what manner. Based on this information, desired functional
`dependencies can be established between different content
`representations and media signals to generate new media
`content from a template.
`Referring to FIG. 2C, a third type of operation is the
`processing of content representations to produce a new
`media Signal. In this type of operation, the data of the
`content representation might be an input parameter to a
`media producer which causes a media signal to be generated,
`for example, a Synthetic media Signal may be rendered from
`its content representation, Such as computer animation
`parameters or MIDI Sequences, respectively. In the fourth
`type of operation, depicted in FIG. 2D, a media signal is
`transformed in accordance with a defined media producer to
`produce new media Signals.
`These fundamental operations define two basic types of
`operators that are employed in the present invention. AS used
`herein, a media parser is an operator which produces content
`representation as its output data, whether the input data is
`media data, i.e. a first-order parser, or another form of
`content representation as in Second and higher order parsers.
`A media producer, on the other hand, is an operator which
`transforms input data to produce a media Signal as its output
`data.
`In the context of the present invention, these operators are
`Selectively combined to build a functional dependency net
`work. A simple example of a functional dependency network
`is illustrated in FIG. 4. Referring thereto, the functional
`dependency network receives one or more media signals as
`input Signals, and parses these input Signals to generate
`content representations for each. The media signals which
`are input to the functional dependency network could be
`retrieved from a storage medium, Such as the hard disk 20,
`or they can be real-time signals. The content representations
`and media signals are processed in a media producer to
`generate a new media signal. In the context of the present
`invention, a multitude of different kinds of transformations
`can be performed on media Signals within the functional
`dependency network. One example of a media transforma
`tion includes Synchronization, in which the events in one
`media Signal are Synchronized with events in another media
`Signal, e.g. by varying their playback rates. Another type of
`transformation comprises Sound Substitution, Such as foley
`in traditional motion picture production, in which one type
`of Sound is Substituted for another type of Sound in an
`audio/video signal. A third type of processing is the modi
`fication of a media Signal in accordance with another media
`Signal, to produce parametric Special effects. A fourth type of
`processing is the triggering of a Specific media Signal in
`accord with another media signal to, for example, produce a
`reactive animation to an incoming Stream of media signal
`with its dependent content representation. For example, an
`animated character may respond to content representations
`parsed in real-time from live closed-captioned tex

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket