`
`
`
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`____________
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
` ____________
`UBISOFT ENTERTAINMENT SA,
`Petitioner
`v.
`PRINCETON DIGITAL IMAGE CORPORATION,
`Patent Owner
`____________
`
`Case No. TBD
`Patent No. 5,513,129
` ____________
`
`
`
`
`DECLARATION OF STEPHEN T. POPE
`
`
`
`
`
`
`
`
`
`
`
`
`
`1
`
`PETITIONERS EX. 1007 Page 1
`
`
`
`I, Stephen T. Pope, hereby declare the following:
`BACKGROUND AND EDUCATION
`I.
`1.
`I am currently active as chief technology officer of FASTLab and as a
`
`multimedia software design/development consultant. I am an expert in the field of
`
`interactive computer graphics, and especially audio-controlled virtual objects (see
`
`my projects from the years 1988-94).
`
`2.
`
`I have a B.S. degree in Electrical Engineering from Cornell University
`
`in Ithaca, New York (1977), Honors Certificates in Recording Engineering and
`
`Electroacoustic Music from the Vienna Music Academy in Vienna, Austria (1980),
`
`and Honors Certificates in Music Theory and Composition, Form and Analysis,
`
`Music History, and Orchestration from the Academy of Music and the Performing
`
`Arts “Mozarteum” in Salzburg, Austria (1984). Although I discuss my expert
`
`qualifications in more detail below, I also attach as [Appendix A] a recent and
`
`complete curriculum vitae, which details my educational and professional
`
`background and includes a listing of most of my publications.
`
`3. My research has concentrated on models and
`
`languages for
`
`multimedia (sound/image) processing, graphics and user interface software,
`
`immersive virtual reality systems, tools for distributed real-time software, and
`
`signal analysis and statistical processing for music information retrieval. From
`
`1981 to 1986 I was the systems administrator at the Computer Music Center of
`
`Mozarteum. I performed software development, UNIX operating system
`
`
`
`2
`
`PETITIONERS EX. 1007 Page 2
`
`
`
`programming, development of systems and applications for artificial intelligence
`
`applications, graphical user interfaces, and teaching activities. From 1982 to 1986 I
`
`was also active as a lecturer on the faculty of the Composition Department at the
`
`Academy of Music and the Performing Arts “Mozarteum” in Salzburg, Austria.
`
`4.
`
`In the years from 1986 to 1995 I was a postdoctoral research associate
`
`and composer first at the Stanford Center for Computer Research in Music and
`
`Acoustics (CCRMA) in Palo Alto, and then at the Center for New Music and
`
`Audio Technologies (CNMAT) at the University of California, Berkley. My
`
`project work there was in composition and multimedia software development.
`
`5.
`
`From 1988 to 1997, I served as editor-in-chief of Computer Music
`
`Journal, published quarterly by the MIT Press.
`
`6.
`
`During 1990-91, I aided in the development of new performance
`
`interfaces, composition and concert activity and worked as a visiting composer and
`
`researcher at both the STEIM Institute in Amsterdam, Netherlands and the Center
`
`for Art and Media Technology at the University of Utrecht, Netherlands. From
`
`1992-93, I worked at the Swedish Institute for Computer Science as a guest
`
`researcher in the DIVE Virtual Reality group and also as visiting composer at the
`
`Swedish Electronic Music Studio EMS.
`
`7.
`
`From 1995-2010 I worked at the University of California, Santa
`
`Barbara (UCSB), serving as senior continuing lecturer for the Graduate Program of
`
`
`
`3
`
`PETITIONERS EX. 1007 Page 3
`
`
`
`Media Art and Technology (which I co-founded) and as graduate lecturer in the
`
`UCSB Department of Computer Science. At UCSB I developed and taught
`
`required and graduate courses for composers and computer scientists including:
`
`Algorithms for Media Processing, Computing with Media Data, Media Software
`
`Engineering, and multiple courses on Digital Audio Programming.
`
`8.
`
`In 2000 I was the first-ever “Edgard Varese Visiting Professor” in
`
`computer music for the Department of Communication at the Technical University
`
`of Berlin.
`
`9.
`
`I have over 100 publications issued from the early 1980s to the
`
`present on topics related to artificial intelligence, graphics and user interfaces,
`
`virtual reality systems, integrated programming environments, object-oriented
`
`programming, music theory and composition, distributed systems, and digital
`
`multimedia.
`
`10. As a result of my experience, I have been a member on numerous
`
`Media Art and Technology Program faculty committees, thesis committees, and a
`
`habilitation committee. In addition to teaching at UCSB, I have advised graduate
`
`students.
`
`11.
`
`In addition to my academic work, I have extensive industry
`
`experience related to computer software for multimedia applications. From 1972 to
`
`1975 and 1990 to 1993 I worked for Eventide Clockworks, New Jersey on
`
`
`
`4
`
`PETITIONERS EX. 1007 Page 4
`
`
`
`construction, prototyping and custom projects for digital signal processing devices
`
`for audio, and graphical development tools for assembly-language programming.
`
`12. At PCS/Cadmus Computers in Munich (1983-86), I was the manager
`
`and lead programmer of the artificial intelligence and graphics software teams. I
`
`worked on the design/development of C, LISP and Smalltalk-80 software for
`
`graphics and window systems, and AI tools and applications. I participated as a
`
`group manager and planner in European-funded R&D projects for graphics and AI.
`
`13.
`
`In 1986 I began work as a software developer and team manager at
`
`Xerox Palo Alto Research Center, (later ParcPlace Systems, Inc.), where I was
`
`responsible for kernel code, user interface frameworks and developers tools, as
`
`well as implementing distributed processing environments until 1994.
`
`14. Starting in the late 1990s, I have worked DBA FASTLab as a
`
`development and management consultant/contractor for teams building multimedia
`
`and numerical signals processing and data networking software.
`
`15. From 2010-13 I was Chief Technology Officer of Imagine Research,
`
`Inc., delivering “software that listens,” meaning audio analysis solutions for sound
`
`object recognition, content labeling and segmentation, and applications that profit
`
`from intelligent sound/music processing. Imagine Research was acquired by
`
`iZotope, Inc. in 2013 and I was bought out.
`
`
`
`5
`
`PETITIONERS EX. 1007 Page 5
`
`
`
`16.
`
`In sum, I have over 30 years of experience in research and
`
`development of audio in computer graphics systems and applications as a
`
`professor, researcher and consultant. During this time, I have worked extensively
`
`with spatial sound and music processing, graphics and user interfaces, developing
`
`systems for artificial intelligence applications, and generally in software research
`
`and development.
`
`17. At the time of the patent in question in the current proceedings, I had
`
`been in the specific field of state-of-the-art multimedia software for VR systems
`
`for five years, working with the trend-setting groups at Xerox PARC, Stanford, UC
`
`Berkeley and SICS Stockholm.
`
`18.
`
`I am submitting this declaration to offer my independent expert
`
`opinion concerning certain issues raised in the petition for inter partes review
`
`(“Petition”). My compensation is not based on the substance of the opinions
`
`rendered here. As part of my work in connection with this matter, I have studied
`
`U.S. Patent No. 5,513,129 (the “‘129 patent”), including the respective written
`
`descriptions, figures, claims, and portions of the file history. In addition, I have
`
`reviewed the Petition for Inter Partes Review of the ‘129 patent. I have also
`
`carefully considered the following references:
`
`• U.S. Patent No. 5,208,413 to Tsumura, et al., (“Tsumura”), entitled
`“Vocal Display Device,” filed on December 5, 1991 and issued on May
`4, 1993 [Exhibit 1002]
`
`
`
`6
`
`PETITIONERS EX. 1007 Page 6
`
`
`
`• W.T. Lytle, “Driving Computer Graphics Animation from a Musical
`Score.” Scientific Excellence in Supercomputing, the IBM 1990
`Contest Prize Papers, vol. 2. (The Baldwin Press, The University of
`Georgia, Athens, Georgia, 1992. Print) [Exhibit 1003]
`
`• U.S. Patent No. 5,048,390 to Adachi, et al., (“Adachi”), entitled “Tone
`Visualizing Apparatus,” filed on September 1, 1988 and issued on
`September 17, 1991 [Exhibit 1004]
`
`• U.S. Patent No. 5,430,835 to Williams, et al., (“Williams”), entitled
`“Method and Means for Computer Synchronization of Actions and
`Sounds,” filed on May 26, 1994 as a continuation of Ser. No. 07/656,297
`(filed on February 15, 1991), and issued on July 4, 1995 [Exhibit 1005]
`
`
`
`
`
`• Daniel Thalmann, “Using Virtual Reality Techniques in the Animation
`Process,” published in Proc. Virtual Reality Systems, British Computer
`Society (1992) [Exhibit 1006]
`
`
`II. OPINION
`A. Level of a Person Having Ordinary Skill in the Art
`19.
`In determining the characteristics of a hypothetical person of ordinary
`
`skill in the art of the ‘129 Patent at the time of the claimed invention, I considered
`
`several factors, including the type of problems encountered in the art, the solutions
`
`to those problems, the rapidity with which innovations are made in the field, the
`
`sophistication of the technology, and the education level of active workers in the
`
`field. I also placed myself back in the time frame of the claimed invention, and
`
`considered colleagues with whom I had worked at that time. In my view, a person
`
`of ordinary skill in the field of audio-controlled virtual objects in 1993 would have
`
`a B.S. in electrical engineering, computer engineering, computer science or related
`
`engineering discipline or equivalent experience and at least two years experience in
`
`
`
`7
`
`PETITIONERS EX. 1007 Page 7
`
`
`
`practical or post-graduate work in the area of computer-generated animations
`
`and/or graphics or equivalent experience or education. The person would also have
`
`some knowledge of media processing and digital audio programming. Based on
`
`my education, training, and professional experience in the field of the claimed
`
`invention, I am familiar with the level and abilities of a person of ordinary skill in
`
`the art at the time of the claimed invention.
`
`20.
`
`I have been informed that Patent Owner contends that a person of
`
`ordinary skill in the art would also have experience with virtual reality systems,
`
`without which one “would not have known how to generate a virtual environment
`
`from audio signals or a control track generated from audio signals.” Ex. 1008,
`
`IPR2014-00155, Paper No. 9 (Patent Owner Preliminary Response) at 41. I
`
`disagree that the definition of “virtual reality” used in the ‘129 patent is so limited,
`
`as discussed below in paragraph 24, among others. Nevertheless, I qualify as one
`
`of ordinary skill in the art even according to Patent Owner’s definition.
`
`B.
`
`Background of Audio Controlled Virtual Objects
`
`21. The filing date of the ‘129 Patent is July 14, 1993. Virtual reality
`
`systems that incorporated the concepts described and claimed in the ‘129 Patent
`
`were well-known in the art before the priority date of the ‘129 Patent. By the late
`
`1970s notes and chords were being used to control the color and geometry of
`
`shapes on a display. See e.g., Appendix B, Mitroo, et al., “Movies from Music:
`
`
`
`8
`
`PETITIONERS EX. 1007 Page 8
`
`
`
`Visualizing Musical Compositions,” (1979), (“Mitroo”) at 218-219. In another
`
`example, an audio source could be connected to a television on which an object
`
`would be displayed that varied both in shape and color in response to the
`
`characteristics of the audio source. See e.g., Appendix C, U.S. Patent No.
`
`4,081,829 to Brown at 1:31-35, 2:66-3:8, 4:6-14. Thus, as early as the 1970s it was
`
`not only a vision to use an audio signal as an input to generate a display – it was a
`
`reality. See e.g., Appendix B, Mitroo at 220.
`
`22. This concept evolved in the mid-1980s when a system called Music
`
`Animation Machine was born. This system generated animated bar graph scores
`
`using data specific to the notes of the song. Music Animation Machine First
`
`Demonstration Reel Spring 1990, Stephen Malinowski, 1990, VHS. This
`
`technology led to many advancements in audio controlled visual displays and
`
`virtual objects throughout the 1990s and even the 2000s. Similarly, speech
`
`recognition has been used in a virtual reality context since the 1980s. Voice
`
`commands, optionally augmented with gesture recognition, were used to control
`
`the color, size, shape, and even the location of objects on a display. See e.g.,
`
`Appendix D, Bolt, “‘Put-That-There’: Voice and Gesture at the Graphics
`
`Interface,” (1980) at 262, 265-269. In the 1980s computer generated facial
`
`expressions were synchronized to speech. See e.g., Appendix E, Hill, et al.,
`
`“Animating Speech: An Automated Approach Using Speech Synthesised By Rules,”
`
`
`
`9
`
`PETITIONERS EX. 1007 Page 9
`
`
`
`(1988) at 277. In order to animate the face together with the voice, lip and jaw
`
`movements were controlled by pre-programmed parameters. Hill at 283, 285-287.
`
`23. The DIVE Virtual Reality system, which I was involved in the early
`
`development of, was the first sophisticated open-source cross-platform distributed
`
`virtual reality system; it was state of the art in the early 1990s. In the DIVE
`
`system, users were represented by graphical objects and could navigate and
`
`interact with other users and applications in the virtual environment using input
`
`devices, such as a 6D mouse and a head-mounted display. The DIVE system was
`
`widely disseminated as open source software across the world.
`
`F. Virtual Reality According to the Asserted Patent
`
`24.
`
`In the background of the asserted patent, virtual reality is described as
`
`“a computer-simulated environment (intended to be immersive) which includes a
`
`graphic display (from a user’s first person perspective, in a form intended to be
`
`immersive to the user), and optionally also sounds which simulate environmental
`
`sounds.” Ex. 1001, ‘129 Patent at 1:22-28. While the ‘129 Patent describes using a
`
`head-tracking system and/or input devices that interact with the VR systems (e.g.,
`
`instrument gloves with sensors, six-degree-of-freedom trackers, etc.), as was
`
`common for virtual reality systems at the time of the alleged invention, the ‘129
`
`Patent also discloses that neither a head-tracking system nor the various input
`
`devices are required. See e.g., Ex. 1001, ‘129 Patent at 8:18-32, 18:3-8, Claim 2.
`
`
`
`10
`
`PETITIONERS EX. 1007 Page 10
`
`
`
`Further, the ‘129 Patent discloses that a “virtual environment” may be displayed on
`
`a non-stereoscopic, two-dimensional display on a flat screen. ‘129 Patent at 1:34-
`
`35, 8:7-13. Indeed, the embodiments of the asserted patent, for example, pertain to
`
`analyzing music and controlling the virtual environment of a dancer dancing,
`
`displaying cylinders that change height in response to the control track, and
`
`displaying lyrics together with the word vocalized in a song. ‘129 Patent at 12:17-
`
`26; 18:16-37, Fig. 11; 18:38-56. Thus, the exemplary “virtual environments” are
`
`not characterized in terms of the proposed virtual reality system and further do not
`
`require the aspects of the computer-simulated environment set forth in the
`
`background of the ‘129 patent, namely an “inten[t] to be immersive” or a “first-
`
`person perspective.”
`
`G. Thalmann and Williams
`
`25.
`
`I have been asked to consider whether claims 1-6, 8, 9, 12, 13, 15-19
`
`and 21 are obvious over Thalmann in view of Williams. It is my opinion that they
`
`are indeed obvious and that the combination of Thalmann and Williams teaches all
`
`elements of claims 1-6, 8, 9, 12, 13, 15-19 and 21 as set forth in the claim charts
`
`for the combination of Thalmann and Williams in the Petition.
`
`26. Thalmann and Williams both describe systems relating to, for example,
`
`controlling the display of a computer based on an audio signal. Ex. 1006,
`
`Thalmann at 1; Ex. 1005, Williams at Abstract.
`
`
`
`11
`
`PETITIONERS EX. 1007 Page 11
`
`
`
`27. For example, Thalmann specifically describes using audio input as a
`
`way of interactively controlling animation, such as facial animation to depict
`
`speech of an animated character in a virtual world. Ex. 1006, Thalmann at 4-5.
`
`The animated character can be viewed on a stereo display or head-mounted display
`
`to immerse users in a computer-generated world. Thalmann at 1.
`
`28. Williams similarly describes a system in which a sound recording is
`
`analyzed to associate actions with a time in the sound recording. Ex. 1005,
`
`Williams at 4:36-63. Specifically, as one example, Williams describes analyzing
`
`the frequency, intensity or percussive sounds of a recording to determine whether a
`
`predetermined action should be associated with that particular time position of the
`
`recording. Williams at 4:36-48. One predetermined action is that the mouth of a
`
`character on a screen changes depending on the analysis of the sound recording.
`
`Williams at 4:13-27. Williams further describes arm movements, birds flying and
`
`candlesticks appearing as animations that can be used with its system. Williams at
`
`4:38-36. These determinations can be performed automatically by a computer
`
`program. Williams at 4:46-48.
`
`29. By the time of the purported invention of the ‘129 Patent, it was well
`
`known to those of ordinary skill in the art that a computer system – virtual reality
`
`or otherwise – could be used to control production of a virtual environment. This
`
`concept is expressly described by Thalmann. Ex. 1006, Thalmann. Thalmann, in
`
`
`
`12
`
`PETITIONERS EX. 1007 Page 12
`
`
`
`my opinion, therefore, discloses a method for controlling production of a virtual
`
`environment by a virtual reality computer system. In particular, Thalmann
`
`describes that new technologies allow for the creation of computer-generated
`
`worlds, i.e., virtual environment, that result in an immersive experience for a user.
`
`Thalmann at 1. Examples of devices used to create this immersive computer-
`
`generated 3-D world included stereo displays, head-mounted displays and audio
`
`inputs devices. Thalmann at 2-4.
`
`30. As noted above, one of the ways in which Thalmann describes
`
`controlling a virtual world is through audio input. In this manner, an audio signal
`
`may be used to control an animation in real-time, e.g., facial changes of 3-D
`
`characters, that are generated on a display that may be a head-mounted display.
`
`Thalmann at 4-6.
`
`31. While Thalmann expressly contemplates processing audio input in
`
`real-time to generate an animated display in a virtual world, it does not expressly
`
`describe how the audio signals are to be processed. Williams does.
`
`32. For example, Williams discloses various ways to synchronize sound to
`
`facial expressions and also automatically associating a change in facial expression
`
`based on an audio signal. Williams discloses that the association can be
`
`automatically based on different sound features, such as intensity, frequency,
`
`percussive or fricative sounds. Williams at 4:37-48.
`
`
`
`13
`
`PETITIONERS EX. 1007 Page 13
`
`
`
`33. Upon reading the disclosure of Williams, a skilled artisan would have
`
`recognized that modifying Thalmann to be used with the audio processing
`
`described in Williams would be desirable in order to achieve Thalmann’s stated
`
`goal of using audio input to drive animation in a virtual world. This modification
`
`would not affect the operation of Thalmann and, in reality, would enhance the user
`
`experience as expressly recognized by Thalmann.
`
`34. The combination of Thalmann and Williams is nothing more than
`
`using the audio processing technology of Williams to implement the embodiment
`
`described in Thalmann of using audio to animate a virtual world. This
`
`combination, therefore, simply expands upon the teachings of Thalmann and
`
`would have yielded predictable results – 3-D animations in a virtual world that are
`
`driven, and controlled, by an audio input – without undue experimentation.
`
`35. This would have been natural and nothing more than the application of
`
`ordinary skill and common sense to combine the audio processing of Williams
`
`with the virtual reality system disclosed by Thalmann to present a 3-D virtual
`
`world that is controlled, at least in part, by an audio signal.
`
`36. Accordingly, it is my opinion that it would have been obvious to a
`
`person having ordinary skill in the art to combine Thalmann with Williams. This
`
`combination could have been accomplished using known methods in the art and
`
`would have yielded predictable results. The combination of Thalmann and
`
`
`
`14
`
`PETITIONERS EX. 1007 Page 14
`
`
`
`Williams, therefore, in my opinion, renders obvious claims 1-6, 8, 9, 12, 13, 15-19
`
`and 21 of the ‘129 Patent.
`
`H. Tsumura and Williams
`
`37.
`
`I have been asked to consider whether claims 16-20 are obvious over
`
`Tsumura in view of Williams. It is my opinion that they are indeed obvious and
`
`that the combination of Tsumura and Williams teaches all elements of claims 16-
`
`20 as set forth in the claim charts for the combination of Tsumura and Williams in
`
`the Petition.
`
`38. Tsumura and Williams both describe systems relating to, for example,
`
`controlling the display of a computer based on data relating to a sound recording.
`
`Ex. 1002, Tsumura at Abstract; Ex. 1005, Williams at Abstract.
`
`39. For example, Tsumura specifically describes correlating vocal data,
`
`such as strength and pitch, with lyric position data. Ex. 1002, Tsumura at 1:27-38,
`
`2:40-3:39. The correlation between the data is stored on a memory and is
`
`subsequently displayed on a screen. Tsumura at 1:38-47, 3:40-5:2. Essentially,
`
`Tsumura presents a karaoke device that depicts not only the lyrics to be sung, but
`
`also the pitch at which the lyrics are to be sung. Tsumura also discloses detecting
`
`the strength and basic frequency of an actual voice, which is compared to the vocal
`
`data on a memory. The result of that comparison is displayed on a screen.
`
`Tsumura at 1:48-61, 8:16-10:5. Tsumura specifically discloses a frequency
`
`
`
`15
`
`PETITIONERS EX. 1007 Page 15
`
`
`
`analyzer for determining the basic frequency a user’s vocal performance. Tsumura
`
`at 8:30-63.
`
`40. As described above, Williams similarly describes a system in which a
`
`sound recording is analyzed to associate actions with a specific time in the sound
`
`recording. Ex. 1005, Williams at 4:36-63. Specifically, as one example, Williams
`
`describes analyzing the frequency, intensity or percussive sounds of a recording to
`
`determine whether a predetermined action should be associated with that particular
`
`time position of the recording. Williams at 4:36-48. One predetermined action is
`
`that the mouth of a character on a screen changes depending on the analysis of the
`
`sound recording. Williams at 4:13-27. These determinations can be performed
`
`automatically by a computer program or manually by a programmer. Williams at
`
`4:37-48. The associations of predetermined actions and sounds are synchronized
`
`and stored, i.e., prerecorded, on a memory device that can be accessed and
`
`displayed by a computer system. Williams at 5:29-33, 7:17-24.
`
`41. By the time of the purported invention of the ‘129 Patent, it was well
`
`known to those of ordinary skill in the art that a computer system – virtual reality
`
`or otherwise – could be used to control production of a virtual environment. This
`
`concept is expressly described by Tsumura. Ex. 1002, Tsumura.
`
`42. As noted above, one of the ways in which Tsumura describes
`
`controlling a virtual world is through audio input, including a voice. In this
`
`
`
`16
`
`PETITIONERS EX. 1007 Page 16
`
`
`
`manner, an audio signal representing a user’s voice may be used to control the
`
`output of a display, e.g., prompting messages relating to pitch. Tsumura at 12:35-
`
`13:10. To accomplish this, Tsumura uses a frequency analyzer to determine the
`
`basic frequency of a live music signal. Tsumura at 12:35-13:10
`
`43. While Tsumura expressly contemplates storing vocal data and lyric
`
`data on a memory, it does not describe using the frequency analyzer for
`
`prerecording a control track containing this data. However, Williams does and it
`
`would have been obvious to combine the two.
`
`44. For example, Williams discloses various ways to synchronize sound to
`
`facial expressions and also automatically associating a change in facial expression
`
`based on different sound features, such as intensity, frequency, percussive or
`
`fricative sounds. Williams at 4:37-48. Williams also specifically discloses that
`
`this association between frequency, for example, and a predetermined action, such
`
`as change in facial features, can be stored on a memory device. Williams at 5:29-
`
`7:18-24.
`
`45. Upon reading the disclosure of Williams, a skilled artisan would have
`
`recognized that modifying Tsumura to use its frequency analyzer for prerecording
`
`vocal data at a current lyric position as described in Williams would be desirable.
`
`46. The combination of Tsumura and Williams is nothing more than using
`
`the means for prerecording a control track in Williams to implement the system
`
`
`
`17
`
`PETITIONERS EX. 1007 Page 17
`
`
`
`described in Tsumura for providing vocal and lyric data to a computer that is then
`
`displayed. One of ordinary skill would have achieved this modification by merely
`
`using the frequency analyzer of Tsumura to prerecord the frequency data at a lyric
`
`position. This combination, therefore, simply expands upon the teachings of
`
`Tsumura and would have yielded predictable results – 3-D animations in a virtual
`
`world that are driven, and controlled, by a control track – without undue
`
`experimentation.
`
`47. This would have been natural and nothing more than the application of
`
`ordinary skill and common sense to combine the use of frequency data for
`
`prerecording a control track in Williams with the virtual reality system disclosed
`
`by Tsumura to present a 3-D virtual world that is controlled, at least in part, by a
`
`control track having music information.
`
`48. Accordingly, it is my opinion that it would have been obvious to a
`
`person having ordinary skill in the art to combine Tsumura with Williams. This
`
`combination could have been accomplished using known methods in the art and
`
`would have yielded predictable results. The combination of Tsumura and
`
`Williams, therefore, in my opinion, renders obvious claims 16-20 of the ‘129
`
`Patent.
`
`I. Lytle and Adachi
`
`
`
`18
`
`PETITIONERS EX. 1007 Page 18
`
`
`
`49.
`
`I have been asked to consider whether claims 1, 8, 12, 13, 15 and 21
`
`are obvious over Lytle in view of Adachi. It is my opinion that they are indeed
`
`obvious.
`
`50.
`
`It is my opinion that the combination of Lytle and Adachi teaches all
`
`elements of claims 1, 8, 12, 13, 15 and 21 as set forth in the claim charts for the
`
`combination of Adachi and Lytle in the Petition.
`
`51. Lytle and Adachi both describe systems relating to controlling a
`
`computer based on music or audio signals. Ex. 1004, Adachi at Abstract; Ex.
`
`1003, Lytle at 644.
`
`52.
`
`In particular, Lytle describes a method for algorithmically controlling
`
`animations in a three-dimensional virtual world from an original musical score. Ex.
`
`1003, Lytle at Abstract. Lytle had access to the MIDI data for his musical score,
`
`having been generated from synthesizers connected to a personal computer, and
`
`used this to animate three-dimensional objects, such as musical instruments. Lytle
`
`at 646, 649-650. Thus, it was not necessary for Lytle to process an audio signal to
`
`extract data indicative of the audio signal. However, Lytle recognized that using
`
`musical sound was possible if the sound were analyzed and data was extracted, but
`
`it was not required by his system. For example, Lytle recognized that non-
`
`electrical instruments could be used in his system because other methods of
`
`encoding musical data existed at the time, including translating pitch to MIDI.
`
`
`
`19
`
`PETITIONERS EX. 1007 Page 19
`
`
`
`Lytle at 651, 667. As he envisioned an “integrated development process” in which
`
`the animated work was co-composed with the music, it would only be natural to
`
`extend his system to one in which musical sound is analyzed and used to populate
`
`and automatically control the actions of the three-dimensional animated objects.
`
`Lytle at 667. All that would be needed, therefore, is to couple a method for
`
`extracting data from music with the system of Lytle.
`
`53. Adachi similarly describes generating a three-dimensional display,
`
`including a stereoscopic display, based on the characteristics of an audio signal.
`
`Ex. 1004, Adachi at 5:22-64, 12:21-48, Fig. 10. In Adachi, an audio signal
`
`representative of musical tone in input to an envelope detecting circuit, which
`
`detects the musical tone parameter and produces an envelope signal that
`
`corresponds to the audio signal. Adachi at 5:22-27. The musical tone parameter
`
`can include tone color, tone volume or frequency. Adachi at 5:65-6:2. The
`
`envelope signal is converted to a digital signal and supplied to a CPU that, in turn,
`
`supplies the signal to a display circuit. Adachi at 5:28-33. The display circuit
`
`displays an image, such as a bicycle, train or band, and in the case of a three-
`
`dimensional display modifies the scale of the image in response to the amplitude of
`
`the envelope signal. Adachi at 30-64. One of the various types of audio signals
`
`described by Adachi is that produced from a musical instrument such as piano or
`
`guitar. Adachi at 12:21-48, Fig. 10. In one variation of Adachi, the audio signal is
`
`
`
`20
`
`PETITIONERS EX. 1007 Page 20
`
`
`
`supplied to a Fast Fourier circuit, which transforms the audio signal to spectrum
`
`signals from which a CPU extracts a signal of fundamental wave component
`
`having a frequency. Adachi at 8:54-9:5. Any changes in frequency of the
`
`extracted signal will modify an image being displayed, such as by changing its
`
`color or outline characteristics. Adachi at 9:20-46.
`
`54. Adachi is similar to Lytle in that it describes the use of an electrical
`
`MIDI instrument as well as non-electrical MIDI music sources. Adachi at 12:21-
`
`34, Fig. 10. As discussed above, both Lytle and Adachi also describe animating
`
`three-dimensional objects to correlate with an input audio signal.
`
`55. By the time of the purported invention of the ‘129 Patent, it was well
`
`known to those of ordinary skill in the art that animated, three-dimensional objects
`
`could be controlled by musical data. This concept is described in detail in Lytle
`
`and is evident from the graphic below from the same, which depicts a still from an
`
`animation of three-dimensional instruments that are controlled by music data.
`
`
`
`21
`
`PETITIONERS EX. 1007 Page 21
`
`
`
`
`
`Ex. 1003, Lytle at 672.
`
`56. Upon reading the disclosure of Adachi, a skilled artisan would have
`
`recognized that modifying Lytle to include processing a music signal to extract
`
`data would not only not affect the operation of Lytle, but would do something
`
`contemplated by, but not required by, the system of Lytle.
`
`57. As already mentioned, Lytle recognized that a direct music signal
`
`could be used to drive the animations in his system, but he did not expressly
`
`described such a technique. Thus, the combination of Lytle and Adachi is nothing
`
`more than providing the step of processing a direct music signal and using that data
`
`to operate the virtual environment of Lytle.
`
`
`
`22
`
`PETITIONERS EX. 1007 Page 22
`
`
`
`58. This modification to Lytle would have been natural, given the express
`
`suggestion in Lytle that this could be done, and nothing more than the application
`
`of ordinary skill and common sense to combine the processing methods of Adachi
`
`with system disclosed in Lytle.
`
`59. Accordingly, it is my opinion that it would have been obvious to a
`
`person having ordinary skill in the art to combine the processing of a music signal
`
`in Adachi with the Lytle to provide a virtual world having three-dimensional,
`
`animated objects that are controlled by and correlated with an audio or music
`
`signal. This combination could have been accomplished using known methods in
`
`the art, as recognized by Lytle, and would have yielded predictable results. The
`
`combination of Lytle and Adachi, therefore, in my opinion, renders obvious claims
`
`1, 8, 12, 13, 15 and 21 of the ‘129 Patent.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`23
`
`PETITIONERS EX. 1007 Page 23
`
`
`
`III. CONCLUSION
`60.
`I declare under penalty of perjury that the above statements are