throbber
UNITED STATES PATENT AND TRADEMARK OFFICE
`
`_________________
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`_________________
`
`APPLE INC.,
`Petitioner
`
`v.
`
`ZENTIAN LIMITED,
`Patent Owner
`_________________
`
`
`
`
`
`Inter Partes Review Case No. IPR2023-00034
`U.S. Patent No. 7,979,277
`
`
`DECLARATION OF CHRISTOPHER SCHMANDT
`IN SUPPORT OF PETITION FOR INTER PARTES REVIEW OF
`U.S. PATENT NO. 7,979,277
`
`
`IPR2023-00034
`Apple EX1003 Page 1
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`
`TABLE OF CONTENTS
`
`I.
`
`INTRODUCTION AND QUALIFICATIONS ....................................... 15
`A.
`EDUCATIONAL BACKGROUND AND PROFESSIONAL EXPERIENCE ........ 15
`II. METHODOLOGY: MATERIALS CONSIDERED .............................. 18
`III. OVERVIEW OF LEGAL STANDARDS ............................................... 21
`A.
`PERSON OF ORDINARY SKILL IN THE ART ........................................... 22
`B.
`OBVIOUSNESS ..................................................................................... 22
`C.
`ANALOGOUS ART ............................................................................... 28
`D.
`CLAIM CONSTRUCTION ...................................................................... 29
`1.
`Construction Pursuant to 35 U.S.C. § 112, ¶ 6 ...................... 29
`IV. LEVEL OF A PERSON OF ORDINARY SKILL ................................. 31
`V. OVERVIEW OF THE TECHNOLOGY ................................................ 32
`A.
`SPEECH RECOGNITION ........................................................................ 32
`B.
`FEATURE VECTORS ............................................................................ 40
`C.
`ACOUSTIC MODELS ............................................................................ 49
`D. HIDDEN MARKOV MODELS ................................................................ 51
`E.
`DISTANCE CALCULATIONS ................................................................. 59
`F.
`GAUSSIAN DISTRIBUTION AND PROBABILITY ..................................... 63
`G.
`SPEECH RECOGNITION SYSTEM HARDWARE ....................................... 66
`H.
`PIPELINING ......................................................................................... 75
`I.
`INTERRUPTS ....................................................................................... 79
`J.
`PRIOR ART SPEECH RECOGNITION SYSTEMS ...................................... 81
`VI. OVERVIEW OF THE ’277 PATENT ..................................................... 82
`VII. SUMMARY OF UNPATENTABILITY ................................................. 83
`VIII. OVERVIEW OF THE PRIOR ART ....................................................... 85
`A. OVERVIEW OF JIANG .......................................................................... 85
`B.
`OVERVIEW OF BAUMGARTNER ........................................................... 85
`C.
`OVERVIEW OF BROWN ........................................................................ 86
`D. OVERVIEW OF KAZEROONIAN ............................................................ 86
`
`IPR2023-00034
`Apple EX1003 Page 2
`
`

`

`2.
`
`3.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`OVERVIEW OF VENSKO ...................................................................... 87
`E.
`OVERVIEW OF SMYTH ........................................................................ 87
`F.
`IX. OPINIONS REGARDING GROUND 1: CLAIMS 1, 5, 7, 12,
`AND 14-16 ARE OBVIOUS OVER JIANG, BAUMGARTNER,
`AND BROWN ............................................................................................. 88
`A.
`INDEPENDENT CLAIM 1 ...................................................................... 88
`Claim 1(Pre): “A speech recognition circuit,
`1.
`comprising” ............................................................................ 88
`Claim 1(a): “an audio front end for calculating a feature
`vector from an audio signal,” ................................................. 89
`Claim 1(b): “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ................. 94
`The ’277 Patent’s Discussion of “extracted and/or
`a)
`derived quantities” ........................................................ 94
`Jiang’s Teachings ......................................................... 96
`b)
`Claim 1(c): “a calculating circuit for calculating
`distances indicating the similarity between a feature
`vector and a plurality of predetermined acoustic states of
`an acoustic model; and” ......................................................... 98
`a)
`The ’277 Patent’s Discussion of “distances” .............. 100
`b)
`Jiang’s Teachings ....................................................... 102
`(1)
`“a calculating circuit” ....................................... 102
`(2)
`“a plurality of predetermined acoustic states
`of an acoustic model” ....................................... 104
`“a feature vector” .............................................. 107
`(3)
`“calculating distances …” ................................ 110
`(4)
`Baumgartner Teaches a “calculating circuit” ............. 121
`c)
`d) Motivation to Combine Jiang and Baumgartner’s
`Teachings .................................................................... 126
`Claim 1(d): “a search stage for using said calculated
`distances to identify words within a lexical tree, the
`lexical tree comprising a model of words;” ......................... 127
`a)
`“a lexical tree” ............................................................ 127
`IPR2023-00034
`Apple EX1003 Page 3
`
`4.
`
`5.
`
`

`

`6.
`
`7.
`
`c)
`
`d)
`
`e)
`
`b)
`c)
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`“the lexical tree comprising a model of words” ......... 130
`“a search stage for using said calculated distances
`to identify words within a lexical tree” ...................... 131
`Claim 1(e): “wherein said audio front end and said
`search stage are implemented using a first processor,
`and said calculating circuit is implemented using a
`second processor, and” ........................................................ 135
`a)
`Overview of Mapping ................................................. 135
`b)
`Baumgartner Teaches an “audio front end,” a
`“calculating circuit,” and a “search stage” ................. 139
`Baumgartner Teaches an Audio Front End
`“implemented using a first processor” ........................ 146
`Baumgartner Teaches a Calculating Circuit
`“implemented using a second processor” ................... 149
`Baumgartner Teaches a Search Stage
`“implemented using a first processor” ........................ 151
`f) Motivation to Combine Jiang and Baumgartner ........ 153
`Claim 1(f): “wherein data is pipelined from the front end
`to the calculating circuit to the search stage.” ..................... 160
`a)
`Brown’s Teachings ..................................................... 162
`b) Motivation to Combine Brown with Jiang-
`Baumgartner ............................................................... 164
`DEPENDENT CLAIM 5 ....................................................................... 168
`“A speech recognition circuit as claimed in claim 1,
`1.
`wherein the said calculating circuit is configured to
`autonomously calculate distances for every acoustic state
`defined by the acoustic model.” ............................................ 168
`DEPENDENT CLAIM 7 ....................................................................... 170
`“The speech recognition circuit of claim 1, wherein the
`1.
`feature vector comprises a plurality of spectral
`components of an audio signal for a predetermined time
`frame.” .................................................................................. 170
`D. DEPENDENT CLAIM 12 ..................................................................... 171
`
`B.
`
`C.
`
`IPR2023-00034
`Apple EX1003 Page 4
`
`

`

`E.
`
`F.
`
`1.
`
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`“The speech recognition circuit of claim 1, wherein the
`audio front end is configured to input a digital audio
`signal.” ................................................................................. 171
`INDEPENDENT CLAIM 14 .................................................................. 171
`Claim 14(Pre): “A speech recognition circuit,
`1.
`comprising:” ......................................................................... 171
`Claim 14(a): “an audio front end for calculating a
`feature vector from an audio signal,” ................................... 171
`Claim 14(b) “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 171
`Claim 14(c): “calculating means for calculating a
`distance indicating the similarity between a feature
`vector and a predetermined acoustic state of an acoustic
`model; and” .......................................................................... 172
`Claim 14(d): “a search stage for using said calculated
`distances to identify words within a lexical tree, the
`lexical tree comprising a model of words;” ......................... 173
`Claim 14(e): “wherein said audio front end, said
`calculating means, and said search stage are connected
`to each other to enable pipelined data flow.” ....................... 173
`INDEPENDENT CLAIM 15 .................................................................. 173
`Claim 15(Pre): “A speech recognition method,
`1.
`comprising:” ......................................................................... 173
`Claim 15(a): “calculating a feature vector from an audio
`signal using an audio front end,” ......................................... 174
`Claim 15(b): “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 174
`Claim 15(c): “calculating a distance indicating the
`similarity between a feature vector and a predetermined
`acoustic state of an acoustic model using a calculating
`circuit; and” .......................................................................... 174
`Claim 15(d): “using a search stage to identify words
`within a lexical tree using said calculated distances, the
`lexical tree comprising a model of words;” ......................... 174
`IPR2023-00034
`Apple EX1003 Page 5
`
`2.
`
`3.
`
`4.
`
`5.
`
`

`

`6.
`
`2.
`
`3.
`
`4.
`
`G.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`Claim 15(e): “wherein data is pipelined from the front
`end, to the calculating circuit, and to the search stage.” ..... 175
`INDEPENDENT CLAIM 16 .................................................................. 175
`Claim 16(Pre): “A non-transitory storage medium
`1.
`storing processor implementable code for controlling at
`least one processor to implement a speech recognition
`method, the code comprising:” ............................................. 175
`Claim 16(a): “code for controlling the processor to
`calculate a feature vector from an audio signal,” ................ 176
`Claim 16(b): “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 176
`Claim 16(c): “code for controlling the processor to
`calculate a distance indicating the similarity between a
`feature vector and a predetermined acoustic state of an
`acoustic model; and” ............................................................ 177
`Claim 16(d): “code for controlling the processor to
`identify words within a lexical tree using said calculated
`distances, the lexical tree comprising a model of words,” ... 177
`Claim 16(e): “wherein data is pipelined by the processor
`pursuant to the code from the feature calculation, to the
`distance calculation, and to the word identification.” ......... 177
`X. OPINIONS REGARDING GROUND 2: CLAIM 4 IS OBVIOUS
`OVER JIANG, BAUMGARTNER, BROWN, AND
`KAZEROONIAN ...................................................................................... 177
`A. DEPENDENT CLAIM 4 ....................................................................... 177
`Claim 4: “A speech recognition circuit as claimed
`1.
`in claim 1, wherein the first processor supports multi-
`threaded operation, and runs the search stage and front
`ends as separate threads." .................................................... 177
`XI. OPINIONS REGARDING GROUND 3: CLAIMS 9-10 ARE
`OBVIOUS OVER JIANG, BAUMGARTNER, BROWN, AND
`VENSKO ................................................................................................... 182
`A. DEPENDENT CLAIM 9 ....................................................................... 182
`Claim 9: “The speech recognition circuit of claim 1,
`1.
`wherein the speech accelerator has an interrupt signal to
`IPR2023-00034
`Apple EX1003 Page 6
`
`5.
`
`6.
`
`

`

`B.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`inform the front end that the accelerator is ready to
`receive a next feature vector from the front end.” ................ 182
`DEPENDENT CLAIM 10 ..................................................................... 188
`Claim 10: “The speech recognition circuit of claim 1,
`1.
`wherein the accelerator signals to the search stage when
`the distances for a new frame are available in a result
`memory.” .............................................................................. 188
`XII. OPINIONS REGARDING GROUND 4: CLAIMS 1, 5, 7, 12,
`AND 14-16 ARE OBVIOUS OVER JIANG, BAUMGARNTER,
`BROWN, AND SMYTH ........................................................................... 189
`A.
`INDEPENDENT CLAIM 1 .................................................................... 189
`Claim 1(Pre): “A speech recognition circuit,
`1.
`comprising:” ......................................................................... 189
`Claim 1(a): “an audio front end for calculating a feature
`vector from an audio signal,” ............................................... 189
`Claim 1(b) “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 189
`Claim 1(c): “a calculating circuit for calculating
`distances indicating the similarity between a feature
`vector and a plurality of predetermined acoustic states of
`an acoustic model; and” ....................................................... 189
`a)
`“a calculating circuit” ................................................. 190
`b)
`“a plurality of predetermined acoustic states” ............ 192
`c)
`“[a plurality of predetermined acoustic states] of
`an acoustic model” ...................................................... 196
`“for calculating distances …” ..................................... 200
`d)
`e) Motivation to Modify Jiang-Baumgartner-Brown
`with Smyth .................................................................. 203
`Claim 1(d): “a search stage for using said calculated
`distances to identify words within a lexical tree, the
`lexical tree comprising a model of words;” ......................... 206
`Claim 1(e): “wherein said audio front end and said
`search stage are implemented using a first processor,
`
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`
`IPR2023-00034
`Apple EX1003 Page 7
`
`

`

`7.
`
`C.
`
`E.
`
`B.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`and said calculating circuit is implemented using a
`second processor, and” ........................................................ 206
`Claim 1(f): “wherein data is pipelined from the front end
`to the calculating circuit to the search stage.” ..................... 206
`DEPENDENT CLAIM 5 ....................................................................... 207
`Claim 5: “A speech recognition circuit as claimed
`1.
`in claim 1, wherein the said calculating circuit is
`configured to autonomously calculate distances for every
`acoustic state defined by the acoustic model.” ..................... 207
`DEPENDENT CLAIM 7 ....................................................................... 207
`Claim 7: “The speech recognition circuit of claim 1,
`1.
`wherein the feature vector comprises a plurality of
`spectral components of an audio signal for a
`predetermined time frame.” .................................................. 207
`D. DEPENDENT CLAIM 12 ..................................................................... 207
`Claim 12: “The speech recognition circuit of claim 1,
`1.
`wherein the audio front end is configured to input a
`digital audio signal.” ............................................................ 207
`INDEPENDENT CLAIM 14 .................................................................. 207
`Claim 14(Pre): “a speech recognition circuit,
`1.
`comprising:” ......................................................................... 207
`Claim 14(a): “an audio front end for calculating a
`feature vector from an audio signal,” ................................... 207
`Claim 14(b) “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 207
`Claim 14(c): “calculating means for calculating a
`distance indicating the similarity between a feature
`vector and a predetermined acoustic state of an acoustic
`model; and” .......................................................................... 208
`Claim 14(d): “a search stage for using said calculated
`distances to identify words within a lexical tree, the
`lexical tree comprising a model of words;” ......................... 208
`
`2.
`
`3.
`
`4.
`
`5.
`
`IPR2023-00034
`Apple EX1003 Page 8
`
`

`

`F.
`
`G.
`
`6.
`
`2.
`
`3.
`
`4.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`Claim 14(e): “wherein said audio front end, said
`calculating means, and said search stage are connected
`to each other to enable pipelined data flow.” ....................... 208
`INDEPENDENT CLAIM 15 .................................................................. 208
`Claim 15(Pre): “A speech recognition method,
`1.
`comprising:” ......................................................................... 208
`Claim 15(a): “calculating a feature vector from an audio
`signal using an audio front end,” ......................................... 208
`Claim 15(b): “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 209
`Claim 15(c): “calculating a distance indicating the
`similarity between a feature vector and a predetermined
`acoustic state of an acoustic model using a calculating
`circuit; and” .......................................................................... 209
`Claim 15(d): “using a search stage to identify words
`within a lexical tree using said calculated distances, the
`lexical tree comprising a model of words;” ......................... 209
`Claim 15(e): “wherein data is pipelined from the front
`end, to the calculating circuit, and to the search stage.” ..... 209
`INDEPENDENT CLAIM 16 .................................................................. 209
`Claim 16(Pre): “A non-transitory storage medium
`1.
`storing processor implementable code for controlling at
`least one processor to implement a speech recognition
`method, the code comprising:” ............................................. 209
`Claim 16(a): “code for controlling the processor to
`calculate a feature vector from an audio signal,” ................ 209
`Claim 16(b): “wherein the feature vector comprises a
`plurality of extracted and/or derived quantities from said
`audio signal during a defined audio time frame;” ............... 210
`Claim 16(c): “code for controlling the processor to
`calculate a distance indicating the similarity between a
`feature vector and a predetermined acoustic state of an
`acoustic model; and” ............................................................ 210
`
`5.
`
`6.
`
`2.
`
`3.
`
`4.
`
`IPR2023-00034
`Apple EX1003 Page 9
`
`

`

`5.
`
`6.
`
`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`Claim 16(d): “code for controlling the processor to
`identify words within a lexical tree using said calculated
`distances, the lexical tree comprising a model of words,” ... 210
`Claim 16(e): “wherein data is pipelined by the processor
`pursuant to the code from the feature calculation, to the
`distance calculation, and to the word identification.” ......... 210
`XIII. OPINIONS REGARDING GROUND 5: CLAIM 4 IS OBVIOUS
`OVER JIANG, BAUMGARTNER, BROWN, SMYTH, AND
`KAZEROONIAN ...................................................................................... 210
`A. DEPENDENT CLAIM 4 ....................................................................... 210
`Claim 4: “A speech recognition circuit as claimed
`1.
`in claim 1, wherein the first processor supports multi-
`threaded operation, and runs the search stage and front
`ends as separate threads” ..................................................... 210
`XIV. OPINIONS REGARDING GROUND 6: CLAIMS 9-10 ARE
`OBVIOUS OVER JIANG, BAUMGARTNER, BROWN, SMYTH,
`AND VENSKO ......................................................................................... 211
`A. DEPENDENT CLAIM 9 ....................................................................... 211
`Claim 9: “The speech recognition circuit of claim 1,
`1.
`wherein the speech accelerator has an interrupt signal to
`inform the front end that the accelerator is ready to
`receive a next feature vector from the front end.” ................ 211
`DEPENDENT CLAIM 10 ..................................................................... 211
`Claim 10: “The speech recognition circuit of claim 1,
`1.
`wherein the accelerator signals to the search stage when
`the distances for a new frame are available in a result
`memory.” .............................................................................. 211
`XV. CONCLUSION ........................................................................................ 212
`
`
`B.
`
`IPR2023-00034
`Apple EX1003 Page 10
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`
`CLAIM LISTING
`
`Claim 1:
`
`1(Pre) A speech recognition circuit, comprising:
`
`1(a) an audio front end for calculating a feature vector from an audio signal,
`
`1(b) wherein the feature vector comprises a plurality of extracted and/or
`
`derived quantities from said audio signal during a defined audio time frame;
`
`1(c) a calculating circuit for calculating distances indicating the similarity
`
`between a feature vector and a plurality of predetermined acoustic states of an
`
`acoustic model; and;
`
`1(d) a search stage for using said calculated distances to identify words within
`
`a lexical tree, the lexical tree comprising a model of words;
`
`1(e) wherein said audio front end and said search stage are implemented using
`
`a first processor, and said calculating circuit is implemented using a second
`
`processor, and
`
`1(f) wherein data is pipelined from the front end to the calculating circuit to
`
`the search stage.
`
`Claim 4:
`
`A speech recognition circuit as claimed in claim 1, wherein the first processor
`
`supports multi-threaded operation, and runs the search stage and front ends as
`
`separate threads.
`
`IPR2023-00034
`Apple EX1003 Page 11
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`
`Claim 5:
`
`A speech recognition circuit as claimed in claim 1, wherein the said
`
`calculating circuit is configured to autonomously calculate distances for every
`
`acoustic state defined by the acoustic model.
`
`Claim 7:
`
`The speech recognition circuit of claim 1, wherein the feature vector
`
`comprises a plurality of spectral components of an audio signal for a predetermined
`
`time frame.
`
`Claim 9:
`
`The speech recognition circuit of claim 1, wherein the speech accelerator has
`
`an interrupt signal to inform the front end that the accelerator is ready to receive a
`
`next feature vector from the front end.
`
`Claim 10:
`
`The speech recognition circuit of claim 1, wherein the accelerator signals to
`
`the search stage when the distances for a new frame are available in a result memory.
`
`Claim 12:
`
`The speech recognition circuit of claim 1, wherein the audio front end is
`
`configured to input a digital audio signal.
`
`Claim 14:
`
`14(Pre) A speech recognition circuit, comprising:
`
`IPR2023-00034
`Apple EX1003 Page 12
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`14(a) an audio front end for calculating a feature vector from an audio signal,
`
`14(b) wherein the feature vector comprises a plurality of extracted and/or
`
`derived quantities from said audio signal during a defined audio time frame;
`
`14(c) calculating means for calculating a distance indicating the similarity
`
`between a feature vector and a predetermined acoustic state of an acoustic model;
`
`and
`
`14(d) a search stage for using said calculated distances to identify words
`
`within a lexical tree, the lexical tree comprising a model of words;
`
`14(e) wherein said audio front end, said calculating means, and said search
`
`stage are connected to each other to enable pipelined data flow.
`
`Claim 15:
`
`15(Pre) A speech recognition method, comprising:
`
`15(a) calculating a feature vector from an audio signal using an audio front
`
`end,
`
`15(b) wherein the feature vector comprises a plurality of extracted and/or
`
`derived quantities from said audio signal during a defined audio time frame;
`
`15(c) calculating a distance indicating the similarity between a feature vector
`
`and a predetermined acoustic state of an acoustic model using a calculating circuit;
`
`and
`
`IPR2023-00034
`Apple EX1003 Page 13
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`15(d) using a search stage to identify words within a lexical tree using said
`
`calculated distances, the lexical tree comprising a model of words;
`
`15(e) wherein data is pipelined from the front end, to the calculating circuit,
`
`and to the search stage.
`
`Claim 16:
`
`16(Pre) A non-transitory storage medium storing processor implementable
`
`code for controlling at least one processor to implement a speech recognition
`
`method, the code comprising:
`
`16(a) code for controlling the processor to calculate a feature vector from an
`
`audio signal,
`
`16(b) wherein the feature vector comprises a plurality of extracted and/or
`
`derived quantities from said audio signal during a defined audio time frame;
`
`16(c) code for controlling the processor to calculate a distance indicating the
`
`similarity between a feature vector and a predetermined acoustic state of an acoustic
`
`model; and
`
`16(d) code for controlling the processor to identify words within a lexical tree
`
`using said calculated distances, the lexical tree comprising a model of words,
`
`16(e) wherein data is pipelined by the processor pursuant to the code from the
`
`feature calculation, to the distance calculation, and to the word identification.
`
`
`
`IPR2023-00034
`Apple EX1003 Page 14
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`
`I, Christopher Schmandt, declare as follows:
`
`I.
`
`INTRODUCTION AND QUALIFICATIONS
`1.
`I am over the age of 21 and am competent to make this declaration.
`
`A. Educational Background and Professional Experience
`2.
`I retired several years ago after a 40-year career at the Massachusetts
`
`Institute of Technology (“MIT”); for most of that time I was employed as a Principal
`
`Research Scientist at the Media Laboratory. In that role I also served as faculty for
`
`the MIT Media Arts and Sciences academic program. I was a founder of the Media
`
`Laboratory, a research lab which now spans two buildings.
`
`3.
`
`I received my B.S. degree in Electrical Engineering and Computer
`
`Science from MIT in 1978, and my M.S. in Visual Studies (Computer Graphics) also
`
`from MIT. I was employed at MIT since 1980, initially at the Architecture Machine
`
`Group which was an early computer graphics and interactive systems research lab.
`
`In 1985, I helped found the Media Laboratory and continued to work there until
`
`retirement. I was director of a research group titled “Living Mobile.” My research
`
`spanned distributed communication and collaborative systems, with an emphasis on
`
`multi-media and user interfaces, with a strong focus on speech-based systems. I have
`
`over 70 published conference and journal papers and one book in the field of speech
`
`technology and user interaction.
`
`IPR2023-00034
`Apple EX1003 Page 15
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`For the first fifteen years of my career, my research emphasized speech
`
`4.
`
`recognition and speech user interfaces. I built the first conversational computer
`
`system utilizing speech recognition and synthesis (“Put That There”) starting in
`
`1980. I continued to innovate speech user interfaces using recognition, text-to-
`
`speech synthesis, and recorded audio in a wide variety of projects. I built one of the
`
`first graphical user interfaces for audio editing, employing keyword recognition on
`
`voice memos in 1982 (Intelligent Ear). I built the first research-grade unified
`
`messaging system, which combined text and voice messages into a single inbox,
`
`with speech recognition over the phone for remote access, and a graphical user
`
`interface for desktop access in 1983 (Phone Slave). Along with my students we built
`
`the first system for real time spoken driving directions, including speech-accessible
`
`maps of Cambridge, Massachusetts in 1987 (Back Seat Driver). We built some of
`
`the earliest speech-based personal assistants for managing messages, calendar,
`
`contacts, etc. (Conversational Desktop 1985, Chatter 1993, MailCall 1996). We built
`
`quite a few systems employing speech recognition in handheld mobile devices
`
`(ComMotion 1999, Nomadic Radio 2000, Impromptu 2001, and Symphony 2004,
`
`for example). We applied speech recognition to large bodies of everyday
`
`conversations captured with a wearable device and utilized as a memory aid
`
`(Memory Prosthesis 2004). We used speech recognition on radio newscasts to build
`
`a personalized version of audio newscasts (Synthetic News Radio, 1999) and also
`
`IPR2023-00034
`Apple EX1003 Page 16
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`investigated adding speech recognition to a mouse-based window system a few years
`
`earlier.
`
`5.
`
`I was later awarded the prestigious Association for Computing
`
`Machinery (ACM) Computer Human Interface (CHI) Academy membership
`
`specifically for those years of work pioneering speech user interfaces.
`
`6.
`
`In the course of my research, I built a number of speech recognition
`
`client/server distributed systems, with the first being in 1985. Much of the initial
`
`motivation for a server architecture was that speech recognition required expensive
`
`digital signal processing hardware that we could not afford to put on each computer,
`
`so a central server with the required hardware was used. Later versions of the speech
`
`recognition server architecture allowed certain computers to perform specialized
`
`tasks serving a number of client computers providing voice user interfaces, either on
`
`screens or over telephone connections.
`
`7.
`
`Because of my early work with distributed speech systems, I served for
`
`several years in the mid-1990s with a working group on the impact of multimedia
`
`systems on the Internet reporting to the Internet Engineering Task Force (IETF) and
`
`later the Internet Activities Board (IAB). This work impacted emerging standards
`
`such as Session Initiation Protocol (SIP).
`
`8.
`
`In my faculty position I taught graduate level courses in speech
`
`technology and user interaction design, and directly supervised student research and
`
`IPR2023-00034
`Apple EX1003 Page 17
`
`

`

`Declaration of Christopher Schmandt
`Patent No. 7,979,277
`theses at the Bachelors, Masters, and PhD level. I oversaw the Masters and PhD
`
`thesis programs for the entire Media Arts and Sciences academic program during
`
`my more senior years. I also served on the Media Laboratory intellectual property
`
`committee for many years.
`
`II. METHODOLOGY: MATERIALS CONSIDERED
`9.
`I have relied upon my education, knowledge and experience with
`
`speech technology and speech recognition systems, as well as the other materials as
`
`discussed in this declaration in forming my opinions.
`
`10. For this work, I have been asked to review U.S. Patent No. 7,979,277
`
`(“the ’277 Patent”) (Ex. 1001) including the specification and claims, and the ’277
`
`Patent’s prosecution history (“’277 File History”) (Ex. 1002). In developing

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket