`
`
`
`
`
`
`
`
`
` UNITED STATES PATENT AND TRADEMARK OFFICE
`
`
`
`
`
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`
`
`
`
`
`APPLE INC.,
`Petitioner,
`
`v.
`
`Zentian Limited
`Patent Owner.
`____________________
`
`Case IPR2023-00037
`Patent No. 10,971,140
`____________________
`
`
`
`DECLARATION OF DAVID ANDERSON, Ph.D. IN SUPPORT OF
`PATENT OWNER’S RESPONSE
`
`
`
`
`
`
`
`
`
`
`
`
`
`I, David Anderson, Ph.D, do hereby declare as follows:
`
`I.
`
`Introduction
`A.
`Background and qualifications
`1.
`I am a professor in the School of Electrical and Computer Engineering
`
`at the Georgia Institute of Technology (“Georgia Tech”) in Atlanta, Georgia. I
`
`have been a professor at Georgia Tech since 1999. In 2009 I served as a visiting
`
`professor in the Department of Computer Science at Korea University in Seoul,
`
`South Korea.
`
`2. My full qualifications, including my professional experience and
`
`education, can be found in my Curriculum Vitae, which includes a complete list of
`
`my publications, and is attached as Ex. A to this declaration.
`
`3.
`
`I received my Ph.D. in Electrical and Computer Engineering from
`
`Georgia Tech in 1999. I received my B.S. and M.S. in Electrical Engineering from
`
`Brigham Young University in 1993 and 1994, respectively.
`
`4.
`
`In my employment prior to Georgia Tech as well as in my subsequent
`
`studies and research, I have worked extensively in areas related to the research,
`
`design, and implementation of speech and audio processing systems. I have also
`
`taught graduate and undergraduate level courses at Georgia Tech on the
`
`implementation of signal processing and embedded systems. For example, I have
`
`taught courses on statistical machine learning, machine learning for speech, pattern
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`recognition, multimedia processing and systems, software design, computer
`
`architecture, real-time signal processing systems, and applications of signal
`
`processing (covering topics in audio processing and speech recognition). I have
`
`also designed and taught a course on signal processing in the context of human
`
`perception. These courses and my research have covered many topics relevant to
`
`the subject matter of the ’277 patent and the prior art cited therein.
`
`5.
`
`I have served as principal investigator or co-principal investigator in
`
`numerous multi-disciplinary research projects including “Blind Source Separation
`
`for Audio,” “Audio Classification,” “Auditory Scene Analysis,” “Hearing Aid
`
`Audio Processing,” “Speaker Driver Sound Enhancement,” “I-Vector Based Voice
`
`Quality,” “Analysis of Voice Exercise Using Signal Processing,” and “Smart
`
`Homes for Effective and Safe Remote Work During a Pandemic and Beyond.”
`
`6.
`
`I also have extensive experience with the practical implementation of
`
`signal processing algorithms, information theory, signal detection, and related
`
`topics through my research and consulting. I have published over 200 book
`
`chapters and papers in reviewed journals and conferences. Topics include those
`
`such as “Speech recognition using filter bank features,” “Speaker adaptation using
`
`speaker similarity score on DNN features.” “Segmentation based speech
`
`enhancement using auxiliary sensors,” “A framework for estimation of clean
`
`
`
`
`
`
`
`
`2
`
`
`
`
`
`
`
`
`
`
`
`
`
`speech by fusion of outputs from multiple speech enhancement systems,”
`
`“Distributed acquisition and processing systems for speech and audio,” “A missing
`
`data-based feature fusion strategy for noise-robust automatic speech recognition
`
`using noisy sensors,” “Learning distances to improve phoneme classification,”
`
`“Identification of voice quality variation using i-vectors,” “Varying time-constants
`
`and gain adaptation in feature extraction for speech processing,” “Low bit-rate
`
`coding of speech in harsh conditions using non-acoustic auxiliary devices,”
`
`“Speech analysis and coding using a multi-resolution sinusoidal transform,”
`
`“Biologically inspired auditory sensing system interfaces on a chip,” “Cascade
`
`classifiers for audio classification,” and “Single acoustic channel speech
`
`enhancement based on glottal correlation using non-acoustic sensors.” I have also
`
`contributed book chapters for treatises such as “Independent Component Analysis
`
`for Audio and Biosignal Applications,” and written a book on Fixed-Point Signal
`
`Processing which is related to the practical implementation of systems for
`
`processing sound and other signals.
`
`7.
`
`I am a named inventor on eight patents, including “Speech activity
`
`detector for use in noise reduction system, and methods therefor” (U.S. Patent No.
`
`6,351,731), and “Analog audio signal enhancement system using a noise
`
`suppression algorithm” (U.S. Patent No. 7,590,250).
`
`
`
`
`
`
`
`
`3
`
`
`
`
`
`
`
`
`
`
`
`
`
`8.
`
`I am a Senior Member of the Institute of Electrical and Electronics
`
`Engineers (“IEEE”) and have been a Member since 1991. I am also a Member of
`
`the IEEE Signal Processing Society. From 1994 to 2016, I was also a member of
`
`the Acoustical Society of America. In 2003, I served as the Co-Chair for the NSF
`
`Symposium on Next Generation Automatic Speech Recognition. In 2004, I
`
`received the Presidential Early Career Award for Scientists and Engineers,
`
`presented by then-President George W. Bush, for my work on ultra-low-power
`
`signal processing system design.
`
`B.
`9.
`
`Engagement
`I have been retained by Patent Owner Zentian Limited (“Zentian” or
`
`“Patent Owner”) to provide my opinions with respect to Zentian’s Response to the
`
`Petition in Inter Partes Review proceeding IPR2023-00037, with respect to U.S. Pat.
`
`10,971,140. I am being compensated for my time spent on this matter. I have no
`
`interest in the outcome of this proceeding and the payment of my fees is in no way
`
`contingent on my providing any particular opinions.
`
`10. As part of this engagement, I have also been asked to provide my
`
`technical review, analysis, insights, and opinions regarding the materials cited and
`
`relied upon by the Petition, including the prior art references and the supporting
`
`Declaration of Mr. Schmandt.
`
`
`
`
`
`
`
`
`4
`
`
`
`
`
`
`
`
`
`
`11. The statements made herein are based on my own knowledge and
`
`
`
`opinions.
`
`C. Materials considered
`12.
`In the course of preparing my opinions, I have reviewed and am familiar
`
`with the ’140 patent, including its written description, figures, and claims. I have
`
`also reviewed and am familiar with the Petition in this proceeding, the supporting
`
`Declaration of Mr. Schmandt, and the relied upon prior art, including Jiang and
`
`Chen. I have also reviewed the materials cited in this declaration. My opinions are
`
`based on my review of these materials as well as my more than 3 years of experience,
`
`research, and education in the field of art.
`
`II. Relevant legal standards
`13.
`I am not an attorney. I offer no opinions on the law. But counsel has
`
`informed me of the following legal standards relevant to my analysis here. I have
`
`applied these standards in arriving at my conclusions.
`
`A.
`14.
`
`Person of ordinary skill in the art
`I understand that an analysis of the claims of a patent in view of prior
`
`art has to be provided from the perspective of a person having ordinary skill in the
`
`art at the time of invention of the ’140 patent. I understand that I should consider
`
`factors such as the educational level and years of experience of those working in the
`
`
`
`
`
`
`
`
`5
`
`
`
`
`
`
`
`
`
`
`pertinent art; the types of problems encountered in the art; the teachings of the prior
`
`
`
`art; patents and publications of other persons or companies; and the sophistication
`
`of the technology. I understand that the person of ordinary skill in the art is not a
`
`specific real individual, but rather a hypothetical individual having the qualities
`
`reflected by the factors discussed above.
`
`15.
`
`I understand that the Petition applies a priority date of February 4, 2002,
`
`for the challenged claims, Pet. 5, and I apply the same date.
`
`16.
`
`I further understand that the Petition defines the person of ordinary skill
`
`in the art at the time of the invention as having had a master’s degree in computer
`
`engineering, computer science, electrical engineering, or a related field, with at least
`
`two years of experience in the field of speech recognition, or a bachelor’s degree in
`
`the same fields with at least four years of experience in the field of speech
`
`recognition. The Petition adds that further education or experience might substitute
`
`for the above requirements. I do not dispute the Petition’s assumptions at this time,
`
`and my opinions are rendered on the basis of the same definition of the ordinary
`
`artisan set forth in the Petition, but I note that obviousness must be viewed from the
`
`perspective of one of ordinary skill in the field of speech recognition, and my
`
`opinions are rendered on the basis of the definition of the ordinary artisan set forth
`
`in the Petition in view of that further clarification. Throughout my declaration, my
`
`
`
`
`
`
`
`
`6
`
`
`
`
`
`
`
`
`
`
`statements as to the POSA’s capabilities and knowledge are based on the Petition’s
`
`
`
`definition in conjunction with my statements about that person’s capabilities,
`
`although I do not necessarily agree that the Petition has properly defined the POSA
`
`in this context.
`
`17.
`
`I note that the field of electrical engineering encompasses many
`
`specialties, including fiber optics, analog circuits, digital circuits, very large-scale
`
`integration (VLSI), systems and controls, digital signal processing, bio-medical
`
`sensors and systems, electrical energy, electromagnetics, nanotechnology,
`
`telecommunications, computer
`
`systems and
`
`software, and others.
`
`(see
`
`https://ece.gatech.edu/research/tigs). At Georgia Tech, one of the top 5 electrical and
`
`computer engineering schools in the country, students specialize in one of these
`
`areas after taking some general courses. If a student desires to study speech
`
`recognition, they would specialize in the area of digital signal processing (DSP) and
`
`then pursue studies specific to the advanced subspecialty of speech recognition
`
`within DSP in their graduate studies. On the other hand, a student who desires to
`
`learn computer architecture would take a different set of core courses in one of the
`
`computer engineering specialties. This structure is common, with only slight
`
`variations, across all electrical and computer engineering programs. In this sense,
`
`electrical and computer engineering can be compared with medicine. The engineers
`
`
`
`
`
`
`
`
`7
`
`
`
`
`
`
`
`
`
`
`each take basic math and physics classes as well as some introductory systems
`
`
`
`classes before specializing. Medical students take general courses as well but then
`
`may specialize into one of a variety of different subdisciplines such as cardiology,
`
`neurosurgery, dermatology, orthopedic surgery, etc. It would be understood that if a
`
`person was a specialist in orthopedic surgery they would not, under any ordinary
`
`circumstances, also be a specialist in neurosurgery.
`
`18. Mr. Schmandt’s proposed person of ordinary skill in the art was
`
`specifically identified to have specialized in speech recognition. Any such person of
`
`ordinary skill would not also be expected to have specialized in parallel processing
`
`architectures and methods or high-performance computing in addition to speech
`
`recognition. At Georgia Tech, I have a somewhat rare distinction of having been
`
`hired to be a part of both the computer engineering group and the digital signal
`
`processing group. This was due to my years of experience designing embedded
`
`systems in addition to my Ph.D. studies in signal processing. In my role I teach
`
`classes in both areas including courses on advanced digital signal processing, speech
`
`recognition, machine learning, digital circuit design, and computer architecture.
`
`However, I have never taught a course that would have equipped a masters-level
`
`student in speech recognition to apply a parallel processing architecture like Chen’s
`
`to known speech recognition techniques.
`
`
`
`
`
`
`
`
`8
`
`
`
`
`
`
`
`
`
`
`19. While a person with a master’s degree in one of the fields identified in
`
`
`
`Mr. Schmandt’s POSA definition could be a person of ordinary skill in speech
`
`recognition or a person of ordinary skill in high performance computing and parallel
`
`processing, that person would not be both.
`
`20. Having studied, taught, and researched multiple aspects of computer
`
`architecture and the circuit-level implementation of signal processing systems, I am
`
`conversant with the basic concepts, trade-offs, and challenges that would be
`
`encountered in designing a system such as that claimed in the ’140 patent. An
`
`ordinarily skilled engineer at the time of the invention would have been trained in
`
`evaluating both the costs and benefits of a particular design choice. Engineers are
`
`trained (both in school and through general experience in the workforce) to
`
`recognize that design choices can have complex consequences that need to be
`
`evaluated before forming a motivation to pursue a particular design choice, and
`
`before forming an expectation of success as to that design choice. In my opinion,
`
`anyone who did not recognize these realities would not be a person of ordinary skill
`
`in the art. Thus, a person who would have simply formed design motivations based
`
`only on the premise that a particular combination of known elements would be
`
`possible would not be a person of ordinary skill regardless of their education,
`
`experience, or technical knowledge. Likewise, a person who would have formed
`
`
`
`
`
`
`
`
`9
`
`
`
`
`
`
`
`
`
`
`design motivations as to a particular combination of known elements based only on
`
`
`
`the premise that the combination may provide some benefit, with no consideration
`
`of the relevance of the benefit in the specific context and in relation to the costs or
`
`disadvantages of that combination, would also not have been a person of ordinary
`
`skill in the art, regardless of their education, experience, or technical knowledge. In
`
`my opinion, a person of ordinary skill in the art would have been deliberative and
`
`considered, rather than impulsive.
`
`21. Throughout my declaration, even if I discuss my analysis in the present
`
`tense, I am always making my determinations based on what a person of ordinary
`
`skill in the art (“POSA”) would have known at the time of the invention. Based on
`
`my background and qualifications, I have experience and knowledge exceeding the
`
`level of a POSA, and am qualified to offer the testimony set forth in this declaration.
`
`B.
`22.
`
`Burden of proof
`I understand that in an inter partes review the petitioner has the burden
`
`of proving a proposition of unpatentability by a preponderance of the evidence.
`
`C. Claim construction
`23.
`I understand that in an inter partes review, claims are interpreted based
`
`on the same standard applied by Article III courts, i.e., based on their ordinary and
`
`customary meaning as understood in view of the claim language, the patent’s
`
`
`
`
`
`
`
`
`10
`
`
`
`
`
`
`
`
`
`
`description, and the prosecution history viewed from the perspective of the ordinary
`
`
`
`artisan. I further understand that where a patent defines claim language, the
`
`definition in the patent controls, regardless of whether those working in the art may
`
`have understood the claim language differently based on ordinary meaning.
`
`D. Obviousness
`24.
`I understand that a patent may not be valid even though the invention
`
`is not identically disclosed or described in the prior art if the differences between the
`
`subject matter sought to be patented and the prior art are such that the subject matter
`
`as a whole would have been obvious to a person having ordinary skill in the art in
`
`the relevant subject matter at the time the invention was made.
`
`25.
`
`I understand that, to demonstrate obviousness, it is not sufficient for a
`
`petition to merely show that all of the elements of the claims at issue are found in
`
`separate prior art references or even scattered across different embodiments and
`
`teachings of a single reference. The petition must thus go further, to explain how a
`
`person of ordinary skill would combine specific prior art references or teachings,
`
`which combinations of elements in specific references would yield a predictable
`
`result, and how any specific combination would operate or read on the claims.
`
`Similarly, it is not sufficient to allege that the prior art could be combined, but rather,
`
`
`
`
`
`
`
`
`11
`
`
`
`
`
`
`
`
`
`
`the petition must show why and how a person of ordinary skill would have combined
`
`
`
`them.
`
`26.
`
`I understand that where an alleged motivation to combine relies on a
`
`particular factual premise, the petitioner bears the burden of providing specific
`
`support for that premise. I understand that obviousness cannot be shown by
`
`conclusory statements, and that the petition must provide articulated reasoning with
`
`some rational underpinning to support its conclusion of obviousness. I also
`
`understand that skill in the art and “common sense” rarely operate to supply missing
`
`knowledge to show obviousness, nor does skill in the art or “common sense” act as
`
`a bridge over gaps in substantive presentation of an obviousness case.
`
`III. Overview of the ’140 Patent
`27. U.S. Patent 10,971,140, titled “Speech recognition using parallel
`
`processors,” is directed to an improved speech recognition circuit that “uses
`
`parallel processors for processing the input speech data in parallel.” Ex. 1001,
`
`1:18-20. The ’140 patent teaches multiple processors “arranged in groups or
`
`clusters,” with each group or cluster of processors connected to one of several
`
`“partial lexical memories” that “contains part of the lexical data.” Ex. 1001, 3:13-
`
`18. “Each lexical tree processor is operative to process the speech parameters using
`
`a partial lexical memory and the controller controls each lexical tree processor to
`
`
`
`
`
`
`
`
`12
`
`
`
`
`
`
`
`
`
`
`
`
`
`process a lexical tree corresponding to partial lexical data in a corresponding
`
`partial lexical memory.” Ex. 1001, 3:19-24. The ’140 patent further teaches that
`
`the invention “provides a circuit in which speech recognition processing is
`
`performed in parallel by groups of processors operating in parallel in which each
`
`group accesses a common memory of lexical data.” Ex. 1001, 3:62-66.
`
`28. The specification of the ’140 patent thus provides: “[T]he present
`
`invention provides a circuit in which speech recognition processing is performed in
`
`parallel by groups of processors operating in parallel in which each group accesses
`
`a common memory of lexical data. . . . Each processor within a group can access
`
`the same lexical data as any other processor in the group. The controller can thus
`
`control the parallel processing of input speech parameters in a more flexible
`
`manner. For example, it allows more than one processor to process input speech
`
`parameters using the same lexical data in a lexical memory. This is because the
`
`lexical data is segmented into domains which are accessible by multiple
`
`processors.” Ex. 1001, 3:62-4:18.
`
`29. Figure 2 of the patent, annotated below, further illustrates that
`
`architecture by showing two groups of lexical tree processors, with each group
`
`containing multiple processors 1-k, and each group of processors connected to a
`
`
`
`
`
`
`
`
`13
`
`
`
`
`
`
`
`
`dedicated “acoustic model memory,” such that there are least two acoustic model
`
`memories for at least two groups of processors. Ex. 1001, Fig. 2 (annotated).
`
`
`
`
`
`
`
`
`30. Moreover, the ’140 patent expressly distinguishes its novel
`
`architecture from two prior known alternative designs, which the patent describes
`
`as less advantageous. In particular, the ’140 patent teaches that: “By providing a
`
`plurality of processors in a group with a common memory, flexibility in the
`
`processing is provided without being bandwidth limited by the interface to the
`
`memory that would occur if only a single memory were used for all processors.
`
`The arrangement is more flexible than the parallel processing arrangement in
`
`which each processor only has access to its own local memory and requires
`
`fewer memory interfaces (i.e. chip pins).” Ex. 1001, 4:1-9. The patent thus
`
`distinguishes its design from (1) a one-memory-to-all-processors design, which it
`
`
`
`
`
`
`
`
`14
`
`
`
`
`
`
`
`
`
`
`
`
`
`describes as bandwidth limited in the processor to memory interface; and (2) a one-
`
`memory-to-one-processor design, which would require more memory interfaces
`
`and would be less flexible.
`
`IV. The POSA would not have had a reasonable expectation of success with
`respect to the Petition’s combination of Jiang and Chen
`23.
`I understand the Petition and Mr. Schmandt propose that the ordinary
`
`artisan would have found it obvious “to configure a computing platform comprising
`
`clusters of processors as taught by Chen to perform the speech recognition
`
`techniques of Jiang.” Ex. 1003 ¶ 68. I disagree for the reasons below.
`
`24.
`
`Jiang does not enable performing its speech recognition techniques
`
`using the particular clustered processor embodiment the Petition has selected from
`
`Chen, and Chen does not enable using that clustered processor embodiment to
`
`perform Jiang’s speech recognition techniques. Mr. Schmandt
`
`admitted those facts at his deposition. Ex. 2017 at 86:9-17.
`
`25. Mr. Schmandt and the Petition do not provide any explanation as to
`
`how the ordinary artisan could have “configure[d] a computing platform comprising
`
`clusters of processors as taught by Chen to perform the speech recognition
`
`
`
`
`
`
`
`
`15
`
`
`
`
`
`
`
`
`
`
`techniques of Jiang.” Ex. 1003 ¶ 68. Nor do they provide any evidence that ordinary
`
`
`
`artisans at the time of the ’140 patent could have done so.
`
`26. Mr. Schmandt’s declaration simply states the “modifications necessary,
`
`such as configuring Chen’s circuitry to specifically recognize speech pursuant to
`
`Jiang would have required only software programming and well-known computing
`
`techniques and structures, and thus would have led a POSITA to a reasonable
`
`expectation of success.” Ex. 1003 ¶ 68. Mr. Schmandt acknowledged at his
`
`deposition, however, that his declaration does not specify what software
`
`programming, computer techniques, or structures the POSA would need to utilize to
`
`configure Chen’s circuitry to recognize speech pursuant to Jiang. Ex. 2017 at 48:13-
`
`21. Moreover, I find it significant that Mr. Schmandt has never built the processor
`
`to memory architecture for any of the systems identified in the background of his
`
`declaration, nor has he ever supervised anyone involved in the process of mapping
`
`a speech recognition model to a clustered processor and memory architecture like
`
`Chen’s. Ex. 2017 at 34:24-35:5, 32:3-9, 146:20-24.
`
`27.
`
`In my opinion, based on my experience, implementing Jiang’s speech
`
`recognition techniques on Chen’s clustered processor and memory architecture
`
`would have been highly complex, and, without the benefit of the teachings of the’140
`
`Patent, far outside the skill level of the ordinary artisan in the field of speech
`
`
`
`
`
`
`
`
`16
`
`
`
`
`
`
`
`
`
`
`recognition as defined by the Petition, without the benefit of the ’140 Patent. In
`
`
`
`particular, moving Jiang’s speech recognition techniques to Chen’s clustered,
`
`parallel processors and memories would have required coordinating multiple caches,
`
`avoiding memory conflicts, controlling task sharing in an efficient manner, resolving
`
`synchronous bottlenecks, and addressing communication bandwidth and latency
`
`issues among the various hardware components, and developing a messaging
`
`strategy to coordinate information sharing between and within clusters, among other
`
`challenges. Such tasks were not within the skill level of the Petition’s ordinary
`
`artisan in the field of speech recognition prior to the ’140 Patent.
`
`28. As noted earlier, the ordinary artisan in the field of speech recognition
`
`would have specialized in the area of digital signal processing (DSP), and would
`
`have further pursued studies specific to speech recognition, an advanced
`
`subspecialty of DSP, in their graduate studies. The POSA in the field of speech
`
`recognition would not have also been a high-performance computing or parallel
`
`processing specialist, and would not have been capable of addressing the complex
`
`challenges of combining Chen with Jiang in the manner the Petition has proposed
`
`any more than a dermatologist could be expected to also perform open heart surgery.
`
`29. Moreover, speech recognition requires evaluating the comparative
`
`likelihoods of many possible outcomes for each incoming frame of samples. Jiang
`
`
`
`
`
`
`
`
`17
`
`
`
`
`
`
`
`
`
`
`teaches the use of the Viterbi decoding algorithm. Ex. 1004 at 1:19-39, 2:1-46. The
`
`
`
`Viterbi algorithm compares the likelihoods of all possible sequences for a given
`
`input frame, given the most likely outcomes from the previous frame. Id. To perform
`
`these comparisons requires extensive communication between computational
`
`components that can have a drastic impact on the system design and performance.
`
`By contrast, Chen teaches that the memory associated with one cluster is not directly
`
`or adjacently accessible by the processors and memories in each other cluster. See
`
`Ex. 1003 ¶ 95; Ex. 1005 at 9:10-39. As a result, performing Viterbi decoding when
`
`the information needed by each node may be in a cluster that is not directly
`
`accessible, as in Chen, is not likely to be successful in a practical speech recognition
`
`system.
`
`30. As explained by Hennesy & Patterson, communication between
`
`parallel processing nodes is one of several considerations that can determine the
`
`feasibility of a particular parallel architecture for use with a particular problem. Ex.
`
`2019 at 534-535. Hennesy & Patterson also points out that the private memory model
`
`in which memory is connected only within a cluster may be suitable “for
`
`applications that require little or no communication.” Id. at Ex. 2019 at 533. Chen
`
`implements such a private memory architecture but the speech recognition search
`
`stage is an application that requires significant communication. Therefore, based on
`
`
`
`
`
`
`
`
`18
`
`
`
`
`
`
`
`
`
`
`teachings at the time of the ’140 Patent, the POSA would not expect the proposed
`
`
`
`combination of Chen with Jiang to be successful.
`
`31. My own experience in supervising a team that attempted to implement
`
`an application within a parallel processing environment is illustrative. In addition to
`
`my support, this team of engineers worked in close collaboration with a DSP
`
`application expert at Texas Instruments for six months in order to implement a
`
`custom computer vision application on a multi-processor system supplied by Texas
`
`Instruments (Texas Instrument’s multi-processor daVinci platform). Even with
`
`training, expert oversight, and extensive efforts on behalf of the team, the inter-
`
`processor communication details prevented the successful implementation of their
`
`system. Rather, the final product was only able to do a simple tracking of a black
`
`ball on a white background—something that could have more easily been performed
`
`on a single processor. While this system was not a speech recognition system, it was
`
`a signal processing system that was simpler than a speech recognition system in
`
`terms of complexity (only a few hundreds of lines of code). The failure of this design
`
`example demonstrates that transitioning a signal processing system to any particular
`
`multi-processing architecture is not necessarily trivial or practical, and that even
`
`those beyond the level of ordinary skill in the field of speech recognition could not
`
`
`
`
`
`
`
`
`19
`
`
`
`
`
`
`
`
`
`
`necessarily expect success with respect to far simpler combinations than the one the
`
`
`
`Petition has proposed.
`
`32.
`
`In my opinion, based on actual experience, the POSA would not have
`
`“been capable of simply substituting” Chen’s clustered processors to implement
`
`Jiang’s speech recognition techniques, as Mr. Schmandt’s declaration simplistically
`
`presumes.
`
`V. The Petition and Mr. Schmandt fail to prove a motivation to combine
`Jiang with Chen
`33. Apple and Mr. Schmandt allege that the POSA would have been
`
`motivated to combine Jiang with Chen due to the alleged benefits of improved speed
`
`and power, a relaxed pruning threshold, and reduced cost. Ex. 1003 ¶¶ 67, 74.
`
`34. Separately, Mr. Schmandt contends that “the flexibility and scalability
`
`afforded to the MARS machine by its clustered architecture [ ] would have motivated
`
`a POSITA to implement clustered architecture in other speech recognition circuits.”
`
`Ex. 1003 ¶ 71.
`
`35.
`
` Mr. Schmandt’s declaration assumes that the clustered processor and
`
`memory arrangement of Chen would have necessarily improved the speed and
`
`power of Jiang’s speech recognition, and thus allowed for relaxed pruning
`
`threshold. Ex. 1003 ¶¶ 67, 74. Mr. Schmandt’s declaration does not
`
`undertake any quantitative or other particularized analysis to demonstrate why
`20
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Chen would allegedly speed up Jiang and relax its pruning threshold, or by how
`
`much. Rather, Mr. Schmandt seems to simply assume the conclusion that more
`
`processors would have been better.
`
`36. Mr. Schmandt’s assumptions are not correct in the context of his
`
`proposed combination. Mathew I provides a detailed report on computer architecture
`
`modifications intended to improve the speed of a speech recognizer, and in particular
`
`through the addition of more processors. Ex. 2021 at 11. Matthew I explains that it
`
`developed “a parallel version” of the Sphinx speech recognition system in which the
`
`distance calculation (GAU, or Gaussian probability estimation) was run on a
`
`separate processor from the search stage (HMM processing) with parallelization. Id.
`
`This two-processor parallel processing design was only 1.67x faster than the
`
`“original sequential version,” not 1.97x faster as the authors had predicted based on
`
`Amdahl’s law. Id. The authors then further modified the search stage (HMM phase)
`
`“to use 4 processors instead of 1,” thus moving from a two processor system (one
`
`for distance calculation and one for search) to a five processor system (one processor
`
`for the calculation stage, and four for search). Id. As Mathew I reports, “the resulting
`
`5 processor version was slower than the 2 processor version due to the high
`
`synchronization overhead.” Id. (emphasis added).
`
`37. Mathew I demonstrates that the ordinary artisan could not have
`
`
`
`
`
`
`
`
`21
`
`
`
`
`
`
`
`
`
`
`assumed that Chen’s complex, clustered parallel processing architecture with
`
`
`
`numerous processors in each cluster and multiple distributed memories would have
`
`improved the speed of Jiang’s speech recognition system or relaxed its pruning
`
`threshold in view of the high synchronization overhead and other complications that
`
`such a system would have implicated.
`
`38. Mr. Schmandt’s declaration testimony as to the alleged cost-based
`
`motivations for the combination are likewise unsupported. As Mr. Schmandt
`
`admitted at his deposition, his declaration opines that the combination of Jiang and
`
`Chen would have been less expensive, but never states what the combination
`
`would have been less expensive than, or on what basis.
`
`
`
`
`
`
`
`
`22
`
`
`
`
`
`
`
`
`
`
`
`
`
`
` Ex. 2017 at 61:14-62:9. Mr. Schmandt also made clear that his declaration
`
`testimony was not based on a cost comparison between Chen’s clustered processing
`
`architecture and any other allegedly alternative clustered processing architectures.
`
`
`Ex. 2017 at 63:8-15. And Mr. Schmandt likewise admitted that his declaration
`
`
`
`
`
`
`
`
`23
`
`
`
`
`
`
`
`
`provides no quantitative cost comparison between Jiang’s unmodified architecture
`
`and his proposed modified architecture in combination with Chen. Ex. 2017 at
`
`
`
`
`
`
`40:23-41:4.
`
`39. Mr. Schmandt thus provides no basis for a cost-driven motivation to
`
`combine Jiang with Chen.
`
`40. Mr. Schmandt’s “flexibility and scalability” motivation theory is also
`
`lacking evidentiary support. Ex. 1003 ¶¶ 71, 72. Mr. Schmandt’s declaration cites
`
`Hon as purportedly teaching that clustered processing architectures in general
`
`present the benefits of flexibility and scalability, and that this would have motivated
`
`the combination of Jiang and Chen. Id. But Mr. Schmandt’s deposition testimony
`
`clarifies that Hon’s cited passages in