`
`IN THE UNITED STATES PATENT AND TRADEMARK OFFICE
`
`
`Petition for Inter Partes Review
`
`Attorney Docket No.: 52959.38
`Customer No.:
`27683
`
`Real Party in Interest:
`Apple Inc.
`
`
`
`
`
`In re patent of Bessette
`
`U.S. Patent No. 6,807,524
`
`Issued: October 19, 2004
`
`Title: PERCEPTUAL WEIGHTING
`DEVICE AND METHOD FOR
`EFFICENT CODING OF
`WIDEBAND SIGNALS
`
`
`§
`§
`§
`§
`§
`§
`§
`§
`§
`
`Declaration of Jordan Cohen, Ph.D.
`Under 37 C.F.R. § 1.68
`
`
`
`
`
`
`
`– 1 –
`
`
`
`Ex. 1003 / Page 1 of 114
`Apple v. Saint Lawrence
`
`
`
`
`
`Table of Contents
`Introduction .......................................................................................................... 4
`I.
`II. Qualifications and Professional Experience ...................................................... 12
`III. Level of Ordinary Skill in the Art ..................................................................... 20
`IV. Relevant Legal Standards .................................................................................. 22
`V. State of the Art Before the ’524 Patent ............................................................. 24
`A. Digital Speech Coding ................................................................................ 24
`B. CELP ........................................................................................................... 27
`C. Pre-Emphasis Filtering ............................................................................... 30
`D. Perceptually Weighted Filtering ................................................................. 33
`VI. The ’524 Patent .................................................................................................. 38
`A. Overview ..................................................................................................... 38
`B. History of the ’524 Patent ........................................................................... 43
`VII. Claim Construction ...................................................................................... 43
`A. “pitch codebook search device responsive to said perceptually weighted
`signal for producing pitch codebook parameters and an innovative
`search target vector” ................................................................................... 44
`B. “innovative codebook search device, responsive to said synthesis filter
`coefficients and to said innovative search target vector, for producing
`innovative codebook parameters” .............................................................. 45
`C. “signal forming device for producing an encoded wideband speech
`signal” ......................................................................................................... 46
`D. “wideband [speech] signal” ........................................................................ 47
`E. “fixed denominator” ................................................................................... 48
`VIII. Challenge #1: Claims 1, 8, 15, 29, and 36 are obvious over Salami in
`view of Kroon ............................................................................................. 48
`A. Salami ......................................................................................................... 49
`B. Kroon .......................................................................................................... 53
`C. Reasons to Combine Salami and Kroon ..................................................... 54
`D. Detailed Analysis ........................................................................................ 59
`
`– 2 –
`
`
`
`Ex. 1003 / Page 2 of 114
`
`
`
`
`
`IX. Challenge #2: Claims 2-3, 9-10, 16-17, 30-31, and 37-38 are obvious over
`Salami in view of Kroon and Makamura .................................................... 78
`A. Makamura ................................................................................................... 79
`B. Reasons to Combine Salami, Kroon, and Makamura ................................ 81
`C. Detailed Analysis ........................................................................................ 84
`X. Challenge #3: Claims 6, 13, 20, 34, and 41 are obvious over Salami in
`view of Kroon, Lim, and’524 APA ............................................................ 89
`A. Lim .............................................................................................................. 89
`B. ’524 APA .................................................................................................... 91
`C. Reasons to Combine Salami, Kroon, Lim, and ’524 APA ......................... 92
`D. Detailed Analysis ........................................................................................ 99
`XI. Challenge #4: Claims 4-5, 7, 11-12, 14, 18-19, 21, 32-33, 35, 39-40, and
`42 are obvious over Salami in view of Kroon, Lim,’524 APA, and
`Makamura .................................................................................................105
`A. Reasons to Combine Salami, Kroon, Lim, ’524 APA, and Makamura ... 106
`B. Detailed Analysis ...................................................................................... 109
`XII. Declaration .................................................................................................114
`
`
`
`
`
`– 3 –
`
`
`
`Ex. 1003 / Page 3 of 114
`
`
`
`
`
`I.
`
`Introduction
`
`I, Jordan Cohen, Ph.D., declare:
`
`1.
`
`I am making this declaration at the request of Apple Inc. in the matter
`
`of the Inter Partes Review of U.S. Patent No. 6,807,524 to Bessette et al. (“the
`
`’524 Patent” or “Bessette”).
`
`2.
`
`I am being compensated for my work in this matter at the hourly rate
`
`of $500 per hour. I am also being reimbursed for reasonable and customary
`
`expenses associated with my work and testimony in this investigation. My
`
`compensation is not contingent on the outcome of this matter or the specifics of my
`
`testimony.
`
`3.
`
`In the preparation of this declaration, I have studied:
`
`(1) The ’524 Patent, Ex-1001;
`
`(2) The Prosecution History of the ’524 Patent, Ex-1002;
`
`(3)
`
`“Real-time Implementation of a 9.6 kbit/s ACELP Wideband Speech
`
`Coder,” CONF. REC. IEEE GLOBECOM, pg. 447-451 (Dec. 1992) to
`
`Salami et al. (“Salami”), Ex-1008;
`
`(4)
`
`“Regular-Pulse Excitation—A Novel Approach to Effective and
`
`Efficient Multipulse Coding of Speech,” IEEE TRANS. ACOUSTICS,
`
`SPEECH, AND SIG. PROC., Vol. 5, 1054-63 (Oct. 1986) to Kroon et al.
`
`– 4 –
`
`
`
`Ex. 1003 / Page 4 of 114
`
`
`
`
`
`(“Kroon”), Ex-1005;
`
`(5)
`
`“Enhancement and Bandwidth Compression of Noisy Speech,” PROC.
`
`IEEE, Vol. 67, 1586-1604 (Dec. 1979) to J. S. Lim et al. (“Lim”), Ex-
`
`1014;
`
`(6) U.S. Patent No. 5,295,224 to Makamura et al. (“Makamura”), Ex-
`
`1021;
`
`(7) MARC Record Information for IEEE Transactions on Acoustics,
`
`Speech, and Signal Processing, available at the Library of Congress
`
`online catalog at
`
`https://catalog.loc.gov/vwebv/staffView?searchId=7802&recPointer=
`
`0&recCount=25&searchType=1&bibId=11182211, accessed February
`
`17, 2017, Ex-1006;
`
`(8) Bibliographic Record Information for IEEE Transactions on
`
`Acoustics, Speech, and Signal Processing, available at the Library of
`
`Congress online catalog at
`
`https://catalog.loc.gov/vwebv/search?searchCode=STNO&searchTyp
`
`e=1&recCount=25&searchArg=0096-3518, accessed February 17,
`
`2017, Ex-1007;
`
`(9) MARC Record Information for Orlando Globecom ‘’92
`
`– 5 –
`
`
`
`Ex. 1003 / Page 5 of 114
`
`
`
`
`
`Communication for Global Users, available at the online catalog of
`
`the United States Naval Academy Nimitz Library
`
`https://library.usna.edu/search~S4?/tglobecom+92/tglobecom+++92/1
`
`%2C1%2C1%2CB/marc&FF=tglobecom+++92&1%2C1%2C,
`
`accessed March 8, 2017, Ex-1009;
`
`(10) Bibliographic Record Information for Orlando Globecom ’92
`
`Communication for Global Users, available at the online catalog of
`
`the United States Naval Academy Nimitz Library
`
`https://library.usna.edu/search/?searchtype=t&SORT=D&searcharg=g
`
`lobecom+92&searchscope=4, accessed March 8, 2017, Ex-1010;
`
`(11) MARC Record Information for Orlando Globecom ‘’92
`
`Communication for Global Users, available at the Library of Congress
`
`online catalog at
`
`https://catalog.loc.gov/vwebv/staffView?searchId=7122&recPointer=
`
`10&recCount=25&bibId=11454684, accessed February 17, 2017, Ex-
`
`1011;
`
`(12) Salami, R., LaFlamme, C., Adoul, J-P., 1992, “Real-Time
`
`Implementation of a 9.6 Kbit/s ACELP Wideband Speech Coder,”
`
`Orlando Globecom ’92 Communication for Global Users, Vol. 1,
`
`pp.447-451, obtained from the Library of Congress, Ex-1012;
`
`– 6 –
`
`
`
`Ex. 1003 / Page 6 of 114
`
`
`
`
`
`(13) Bibliographic Record Information for Orlando Globecom ‘’92
`
`Communication for Global Users, available at the Library of Congress
`
`online catalog at
`
`https://catalog.loc.gov/vwebv/holdingsInfo?searchId=7122&recPointe
`
`r=10&recCount=25&bibId=11454684, accessed February 17, 2017,
`
`Ex-1013;
`
`(14) MARC Record Information for Proceedings of the IEEE, available at
`
`the Library of Congress online catalog at
`
`https://catalog.loc.gov/vwebv/staffView?searchId=12413&recPointer
`
`=0&recCount=25&searchType=2&bibId=11315346, accessed March
`
`8, 2017, Ex-1015;
`
`(15) Bibliographic Record Information for Proceedings of the IEEE,
`
`available at the Library of Congress online catalog at
`
`https://catalog.loc.gov/vwebv/holdingsInfo?searchId=12413&recPoint
`
`er=0&recCount=25&searchType=2&bibId=11315346, accessed
`
`March 8, 2017, Ex-1016;
`
`(16) Declaration of Ingrid Hsieh-Yee, Ph.D Under 37 C.F.R. § 1.68, Ex-
`
`1017;
`
`(17) Joint Claim Construction Chart in Saint Lawrence Communications
`
`– 7 –
`
`
`
`Ex. 1003 / Page 7 of 114
`
`
`
`
`
`LLC v. Apple Inc., et al., 2:16-cv-00082 (E.D. Tex 2017), Ex-1018;
`
`(18) Claim Construction Order in Saint Lawrence Communications LLC v.
`
`ZTE Corporation, et al., 2-15-cv-00349 (E.D. Tex 2016), Ex-1019;
`
`(19) Service of Apple in Saint Lawrence Communications LLC v. Apple
`
`Inc., et al., 2:16-cv-00082 (E.D. Tex. 2016), Ex-1020;
`
`(20) U.S. Patent No. 5,235,669 to Ordentlich et al. (“Ordentlich”), Ex-
`
`1022;
`
`(21) U.S. Patent No. 7,599,832 to Lin et al. (“Lin”), Ex-1023;
`
`(22) “A wideband codec at 16/24 kbit/s with 10 ms frames,” IEEE
`
`Workshop on Speech Coding for Telecom., pg. 103-104 (Sept. 1997)
`
`to R. Salami and R. Lefebvre et al. (“Salami-97”), Ex-1024;
`
`(23) “Energy-Based Effective Length of the Impulse Response of a
`
`Recursive Filter,” ICASSP ’98 PROCEEDINGS OF THE 1998 IEEE
`
`INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL
`
`PROCESSING, Vol. 3, pg. 1253-56 (May 1998) to Laakso et al.
`
`(“Laakso”), Ex-1025;
`
`(24) “Fast CELP coding based on algebraic codes,” ICASSP ’87, PROC.
`
`1987 IEEE INTL. CONF. ACOUSTICS, SPEECH, AND SIGNAL PROCESSING
`
`(1987) to Adoul et al. (“Adoul”), Ex-1026;
`
`– 8 –
`
`
`
`Ex. 1003 / Page 8 of 114
`
`
`
`
`
`(25) “Predictive Coding of Speech Signals and Subjective Error Criteria,”
`
`IEEE TRANS. ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (1979) to
`
`Atal et al. (“Atal 1979”), Ex-1027;
`
`(26) “A New Model of LPC Excitation for Producing Natural-Sounding
`
`Speech at Low Bit Rates,” ICASSP ’82, PROC. 1982 IEEE INTL. CONF.
`
`ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (1982) to Atal et al.
`
`(“Atal 1982”), Ex-1028;
`
`(27) “Efficient Vector Quantization of LPC Parameters for Harmonic
`
`Speech Coding,” PH.D THESIS, SIMON FRASER UNIV. (Oct. 1996) to B.
`
`Bhattacharya (“Bhattacharya”), Ex-1029;
`
`(28) “Speech Processing with Linear and Neural network Models,” PH.D.
`
`THESIS, UNIV. OF CAMBRIDGE (1996) to T. L. Burrows (“Burrows”),
`
`Ex-1030;
`
`(29) “Waveform Interpolation Speech Coder at 4 kb/s,” M.S. THESIS,
`
`MCGILL UNIVERSITY, Aug. 1998, by E. Choy (“Choy”), Ex-1031;
`
`(30) “Efficient Calculation of Spectral Tilt from Various LPC Parameters,”
`
`NAVAL COMMAND, CONTROL AND OCEAN SURVEILLANCE CENTER
`
`(NCCOSC), No. 92152-52001 (May 1996) to Goncharoff et al.
`
`(“Goncharoff”), Ex-1032;
`
`– 9 –
`
`
`
`Ex. 1003 / Page 9 of 114
`
`
`
`
`
`(31) “On Fast FIR Filters Implemented as Tail-Canceling IIR Filters,”
`
`IEEE TRANS. SIGNAL PROCESSING (Jun 1997) to Wang et al.
`
`(“Wang”), Ex-1033;
`
`(32) GSM 06.10, v5.0.1 (1997), Ex-1034;
`
`(33) GSM 06.60, v5.0.0 (1996), Ex-1035;
`
`(34) “Analog to Digital Conversion of Voice by 2,400 Bit/Second Linear
`
`Predictive Coding,” FEDERAL STANDARD 1015 (Dec. 1996) (“FS1015-
`
`LPC10”), Ex-1036;
`
`(35) ITU G.728 (1992), Ex-1037;
`
`(36) ITU G.729 (1996), Ex-1038;
`
`(37)
`
` “16 kbps Wideband and Speech Coding Technique Based on
`
`Algebraic CELP,” ICASSP ’91 PROCEEDINGS OF THE 1991 IEEE
`
`INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL
`
`PROCESSING (1991) to Laflamme (“Laflamme”), Ex-1039;
`
`(38) “High Quality Coding of Wideband Audio Signals Using Transform
`
`Coded Excitation (TCX),” ICASSP ’94 PROCEEDINGS OF THE 1994
`
`IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND
`
`SIGNAL PROCESSING (1994) to Lefebvre, et. al. (“Lefebvre”), Ex-
`
`1040;
`
`– 10 –
`
`
`
`Ex. 1003 / Page 10 of 114
`
`
`
`
`
`(39) “RPCELP: A high quality and low complexity scheme for narrow
`
`band coding of speech,” CONF. PROC. ON AREA COMM. EUROCON-
`
`88 (1988) to Lever et al. (“Lever”), Ex-1041;
`
`(40) “Low Delay Code Excited Linear Predictive (LD-CELP) Coding of
`
`Wide Band Speech at 32kbits/sec,” M.S. THESIS, MASSACHUSETTS
`
`INSTITUTE OF TECHNOLOGY (April 1, 1990) to E. Ordentlich
`
`(“Ordentlich Thesis”), Ex-1042;
`
`(41) “Extrapolation of Wideband Speech From the Telephone Band,”
`
`MASTER’S THESIS, UNIVERSITY OF TORONTO (1997) to A. A. Pyke
`
`(“Pyke”), Ex-1043;
`
`(42) “Design and Description of CS-ACELP: A Toll Quality 8 kb/s Speech
`
`Coder,” IEEE TRANS. SPEECH AND AUDIO PROCESSING (Mar. 1998) to
`
`Salami et al. (“Salami-1998), Ex-1044;
`
`(43) “A 13.0 kbit/s Wideband Speech Codec Based on SB-ACELP,”
`
`ICASSP ’98 PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL
`
`CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (1998)
`
`to J. Schnitzler (“Schnitzler”), Ex-1045;
`
`(44) “Code-Excited Linear Prediction (CELP): High-Quality Speech at
`
`Very Low Bit Rates,” ICASSP ’85, PROC. 1985 IEEE INTL. CONF.
`
`– 11 –
`
`
`
`Ex. 1003 / Page 11 of 114
`
`
`
`
`
`ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (1985) to Schroeder, et.
`
`al. (“Schroeder”), Ex-1046;
`
`(45) “Speech Coding: A Tutorial Review,” PROC. IEEE, vol. 82, no. 10
`
`(Oct. 1997) to Spanias (“Spanias”), Ex-1047;
`
`4.
`
`In forming the opinions expressed below, I have considered:
`
`(1) The documents listed above, and
`
`(2) My knowledge and experience based upon my work in the field of
`
`speech coding, as described below.
`
`II. Qualifications and Professional Experience
`
`5. My complete qualifications and professional experience are described
`
`in my curriculum vitae, a copy of which is submitted as Exhibit 1004 with this
`
`declaration. The following is a brief summary of my relevant qualifications and
`
`professional experience:
`
`6.
`
`I received a Bachelor of Science degree in Electrical Engineering
`
`from the University of Massachusetts at Amherst in 1968 and a Master of Science
`
`degree in Electrical Engineering from the University of Illinois at Urbana-
`
`Champaign in 1970.
`
`7.
`
`I have been working in the field of speech, signals, and language since
`
`1969. While a graduate student at the University of Illinois in 1969 and 1970, I
`
`– 12 –
`
`
`
`Ex. 1003 / Page 12 of 114
`
`
`
`
`
`designed and built a pitch detection device to track the pitch of speakers.
`
`8.
`
`I then reported for active duty in the U.S. Air Force at Chanute Air
`
`Force Base where I was an officer working for the Base Civil Engineers. In 1971,
`
`I was assigned to the National Security Agency at Fort Meade, Maryland, where I
`
`worked on various problems related to speech coding, speech analysis, and signal
`
`processing. While at NSA, I was the Contract Officer’s Technical Representative
`
`(COTR) on a contract with the Speech Communication Research Lab in Santa
`
`Barbara, CA that sponsored the work of Markel and Gray, which provided the
`
`foundation for early studies on linear-predictive speech coding and synthesis—
`
`which in turn led to the subsequent studies on high-frequency generation.
`
`9.
`
`I converted from Military to Civilian status in 1975 and was granted
`
`an NSA fellowship to attend graduate school for one year (I returned to NSA in
`
`1976). I attended the University of Connecticut, in the school of Linguistics, as I
`
`felt that the speech and signal problems I had been working on needed a deeper
`
`understanding of the language involved in those processes. Based on this work
`
`and subsequent thesis work, I later obtained a Ph.D. in Linguistics in 1982.
`
`10.
`
`I left NSA in 1982 to work as a member of the research staff at IBM
`
`TJ Watson Research Laboratories, where I designed and built (in software) the
`
`front end processing system for TANGORA, a 5000 word speaker-dependent
`
`continuous office dictation system. I also designed and implemented a telephone
`
`– 13 –
`
`
`
`Ex. 1003 / Page 13 of 114
`
`
`
`
`
`interface that allowed the system to be used over a digital telephone line rather
`
`than with a dedicated microphone. The front-end system is described in the paper
`
`“Application of an Auditory Model to Speech Recognition,” as noted in my CV.
`
`Ex-1025.
`
`11.
`
`In 1985, I was recruited to join the technical staff at the Institute for
`
`Defense Analyses (IDA) in Princeton, New Jersey, where I spent 14 years as a
`
`member of the research staff. I had also been detailed to IDA while working at
`
`NSA from 1979-1982. While most of my work at IDA is classified, it focused on
`
`speech and language systems and on the mathematical foundations of speech
`
`recognition systems. (Baum, Petrie, Soules and Weiss, at IDA in 1969, proved that
`
`Hidden Markov Model Training converged, and this mathematical foundation later
`
`became the dominant probabilistic model used in speech recognition systems.) Of
`
`note is my published unclassified paper, “Segmenting Speech Using Dynamic
`
`Programming,” which is a foundational study often cited in modern speech
`
`analysis.
`
`12. During my tenure at IDA, I served as a government advisor to
`
`DARPA, which was funding research in speech recognition, speech understanding,
`
`and interactive systems. I was an active participant in the Airline Transportation
`
`Information System (ATIS) program, serving on several committees, and I chaired
`
`the DARPA annual program review in spoken language technology in Austin, TX
`
`– 14 –
`
`
`
`Ex. 1003 / Page 14 of 114
`
`
`
`
`
`in 1995.
`
`13. Also while at IDA, I founded summer workshops in speech
`
`recognition, the first in 1993 and the second in 1994. I organized the workshops
`
`and served as co-chair with Jim Flanagan, the director of CAIP, a research
`
`laboratory at Rutgers that hosted the workshops. In these workshops, we invited
`
`30 to 40 researchers from academic, corporate, and government backgrounds to
`
`work for six to eight weeks on various problems relating to continuous speech
`
`recognition. In 1993, Sun Microsystems had just released a SPARK station
`
`computer that was able to decode speech in real time, and we were able to obtain
`
`from Sun a workstation for each researcher, which we networked and connected to
`
`a large disk storage facility. Funded by DARPA, the project brought together the
`
`best minds in speech research, and the joint working conditions allowed rapid,
`
`cooperative development of new ideas. Of particular note was the development of
`
`Vocal Tract Length Normalization, developed by two colleagues and me, and
`
`published in “Vocal Tract Normalization in Speech Recognition: Compensating for
`
`Systematic Speaker Variability” in 1995. This technique is now found in
`
`essentially every commercial large vocabulary speech recognition system.
`
`14. The records and papers from these workshops were gathered together
`
`and are available from the National Institute of Standards and Technology (NIST).
`
`In 1995, the workshop was moved to Johns Hopkins under the tutelage of Fred
`
`– 15 –
`
`
`
`Ex. 1003 / Page 15 of 114
`
`
`
`
`
`Jelinek and it remains an event every summer to this day. Fred expanded the scope
`
`from speech recognition to speech and language more generally, including topics
`
`such as parsing and language translation, and I have remained an advisor to the
`
`workshop and an occasional participant. My most recent workshop attendance was
`
`at the 2014 workshop on meaning in Prague.
`
`15.
`
`I left IDA in 1999 to take a position as the Director of Business
`
`Relations at Dragon in Newton, Massachusetts. In that position, I collaborated
`
`with the research department, engineering department, and various government
`
`agencies to find areas of joint interest. I was responsible for $6M of funding from
`
`DARPA to Dragon to support various research topics. In addition, I assisted as a
`
`business mentor to the Audio Mining group at Dragon, who were developing a
`
`keyword search algorithm based on large vocabulary speech recognition for use in
`
`speech analytics in Interactive Voice Response systems.
`
`16. When Dragon was sold to Lernout and Hauspie in 2000, I left to
`
`become the Chief Technology Officer of Voice Signal Technologies (VST) in
`
`Woburn, Massachusetts. While at VST, I recruited VST’s research team and
`
`served as a mentor for their research and engineering efforts. We developed small-
`
`footprint speech recognition systems for toys and devices, as well as created the
`
`first embedded name recognition voice dialer in cell phones. We designed not
`
`only the software but also the user interfaces that made these systems accessible to
`
`– 16 –
`
`
`
`Ex. 1003 / Page 16 of 114
`
`
`
`
`
`consumers. The software or derivative software is currently available from
`
`Nuance as Vsuite and is found in more than 800 Million mobile devices. (For
`
`example, if you have a SIRI-enabled phone, and turn SIRI off, the speech
`
`recognizer that remains for voice dialing and navigation is Vsuite). While at VST,
`
`I spoke often at industry gatherings and wrote papers on the user interface and
`
`embedded speech recognition for the technical press.
`
`17.
`
`In 2006, I left VST to join SRI (formerly Stanford Research
`
`International) in Menlo Park, California. I was hired to be the Principal
`
`Investigator for Global Autonomous Language Exploitation (GALE), a DARPA
`
`project in which speech recognition, language translation, and information search
`
`and delivery were used cooperatively to analyze events and situations in foreign
`
`languages, particularly Arabic and Mandarin. SRI was one of three contractors
`
`participating in this project, and the SRI portion was about $12M per year. We had
`
`a local research organization at SRI in addition to 14 subcontractors spanning 5
`
`countries.
`
`18.
`
`In 2009, DARPA decided to discontinue the SRI participation in
`
`GALE (a planned reduction in the size of the project), and I left the company. I
`
`then formed SPELAMODE (SPEech LAnguage, and MObile DEvices), an
`
`independent organization supporting technical advancement in speech and
`
`language technology and human interfaces. My work included founding a speech
`
`– 17 –
`
`
`
`Ex. 1003 / Page 17 of 114
`
`
`
`
`
`research organization at Cisco (jointly with Patti Price), analysis and correction of
`
`speech reconstructions for Audience (who specialized in noise and background
`
`suppression in audio signals), and design of the audio system and interface for the
`
`RIM 7 at Dash (a subsidiary of RIM).
`
`19.
`
`I also serve as the co-CTO for a Pittsburgh based company named
`
`Kextil, which designs and builds speech interfaces for the field services industry. I
`
`was the Chief Scientist for Speech Morphing, a company in Campbell, California
`
`specializing in modifying the voices of talkers using signal processing techniques.
`
`I served as a technical advisor to Personics, a Boca Raton company that is
`
`developing a novel device for delivering and recording audio in the ears of people.
`
`I am a co-inventor on patent applications for all three of these companies. Finally,
`
`I recently served as the co-principal investigator for a project called OUCH
`
`(Outing Unfortunate Characteristics of HiddenMarkovModels) at the International
`
`Computer Science Institute (ICSI) at Berkeley. Under a sponsorship from the
`
`Intelligence Advanced Research Projects Agency (IARPA), we studied the reasons
`
`for the failures of speech recognition systems and surveyed people in the field
`
`active in research and development to understand their experience with the current
`
`technology. The final report may be found at the ICSI website
`
`www.icsi.berkeley.edu.
`
`20. With several colleagues, I recently co-founded Semantic Machines, a
`
`– 18 –
`
`
`
`Ex. 1003 / Page 18 of 114
`
`
`
`
`
`company for creating infrastructure for the design and development of
`
`conversational systems, such as conversational artificial intelligence interfaces.
`
`The company is private, and our presence is announced at
`
`www.semanticmachines.com.
`
`21. Throughout my career, I have been a member of the IEEE, the
`
`Acoustical Society, and occasionally a member of the European-based ISCA. I
`
`served on the Board of Directors of ICSI in the mid-2000s. I have participated in
`
`the annual meeting of the IEEE in speech and signals (ICASSP) and also
`
`frequently attended Eurospeech/Interspeech. Additionally, I have regularly
`
`attended the meetings of the American Voice Input Output Society (AVIOS),
`
`where commercial speech and language devices and software were presented and
`
`discussed. I was a member of the ETSI (European Technical Standards Institute)
`
`Committee on DSR (Distributed Speech Recognition) in the early 2000’s and
`
`attended their meetings and deliberations on signal processing solutions for a
`
`speech recognition front end in mobile devices.
`
`22.
`
`I am an inventor or co-inventor of more than 14 United States Patents,
`
`almost all of which relate to speech or speech recognition. Further detail on my
`
`education, patents, work experience, and the cases in which I have previously
`
`given testimony in the past four years is contained in my CV, which is attached as
`
`an Exhibit to this Petition. Ex-1003.
`
`– 19 –
`
`
`
`Ex. 1003 / Page 19 of 114
`
`
`
`
`
`23.
`
`In summary, I have a deep familiarity with speech encoders and
`
`decoders and their associated mathematical models, speech synthesis and
`
`synthesizers, speech filtering, speech pre-emphasis and perceptual weighting, the
`
`analysis and correction of reconstructed speech, including noise reduction, and
`
`wideband speech signal processing. I had first-hand experience with these
`
`technologies at the relevant time of the ’524 Patent and before.
`
`III. Level of Ordinary Skill in the Art
`
`24.
`
`I am familiar with the knowledge and capabilities possessed by one of
`
`ordinary skill in the field of speech coding in the period before and around 1998,
`
`the year in which the parent Canadian patent application of the ’524 was filed. In
`
`particular, I have been informed by Apple’s counsel that the earliest alleged
`
`priority date for the ’524 Patent is October 27, 1998 based on parent Canadian
`
`patent application No. 2252170. Unless otherwise stated, my testimony below
`
`refers to the knowledge of one of ordinary skill in the speech-coding arts in the
`
`period around and prior to October 27, 1998.
`
`25. My extensive experience (i) in the industry and (ii) with engineers
`
`practicing in the industry, as detailed above, allowed me to become personally
`
`familiar with the level of skill of individuals and the general state of the art. In my
`
`opinion, the level of ordinary skill in the art needed to have the capability of
`
`– 20 –
`
`
`
`Ex. 1003 / Page 20 of 114
`
`
`
`
`
`understanding the scientific, mathematical, and engineering principles applicable
`
`to the ’524 Patent is in the academic area of electrical engineering or equivalent
`
`training. Such academic training and relevant industry experience would include
`
`experience with speech coding technologies, including familiarity and design
`
`experience with the most common types of speech encoders and decoders at the
`
`time of this patent, one of the most common of which was the Code Excited Linear
`
`Prediction approach to speech coding that is relevant to the ’524 Patent and a topic
`
`of this declaration. That includes basic knowledge of linear prediction analysis and
`
`mathematics, long-term pitch prediction and adaptive codebooks, various types of
`
`fixed excitation or innovative codebook methods, signal pre-emphasis, perceptual
`
`weighting, the optimization principles underlying analysis-by-synthesis speech
`
`coding, and CELP decoding.
`
`26. Therefore, based on the technologies disclosed in the ’524 patent, the
`
`level of ordinary skill in the art would include someone who had, at the priority
`
`date of the ’524 patent, (i) a Master’s of Science (M.S.) degree in Electrical
`
`Engineering or equivalent training, and (ii) at least three to five years of relevant
`
`industry experience in the field of speech coding technology. Unless otherwise
`
`stated, when I provide my understanding and analysis below, it is consistent with
`
`the level of a person of ordinary skill in these technologies prior to the priority date
`
`of the ’524 Patent.
`
`– 21 –
`
`
`
`Ex. 1003 / Page 21 of 114
`
`
`
`
`
`IV. Relevant Legal Standards
`
`27.
`
`I have been asked to provide my opinions regarding whether the
`
`claims 1-21 and 29-42 of the ’524 Patent are anticipated or would have been
`
`obvious to a person having ordinary skill in the art at the time of the alleged
`
`invention, in light of the prior art.
`
`28.
`
`I am not an attorney. In forming my opinions and considering the
`
`patentability of the claims of the ’524 Patent, I am relying upon certain legal
`
`principles that counsel has explained to me. These principles are discussed below.
`
`29.
`
`It is my understanding that in order to anticipate a claim under 35
`
`U.S.C. § 102, a reference must teach every element of the claim. A prior art
`
`reference may anticipate a claim inherently if an element is not expressly stated,
`
`but only if the prior art necessarily includes the claim limitation, and the fact that
`
`the reference might possibly practice or contain a claimed limitation is insufficient
`
`to establish that the reference inherently teaches the limitation.
`
`30. Further, it is my understanding that a claimed invention is
`
`unpatentable under 35 U.S.C. § 103 if the differences between the invention and
`
`the prior art are such that the subject matter as a whole would have been obvious at
`
`the time the invention was made to a person having ordinary skill in the art to
`
`which the subject matter pertains. I therefore understand that a claim is obvious
`
`– 22 –
`
`
`
`Ex. 1003 / Page 22 of 114
`
`
`
`
`
`over a prior art reference if that reference, combined with the knowledge of one
`
`skilled in the art or other prior art references, discloses each and every element of
`
`the recited claim. I have also been informed by counsel that the obviousness
`
`analysis takes into account factual inquiries including the level of ordinary skill in
`
`the art, the scope and content of the prior art, and the differences between the prior
`
`art and the claimed subject matter. Further, I have been informed by counsel that if
`
`a claim includes an equation, the equation is an obvious design choice if the need
`
`for a specific result was known and the corresponding analysis needed to develop
`
`an equation to achieve the result would have been straightforward and within the
`
`skill of such person.
`
`31.
`
`I have been informed by counsel that the Supreme Court has
`
`recognized several rationales for combining references or modifying a reference to
`
`show obviousness of claimed subject matter. Some of these rationales include the
`
`following: (a) combining prior art elements according to known methods to yield
`
`predictable results; (b) simple substitution of one known element for another to
`
`obtain predictable results; (c) use of a known technique to improve a similar device
`
`(method, or product) in the same way; (d) applying a known technique to a known
`
`device (method, or product) ready for improvement to yield predictable results; (e)
`
`choosing from a finite number of identified, predictable solutions, with a
`
`reasonable expectation of success; and (f) some teaching, suggestion, or motivation
`
`– 23 –
`
`
`
`Ex. 1003 / Page 23 of 114
`
`
`
`
`
`in the prior art that would have led one of ordinary skill to modify the prior art
`
`reference or to combine prior art reference teachings to arrive at the claimed
`
`invention.
`
`V.
`
`State of the Art Before the ’524 Patent
`
`A. Digital Speech Coding
`
`32. The ’524 Patent is based o