IPR2017-01075, No. 2004-40 Exhibit - exhibit 2004 (P.T.A.B. Jul. 6, 2017)

Inter Partes Review of USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D. (Exhibit 2004)
`
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`
`APPLE, INC.
`Petitioner
`
`v.
`
`SAINT LAWRENCE COMMUNICATIONS LLC
`Patent Owner
`
`Case: IPR2016-01075
`Patent No. 7,151,802
`
`DECLARATION OF ODED GOTTESMAN, PH.D.
`Exhibit 2004
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Mail Stop “PATENT BOARD”
`Patent Trial and Appeal Board
`U.S. Patent and Trademark Office
`P.O. Box 1450
`Alexandria, VA 22313-1450
`
`
`
`SLC 2004
`
`

`Inter Partes Review of USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`TABLE OF CONTENTS
`
`
`
`
`
`I.
`
`II.
`
`INTRODUCTION ........................................................................................... 1
`A.
`Background ........................................................................................... 1
`B.
`Qualifications ........................................................................................ 2
`LIST OF DOCUMENTS CONSIDERED IN FORMULATING MY
`OPINIONS ...................................................................................................... 6
`III. TECHNICAL BACKGROUND AND STATE OF THE ART AT
`THE TIME OF THE ALLEGED INVENTION ........................................... 11
`A.
`Speech coding and Linear Predictive Coding (LPC) analysis ........... 14
`B.
`Long Term Prediction (Pitch Prediction) ........................................... 18
`C.
`Quantization ........................................................................................ 20
`D.
`Code Excited Linear Prediction (CELP) ............................................ 21
`E.
`Perceptual Weighting .......................................................................... 28
`F.
`Long term pitch prediction and using adaptive codebook .................. 29
`G.
`CELP Decoding .................................................................................. 29
`H.
`Speech Bandwidth extension .............................................................. 30
`I.
`Speech Quality .................................................................................... 31
`J.
`Finite precision Considerations .......................................................... 31
`IV. PERSON OF ORDINARY SKILL IN THE ART ........................................ 34
`V. OVERVIEW OF THE ‘802 PATENT .......................................................... 35
`VI. THE CLAIMS OF THE ‘802 PATENT ....................................................... 40
`VII. LEGAL STANDARDS ................................................................................. 41
`A.
`Requirements of a Method and System Patent ................................... 41
`B.
`Obviousness ........................................................................................ 42
`VIII. CLAIM CONSTRUCTION .......................................................................... 47
`IX. SUMMARY OF PRIOR ART TO THE ’802 PATENT ALLEGED IN
`THIS PETITION (CASE IPR2016-00704) .................................................. 47
`A. A 13.0 kbit/s wideband speech codec based on SB-ACELP
`(“Schnitzler”) ...................................................................................... 47
`Pyke Master’s Thesis Exhibit 2010 (“Pyke”) ..................................... 51
`
`B.
`
`
`
`i
`
`

`
`
`X.
`
`Inter Partes Review of USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`PATENTABILITY OF THE CHALLENGED CLAIMS OF THE
`‘802 PATENT ............................................................................................... 78
`XI. CONCLUSION ........................................................................................... 115
`BIBLIOGRAPHY ................................................................................................. 117
`DR. ODED GOTTESMAN – CURRICULUM VITAE ....................................... 119
`
`
`
`ii
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`I, Oded Gottesman, hereby declare as follows:
`
`I.
`
`INTRODUCTION
`A. Background
`1. My name is Oded Gottesman. I am a researcher and consultant
`
`working in areas related to speech and audio coding and enhancement, digital
`
`signal processing, telecommunications, networks, and location and positioning
`
`systems.
`
`2.
`
`I have been retained to act as an expert witness on behalf of SAINT
`
`LAWRENCE COMMUNICATIONS Inc. (“Patent Owner”) in connection with the
`
`above captioned Petition for Method and System Review of U.S. Patent No.
`
`7,151,802 (“Petition”) submitted by Apple, Inc. (“Petitioner”). I understand that
`
`this proceeding involves U.S. Patent No. 7,151,802 (“the ‘802 Patent”), titled
`
`“High frequency Content Recovering Method and Device for Over-Samples
`
`Synthesized Wideband Signal.” The ‘802 Patent is provided as Exhibit 1001.
`
`3.
`
`I understand that Petitioner challenges the validity of Claims 1-3, 8-
`
`11, 16, 25-27, 32-35, 40, 49, 50, 52, and 53 of the ‘802 Patent (the “challenged
`
`claims”).
`
`4.
`
`I have reviewed and am familiar with the ‘802 Patent as well as its
`
`prosecution history. The ‘802 prosecution history is provided as Exhibit 1003.
`
`Additionally, I have reviewed materials identified in Section III.
`
`
`
`1
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`As set forth below, I am familiar with the technology at issue as of the
`
`5.
`
`effective filing date of the ‘802 patent. I have been asked to provide my technical
`
`review, analysis, insights, and opinions regarding the prior art references that form
`
`the basis for the Petition. In forming my opinions, I have relied on my own
`
`experience and knowledge, my review of the ‘802 Patent and its file history, and of
`
`the prior art references cited in the Petition.
`
`6. My opinions expressed in this Declaration rely to a great extent on my
`
`own personal knowledge and recollection. However, to the extent I considered
`
`specific documents or data in formulating the opinions expressed in this
`
`Declaration, such items are expressly referred to in this Declaration.
`
`7.
`
`I am being compensated for my time in connection with this covered
`
`patent review at my standard consulting rate, which is $525 per hour. My
`
`compensation is not contingent upon and in no way affects the substance of my
`
`testimony.
`
`B. Qualifications
`I am a citizen of the United States, and I am currently employed as the
`8.
`
`Chief Technology Officer (“CTO”) of Compandent, Inc.
`
`9. My curriculum vitae, including my qualifications, a list of the
`
`publications that I have authored during my career, and a list of the cases in which,
`
`during the previous four years, I have testified as an expert at trial or by deposition,
`
`
`
`2
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`is attached to this report as Exhibit A. I expect to testify regarding my background,
`
`qualifications, and experience relevant to the issues in this investigation.
`
`10.
`
`I earned my Bachelor of Science degree in Electrical Engineering
`
`from Ben-Gurion University in 1988.
`
`11.
`
`In 1992, I earned my Master of Science degree in Electrical and
`
`Computer Engineering from Drexel University, which included performing
`
`research at AT&T Bell Labs, Murray Hill, at the time considered the world “holy
`
`grail” of speech processing research. My research was in the area of wideband
`
`speech coding, and titled “Algorithm Development and Real-Time Implementation
`
`of High-Quality 32kbps Wideband Speech Low-Delay Code-Excited Linear
`
`Predictive (LD-CELP) Coder”. The work continued a prior research by E.
`
`Ordentlich, and Y. Shoham who was also my M.Sc. research advisor. As a part of
`
`my work, I have also implemented that algorithm in DSP Assembly Language on
`
`two DSPs running in parallel. I subsequently co-authored and published two
`
`articled about this work.
`
`12.
`
`I earned my Doctorate of Philosophy in Electrical and Computer
`
`Engineering from the University of California at Santa Barbara in 2000.
`
`13.
`
`I have worked in the field of digital signal processing (“DSP”) for
`
`over 25 years, and have extensive experience in DSP research, design, and
`
`development, as well as the design and development of DSP-related software and
`
`
`
`3
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`hardware. Presently, I am the CTO of Compandent, a technology company that
`
`develops and provides telecommunication and DSP-related algorithms, software
`
`and hardware products, real-time DSP systems, speech coding and speech
`
`enhancement-related projects, and DSP, software, and hardware-related services.
`
`While at Compandent, I have contributed to a speech coding algorithm and noise
`
`canceller algorithm that has been adopted for secure voice communication by the
`
`U.S. Department of Defense & NATO. Currently, I am supporting the DoD’s and
`
`NATO’s use of these algorithms, and am performing real-time implementation
`
`projects for DoD and NATO vendors, as well as the Defense Advanced Research
`
`Projects Agency (DARPA).
`
`14.
`
`I have worked for numerous different companies in the field of digital
`
`signal processing during my career. I am very familiar with most, if not all, speech
`
`coding, speech enhancement, audio coding, and video coding techniques for
`
`various applications. As part of my work, I have developed real-time DSP
`
`systems, DSP software for telephony applications, various serial communication
`
`software and hardware, and Internet communication software. I have led real-time
`
`DSP and speech coding engineering groups in two high-tech companies before my
`
`present company (Comverse Technology, Inc. and Optibase Ltd.), and, at DSP
`
`Communications, Inc., I was involved with echo cancellation, noise cancellation,
`
`and the creation of state-of-the-art chipsets for cellular telephones.
`
`
`
`4
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`I have been working with, and have written programs for, personal
`
`15.
`
`computers since around 1986. Initially on DOS, and later on Windows 3.1,
`
`Windows 96, Windows 98, and Windows 2000, Windows NT, Windows XP,
`
`Windows 7 and Windows 10, Apple computers, iOS, Unix, Linux, and Android
`
`operating systems, as well as numerous Digital Signal Processors (DSP). Much of
`
`my programming concerned digital signal processing, particularly speech, audio
`
`and image coding, and communications.
`
`16. From 2001, I also have been providing expert technology services in
`
`patent disputes. My biography and experience relevant to my work in these
`
`matters is more fully detailed in Exhibit A.
`
`17.
`
`I have been the co-recipient, with Dr. Allen Gersho, of the Ericsson-
`
`Nokia Best Paper Award for the paper: “Enhanced Waveform Interpolative Coding
`
`at 4 kbps,” IEEE Workshop on Speech Coding, Finland, 1999. The IEEE
`
`Workshop on Speech Coding is an exclusive workshop for speech-coding
`
`researchers from around the world.
`
`18.
`
`I have authored and co-authored approximately eight journal
`
`publications, in addition to conference proceedings, technical articles, technical
`
`papers, book chapters, and technical presentations concerning a broad array of
`
`signal processing technology. I have also developed and taught many courses
`
`
`
`5
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`related to digital signal processing and signal processing systems. These courses
`
`have included introductory level and advanced courses.
`
`19.
`
`I have several international patents related to the field of audio signal
`
`enhancement, including U.S. Patent Nos. 6,614,370; 7,643,996; 7,010,482; and
`
`7,584,095.
`
`20.
`
`I am being compensated at the rate of $525 per hour for my work in
`
`connection with this matter. The compensation is not dependent in any way on the
`
`contents of this report, the substance of any further opinions or testimony that I
`
`may provide, or the ultimate outcome of this matter.
`
`II. LIST OF DOCUMENTS CONSIDERED IN FORMULATING MY
`OPINIONS
`In formulating my opinions, I have reviewed and considered all of the
`21.
`
`following documents:
`
`EXHIBIT
`NO.
`1001
`1002
`1003
`1004
`1005
`
`1006
`
`1007
`
`
`
`DESCRIPTION
`
`U.S. Patent No. 7,151,802
`File history of U.S. Patent No. 7,151,802
`Declaration of Jordan Cohen, Ph.D, Under 37 C.F.R. § 1.68
`ITU G.722 (1988)
`“A 13.0 kbit/s Wideband Speech Codec Based on SB-ACELP,”
`ICASSP ’98, PROC. 1998 IEEE INTL. CONF. ACOUSTICS,
`SPEECH, AND SIGNAL PROCESSING (1998) to J.
`Schnitzler (“Schnitzler”)
`“ITU-T G.729 Annex A: Reduced Complexity 8 kb/s CS-
`ACELP Codec for Digital Simultaneous Voice and Data,”
`IEEE COMM. MAG., 57-63 (Sept. 1997) to Salami et al.
`(“Salami 1997”)
`“A New Model of LPC Excitation for Producing Natural-
`
`6
`
`

`
`
`
`
`1008
`
`1009
`1010
`
`1011
`1012
`1013
`
`1014
`
`1015
`
`1016
`
`1017
`
`1021
`
`1022
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`Sounding Speech at Low Bit Rates,” ICASSP ’82, PROC. 1982
`IEEE INTL. CONF. ACOUSTICS, SPEECH, AND SIGNAL
`PROCESSING (1982) to Atal et al. (“Atal 1982”)
`“Waveform Interpolation Speech Coder at 4 kb/s,” M.S.
`Thesis, McGill University Department of Electrical and
`Computer Engineering (Aug. 1998) to E. Choy (“Choy”)
`GSM 06.60, v5.0.0 (1996)
`“Extrapolation of Wideband Speech From the Telephone
`Band,” University of Toronto Department of Electrical and
`Computer Engineering Master’s Thesis (1997) to A. A. Pyke
`(“Pyke”)
`ITU G.728 (1992)
`ITU G.729 (1996)
`“16 kbps Wideband and Speech Coding Technique Based on
`Algebraic CELP,” ICASSP ’91, PROC. 1991 IEEE INTL.
`CONF. ACOUSTICS, SPEECH, AND SIGNAL
`PROCESSING (1991) to Laflamme (“Laflamme”)
`“High Quality Coding of Wideband Audio Signals Using
`Transform Coded Excitation (TCX),” ICASSP ’94, PROC.
`1994 IEEE INTL. CONF. ACOUSTICS, SPEECH, AND
`SIGNAL PROCESSING (1994) to Lefebvre, et. al.
`(“Lefebvre”)
`“Low Delay Code Excited Linear Predictive (LD-CELP)
`Coding of Wide Band Speech at 32kbits/sec,” M.S. Thesis,
`Massachusetts Institute of Technology Department of Electrical
`Engineering and Computer Science (April 1, 1990) to E.
`Ordentlich (“Ordentlich Thesis”)
`“Code-Excited Linear Prediction (CELP): High-Quality Speech
`at Very Low Bit Rates,” ICASSP ’85, PROC. 1985 IEEE
`INTL. CONF. ACOUSTICS, SPEECH, AND SIGNAL
`PROCESSING (1985) to Schroeder, et. al. (“Schroeder”)
`“Speech Coding: A Tutorial Review,” PROC. IEEE, vol. 82,
`no. 10 (Oct. 1997) to Spanias (“Spanias”)
`Japanese Patent Application Publication No. JH08-123495
`(May
`17, 1996) to Tasaki et al.
`Japanese Patent Application Publication No. JH08-123495,
`Machine Translation (May 17, 1996) to Tasaki et al. (“Tasaki
`’495”)
`
`7
`
`

`1023
`
`1024
`
`1029
`
`1030
`
`1031
`1032
`
`1033
`
`1034
`
`1035
`
`2001
`
`2002
`
`2003
`
`2005
`
`
`
`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`“16 kbit/s Wideband Speech Coding Based on Unequal
`Subbands,” ICASSP ’96, PROC. 1996 IEEE INTL. CONF.
`ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 57-63
`(May 7-10, 1996) to Paulus et al. (“Paulus”)
`“Reconstruction of Wideband Audio from Narrowband CELP
`Code,” ACOUSTICAL SOC. JPN., Lecture Paper Collection,
`249 (1994) Tasaki et. al. (“Tasaki”)
`“Analog to Digital Conversion of Voice by 2,400 Bit/Second
`Linear Predictive Coding,” Federal Standard 1015, December
`17, 1996.
`Claim Construction Order in Saint Lawrence Communications
`LLC v. ZTE Corporation, et al., 2-15-cv-00349 (E.D. Tex
`2016).
`U.S. Patent No. 5,966,689 to McCree (“McCree”)
`“MELP: The New Federal Standard at 2400 BPS,” IEEE
`ICASSP (1997) to Supplee, Lynn M. et al. (“Supplee”)
`High-Frequency Regeneration in Speech Coding Systems,”
`ICASSP ’79, PROC. 1979 IEEE INTL. CONF. ACOUSTICS,
`SPEECH, AND SIGNAL PROCESSING, 428-31 (April 2-4,
`1979) to Makhoul et al. (“Makhoul”)
`Joint Claim Construction Chart in Saint Lawrence
`Communications LLC v. Apple Inc., et al., 2:16-cv-00082
`(E.D. Tex 2017)
`“Real-time Implementation of a 9.6 kbit/s ACELP Wideband
`Speech Coder,” Conf. Rec. IEEE GLOBECOM, 447-451 (Dec.
`1992) to Salami et al. (“Salami”)
`P. Mermelstein, “G.722, A new CCITT Coding Standard for
`Digital Transmission of Wideband Audio Signals,” IEEE
`Comm. Mag., Vol. 26, No. 1, pp. 8-15, Jan. 1988.
`Fuemmeler et. al, “Techniques for the Regeneration of
`Wideband Speech from Narrowband Speech,” EURASIP
`Journal on Applied Signal Processing 2001:0, 1-9 (Sep. 2001).
`C.H. Ritz et. al., “Lossless Wideband Speech Coding,” 10th
`Australian Int’l. Conference on Speech Science & Technology,
`p. 249 (Dec. 2004).
`“Discrete-Time Signal Processing,” by Alan V. Oppenheim,
`
`8
`
`

`
`
`
`
`2006
`
`2007
`2008
`
`2009
`
`2010
`
`2011
`
`2012
`
`2013
`
`2014
`
`2015
`
`2016
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`
`Ronald W. Schafer
` https://www.mathworks.com/help/matlab/math/random-
`numbers-with-specific-mean-and-variance.html
`Transcript of Deposition of Dr. Johnson
`O. Gottesman and A. Gersho, “Enhanced Waveform
`Interpolative Coding at Low Bit Rate,” in IEEE Transactions
`on Speech and Audio Processing, vol. 9, November 2001, pp.
`786-798
`O. Gottesman and A. Gersho, “Enhancing Waveform
`Interpolative Coding with Weighted REW Paramertric
`Quantization,” in IEEE Workshop on Speech Coding
`Proceedings, pp. 50-52, September 2000, Wisconsin, USA
`O. Gottesman and A. Gersho, “High Quality Enhanced
`Waveform Interpolative Coding at 2.8 kbps,” in Proc. IEEE
`ICASSP’2000, vol. III, pp. 1363-1366, June 5-9, 2000,
`Istanbul, Turkey.
`O. Gottesman and A. Gersho, “Enhanced Analysis-by-
`Synthesis Waveform Interpolative Coding at 4 kbps,”
`EUROSPEECH’99, pp. 1443-1446, 1999, Hungary
`O. Gottesman and A. Gersho, “Enhanced Waveform
`Interpolative Coding at 4 kbps,” IEEE Workshop on Speech
`Coding Proceedings, pp. 90-92, 1999, Finland
`O. Gottesman, “Dispersion Phase Vector Quantization For
`Enhancement of Waveform Interpolative Coder,” IEEE
`ICASSP’99, vol. 1, pp. 269-272, 1999
`O. Gottesman and Y. Shoham, “Real-Time Implementation of
`High Quality 32 kbps Wideband Speech LD-CELP Coder,”
`EUROSPEECH’93, 1993
`Oded Gottesman, “Redundant compression of techniques for
`transmitting data over degraded communication links and/or
`storing data on media subject to degradation,” U.S. Patent
`6,614,370
`Oded Gottesman, “Enhanced waveform interpolative coder,”
`U.S. Patent 7,643,996
`
`9
`
`

`
`
`
`
`2017
`
`2018
`
`2019
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`Oded Gottesman and Allen Gersho, “REW parametric vector
`quantization and dual-predictive SEW vector quantization for
`waveform interpolative coding”, U.S. Patent 7,584,095
`Oded Gottesman and Allen Gersho, “REW parametric vector
`quantization and dual-predictive SEW vector quantization for
`waveform interpolative coding”, U.S. Patent 7,010,482
`Rabiner and Schafer, “Digital Processing Of Speech Signals,”
`Prentice Hall Inc., 1978.
`
`22.
`
`I have reviewed and am familiar with the response to Petition
`
`submitted on behalf of Patent Owner for covered patent review submitted with this
`
`Declaration and I agree with the technical analysis that underlies the positions set
`
`forth in the response to Petition.
`
`23.
`
`I have reviewed and am familiar with the Petition for covered patent
`
`submitted by Petitioner, and I disagree with some of it, and with its conclusions. I
`
`have reviewed and considered Dr. Cohen’s Report submitted on behalf of
`
`Petitioner, and I disagree with some of it, and with its conclusions.
`
`24.
`
`I may consider additional documents as they become available or
`
`other that are necessary to form my opinions. I reserve the right to revise,
`
`supplement, or amend my opinions based on new information and on my
`
`continuing analysis.
`
`
`
`10
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`III. TECHNICAL BACKGROUND AND STATE OF THE ART AT THE
`TIME OF THE ALLEGED INVENTION
`25. Speech signal is generated by air flow emanating from the lungs,
`
`passed through the vocal chords which may vibrate to create quasi-periodic air
`
`pulses, which are then passed through the vocal tracts (throat cavity, mouth cavity
`
`and nasal cavity) and radiate outside though the lips and/or the nose. When the
`
`vocal chords vibrate the generated quasi-periodic speech is called “voiced”, and
`
`when they are at rest the generated speech is more noise like and called
`
`“unvoiced”. A classical model for speech production (“Digital Processing of
`
`Speech Signals, L. R. Rabiner and R. W. Schafer, Prentice Hall, 1978) is illustrated
`
`below. This early model includes a switch for selecting between the “voiced”
`
`component generated by the periodic source and “unvoiced” component generated
`
`by the noise source, each multiplied by an appropriate gain, to form the excitation
`
`signal, which is then passed through a system modeling the vocal tract and uses
`
`time varying parameters, and the resulted signal is finally passed though the lips
`
`radiation model to form the generated speech. The fundamental period of the
`
`vocal chords vibration is known as the pitch period. The vocal tract is typically
`
`modeled by time varying filter implemented by set of parameters (or coefficients).
`
`
`
`11
`
`

`
`
`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`
`
`
`
`26. Speech signal is much richer than simply voiced and unvoiced, for
`
`example buzzing sounds like “z” and “v” include both periodic and noise
`
`components, and therefore in later models, such as the one below, the switch was
`
`replaced by a mixer that combined timed varying mixture. For simplicity let’s
`
`consider the following simplified diagram. For producing good quality speech, the
`
`vocal tract parameters, representing the vocal tract resonance, are sufficiently
`
`updated every 20-30 ms, while the excitation components are sufficiently updated
`
`every 5-7 ms. It can be shown that the vocal tract effect adds correlation among
`
`
`
`12
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`neighboring samples, which is often treated as short-term prediction or
`
`redundancy. The vocal chords quasi periodicity is often treated as long-term
`
`prediction or redundancy.
`
`
`
`Figure 1. Block diagram of simplified speech production model
`
`27. Speech coding is used for representing speech signals in a compact
`
`form for transmission or storage. The compression is typically achieved by first
`
`capturing and removing the redundancies in forms of short-term prediction and
`
`long-term prediction that exist in the speech signal, and forming a residual signal
`
`having a much smaller dynamic range and energy. Then the residual signal is
`
`quantized, a process of representing it by a finite number of bits and limited
`
`resolution. The difference between the reduced resolution signal and the original
`
`signal is considered quantization noise (or quantization error). The number of bits
`
`
`
`13
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`used for the quantization governs the amount of quantization noise, or alternatively
`
`the overall quality of the coded speech.
`
`Speech coding and Linear Predictive Coding (LPC) analysis
`A.
`28. The Linear Predictive Coding (LPC) is a well known analysis for
`
`calculating the short term prediction coefficients that are used to model the vocal
`
`tract. It is typically applied directly to the input speech signal and LPCs are
`
`encoded (or quantized) once per frame at of 20-30 ms, which is an adequate
`
`interval based on the vocal tract change speed.
`
`29. The LPC is used to capture and remove the short-term prediction from
`
`the input speech, where the LPC coefficients are used as weighted sum of the most
`
`recent 10 samples to predict the present sample, which is essentially a filtering
`
`operation. The short-term predicted sample is subtracted from the input sample to
`
`generate the prediction error also known as the residual signal sample having a
`
`much smaller dynamic range and energy than the input speech. For this reason the
`
`residual signal is more desirable for quantization than the speech, since thanks to
`
`its smaller energy and dynamic range, its quantization error (or noise) is also
`
`smaller.
`
`
`
`14
`
`

`
`
`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`Figure 2. Block diagram of open-loop short-term prediction removal
`
`
`
`example using LPC analysis
`
`
`
`30. The removal of the short-term prediction from the input speech to
`
`generate the residual signal is essentially a decomposition of the speech signal into
`
`its slowly varying vocal tract characteristics (i.e. the LPC coefficients) and its
`
`rapidly changing excitation characteristic (i.e. the residual signal). Such a
`
`decomposition allows for applying slower LPC quantization at a frame intervals of
`
`typically 20-30 ms, and a faster residual encoding at a subframe rate of typically 5-
`
`7 ms. Examples of speech signals and their corresponding residual signals (scaled
`
`up in the figure for better view) is illustrated at Id Fig 8.5. As can be seen, the
`
`residual signal exhibit less short-term correlation among neighboring samples,
`
`looks more like noise and comb pulses, but still exhibits long-term correlation
`
`between neighboring pitch periods.
`
`
`
`15
`
`

`
`
`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`
`
`31.
`
`In the frequency domain, the vocal tract shapes the speech spectral
`
`envelope, while the excitation component shapes the fine spectral structure about
`
`that envelope. The figure below illustrates from top to bottom time domain
`
`sampled speech segment, the corresponding residual signal that contains much less
`
`short-term prediction and energy, the speech signal’s (log) spectrum and the
`
`corresponding spectral envelope represented by the LPC, and the residual signal’s
`
`(log) spectrum which is flat since the spectral envelope captured by the LPC was
`
`removed from the speech. As illustrated, the speech was decomposed to the
`
`
`
`16
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`spectral envelope - represented by the LPC, and the fine spectral variations around
`
`the spectral envelope - represented by the residual signal. As illustrated, during
`
`voiced speech segments, the residual signal exhibits comb-like harmonic structure
`
`at the fundamental frequency’s (pitch) multiple frequencies (harmonics). During
`
`buzzy sounds like “v” and “z”, and during transitions, the harmonic structure
`
`appears more prominent in the lower frequency range, and the higher frequency
`
`range (e.g. 4-5 kHz in the figure below) becomes less structured and more noise
`
`like.
`
`
`
`17
`
`

`
`
`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`
`
`Long Term Prediction (Pitch Prediction)
`B.
`32. As explained above, the vocal chords “pitch” periodicity generates
`
`long term correlation between neighboring speech periods. For encoding purposes,
`
`such long-term correlation can be modeled and captured by means of long-term
`
`prediction. The parameters that are typically used to encode the long-term
`
`
`
`18
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`prediction are the pitch period which may be integer or fractional, and the gain
`
`used to predict the present period from the past period. Such parameters may be
`
`viewed as describing a comb filter, where the comb pulses are spaced by the pitch
`
`period, and their exponential progression is given by the gain. A simple scheme is
`
`illustrated below. In this scheme the past residual signal is stored and used to
`
`generate a predicted residual sample which is then subtracted from the input
`
`residual signal to form a long-term prediction error often referred to as the remnant
`
`signal.
`
`
`
`Figure 3. Block diagram of simplified open-loop long-term prediction
`
`removal example
`
`33. The remnant signal is noise like signal that exhibits almost no short
`
`term and no long-term correlation, and its energy and dynamic range are much
`
`smaller than the input speech. It is very hard to encode since it has no particular
`
`structure, and it is hard to tell how much its quantization error would affect the
`
`
`
`19
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`overall generated speech quality at the decoder. In other words, the simple
`
`encoding scheme described so far was performed in open-loop and did not really
`
`consider the speech signal generated by the decoder.
`
`C. Quantization
`34. Quantization is the process of reducing signal resolution, and
`
`representing the reduced resolution signal by some code to be stored and
`
`transmitted. It can be done on a sample by sample basis, using table of
`
`quantization levels, where the quantization level is selected by the encoder (scalar
`
`quantizer) using for example nearest-neighbor criterion. Quantization can also be
`
`performed on a set of values often called a vector (e.g. set of LPC parameters or set
`
`of consecutive signal samples), where quantized vectors are selected from multi-
`
`dimensional tables also referred to as codebooks, again the quantized vector is
`
`selected by the encoder (vector quantizer, VQ) using for example nearest-neighbor
`
`criterion. For a given number of bits, vector quantizer produces much better
`
`quality than scalar quantizer, at the cost of increased computational complexity.
`
`35. Codebooks (and quantizers) are usually computed by training data
`
`gathered during system operation, such that the generated trained codebook
`
`minimizes some distortion measure over all the training data generated by the
`
`system. Once the system is changed, e.g. its operation order is changed,
`
`computation elements are added, removed or modified, the codebook would no
`
`
`
`20
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`longer operate optimally, and new training process is needed to be performed to
`
`achieve adequate performance. In other words, systems having quantizers and/or
`
`codebooks may operate unpredictably once they are changed or combined with
`
`other systems.
`
`36. Similarly, person of ordinary skills in the art knows that one cannot
`
`combine systems that were designed to operate at one sampling rate with systems
`
`that were designed to operate at a different sampling rate. Combining such system
`
`typically yields unpredictable result.
`
`D. Code Excited Linear Prediction (CELP)
`37. Code Excited Linear Prediction (CELP) has become the most widely
`
`used coding scheme in the past 30 years. Its core idea is based on closed-loop
`
`quantization of the residual signal and remnant signal, also known as analysis-by-
`
`synthesis (AbS). In this paradigm, the excitation and the remnant signals are
`
`selected such that the resulted output speech best matches the input speech, as
`
`explained below.
`
`38. A simplified block diagram of the encoder of such a system is
`
`illustrated in Figure 4. In the encoder, the LPC are computed and quantized for the
`
`short-term correlation filter in open-loop manner once every 20-30 ms frame. This
`
`is done the standard LPC analysis that aims to maximize the speech short-term
`
`self-prediction, as explained above.
`
`
`
`21
`
`

`
`
`Inter Partes Review USPN 7,151,802
`Declaration of Oded Gottesman, Ph.D.
`
`39. Given the LPC precomputed in open-loop for the whole frame, the
`
`encoder then starts closed-loop processing of 3-5 subframes, each of 5-7 ms, where
`
`the distortion criterion is waveform matching between the input speech and the
`
`synthesized speech, or (theoretically) equivalently minimizing the energy of
`
`weighted error between the input speech and the synthesized speech. For
`
`convenience let’s consider a theoretically equivalent simplified diagram illustrated
`
`in Figure 4, although actual CELP system is implemented differently, and is more
`
`complicated.
`
`40. The encoder consists of two components, namely the analyzer, and the
`
`synthesizer. Since the synthesizer is actually a replica of the decoder, it is also
`
`known as the local decoder. In the synthesizer, the excitation is generated as the
`
`output of a long-term correlation filter, which generates the pitch periodicity. The
`
`excitation is then passed through the short-term correlation, also known as the LP
`
`synthesis filter, to produce the output speech. This filter models the vocal tract
`
`transfer function. It emphasizes certain frequency regions

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go Monthly

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases