`
`
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`_____________________________
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`_____________________________
`
`
`APPLE INC.,
`Petitioner,
`
`v.
`
`CALIFORNIA INSTITUTE OF TECHNOLOGY,
`Patent Owner.
`
`_____________________________
`
`IPR2017-00210 and IPR2017-00219
`Patent No. 7,116,710
`_____________________________
`
`
`DECLARATION OF DR. DARIUSH DIVSALAR
`
`
`
`
`
`
`
`
`
`CALTECH - EXHIBIT 2031
`Apple Inc. v. California Institute of Technology
`IPR2017-00210
`
`
`
`I, Dariush Divsalar, declare as follows:
`
`I.
`
`ENGAGEMENT
`
`1.
`
`I have been asked by counsel for the California Institute of
`
`Technology to be a witness in the above-captioned proceeding. I have been asked
`
`to address certain aspects of error correction coding technology during the time of
`
`the invention of U.S. Patent No. 7,116,710 (the “’710 patent”), specifically with
`
`regard to certain references cited and relied on in connection with cases IPR2017-
`
`00210 and IPR2017-00219. In particular, I offer my opinions regarding the
`
`following reference, and potential combinations: Dariush Divsalar, Hui Jin, and
`
`Robert McEliece, “Coding Theorems for ‘Turbo-Like’ Codes” (Exs. 1003/1203,
`
`“Divsalar”).
`
`II. QUALIFICATIONS
`2.
`
`I received my MSEE in 1975, my Engineer Degree in 1977, and my
`
`Ph.D. in 1978, all from the University of California in Los Angeles. I majored in
`
`Communication Systems (Department of Electrical Engineering) and double-
`
`minored in Computer System Modeling and Analysis (Department of Computer
`
`Science) and Applied Mathematics (Department of Mathematics).
`
`3.
`
`In 1996 (as a report), and later in 1998 (published in IEEE Trans. on
`
`Info. Theory) along with S. Benedetto, G. Montorsi, and F. Pollara, I published a
`
`paper entitled, “Serial Concatenation of Interleaved Codes: Performance Analysis,
`
`
`
`
`
`1
`
`
`
`Design, and Iterative Decoding.” In this paper, we proposed a subclass of serial
`
`concatenated codes, serial concatenated convolutional codes (SCCCs). Combined
`
`with the iterative message-passing decoding algorithm of turbo codes, we
`
`demonstrated that these SCCCs outperformed parallel concatenated turbo codes.
`
`This paper was highly cited—according to Google Scholar it has been cited more
`
`than 1500 times on the same title, and it has been cited more than 450 times in
`
`IEEE publications. This paper is considered instrumental in progressing the field’s
`
`understanding of error correcting codes.
`
`4.
`
`In 1988 I won the IEEE Best Paper of the Year award for my work on
`
`Trellis coded DPSK for satellite communications.
`
`5.
`
`In 1999 I was promoted to my role as a senior research scientist at
`
`JPL/Caltech. The selection of senior research scientist requires a recommendation
`
`from the section division and also six to eight references from outside JPL; one of
`
`my recommenders was Dr. Andrew J. Viterbi, the inventor of the Viterbi
`
`algorithm, co-founder of Qualcomm, and one of the most prominent information
`
`theorists of all time.
`
`6. More recently, in 2008 I won the Joint Paper award of the IEEE
`
`Information Theory and IEEE Communication Theory societies for my paper,
`
`
`
`
`
`2
`
`
`
`“Accumulate Repeat Accumulate Codes”. My paper on Accumulate-Repeat-
`
`Accumulate (ARA) codes cites the inventors’ IRA paper.
`
`7.
`
`I have summarized in this section my educational background and
`
`relevant experience. My full curriculum vitae is attached as Exhibit 2032 to this
`
`declaration.
`
`III. COMPENSATION AND PRIOR TESTIMONY
`8.
`
`Other than my ordinary salary, I am not being compensated for the
`
`time I spend in connection with this case. My compensation is in no way linked to
`
`the outcome of this case. Furthermore, I am not listed on any of the patents-in-suit
`
`and have no financial interest in the outcome of this litigation.
`
`IV. SUMMARY OF OPINIONS
`
`9.
`
`I am the author of the paper, “Coding Theorems for ‘Turbo-Like’
`
`Codes” identified by Petitioners as the “Divsalar” prior art. I was not involved in
`
`the development or research that led to IRA codes, and I made no contribution to
`
`the development of the IRA codes. I do not believe it would have been trivial or
`
`obvious to modify RA codes by making them “irregular” in order to arrive at IRA
`
`codes, nor would a person of ordinary skill in the art be motivated to make such a
`
`modification.
`
`10.
`
`It was completely untrue that people of ordinary skill in the art at the
`
`time of invention thought that irregularity could be used to improve any code. It’s
`
`
`
`
`
`3
`
`
`
`not even clear what that phrase means for the vast majority of codes. Furthermore,
`
`it would not have been obvious to modify my RA codes to include an irregular
`
`repeat—there would be no motivation to do so and no expectation that such a
`
`modification would improve performance. The contemporaneous technical
`
`literature on “irregular codes” was directed to specific types of irregularity that had
`
`no meaningful application to RA codes. For example, research on irregular low-
`
`density parity check (LDPC) codes was concerned with modifying traditional
`
`Gallager codes, a specific class of LDPC codes with a structure different from RA
`
`and IRA codes, to use irregular parity check matrices. A person of ordinary skill
`
`in the art would have no motivation to apply such a teaching to RA codes, because
`
`they were not described in terms of a low-density parity check matrix. In addition,
`
`even if such a person derived the parity check matrix for an RA code, that matrix is
`
`already irregular.
`
`11. There is no disclosure in my RA paper (“Divsalar”) of a systematic
`
`code. I did not consider making RA codes systematic during my research. There
`
`was no point in making RA codes systematic, and a person of ordinary skill in the
`
`art would not have made RA codes systematic.
`
`V. LEVEL OF ORDINARY SKILL
`12.
`
`I understand that the ’710 patent claims a priority date of no later than
`
`May 18, 2000 for their invention. In my opinion the typical qualifications of a
`
`
`
`
`
`4
`
`
`
`person who was working in the field of error correction codes in the 1998 to 2000
`
`period would have a Ph.D and be able to analyze and say something about these
`
`types of codes.
`
`VI. THE ’710 PATENT
`
`13.
`
`I understand that this IPR relates to Caltech’s ’710 patent. I further
`
`understand that the ’710 patent covers the irregular repeat accumulate (IRA) codes
`
`invented by Dr. Jin, Dr. Khandekar, and Dr. McEliece. These codes are not a
`
`subset of the RA codes described in my paper.
`
`14.
`
`I first learned of IRA codes in 1999 or 2000 when I was teaching at
`
`Caltech. Hui Jin came to me and told me that he and Professor Bob McEliece had
`
`come up with an irregular-repeat and accumulate structure that resulted in well-
`
`performing codes. He took me to the lab and showed me an FPGA prototype of
`
`IRA code that demonstrated some functionality of the codes. I recall clearly that
`
`my initial reaction was one of surprise when I learned that Bob McEliece, along
`
`with Hui Jin and Aamod Khandekhar, had designed such codes that included
`
`irregular repeat and performed well. I was initially quite surprised that a code with
`
`such a structure would work so well. After further discussions with Bob McEliece,
`
`as well as Hui Jin and Aamod Khandekar, I recognized that their work was a
`
`ground breaking discovery.
`
`
`
`
`
`5
`
`
`
`15.
`
`IRA codes are unique in that they offer the linear-in-time encoding
`
`performance of turbo codes and the decoding performance of the best irregular
`
`LDPCs. Additionally Dr. Jin, Dr. Khandekar, and Dr. McEliece provided an
`
`important analytical framework through their development of IRA codes. Their
`
`analysis of IRA codes using, among other techniques, Tanner graph representation
`
`was a significant step forward in unifying the worlds of concatenated codes, such
`
`as turbo codes, and low-density parity check codes. By providing both analytical
`
`and empirical proofs of the performance on the Binary Erasure Channel and
`
`Additive White Gaussian Noise channel (via Gaussian approximation), coding
`
`theorists gained a greater understanding of code performance, and what makes a
`
`good code. As such, Dr. Jin, Dr. McEliece, and Dr. Khandekar’s analysis of IRA
`
`codes was an important advance in the field of coding theory.
`
`VII. DIVSALAR (EXS. 1003/1203)
`16.
`
`In 1993, Berrou et al (Exs. 1008/1208) introduced turbo codes to the
`
`industry. Turbo codes are a parallel concatenation of two convolutional codes, as
`
`shown below.
`
`
`
`
`
`6
`
`
`
`
`
`17. Berrou insisted in one of these initial conferences that the entire
`
`family of concatenated convolutional codes, whether parallel or serial, be looked at
`
`as “turbo-like codes,” in my opinion, because he wanted his turbo codes to achieve
`
`greater fame. However, I do not consider serial concatenated convolutional codes
`
`(SCCC) to be turbo codes.
`
`18. Turbo codes contain enough randomness to achieve reliable
`
`communication at data rates near the theoretical Shannon limit, yet enough
`
`structure to allow practical encoding and decoding algorithms. After Berrou’s
`
`initial research paper on turbo-codes, researchers in the field of information theory
`
`were interested in determining how, fundamentally, turbo codes achieved such
`
`
`
`
`
`7
`
`
`
`good performance. Dr. McEliece, Hui Jin, and I also began looking into why turbo
`
`codes were able to perform so well.
`
`19. We submitted a paper entitled, “Coding Theorems for ‘Turbo-Like’
`
`Codes,” in connection with the Allerton conference in 1998. Our paper was an
`
`attempt to determine how turbo codes provided near-Shannon limit performance
`
`on the Additive White Gaussian Noise (AWGN) channel. I began working on this
`
`paper with Dr. McEliece and Hui Jin in 1998.
`
`20. Traditional evaluation of code performance relied on various weight
`
`distributions and methods of calculating weight distributions. For example, the
`
`Hamming distance between two codewords is the number of bit positions in which
`
`the codes differ. The Hamming weight for a code word of length N is identical to
`
`the Hamming distance from the all-zero string of length N.
`
`21. For a given code, if Ai is the number of code words of weight i, the
`
`ordered set of Ai is called the weight distribution of the code. A convenient
`
`method of representing the weight distribution of a code is as a generating function
`
`in two variables, called the weight enumerator (also referred to as the “weight
`
`enumerator polynomial”). The weight enumerator polynomial of a binary linear
`
`code specifies the number of words of each possible Hamming weight. Once the
`
`
`
`
`
`8
`
`
`
`weight enumerator of a given code is calculated some theoretical values of
`
`performance may also be calculated.
`
`22. The performance of a convolutional code is usually measured in terms
`
`of the bit-error rate (BER), word-error rate (WER), or first-event error rate (FER)
`
`(note that “FER” also stands for frame error rate). However, because no closed
`
`form for these quantities are known, it is common to use a union bound on the
`
`error probabilities. The union bound on the word-error probability Pw (using
`
`maximum-likelihood decoding) can be calculated by summing the ensemble input-
`
`output weight enumerator (IOWE) over all input weights w and all output weights
`
`h, and multiplied by a constant z raised to the h-th power.
`
`23. By calculating the IOWE, we sought to prove what we coined the
`
`“IGE Conjecture” (IGE = interleaving gain exponent). See Exs. 1003/1203 at 3-4.
`
`In a nutshell, we sought to prove that there is word error probability interleaving
`
`gain.
`
`24.
`
` However, due to the complicated structure of turbo codes (parallel
`
`concatenation of two convolutional codes where obtaining closed form solution for
`
`IOWE is very difficult or almost impossible), it was nearly impossible to calculate
`
`(in closed form) the IOWE for turbo codes. We sought to gain some insight into
`
`turbo codes by constructing an extremely simple SCCC, which we called repeat-
`
`
`
`
`
`9
`
`
`
`accumulate (RA) codes. RA codes are not a subset or simplification of turbo
`
`codes, as they rely on serial concatenation rather than parallel concatenation.
`
`Nevertheless, we were hopeful that proving the goodness of RA codes could shed
`
`some light onto why turbo codes performed as well as they did. We were not
`
`attempting to design new codes for immediate application when we created RA
`
`codes—rather we were attempting to study the properties of a code with a simpler
`
`structure.
`
`25. At the time we wrote the RA paper, the entire field of “turbo-like”
`
`codes (i.e., concatenated convolutional codes) were represented at the system level
`
`via a block diagram. The block diagrams would show the steps performed during
`
`the encoding process; for example, in an RA code, the message bits are fed into a
`
`repeater, followed by a permuter, then an accumulator, as shown below:
`
`
`
`
`
`
`
`10
`
`
`
`26. People from the coding community in the late 1990’s generally did
`
`not think of concatenated convolutional codes in terms of their parity check
`
`matrices. Concatenated convolutional codes were defined by the component
`
`convolutional codes and the arrangement of the concatenation. While we were
`
`aware the Tanner graphs could be used to represent LDPC codes, we did not
`
`consider Tanner graph representation useful or applicable to concatenated
`
`convolutional codes, because the check nodes of the Tanner graph correspond to
`
`parity check equations, and we were not focused on the parity check matrix
`
`representation of turbo-like codes.
`
`27. The RA codes described in my paper are regular repeat-accumulate
`
`codes, meaning each message bit is repeated the same number of times. In the
`
`diagram above, the repeater performs a q repetition, meaning each bit is repeated q
`
`times. Thus if a block of N bits is input to the repeater, qN bits exit the repeater.
`
`There is no provision in my paper for repeating some bits a different number of
`
`times from others, nor did we contemplate such an arrangement at the time we
`
`wrote the paper. Furthermore, it is well-known that any value for q that is less than
`
`three would result in very poor performance, as a single repeat, or no repeat at all,
`
`does not provide any word error probability interleaving gain. This was, in fact,
`
`one of the key conclusions of our paper. See Exs. 1003/1203 at 6 (“It follows from
`
`
`
`
`
`11
`
`
`
`(5.6) that an RA code can have word error probability interleaving gain only if q
`
`>= 3.”).
`
`28. The RA codes described in my paper are non-systematic; that is, the
`
`unencoded message bits are not transmitted across the channel. There is no
`
`disclosure in my paper of a systematic RA code, nor is there any implication to
`
`make the codes systematic. Making the RA codes systematic would be expected to
`
`decrease the code rate, and we did not investigate the performance of making the
`
`RA codes systematic. In fact, there are two significant reasons why we did not,
`
`and why a person of ordinary skill would not make RA codes systematic.
`
`29. First, RA codes were designed to be research tools. It would have
`
`been more complicated by including a systematic code. As discussed above, we
`
`sought to keep the codes simple in order to calculate the ensemble IOWE.
`
`30. Second, there is no need to make an RA code systematic. Because the
`
`repeated variable nodes are not combined with any other variable nodes before the
`
`accumulator, one can determine the information bits from the parity bits. In other
`
`words, to the extent RA codes have “check nodes”, their input degree is 1. At the
`
`decoder side one of ordinary skill would be able to “back out” the information bits
`
`from the parity bits. Sending a separate set of copies of the bits, thereby making it
`
`systematic, would reduce the code rate to a rate of 1/(q+1) (for example, making a
`
`
`
`
`
`12
`
`
`
`rate 1/3 nonsystematic RA code systematic would result in a rate ¼ code) and
`
`without any additional benefit. This “backing out” of the information bits from a
`
`non-systematic codeword is not possible in IRA codes because of the degree of 2
`
`or more at the check nodes. A person of ordinary skill would be aware of these
`
`two reasons, and would not make RA codes systematic.
`
`31. Furthermore, there is no mention of puncturing anywhere in my RA
`
`paper. Puncturing is a technique of not transmitting certain codeword bits in order
`
`to increase the code rate. Puncturing is not an arbitrary practice—it takes careful
`
`analysis and simulation to select code bits that have relatively smaller adverse
`
`effect on the ability to correct errors. The paper does not explicitly suggest
`
`puncturing RA codes, nor does it implicitly suggest doing so. The purpose of my
`
`paper was to generate an extremely simple code for analysis. Puncturing
`
`introduces another variable that complicates the IOWE calculation, and possibly
`
`limits the results of the calculation to a very narrow ensemble of codes.
`
`32. We proved the IGE Conjecture for RA codes, and determined that
`
`performance using maximum likelihood (ML) decoding for RA codes was rather
`
`good. However, ML decoding is not computationally practical. See Exs.
`
`1003/1203 at 8-9. We wrote a program to test performance under iterative
`
`message passing decoding, such as those used in turbo codes, and found that the
`
`
`
`
`
`13
`
`
`
`performance was surprisingly good given that the exercise was not aimed at
`
`designing a high-performance code, but as a tool to aid our understanding of turbo
`
`codes. Practically, compared to the best error correcting codes of the time, our
`
`performance was rather poor.
`
`VIII. MODIFYING RA CODES
`33.
`
`It would not have been trivial or obvious to a person of ordinary skill
`
`to modify my RA code paper to create IRA codes. The goal of my paper was to
`
`find the simplest concatenated convolutional code possible to calculate the IOWE
`
`and prove the IGE Conjecture. Including an irregular degree profile would have
`
`made the calculation harder (though not as difficult as the analysis for turbo codes,
`
`with two parallel convolutional encoders).
`
`34.
`
`In addition, even if someone had suggested to me to apply irregularity
`
`to RA codes, it would still be a difficult and complex task to determine how to
`
`apply irregularity in a manner that is effective. One would not have expected that
`
`any form of irregularity would provide a benefit to RA codes, or even that doing so
`
`would result in a workable or practical code.
`
`35. Even if a person of ordinary skill were to contemplate applying
`
`irregular repetition of information bits to RA codes—despite the fact that my paper
`
`does not suggest doing so, for the reasons noted above—there would be no
`
`expectation that the resulting code would perform well. A person of ordinary skill
`
`
`
`
`
`14
`
`
`
`in 1999 would not have believed that simply inserting or applying irregularity to
`
`any code universally improves performance.
`
`36. The idea of applying irregular repetition of information bits to an RA
`
`code would actually be expected to degrade performance of the code. For
`
`example, if a person of ordinary skill in the art decided to add irregular repetition
`
`to the RA codes discussed in my paper, doing so without additional modifications
`
`would drastically reduce the coding rate. This decrease in coding rate would have
`
`deterred a person of ordinary skill who was looking for high-performing codes
`
`with no interest in reducing the coding rate. Even if one erroneously believed that
`
`irregularity improves all codes, a person of ordinary skill would have expected that
`
`any improvement would be offset by the decreased coding rate, which is not
`
`desirable in many applications.
`
`37. Finally, I understand the petition discussed a hypothetical example of
`
`modifying an RA code from a repeat degree 3, such that information bits are
`
`evenly repeated degree 2 and degree 4. This makes little sense at least for the
`
`reason that it would have been expected to degrade performance of the code. As
`
`explained above, our research showed that any value for q that is less than three
`
`would result in very poor performance because a single repeat, or no repeat at all,
`
`does not provide any word error probability interleaving gain. Again, this was one
`
`
`
`
`
`15
`
`
`
`of the key conclusions of our paper. See Exs. 1003/1203 at 6 (“It follows from
`
`(5.6) that an RA code can have word error probability interleaving gain only if q
`
`>= 3.”).
`
`IX. CONCLUDING STATEMENTS
`
`38.
`
`In signing this declaration, I understand that the declaration will be
`
`filed as evidence in a contested case before the Patent Trial and Appeal Board of
`
`the United States Patent and Trademark Office. I acknowledge that I may be
`
`subject to cross-examination in this case and that cross-examination will take place
`
`within the United States. If cross-examination is required of me, I will appear for
`
`cross-examination within the United States during the time allotted for cross-
`
`examination.
`
`39.
`
`I declare that all statements made herein of my knowledge are true,
`
`and that all statements made on information and belief are believed to be true, and
`
`that these statements were made with the knowledge that willful false statements
`
`and the like so made are punishable by fine or imprisonment, or both, under
`
`Section 1001 of Title 18 of the United States Code.
`
`
`
`Dated: 11/07/2017
`
`
`
`
`By:
`
`Dariush Divsalar, Ph.D.
`
`
`
`
`
`16
`
`