throbber
This PDF is available from The National Academies Press at http://www.nap.edu/catalog.php?record_id=13163
`
`Reference Manual on Scientific Evidence: Third Edition
`
`ISBN
`978-0-309-21421-6
`
`1038 pages
`6 x 9
`PAPERBACK (2011)
`
`Committee on the Development of the Third Edition of the Reference
`Manual on Scientific Evidence; Federal Judicial Center; National Research
`Council
`
`Visit the National Academies Press online and register for...
`
`Instant access to free PDF downloads of titles from the
`
`NATIONAL ACADEMY OF SCIENCES
`
`NATIONAL ACADEMY OF ENGINEERING
`
`INSTITUTE OF MEDICINE
`
`NATIONAL RESEARCH COUNCIL
`
`10% off print titles
`
`Custom notification of new releases in your field of interest
`
`Special offers and discounts
`
`Distribution, posting, or copying of this PDF is strictly prohibited without written permission of the National Academies Press.
`Unless otherwise indicated, all materials in this PDF are copyrighted by the National Academy of Sciences.
`Request reprint permission for this book
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`Regeneron Exhibit 1198.001
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Reference Guide on Statistics
`
`daVid h. Kaye and daVid a. freedman
`
`David H. Kaye, M.A., J.D., is Distinguished Professor of Law and Weiss Family Scholar,
`The Pennsylvania State University, University Park, and Regents’ Professor Emeritus, Arizona
`State University Sandra Day O’Connor College of Law and School of Life Sciences, Tempe.
`
`David A. Freedman, Ph.D., was Professor of Statistics, University of California, Berkeley.
`
`[Editor’s Note: Sadly, Professor Freedman passed away during the production of this
`manual.]
`
`ConTenTs
` I. Introduction, 213
`
` A. Admissibility and Weight of Statistical Studies, 214
`
` B. Varieties and Limits of Statistical Expertise, 214
`
` C. Procedures That Enhance Statistical Testimony, 215
`
`
`
` 1. Maintaining professional autonomy, 215
`
`
`
` 2. Disclosing other analyses, 216
`
`
`
` 3. Disclosing data and analytical methods before trial, 216
` II. How Have the Data Been Collected? 216
`
` A. Is the Study Designed to Investigate Causation? 217
`
`
`
` 1. Types of studies, 217
`
`
`
` 2. Randomized controlled experiments, 220
`
`
`
` 3. Observational studies, 220
`
`
`
` 4. Can the results be generalized? 222
`
` B. Descriptive Surveys and Censuses, 223
`
`
`
` 1. What method is used to select the units? 223
`
`
`
` 2. Of the units selected, which are measured? 226
`
` C. Individual Measurements, 227
`
`
`
` 1. Is the measurement process reliable? 227
`
`
`
` 2. Is the measurement process valid? 228
`
`
`
` 3. Are the measurements recorded correctly? 229
`
` D. What Is Random? 230
` III. How Have the Data Been Presented? 230
`
` A. Are Rates or Percentages Properly Interpreted? 230
`
`
`
` 1. Have appropriate benchmarks been provided? 230
`
`
`
` 2. Have the data collection procedures changed? 231
`
`
`
` 3. Are the categories appropriate? 231
`
`
`
` 4. How big is the base of a percentage? 233
`
`
`
` 5. What comparisons are made? 233
`
` B. Is an Appropriate Measure of Association Used? 233
`
`211
`
`Regeneron Exhibit 1198.002
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Manual on Scientific Evidence
`
` C. Does a Graph Portray Data Fairly? 236
`
`
`
` 1. How are trends displayed? 236
`
`
`
` 2. How are distributions displayed? 236
`
` D. Is an Appropriate Measure Used for the Center of a Distribution? 238
`
` E. Is an Appropriate Measure of Variability Used? 239
`
` IV. What Inferences Can Be Drawn from the Data? 240
`
` A. Estimation, 242
`
`
`
` 1. What estimator should be used? 242
`
`
`
` 2. What is the standard error? The confidence interval? 243
`
`
`
` 3. How big should the sample be? 246
`
`
`
` 4. What are the technical difficulties? 247
`
` B. Significance Levels and Hypothesis Tests, 249
`
`
`
` 1. What is the p-value? 249
`
`
`
` 2. Is a difference statistically significant? 251
`
`
`
` 3. Tests or interval estimates? 252
`
`
`
` 4. Is the sample statistically significant? 253
`
` C. Evaluating Hypothesis Tests, 253
`
`
`
` 1. What is the power of the test? 253
`
`
`
` 2. What about small samples? 254
`
`
`
` 3. One tail or two? 255
`
`
`
` 4. How many tests have been done? 256
`
`
`
` 5. What are the rival hypotheses? 257
`
` D. Posterior Probabilities, 258
` V. Correlation and Regression, 260
`
` A. Scatter Diagrams, 260
`
` B. Correlation Coefficients, 261
`
`
`
` 1. Is the association linear? 262
`
`
`
` 2. Do outliers influence the correlation coefficient? 262
`
`
`
` 3. Does a confounding variable influence the coefficient? 262
`
` C. Regression Lines, 264
`
`
`
` 1. What are the slope and intercept? 265
`
`
`
` 2. What is the unit of analysis? 266
`
` D. Statistical Models, 268
`Appendix, 273
`
` A. Frequentists and Bayesians, 273
`
` B. The Spock Jury: Technical Details, 275
`
` C. The Nixon Papers: Technical Details, 278
`
` D. A Social Science Example of Regression: Gender Discrimination in
`Salaries, 279
` 1. The regression model, 279
`
`
`
` 2. Standard errors, t-statistics, and statistical significance, 281
`
`
`
`Glossary of Terms, 283
`References on Statistics, 302
`
`212
`
`Regeneron Exhibit 1198.003
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Guide on Statistics
`
`I. Introduction
`Statistical assessments are prominent in many kinds of legal cases, including
`antitrust, employment discrimination, toxic torts, and voting rights cases.1 This
`reference guide describes the elements of statistical reasoning. We hope the expla-
`nations will help judges and lawyers to understand statistical terminology, to see
`the strengths and weaknesses of statistical arguments, and to apply relevant legal
`doctrine. The guide is organized as follows:
`
`• Section I provides an overview of the field, discusses the admissibility
`of statistical studies, and offers some suggestions about procedures that
`encourage the best use of statistical evidence.
`• Section II addresses data collection and explains why the design of a study
`is the most important determinant of its quality. This section compares
`experiments with observational studies and surveys with censuses, indicat-
`ing when the various kinds of study are likely to provide useful results.
`• Section III discusses the art of summarizing data. This section considers the
`mean, median, and standard deviation. These are basic descriptive statistics,
`and most statistical analyses use them as building blocks. This section also
`discusses patterns in data that are brought out by graphs, percentages, and
`tables.
`• Section IV describes the logic of statistical inference, emphasizing founda-
`tions and disclosing limitations. This section covers estimation, standard
`errors and confidence intervals, p-values, and hypothesis tests.
`• Section V shows how associations can be described by scatter diagrams,
`correlation coefficients, and regression lines. Regression is often used to
`infer causation from association. This section explains the technique, indi-
`cating the circumstances under which it and other statistical models are
`likely to succeed—or fail.
`• An appendix provides some technical details.
`• The glossary defines statistical terms that may be encountered in litigation.
`
`1. See generally Statistical Science in the Courtroom (Joseph L. Gastwirth ed., 2000); Statistics
`and the Law (Morris H. DeGroot et al. eds., 1986); National Research Council, The Evolving Role
`of Statistical Assessments as Evidence in the Courts (Stephen E. Fienberg ed., 1989) [hereinafter The
`Evolving Role of Statistical Assessments as Evidence in the Courts]; Michael O. Finkelstein & Bruce
`Levin, Statistics for Lawyers (2d ed. 2001); 1 & 2 Joseph L. Gastwirth, Statistical Reasoning in Law
`and Public Policy (1988); Hans Zeisel & David Kaye, Prove It with Figures: Empirical Methods in
`Law and Litigation (1997).
`
`213
`
`Regeneron Exhibit 1198.004
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Manual on Scientific Evidence
`
`A. Admissibility and Weight of Statistical Studies
`Statistical studies suitably designed to address a material issue generally will be
`admissible under the Federal Rules of Evidence. The hearsay rule rarely is a
` serious barrier to the presentation of statistical studies, because such studies may
`be offered to explain the basis for an expert’s opinion or may be admissible under
`the learned treatise exception to the hearsay rule.2 Because most statistical methods
`relied on in court are described in textbooks or journal articles and are capable
`of producing useful results when properly applied, these methods generally satisfy
`important aspects of the “scientific knowledge” requirement in Daubert v. Merrell
`Dow Pharmaceuticals, Inc.3 Of course, a particular study may use a method that is
`entirely appropriate but that is so poorly executed that it should be inadmissible
`under Federal Rules of Evidence 403 and 702.4 Or, the method may be inappro-
`priate for the problem at hand and thus lack the “fit” spoken of in Daubert.5 Or
`the study might rest on data of the type not reasonably relied on by statisticians or
`substantive experts and hence run afoul of Federal Rule of Evidence 703. Often,
`however, the battle over statistical evidence concerns weight or sufficiency rather
`than admissibility.
`
`B. Varieties and Limits of Statistical Expertise
`For convenience, the field of statistics may be divided into three subfields: prob-
`ability theory, theoretical statistics, and applied statistics. Probability theory is the
`mathematical study of outcomes that are governed, at least in part, by chance.
`Theoretical statistics is about the properties of statistical procedures, including
`error rates; probability theory plays a key role in this endeavor. Applied statistics
`draws on both of these fields to develop techniques for collecting or analyzing
`particular types of data.
`
`2. See generally 2 McCormick on Evidence §§ 321, 324.3 (Kenneth S. Broun ed., 6th ed. 2006).
`Studies published by government agencies also may be admissible as public records. Id. § 296.
`3. 509 U.S. 579, 589–90 (1993).
`4. See Kumho Tire Co. v. Carmichael, 526 U.S. 137, 152 (1999) (suggesting that the trial court
`should “make certain that an expert, whether basing testimony upon professional studies or personal
`experience, employs in the courtroom the same level of intellectual rigor that characterizes the practice
`of an expert in the relevant field.”); Malletier v. Dooney & Bourke, Inc., 525 F. Supp. 2d 558, 562–63
`(S.D.N.Y. 2007) (“While errors in a survey’s methodology usually go to the weight accorded to the
`conclusions rather than its admissibility, . . . ‘there will be occasions when the proffered survey is so
`flawed as to be completely unhelpful to the trier of fact.’”) (quoting AHP Subsidiary Holding Co. v.
`Stuart Hale Co., 1 F.3d 611, 618 (7th Cir.1993)).
`5. Daubert, 509 U.S. at 591; Anderson v. Westinghouse Savannah River Co., 406 F.3d 248 (4th
`Cir. 2005) (motion to exclude statistical analysis that compared black and white employees without
`adequately taking into account differences in their job titles or positions was properly granted under
`Daubert); Malletier, 525 F. Supp. 2d at 569 (excluding a consumer survey for “a lack of fit between the
`survey’s questions and the law of dilution” and errors in the execution of the survey).
`
`214
`
`Regeneron Exhibit 1198.005
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Guide on Statistics
`
`Statistical expertise is not confined to those with degrees in statistics. Because
`statistical reasoning underlies many kinds of empirical research, scholars in a
` variety of fields—including biology, economics, epidemiology, political science,
`and psychology—are exposed to statistical ideas, with an emphasis on the methods
`most important to the discipline.
`Experts who specialize in using statistical methods, and whose professional
`careers demonstrate this orientation, are most likely to use appropriate procedures
`and correctly interpret the results. By contrast, forensic scientists often lack basic
`information about the studies underlying their testimony. State v. Garrison6 illus-
`trates the problem. In this murder prosecution involving bite mark evidence, a
`dentist was allowed to testify that “the probability factor of two sets of teeth being
`identical in a case similar to this is, approximately, eight in one million,” even
`though “he was unaware of the formula utilized to arrive at that figure other than
`that it was ‘computerized.’”7
`At the same time, the choice of which data to examine, or how best to model
`a particular process, could require subject matter expertise that a statistician lacks.
`As a result, cases involving statistical evidence frequently are (or should be) “two
`expert” cases of interlocking testimony. A labor economist, for example, may
`supply a definition of the relevant labor market from which an employer draws
`its employees; the statistical expert may then compare the race of new hires to
`the racial composition of the labor market. Naturally, the value of the statistical
`analysis depends on the substantive knowledge that informs it.8
`
`C. Procedures That Enhance Statistical Testimony
`1. Maintaining professional autonomy
`
`Ideally, experts who conduct research in the context of litigation should proceed
`with the same objectivity that would be required in other contexts. Thus, experts
`who testify (or who supply results used in testimony) should conduct the analysis
`required to address in a professionally responsible fashion the issues posed by the
`litigation.9 Questions about the freedom of inquiry accorded to testifying experts,
`
`6. 585 P.2d 563 (Ariz. 1978).
`7. Id. at 566, 568. For other examples, see David H. Kaye et al., The New Wigmore: A Treatise
`on Evidence: Expert Evidence § 12.2 (2d ed. 2011).
`8. In Vuyanich v. Republic National Bank, 505 F. Supp. 224, 319 (N.D. Tex. 1980), vacated, 723
`F.2d 1195 (5th Cir. 1984), defendant’s statistical expert criticized the plaintiffs’ statistical model for an
`implicit, but restrictive, assumption about male and female salaries. The district court trying the case
`accepted the model because the plaintiffs’ expert had a “very strong guess” about the assumption, and
`her expertise included labor economics as well as statistics. Id. It is doubtful, however, that economic
`knowledge sheds much light on the assumption, and it would have been simple to perform a less
`restrictive analysis.
`9. See The Evolving Role of Statistical Assessments as Evidence in the Courts, supra note 1, at
`164 (recommending that the expert be free to consult with colleagues who have not been retained
`
`215
`
`Regeneron Exhibit 1198.006
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Manual on Scientific Evidence
`
`as well as the scope and depth of their investigations, may reveal some of the
`limitations to the testimony.
`
`2. Disclosing other analyses
`
`Statisticians analyze data using a variety of methods. There is much to be said for
`looking at the data in several ways. To permit a fair evaluation of the analysis that
`is eventually settled on, however, the testifying expert can be asked to explain
`how that approach was developed. According to some commentators, counsel
`who know of analyses that do not support the client’s position should reveal them,
`rather than presenting only favorable results.10
`
`3. Disclosing data and analytical methods before trial
`
`The collection of data often is expensive and subject to errors and omissions.
`Moreover, careful exploration of the data can be time-consuming. To minimize
`debates at trial over the accuracy of data and the choice of analytical techniques,
`pretrial discovery procedures should be used, particularly with respect to the qual-
`ity of the data and the method of analysis.11
`
`II. How Have the Data Been Collected?
`The interpretation of data often depends on understanding “study design”—the
`plan for a statistical study and its implementation.12 Different designs are suited to
`answering different questions. Also, flaws in the data can undermine any statistical
`analysis, and data quality is often determined by study design.
`In many cases, statistical studies are used to show causation. Do food additives
`cause cancer? Does capital punishment deter crime? Would additional disclosures
`
`by any party to the litigation and that the expert receive a letter of engagement providing for these
`and other safeguards).
`10. Id. at 167; cf. William W. Schwarzer, In Defense of “Automatic Disclosure in Discovery,” 27
`Ga. L. Rev. 655, 658–59 (1993) (“[T]he lawyer owes a duty to the court to make disclosure of core
`information.”). The National Research Council also recommends that “if a party gives statistical data
`to different experts for competing analyses, that fact be disclosed to the testifying expert, if any.” The
`Evolving Role of Statistical Assessments as Evidence in the Courts, supra note 1, at 167.
`11. See The Special Comm. on Empirical Data in Legal Decision Making, Recommendations
`on Pretrial Proceedings in Cases with Voluminous Data, reprinted in The Evolving Role of Statistical
`Assessments as Evidence in the Courts, supra note 1, app. F; see also David H. Kaye, Improving Legal
`Statistics, 24 Law & Soc’y Rev. 1255 (1990).
`12. For introductory treatments of data collection, see, for example, David Freedman et al.,
`Statistics (4th ed. 2007); Darrell Huff, How to Lie with Statistics (1993); David S. Moore & William
`I. Notz, Statistics: Concepts and Controversies (6th ed. 2005); Hans Zeisel, Say It with Figures (6th
`ed. 1985); Zeisel & Kaye, supra note 1.
`
`216
`
`Regeneron Exhibit 1198.007
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Guide on Statistics
`
`in a securities prospectus cause investors to behave differently? The design of
` studies to investigate causation is the first topic of this section.13
`Sample data can be used to describe a population. The population is the
`whole class of units that are of interest; the sample is the set of units chosen for
`detailed study. Inferences from the part to the whole are justified when the sample
`is representative. Sampling is the second topic of this section.
`Finally, the accuracy of the data will be considered. Because making and
`recording measurements is an error-prone activity, error rates should be assessed
`and the likely impact of errors considered. Data quality is the third topic of this
`section.
`
`A. Is the Study Designed to Investigate Causation?
`1. Types of studies
`
`When causation is the issue, anecdotal evidence can be brought to bear. So can
`observational studies or controlled experi ments. Anecdotal reports may be of
`value, but they are ordinarily more helpful in generating lines of inquiry than in
`proving causation.14 Observational studies can establish that one factor is associ-
`
`13. See also Michael D. Green et al., Reference Guide on Epidemiology, Section V, in this
`manual; Joseph Rodricks, Reference Guide on Exposure Science, Section E, in this manual.
`14. In medicine, evidence from clinical practice can be the starting point for discovery of
`cause-and-effect relationships. For examples, see David A. Freedman, On Types of Scientific Enquiry, in
`The Oxford Handbook of Political Methodology 300 (Janet M. Box-Steffensmeier et al. eds., 2008).
`Anecdotal evidence is rarely definitive, and some courts have suggested that attempts to infer causa-
`tion from anecdotal reports are inadmissible as unsound methodology under Daubert v. Merrell Dow
`Pharmaceuticals, Inc., 509 U.S. 579 (1993). See, e.g., McClain v. Metabolife Int’l, Inc., 401 F.3d 1233,
`1244 (11th Cir. 2005) (“simply because a person takes drugs and then suffers an injury does not show
`causation. Drawing such a conclusion from temporal relationships leads to the blunder of the post hoc
`ergo propter hoc fallacy.”); In re Baycol Prods. Litig., 532 F. Supp. 2d 1029, 1039–40 (D. Minn. 2007)
`(excluding a meta-analysis based on reports to the Food and Drug Administration of adverse events);
`Leblanc v. Chevron USA Inc., 513 F. Supp. 2d 641, 650 (E.D. La. 2007) (excluding plaintiffs’ experts’
`opinions that benzene causes myelofibrosis because the causal hypothesis “that has been generated by
`case reports . . . has not been confirmed by the vast majority of epidemiologic studies of workers being
`exposed to benzene and more generally, petroleum products.”), vacated, 275 Fed. App’x. 319 (5th
`Cir. 2008) (remanding for consideration of newer government report on health effects of benzene);
`cf. Matrixx Initiatives, Inc. v. Siracusano, 131 S. Ct. 1309, 1321 (2011) (concluding that adverse event
`reports combined with other information could be of concern to a reasonable investor and therefore
`subject to a requirement of disclosure under SEC Rule 10b-5, but stating that “the mere existence of
`reports of adverse events . . . says nothing in and of itself about whether the drug is causing the adverse
`events”). Other courts are more open to “differential diagnoses” based primarily on timing. E.g., Best v.
`Lowe’s Home Ctrs., Inc., 563 F.3d 171 (6th Cir. 2009) (reversing the exclusion of a physician’s opinion
`that exposure to propenyl chloride caused a man to lose his sense of smell because of the timing in this
`one case and the physician’s inability to attribute the change to anything else); Kaye et al., supra note
`7, §§ 8.7.2 & 12.5.1. See also Matrixx Initiatives, supra, at 1322 (listing “a temporal relationship” in a
`single patient as one indication of “a reliable causal link”).
`
`217
`
`Regeneron Exhibit 1198.008
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Manual on Scientific Evidence
`
`ated with another, but work is needed to bridge the gap between association and
`causation. Randomized controlled experiments are ideally suited for demonstrat-
`ing causation.
`Anecdotal evidence usually amounts to reports that events of one kind are
`followed by events of another kind. Typically, the reports are not even sufficient
`to show association, because there is no comparison group. For example, some
`children who live near power lines develop leukemia. Does exposure to electrical
`and magnetic fields cause this disease? The anecdotal evidence is not compelling
`because leukemia also occurs among children without exposure.15 It is necessary
`to compare disease rates among those who are exposed and those who are not.
`If exposure causes the disease, the rate should be higher among the exposed and
`lower among the unexposed. That would be association.
`The next issue is crucial: Exposed and unexposed people may differ in ways
`other than the exposure they have experienced. For example, children who live
`near power lines could come from poorer families and be more at risk from other
`environmental hazards. Such differences can create the appearance of a cause-and-
`effect relationship. Other differences can mask a real relationship. Cause-and-effect
`relationships often are quite subtle, and carefully designed studies are needed to
`draw valid conclusions.
`An epidemiological classic makes the point. At one time, it was thought that
`lung cancer was caused by fumes from tarring the roads, because many lung cancer
`patients lived near roads that recently had been tarred. This is anecdotal evidence.
`But the argument is incomplete. For one thing, most people—whether exposed
`to asphalt fumes or unexposed—did not develop lung cancer. A comparison of
`rates was needed. The epidemiologists found that exposed persons and unexposed
`persons suffered from lung cancer at similar rates: Tar was probably not the causal
`agent. Exposure to cigarette smoke, however, turned out to be strongly associated
`with lung cancer. This study, in combination with later ones, made a compelling
`case that smoking cigarettes is the main cause of lung cancer.16
`A good study design compares outcomes for subjects who are exposed to
`some factor (the treatment group) with outcomes for other subjects who are
`
`15. See National Research Council, Committee on the Possible Effects of Electromagnetic Fields
`on Biologic Systems (1997); Zeisel & Kaye, supra note 1, at 66–67. There are problems in measur-
`ing exposure to electromagnetic fields, and results are inconsistent from one study to another. For
`such reasons, the epidemiological evidence for an effect on health is inconclusive. National Research
`Council, supra; Zeisel & Kaye, supra; Edward W. Campion, Power Lines, Cancer, and Fear, 337 New
`Eng. J. Med. 44 (1997) (editorial); Martha S. Linet et al., Residential Exposure to Magnetic Fields and Acute
`Lymphoblastic Leukemia in Children, 337 New Eng. J. Med. 1 (1997); Gary Taubes, Magnetic Field-Cancer
`Link: Will It Rest in Peace?, 277 Science 29 (1997) (quoting various epidemiologists).
`16. Richard Doll & A. Bradford Hill, A Study of the Aetiology of Carcinoma of the Lung, 2 Brit.
`Med. J. 1271 (1952). This was a matched case-control study. Cohort studies soon followed. See
`Green et al., supra note 13. For a review of the evidence on causation, see 38 International Agency
`for Research on Cancer (IARC), World Health Org., IARC Monographs on the Evaluation of the
`Carcinogenic Risk of Chemicals to Humans: Tobacco Smoking (1986).
`
`218
`
`Regeneron Exhibit 1198.009
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Guide on Statistics
`
`not exposed (the control group). Now there is another important distinction to
`be made—that between controlled experiments and observational studies. In a
`controlled experiment, the investigators decide which subjects will be exposed
`and which subjects will go into the control group. In observational studies, by
`contrast, the subjects themselves choose their exposures. Because of self-selection,
`the treatment and control groups are likely to differ with respect to influential
`factors other than the one of primary interest. (These other factors are called lurk-
`ing variables or confounding variables.)17 With the health effects of power lines,
`family background is a possible confounder; so is exposure to other hazards. Many
`confounders have been proposed to explain the association between smoking and
`lung cancer, but careful epidemiological studies have ruled them out, one after
`the other.
`Confounding remains a problem to reckon with, even for the best observa-
`tional research. For example, women with herpes are more likely to develop cer-
`vical cancer than other women. Some investigators concluded that herpes caused
`cancer: In other words, they thought the association was causal. Later research
`showed that the primary cause of cervical cancer was human papilloma virus
`(HPV). Herpes was a marker of sexual activity. Women who had multiple sexual
`partners were more likely to be exposed not only to herpes but also to HPV.
`The association between herpes and cervical cancer was due to other variables.18
`What are “variables?” In statistics, a variable is a characteristic of units in a
`study. With a study of people, the unit of analysis is the person. Typical vari-
`ables include income (dollars per year) and educational level (years of schooling
`completed): These variables describe people. With a study of school districts, the
`unit of analysis is the district. Typical variables include average family income of
`district residents and average test scores of students in the district: These variables
`describe school districts.
`When investigating a cause-and-effect relationship, the variable that repre-
`sents the effect is called the dependent variable, because it depends on the causes.
`The variables that represent the causes are called independent variables. With a
`study of smoking and lung cancer, the independent variable would be smoking
`(e.g., number of cigarettes per day), and the dependent variable would mark the
`presence or absence of lung cancer. Dependent variables also are called outcome
`variables or response variables. Synonyms for independent variables are risk factors,
`predictors, and explanatory variables.
`
`17. For example, a confounding variable may be correlated with the independent variable and
`act causally on the dependent variable. If the units being studied differ on the independent variable,
`they are also likely to differ on the confounder. The confounder—not the independent variable—could
`therefore be responsible for differences seen on the dependent variable.
`18. For additional examples and further discussion, see Freedman et al., supra note 12, at 12–28,
`150–52; David A. Freedman, From Association to Causation: Some Remarks on the History of Statistics, 14
`Stat. Sci. 243 (1999). Some studies find that herpes is a “cofactor,” which increases risk among women
`who are also exposed to HPV. Only certain strains of HPV are carcinogenic.
`
`219
`
`Regeneron Exhibit 1198.010
`Regeneron v. Novartis
`IPR2021-00816
`
`Reference Manual on Scientific Evidence: Third Edition
`
`Copyright © National Academy of Sciences. All rights reserved.
`
`

`

`Reference Manual on Scientific Evidence
`
`2. Randomized controlled experiments
`
`In randomized controlled experiments, investigators assign subjects to treatment
`or control groups at random. The groups are therefore likely to be comparable,
`except for the treatment. This minimizes the role of confounding. Minor imbal-
`ances will remain, due to the play of random chance; the likely effect on study
`results can be assessed by statistical techniques.19 The bottom line is that causal
`inferences based on well-executed randomized experiments are generally more
`secure than inferences based on well-executed observational studies.
`The following example should help bring the discussion together. Today, we
`know that taking aspirin helps prevent heart attacks. But initially, there was some
`controversy. People who take aspirin rarely have heart attacks. This is anecdotal
`evidence for a protective effect, but it proves almost nothing. After all, few people
`have frequent heart attacks, whether or not they take aspirin regularly. A good
`study compares heart attack rates for two groups: people who take aspirin (the
`treatment group) and people who do not (the controls). An observational study
`would be easy to do, but in such a study the aspirin-takers are likely to be dif-
`ferent from the controls. Indeed, they are likely to be sicker—that is why they
`are taking aspirin. The study would be biased against finding a protective effect.
`Randomized experiments are harder to do, but they provide better evidence. It
`is the experiments that demonstrate a protective effect.20
`In summary, data from a treatment group without a control group generally
`reveal very little and can be misleading. Comparisons are essential. If subjects are
`assigned to treatment and control groups at random, a difference in the outcomes
`between the two groups can usually be accepted, within the limits of statistical
`error (infra Section IV), as a good measure of the treatment effect. However, if
`the groups are created in any other way, differences that existed before treatment
`may contribute to differences in the outcomes or mask differences that otherwise
`would become manifest. Observational studies succeed to the extent that the treat-
`ment and control groups are comparable—apart from the treatment.
`
`3. Observational studies
`
`The bulk of the statistical studies seen in court are observational, not experi-
`mental. Take the question of whether capital punishment deters murder. To
`conduct a randomized controlled experiment, people would need to be assigned
`randomly to a treatment group or a control group. People in the treatment
`group would know they were subject to the death penalty for murder; the
`
`19. Randomization of subjects to treatment or control groups puts statistical tests of s

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket