throbber
PAWEL ODATA
`
`UF CHOSS SECTION
`
`2a ee ee
`
`A
`
`Regeneron Exhibit 1199.001
`Regeneron Exhibit 1199.001
`Regeneronv. Novartis
`Regeneron v. Novartis
`IPR2021-00816
`IPR2021-00816
`
`

`

`Econometric Analysis of Cross Section and Panel Data
`
`Je¤rey M. Wooldridge
`
`The MIT Press
`Cambridge, Massachusetts
`London, England
`
`Regeneron Exhibit 1199.002
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Contents
`
`I
`
`1
`1.1
`1.2
`
`1.3
`1.4
`
`2
`2.1
`2.2
`
`2.3
`
`3
`3.1
`3.2
`3.3
`3.4
`3.5
`
`Preface
`Acknowledgments
`
`INTRODUCTION AND BACKGROUND
`
`Introduction
`Causal Relationships and Ceteris Paribus Analysis
`The Stochastic Setting and Asymptotic Analysis
`1.2.1 Data Structures
`1.2.2 Asymptotic Analysis
`Some Examples
`Why Not Fixed Explanatory Variables?
`
`Conditional Expectations and Related Concepts in Econometrics
`The Role of Conditional Expectations in Econometrics
`Features of Conditional Expectations
`2.2.1 Definition and Examples
`2.2.2
`Partial E¤ects, Elasticities, and Semielasticities
`2.2.3 The Error Form of Models of Conditional Expectations
`2.2.4
`Some Properties of Conditional Expectations
`2.2.5 Average Partial E¤ects
`Linear Projections
`Problems
`Appendix 2A
`2.A.1 Properties of Conditional Expectations
`2.A.2 Properties of Conditional Variances
`2.A.3 Properties of Linear Projections
`
`Basic Asymptotic Theory
`Convergence of Deterministic Sequences
`Convergence in Probability and Bounded in Probability
`Convergence in Distribution
`Limit Theorems for Random Samples
`Limiting Behavior of Estimators and Test Statistics
`3.5.1 Asymptotic Properties of Estimators
`3.5.2 Asymptotic Properties of Test Statistics
`Problems
`
`xvii
`xxiii
`
`1
`
`3
`3
`4
`4
`7
`7
`9
`
`13
`13
`14
`14
`15
`18
`19
`22
`24
`27
`29
`29
`31
`32
`
`35
`35
`36
`38
`39
`40
`40
`43
`45
`
`Regeneron Exhibit 1199.003
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`vi
`
`II
`
`4
`4.1
`4.2
`
`4.3
`
`4.4
`
`5
`5.1
`
`5.2
`
`5.3
`
`LINEAR MODELS
`
`The Single-Equation Linear Model and OLS Estimation
`Overview of the Single-Equation Linear Model
`Asymptotic Properties of OLS
`4.2.1 Consistency
`4.2.2 Asymptotic Inference Using OLS
`4.2.3 Heteroskedasticity-Robust Inference
`4.2.4 Lagrange Multiplier (Score) Tests
`OLS Solutions to the Omitted Variables Problem
`4.3.1 OLS Ignoring the Omitted Variables
`4.3.2 The Proxy Variable–OLS Solution
`4.3.3 Models with Interactions in Unobservables
`Properties of OLS under Measurement Error
`4.4.1 Measurement Error in the Dependent Variable
`4.4.2 Measurement Error in an Explanatory Variable
`Problems
`
`Instrumental Variables Estimation of Single-Equation Linear Models
`Instrumental Variables and Two-Stage Least Squares
`5.1.1 Motivation for Instrumental Variables Estimation
`5.1.2 Multiple Instruments: Two-Stage Least Squares
`General Treatment of 2SLS
`5.2.1 Consistency
`5.2.2 Asymptotic Normality of 2SLS
`5.2.3 Asymptotic E‰ciency of 2SLS
`5.2.4 Hypothesis Testing with 2SLS
`5.2.5 Heteroskedasticity-Robust Inference for 2SLS
`5.2.6
`Potential Pitfalls with 2SLS
`IV Solutions to the Omitted Variables and Measurement Error
`Problems
`5.3.1 Leaving the Omitted Factors in the Error Term
`5.3.2
`Solutions Using Indicators of the Unobservables
`Problems
`
`6
`6.1
`
`Additional Single-Equation Topics
`Estimation with Generated Regressors and Instruments
`
`Contents
`
`47
`
`49
`49
`51
`52
`54
`55
`58
`61
`61
`63
`67
`70
`71
`73
`76
`
`83
`83
`83
`90
`92
`92
`94
`96
`97
`100
`101
`
`105
`105
`105
`107
`
`115
`115
`
`Regeneron Exhibit 1199.004
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Contents
`
`6.2
`
`6.3
`
`7
`7.1
`7.2
`7.3
`
`7.4
`
`7.5
`
`7.6
`7.7
`
`6.1.1 OLS with Generated Regressors
`6.1.2
`2SLS with Generated Instruments
`6.1.3 Generated Instruments and Regressors
`Some Specification Tests
`6.2.1 Testing for Endogeneity
`6.2.2 Testing Overidentifying Restrictions
`6.2.3 Testing Functional Form
`6.2.4 Testing for Heteroskedasticity
`Single-Equation Methods under Other Sampling Schemes
`6.3.1
`Pooled Cross Sections over Time
`6.3.2 Geographically Stratified Samples
`6.3.3
`Spatial Dependence
`6.3.4 Cluster Samples
`Problems
`Appendix 6A
`
`Estimating Systems of Equations by OLS and GLS
`Introduction
`Some Examples
`System OLS Estimation of a Multivariate Linear System
`7.3.1
`Preliminaries
`7.3.2 Asymptotic Properties of System OLS
`7.3.3 Testing Multiple Hypotheses
`Consistency and Asymptotic Normality of Generalized Least
`Squares
`7.4.1 Consistency
`7.4.2 Asymptotic Normality
`Feasible GLS
`7.5.1 Asymptotic Properties
`7.5.2 Asymptotic Variance of FGLS under a Standard
`Assumption
`Testing Using FGLS
`Seemingly Unrelated Regressions, Revisited
`7.7.1 Comparison between OLS and FGLS for SUR Systems
`7.7.2
`Systems with Cross Equation Restrictions
`7.7.3
`Singular Variance Matrices in SUR Systems
`
`vii
`
`115
`116
`117
`118
`118
`122
`124
`125
`128
`128
`132
`134
`134
`135
`139
`
`143
`143
`143
`147
`147
`148
`153
`
`153
`153
`156
`157
`157
`
`160
`162
`163
`164
`167
`167
`
`Regeneron Exhibit 1199.005
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`viii
`
`7.8
`
`8
`8.1
`8.2
`8.3
`
`8.4
`8.5
`
`8.6
`
`9
`9.1
`9.2
`
`9.3
`
`9.4
`
`The Linear Panel Data Model, Revisited
`7.8.1 Assumptions for Pooled OLS
`7.8.2 Dynamic Completeness
`7.8.3 A Note on Time Series Persistence
`7.8.4 Robust Asymptotic Variance Matrix
`7.8.5 Testing for Serial Correlation and Heteroskedasticity after
`Pooled OLS
`7.8.6 Feasible GLS Estimation under Strict Exogeneity
`Problems
`
`System Estimation by Instrumental Variables
`Introduction and Examples
`A General Linear System of Equations
`Generalized Method of Moments Estimation
`8.3.1 A General Weighting Matrix
`8.3.2 The System 2SLS Estimator
`8.3.3 The Optimal Weighting Matrix
`8.3.4 The Three-Stage Least Squares Estimator
`8.3.5 Comparison between GMM 3SLS and Traditional 3SLS
`Some Considerations When Choosing an Estimator
`Testing Using GMM
`8.5.1 Testing Classical Hypotheses
`8.5.2 Testing Overidentification Restrictions
`More E‰cient Estimation and Optimal Instruments
`Problems
`
`Simultaneous Equations Models
`The Scope of Simultaneous Equations Models
`Identification in a Linear System
`9.2.1 Exclusion Restrictions and Reduced Forms
`9.2.2 General Linear Restrictions and Structural Equations
`9.2.3 Unidentified, Just Identified, and Overidentified Equations
`Estimation after Identification
`9.3.1 The Robustness-E‰ciency Trade-o¤
`9.3.2 When Are 2SLS and 3SLS Equivalent?
`9.3.3 Estimating the Reduced Form Parameters
`Additional Topics in Linear SEMs
`
`Contents
`
`169
`170
`173
`175
`175
`
`176
`178
`179
`
`183
`183
`186
`188
`188
`191
`192
`194
`196
`198
`199
`199
`201
`202
`205
`
`209
`209
`211
`211
`215
`220
`221
`221
`224
`224
`225
`
`Regeneron Exhibit 1199.006
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Contents
`
`9.4.1 Using Cross Equation Restrictions to Achieve Identification
`9.4.2 Using Covariance Restrictions to Achieve Identification
`9.4.3
`Subtleties Concerning Identification and E‰ciency in Linear
`Systems
`SEMs Nonlinear in Endogenous Variables
`9.5.1
`Identification
`9.5.2 Estimation
`Di¤erent Instruments for Di¤erent Equations
`Problems
`
`9.5
`
`9.6
`
`Basic Linear Unobserved E¤ects Panel Data Models
`10
`10.1 Motivation: The Omitted Variables Problem
`10.2 Assumptions about the Unobserved E¤ects and Explanatory
`Variables
`10.2.1 Random or Fixed E¤ects?
`10.2.2 Strict Exogeneity Assumptions on the Explanatory
`Variables
`10.2.3 Some Examples of Unobserved E¤ects Panel Data Models
`Estimating Unobserved E¤ects Models by Pooled OLS
`10.3
`10.4 Random E¤ects Methods
`10.4.1 Estimation and Inference under the Basic Random E¤ects
`Assumptions
`10.4.2 Robust Variance Matrix Estimator
`10.4.3 A General FGLS Analysis
`10.4.4 Testing for the Presence of an Unobserved E¤ect
`Fixed E¤ects Methods
`10.5.1 Consistency of the Fixed E¤ects Estimator
`10.5.2 Asymptotic Inference with Fixed E¤ects
`10.5.3 The Dummy Variable Regression
`10.5.4 Serial Correlation and the Robust Variance Matrix
`Estimator
`10.5.5 Fixed E¤ects GLS
`10.5.6 Using Fixed E¤ects Estimation for Policy Analysis
`First Di¤erencing Methods
`10.6.1
`Inference
`10.6.2 Robust Variance Matrix
`
`10.5
`
`10.6
`
`ix
`
`225
`227
`
`229
`230
`230
`235
`237
`239
`
`247
`247
`
`251
`251
`
`252
`254
`256
`257
`
`257
`262
`263
`264
`265
`265
`269
`272
`
`274
`276
`278
`279
`279
`282
`
`Regeneron Exhibit 1199.007
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`x
`
`10.7
`
`10.6.3 Testing for Serial Correlation
`10.6.4 Policy Analysis Using First Di¤erencing
`Comparison of Estimators
`10.7.1 Fixed E¤ects versus First Di¤erencing
`10.7.2 The Relationship between the Random E¤ects and Fixed
`E¤ects Estimators
`10.7.3 The Hausman Test Comparing the RE and FE Estimators
`Problems
`
`More Topics in Linear Unobserved E¤ects Models
`11
`11.1 Unobserved E¤ects Models without the Strict Exogeneity
`Assumption
`11.1.1 Models under Sequential Moment Restrictions
`11.1.2 Models with Strictly and Sequentially Exogenous
`Explanatory Variables
`11.1.3 Models with Contemporaneous Correlation between Some
`Explanatory Variables and the Idiosyncratic Error
`11.1.4 Summary of Models without Strictly Exogenous
`Explanatory Variables
`11.2 Models with Individual-Specific Slopes
`11.2.1 A Random Trend Model
`11.2.2 General Models with Individual-Specific Slopes
`11.3 GMM Approaches to Linear Unobserved E¤ects Models
`11.3.1 Equivalence between 3SLS and Standard Panel Data
`Estimators
`11.3.2 Chamberlain’s Approach to Unobserved E¤ects Models
`11.4 Hausman and Taylor-Type Models
`11.5 Applying Panel Data Methods to Matched Pairs and Cluster
`Samples
`Problems
`
`III
`
`GENERAL APPROACHES TO NONLINEAR ESTIMATION
`
`M-Estimation
`12
`Introduction
`12.1
`Identification, Uniform Convergence, and Consistency
`12.2
`12.3 Asymptotic Normality
`
`Contents
`
`282
`283
`284
`284
`
`286
`288
`291
`
`299
`
`299
`299
`
`305
`
`307
`
`314
`315
`315
`317
`322
`
`322
`323
`325
`
`328
`332
`
`339
`
`341
`341
`345
`349
`
`Regeneron Exhibit 1199.008
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Contents
`
`12.4
`
`12.5
`
`Two-Step M-Estimators
`12.4.1 Consistency
`12.4.2 Asymptotic Normality
`Estimating the Asymptotic Variance
`12.5.1 Estimation without Nuisance Parameters
`12.5.2 Adjustments for Two-Step Estimation
`12.6 Hypothesis Testing
`12.6.1 Wald Tests
`12.6.2 Score (or Lagrange Multiplier) Tests
`12.6.3 Tests Based on the Change in the Objective Function
`12.6.4 Behavior of the Statistics under Alternatives
`12.7 Optimization Methods
`12.7.1 The Newton-Raphson Method
`12.7.2 The Berndt, Hall, Hall, and Hausman Algorithm
`12.7.3 The Generalized Gauss-Newton Method
`12.7.4 Concentrating Parameters out of the Objective Function
`Simulation and Resampling Methods
`12.8.1 Monte Carlo Simulation
`12.8.2 Bootstrapping
`Problems
`
`12.8
`
`Maximum Likelihood Methods
`13
`Introduction
`13.1
`Preliminaries and Examples
`13.2
`13.3 General Framework for Conditional MLE
`13.4
`Consistency of Conditional MLE
`13.5 Asymptotic Normality and Asymptotic Variance Estimation
`13.5.1 Asymptotic Normality
`13.5.2 Estimating the Asymptotic Variance
`13.6 Hypothesis Testing
`13.7
`Specification Testing
`13.8
`Partial Likelihood Methods for Panel Data and Cluster Samples
`13.8.1 Setup for Panel Data
`13.8.2 Asymptotic Inference
`13.8.3
`Inference with Dynamically Complete Models
`13.8.4
`Inference under Cluster Sampling
`
`xi
`
`353
`353
`354
`356
`356
`361
`362
`362
`363
`369
`371
`372
`372
`374
`375
`376
`377
`377
`378
`380
`
`385
`385
`386
`389
`391
`392
`392
`395
`397
`398
`401
`401
`405
`408
`409
`
`Regeneron Exhibit 1199.009
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`xii
`
`Contents
`
`13.9
`
`Panel Data Models with Unobserved E¤ects
`13.9.1 Models with Strictly Exogenous Explanatory Variables
`13.9.2 Models with Lagged Dependent Variables
`13.10 Two-Step MLE
`Problems
`Appendix 13A
`
`Generalized Method of Moments and Minimum Distance Estimation
`14
`14.1 Asymptotic Properties of GMM
`14.2
`Estimation under Orthogonality Conditions
`14.3
`Systems of Nonlinear Equations
`14.4
`Panel Data Applications
`14.5
`E‰cient Estimation
`14.5.1 A General E‰ciency Framework
`14.5.2 E‰ciency of MLE
`14.5.3 E‰cient Choice of Instruments under Conditional Moment
`Restrictions
`Classical Minimum Distance Estimation
`Problems
`Appendix 14A
`
`14.6
`
`IV
`
`NONLINEAR MODELS AND RELATED TOPICS
`
`15.5
`
`Discrete Response Models
`15
`Introduction
`15.1
`The Linear Probability Model for Binary Response
`15.2
`Index Models for Binary Response: Probit and Logit
`15.3
`15.4 Maximum Likelihood Estimation of Binary Response Index
`Models
`Testing in Binary Response Index Models
`15.5.1 Testing Multiple Exclusion Restrictions
`15.5.2 Testing Nonlinear Hypotheses about b
`15.5.3 Tests against More General Alternatives
`15.6 Reporting the Results for Probit and Logit
`15.7
`Specification Issues in Binary Response Models
`15.7.1 Neglected Heterogeneity
`15.7.2 Continuous Endogenous Explanatory Variables
`
`410
`410
`412
`413
`414
`418
`
`421
`421
`426
`428
`434
`436
`436
`438
`
`439
`442
`446
`448
`
`451
`
`453
`453
`454
`457
`
`460
`461
`461
`463
`463
`465
`470
`470
`472
`
`Regeneron Exhibit 1199.010
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Contents
`
`15.8
`
`15.7.3 A Binary Endogenous Explanatory Variable
`15.7.4 Heteroskedasticity and Nonnormality in the Latent
`Variable Model
`15.7.5 Estimation under Weaker Assumptions
`Binary Response Models for Panel Data and Cluster Samples
`15.8.1 Pooled Probit and Logit
`15.8.2 Unobserved E¤ects Probit Models under Strict Exogeneity
`15.8.3 Unobserved E¤ects Logit Models under Strict Exogeneity
`15.8.4 Dynamic Unobserved E¤ects Models
`15.8.5 Semiparametric Approaches
`15.8.6 Cluster Samples
`15.9 Multinomial Response Models
`15.9.1 Multinomial Logit
`15.9.2 Probabilistic Choice Models
`15.10 Ordered Response Models
`15.10.1 Ordered Logit and Ordered Probit
`15.10.2 Applying Ordered Probit to Interval-Coded Data
`Problems
`
`Corner Solution Outcomes and Censored Regression Models
`16
`Introduction and Motivation
`16.1
`16.2 Derivations of Expected Values
`16.3
`Inconsistency of OLS
`16.4
`Estimation and Inference with Censored Tobit
`16.5 Reporting the Results
`16.6
`Specification Issues in Tobit Models
`16.6.1 Neglected Heterogeneity
`16.6.2 Endogenous Explanatory Variables
`16.6.3 Heteroskedasticity and Nonnormality in the Latent
`Variable Model
`16.6.4 Estimation under Conditional Median Restrictions
`Some Alternatives to Censored Tobit for Corner Solution
`Outcomes
`16.8 Applying Censored Regression to Panel Data and Cluster Samples
`16.8.1 Pooled Tobit
`16.8.2 Unobserved E¤ects Tobit Models under Strict Exogeneity
`
`16.7
`
`xiii
`
`477
`
`479
`480
`482
`482
`483
`490
`493
`495
`496
`497
`497
`500
`504
`504
`508
`509
`
`517
`517
`521
`524
`525
`527
`529
`529
`530
`
`533
`535
`
`536
`538
`538
`540
`
`Regeneron Exhibit 1199.011
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`xiv
`
`Contents
`
`16.8.3 Dynamic Unobserved E¤ects Tobit Models
`Problems
`
`17.3
`
`Sample Selection, Attrition, and Stratified Sampling
`17
`Introduction
`17.1
`17.2 When Can Sample Selection Be Ignored?
`17.2.1 Linear Models: OLS and 2SLS
`17.2.2 Nonlinear Models
`Selection on the Basis of the Response Variable: Truncated
`Regression
`17.4 A Probit Selection Equation
`17.4.1 Exogenous Explanatory Variables
`17.4.2 Endogenous Explanatory Variables
`17.4.3 Binary Response Model with Sample Selection
`17.5 A Tobit Selection Equation
`17.5.1 Exogenous Explanatory Variables
`17.5.2 Endogenous Explanatory Variables
`Estimating Structural Tobit Equations with Sample Selection
`Sample Selection and Attrition in Linear Panel Data Models
`17.7.1 Fixed E¤ects Estimation with Unbalanced Panels
`17.7.2 Testing and Correcting for Sample Selection Bias
`17.7.3 Attrition
`Stratified Sampling
`17.8.1 Standard Stratified Sampling and Variable Probability
`Sampling
`17.8.2 Weighted Estimators to Account for Stratification
`17.8.3 Stratification Based on Exogenous Variables
`Problems
`
`17.6
`17.7
`
`17.8
`
`Estimating Average Treatment E¤ects
`18
`Introduction
`18.1
`18.2 A Counterfactual Setting and the Self-Selection Problem
`18.3 Methods Assuming Ignorability of Treatment
`18.3.1 Regression Methods
`18.3.2 Methods Based on the Propensity Score
`Instrumental Variables Methods
`18.4.1 Estimating the ATE Using IV
`
`18.4
`
`542
`544
`
`551
`551
`552
`552
`556
`
`558
`560
`560
`567
`570
`571
`571
`573
`575
`577
`578
`581
`585
`590
`
`590
`592
`596
`598
`
`603
`603
`603
`607
`608
`614
`621
`621
`
`Regeneron Exhibit 1199.012
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Contents
`
`18.5
`
`18.4.2 Estimating the Local Average Treatment E¤ect by IV
`Further Issues
`18.5.1 Special Considerations for Binary and Corner Solution
`Responses
`18.5.2 Panel Data
`18.5.3 Nonbinary Treatments
`18.5.4 Multiple Treatments
`Problems
`
`Count Data and Related Models
`19
`19.1 Why Count Data Models?
`19.2
`Poisson Regression Models with Cross Section Data
`19.2.1 Assumptions Used for Poisson Regression
`19.2.2 Consistency of the Poisson QMLE
`19.2.3 Asymptotic Normality of the Poisson QMLE
`19.2.4 Hypothesis Testing
`19.2.5 Specification Testing
`19.3 Other Count Data Regression Models
`19.3.1 Negative Binomial Regression Models
`19.3.2 Binomial Regression Models
`19.4 Other QMLEs in the Linear Exponential Family
`19.4.1 Exponential Regression Models
`19.4.2 Fractional Logit Regression
`Endogeneity and Sample Selection with an Exponential Regression
`Function
`19.5.1 Endogeneity
`19.5.2 Sample Selection
`Panel Data Methods
`19.6.1 Pooled QMLE
`19.6.2 Specifying Models of Conditional Expectations with
`Unobserved E¤ects
`19.6.3 Random E¤ects Methods
`19.6.4 Fixed E¤ects Poisson Estimation
`19.6.5 Relaxing the Strict Exogeneity Assumption
`Problems
`
`19.6
`
`19.5
`
`xv
`
`633
`636
`
`636
`637
`638
`642
`642
`
`645
`645
`646
`646
`648
`649
`653
`654
`657
`657
`659
`660
`661
`661
`
`663
`663
`666
`668
`668
`
`670
`671
`674
`676
`678
`
`Regeneron Exhibit 1199.013
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`xvi
`
`Contents
`
`Duration Analysis
`20
`Introduction
`20.1
`20.2 Hazard Functions
`20.2.1 Hazard Functions without Covariates
`20.2.2 Hazard Functions Conditional on Time-Invariant
`Covariates
`20.2.3 Hazard Functions Conditional on Time-Varying
`Covariates
`20.3 Analysis of Single-Spell Data with Time-Invariant Covariates
`20.3.1 Flow Sampling
`20.3.2 Maximum Likelihood Estimation with Censored Flow
`Data
`20.3.3 Stock Sampling
`20.3.4 Unobserved Heterogeneity
`20.4 Analysis of Grouped Duration Data
`20.4.1 Time-Invariant Covariates
`20.4.2 Time-Varying Covariates
`20.4.3 Unobserved Heterogeneity
`Further Issues
`20.5.1 Cox’s Partial Likelihood Method for the Proportional
`Hazard Model
`20.5.2 Multiple-Spell Data
`20.5.3 Competing Risks Models
`Problems
`
`20.5
`
`References
`Index
`
`685
`685
`686
`686
`
`690
`
`691
`693
`694
`
`695
`700
`703
`706
`707
`711
`713
`714
`
`714
`714
`715
`715
`
`721
`737
`
`Regeneron Exhibit 1199.014
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Acknowledgments
`
`My interest in panel data econometrics began in earnest when I was an assistant
`professor at MIT, after I attended a seminar by a graduate student, Leslie Papke,
`who would later become my wife. Her empirical research using nonlinear panel data
`methods piqued my interest and eventually led to my research on estimating non-
`linear panel data models without distributional assumptions. I dedicate this text to
`Leslie.
`My former colleagues at MIT, particularly Jerry Hausman, Daniel McFadden,
`Whitney Newey, Danny Quah, and Thomas Stoker, played significant roles in en-
`couraging my interest in cross section and panel data econometrics. I also have
`learned much about the modern approach to panel data econometrics from Gary
`Chamberlain of Harvard University.
`I cannot discount the excellent training I received from Robert Engle, Clive
`Granger, and especially Halbert White at the University of California at San Diego. I
`hope they are not too disappointed that this book excludes time series econometrics.
`I did not teach a course in cross section and panel data methods until I started
`teaching at Michigan State. Fortunately, my colleague Peter Schmidt encouraged me
`to teach the course at which this book is aimed. Peter also suggested that a text on
`panel data methods that uses ‘‘vertical bars’’ would be a worthwhile contribution.
`Several classes of students at Michigan State were subjected to this book in manu-
`script form at various stages of development. I would like to thank these students for
`their perseverance, helpful comments, and numerous corrections. I want to specifically
`mention Scott Baier, Linda Bailey, Ali Berker, Yi-Yi Chen, William Horrace, Robin
`Poston, Kyosti Pietola, Hailong Qian, Wendy Stock, and Andrew Toole. Naturally,
`they are not responsible for any remaining errors.
`I was fortunate to have several capable, conscientious reviewers for the manuscript.
`Jason Abrevaya (University of Chicago), Joshua Angrist (MIT), David Drukker
`(Stata Corporation), Brian McCall (University of Minnesota), James Ziliak (Uni-
`versity of Oregon), and three anonymous reviewers provided excellent suggestions,
`many of which improved the book’s organization and coverage.
`The people at MIT Press have been remarkably patient, and I have very much
`enjoyed working with them. I owe a special debt to Terry Vaughn (now at Princeton
`University Press) for initiating this project and then giving me the time to produce a
`manuscript with which I felt comfortable. I am grateful to Jane McDonald and
`Elizabeth Murry for reenergizing the project and for allowing me significant leeway
`in crafting the final manuscript. Finally, Peggy Gordon and her crew at P. M. Gordon
`Associates, Inc., did an expert job in editing the manuscript and in producing the
`final text.
`
`Regeneron Exhibit 1199.015
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Preface
`
`This book is intended primarily for use in a second-semester course in graduate
`econometrics, after a first course at the level of Goldberger (1991) or Greene (1997).
`Parts of the book can be used for special-topics courses, and it should serve as a
`general reference.
`My focus on cross section and panel data methods—in particular, what is often
`dubbed microeconometrics—is novel, and it recognizes that, after coverage of the
`basic linear model in a first-semester course, an increasingly popular approach is to
`treat advanced cross section and panel data methods in one semester and time series
`methods in a separate semester. This division reflects the current state of econometric
`practice.
`Modern empirical research that can be fitted into the classical linear model para-
`digm is becoming increasingly rare. For instance, it is now widely recognized that a
`student doing research in applied time series analysis cannot get very far by ignoring
`recent advances in estimation and testing in models with trending and strongly de-
`pendent processes. This theory takes a very di¤erent direction from the classical lin-
`ear model than does cross section or panel data analysis. Hamilton’s (1994) time
`series text demonstrates this di¤erence unequivocally.
`Books intended to cover an econometric sequence of a year or more, beginning
`with the classical linear model, tend to treat advanced topics in cross section and
`panel data analysis as direct applications or minor extensions of the classical linear
`model (if they are treated at all). Such treatment needlessly limits the scope of appli-
`cations and can result in poor econometric practice. The focus in such books on the
`algebra and geometry of econometrics is appropriate for a first-semester course, but
`it results in oversimplification or sloppiness in stating assumptions. Approaches to
`estimation that are acceptable under the fixed regressor paradigm so prominent in the
`classical linear model can lead one badly astray under practically important depar-
`tures from the fixed regressor assumption.
`Books on ‘‘advanced’’ econometrics tend to be high-level treatments that focus on
`general approaches to estimation, thereby attempting to cover all data configurations—
`including cross section, panel data, and time series—in one framework, without giving
`special attention to any. A hallmark of such books is that detailed regularity con-
`ditions are treated on par with the practically more important assumptions that have
`economic content. This is a burden for students learning about cross section and
`panel data methods, especially those who are empirically oriented: definitions and
`limit theorems about dependent processes need to be included among the regularity
`conditions in order to cover time series applications.
`In this book I have attempted to find a middle ground between more traditional
`approaches and the more recent, very unified approaches. I present each model and
`
`Regeneron Exhibit 1199.016
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`xviii
`
`Preface
`
`method with a careful discussion of assumptions of the underlying population model.
`These assumptions, couched in terms of correlations, conditional expectations, con-
`ditional variances and covariances, or conditional distributions, usually can be given
`behavioral content. Except for the three more technical chapters in Part III, regularity
`conditions—for example, the existence of moments needed to ensure that the central
`limit theorem holds—are not discussed explicitly, as these have little bearing on ap-
`plied work. This approach makes the assumptions relatively easy to understand, while
`at the same time emphasizing that assumptions concerning the underlying population
`and the method of sampling need to be carefully considered in applying any econo-
`metric method.
`A unifying theme in this book is the analogy approach to estimation, as exposited
`by Goldberger (1991) and Manski (1988). [For nonlinear estimation methods with
`cross section data, Manski (1988) covers several of the topics included here in a more
`compact format.] Loosely, the analogy principle states that an estimator is chosen to
`solve the sample counterpart of a problem solved by the population parameter. The
`analogy approach is complemented nicely by asymptotic analysis, and that is the focus
`here.
`By focusing on asymptotic properties I do not mean to imply that small-sample
`properties of estimators and test statistics are unimportant. However, one typically
`first applies the analogy principle to devise a sensible estimator and then derives its
`asymptotic properties. This approach serves as a relatively simple guide to doing
`inference, and it works well in large samples (and often in samples that are not so
`large). Small-sample adjustments may improve performance, but such considerations
`almost always come after a large-sample analysis and are often done on a case-by-
`case basis.
`The book contains proofs or outlines the proofs of many assertions, focusing on the
`role played by the assumptions with economic content while downplaying or ignoring
`regularity conditions. The book is primarily written to give applied researchers a very
`firm understanding of why certain methods work and to give students the background
`for developing new methods. But many of the arguments used throughout the book
`are representative of those made in modern econometric research (sometimes without
`the technical details). Students interested in doing research in cross section or panel
`data methodology will find much here that is not available in other graduate texts.
`I have also included several empirical examples with included data sets. Most of
`the data sets come from published work or are intended to mimic data sets used in
`modern empirical analysis. To save space I illustrate only the most commonly used
`methods on the most common data structures. Not surprisingly, these overlap con-
`
`Regeneron Exhibit 1199.017
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Preface
`
`xix
`
`siderably with methods that are packaged in econometric software programs. Other
`examples are of models where, given access to the appropriate data set, one could
`undertake an empirical analysis.
`The numerous end-of-chapter problems are an important component of the book.
`Some problems contain important points that are not fully described in the text;
`others cover new ideas that can be analyzed using the tools presented in the current
`and previous chapters. Several of the problems require using the data sets that are
`included with the book.
`As with any book, the topics here are selective and reflect what I believe to be the
`methods needed most often by applied researchers. I also give coverage to topics that
`have recently become important but are not adequately treated in other texts. Part I
`of the book reviews some tools that are elusive in mainstream econometrics books—
`in particular, the notion of conditional expectations, linear projections, and various
`convergence results. Part II begins by applying these tools to the analysis of single-
`equation linear models using cross section data. In principle, much of this material
`should be review for students having taken a first-semester course. But starting with
`single-equation linear models provides a bridge from the classical analysis of linear
`models to a more modern treatment, and it is the simplest vehicle to illustrate the
`application of the tools in Part I. In addition, several methods that are used often
`in applications—but rarely covered adequately in texts—can be covered in a single
`framework.
`I approach estimation of linear systems of equations with endogenous variables
`from a di¤erent perspective than traditional treatments. Rather than begin with simul-
`taneous equations models, we study estimation of a general linear system by instru-
`mental variables. This approach allows us to later apply these results to models
`with the same statistical structure as simultaneous equations models,
`including
`panel data models. Importantly, we can study the generalized method of moments
`estimator from the beginning and easily relate it to the more traditional three-stage
`least squares estimator.
`The analysis of general estimation methods for nonlinear models in Part III begins
`with a general treatment of asymptotic theory of estimators obtained from non-
`linear optimization problems. Maximum likelihood, partial maximum likelihood,
`and generalized method of moments estimation are shown to be generally applicable
`estimation approaches. The method of nonlinear least squares is also covered as a
`method for estimating models of conditional means.
`Part IV covers several nonlinear models used by modern applied researchers.
`Chapters 15 and 16 treat limited dependent variable models, with attention given to
`
`Regeneron Exhibit 1199.018
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`xx
`
`Preface
`
`handling certain endogeneity problems in such models. Panel data methods for binary
`response and censored variables, including some new estimation approaches, are also
`covered in these chapters.
`Chapter 17 contains a treatment of sample selection problems for both cross sec-
`tion and panel data, including some recent advances. The focus is on the case where
`the population model is linear, but some results are given for nonlinear models as
`well. Attrition in panel data models is also covered, as are methods for dealing with
`stratified samples. Recent approaches to estimating average treatment e¤ects are
`treated in Chapter 18.
`Poisson and related regression models, both for cross section and panel data, are
`treated in Chapter 19. These rely heavily on the method of quasi-maximum likeli-
`hood estimation. A brief but modern treatment of duration models is provided in
`Chapter 20.
`I have given short shrift to some important, albeit more advanced, topics. The
`setting here is, at least in modern parlance, essentially parametric. I have not included
`detailed treatment of recent advances in semiparametric or nonparametric analysis.
`In many cases these topics are not conceptually di‰cult. In fact, many semiparametric
`methods focus primarily on estimating a finite dimensional parameter in the presence
`of an infinite dimensional nuisance parameter—a feature shared by traditional par-
`ametric methods, such as nonlinear least squares and partial maximum likelihood.
`It is estimating infinite dimensional parameters that is conceptually and technically
`challenging.
`At the appropriate point, in lieu of treating semiparametric and nonparametric
`methods, I mention when such extensions are possible, and I provide references. A
`benefit of a modern approach to parametric models is that it provides a seamless
`transition to semiparametric and nonparametric methods. General surveys of semi-
`parametric and nonparametric methods are available in Volume 4 of the Handbook
`of Econometrics—see Powell (1994) and Ha¨rdle and Linton (1994)—as well as in
`Volume 11 of the Handbook of Statistics—see Horowitz (1993) and Ullah and Vinod
`(1993).
`I only briefly treat simulation-based methods of estimation and inference. Com-
`puter simulations can be used to estimate complicated nonlinear models when tradi-
`tional optimization methods are ine¤ective. The bootstrap method of inference and
`confidence interval construction can improve on asymptotic analysis. Volume 4 of
`the Handbook of Econometrics and Volume 11 of the Handbook of Statistics contain
`nice surveys of these topics (Hajivassilou and Ruud, 1994; Hall, 1994; Hajivassilou,
`1993; and Keane, 1993).
`
`Regeneron Exhibit 1199.019
`Regeneron v. Novartis
`IPR2021-00816
`
`

`

`Preface
`
`xxi
`
`On an organizational note, I refer to sections throughout the book first by chapter
`number followed by section number and, sometimes, subsection number. Therefore,
`Section 6.3 refers to Section 3 in Chapter 6, and Section 13.8.3 refers to Subsection 3
`of Section 8 in Chapter 13. By always including the chapter number, I hope to
`minimize confusion.
`
`Possible Course Outlines
`
`If all chapters in the book are covered in detail, there is enough mate

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket