`We specify and estimate a diffusion model for the new molecule omeprazole into the anti-ulcer
`drug market. Our model is based on a Bayesian learning process whereby doctors update their
`beliefs about omeprazole’s quality relative to existing drugs alter observing its reflects on the pa-
`tients that have been prescribed this drug. The model also accommodates informational spillovers
`and heterogeneity in informativeness across patients with diflerent diagnoses. We obtain estimates
`of the learning process parameters using a novel panel data set tracking doctors’ complete pre-
`scription histories over a 3-year period.
`© 2003 Elsevier B.V. All rights reserved.
`JEL classification: [10; L10
`Keywords: Entry of innovative drugs; Barriers to entry; Structural diffusion models
`1. Introduction
`First—mover advantage is a well—documented phenomenon in many differentiated
`product markets (see Urban et a1. (1986) for a survey of the evidence). Economists have
`tended to attribute this phenomenon to lack of information among consumers about the
`quality or attributes of an entrant’s product; for example, Shapiro (1982, p. 7) states that
`...the fundamental source of the entry barrier is an information one: consumers
`have better information about established brands than about new ones [...] infor-
`mation is the basic barrier to be overcome by a new product...
`' Corresponding author. Tel: +1—410—516—8828; fax: +1-410-516—7600.
`E—mail Wessex: (A. Coscelli), (M. Shum).
0304-4076/3-see front matter © 2003 Elsevier B.V. All rights reserved.
`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`The doctor/patient relationship is fraught with uncertainty. Doctors have incomplete
`information on the medical condition of a patient, and which treatment is best for the
`patient. Doctors learn about the quality of alternative treatments both through direct
`experience (actual prescriptions of the new drug), and indirect experience (such as
`promotional activity by pharmaceutical companies, articles in medical journals and
`attendance at medical conferences). This paper focuses on direct information, which
`accumulates slowly, and is confounded by heterogeneity across diagnoses: what works
`for diagnosis X may not work as well for diagnosis Y.
`Using a novel panel data set of complete prescription histories for a sample of
`doctors in the Rome (Italy) metropolitan area, we study the di)usion process of a
`new anti-ulcer drug (omeprazole) during a 3-year period (1990–1992). The evolu-
`tion of omeprazole’s market share over time was marked by the gradual di)usion
`which characterizes new product entry into many product markets: omeprazole’s mar-
`ket share (as a proportion of total prescriptions) climbed from under 5% in the latter
`half of 1990 to about 15% in early 1992, and eventually up to 25% by the middle
`of 1995.
`In this paper, we gauge how well this gradual di)usion pattern can be explained by
`a learning model in which doctors, initially uncertain about the quality di)erential be-
`tween omeprazole and the incumbent drugs, update their beliefs about this di)erential
`after observing noisy signals from patients to whom they have prescribed omeprazole.
`To that end, we specify and estimate the parameters of such a learning model. Fur-
`thermore, in order to accommodate features speciIc to the pharmaceutical prescription
`process, we extend the basic learning model to allow for spillovers across all the pa-
`tients of a given doctor, as well as heterogeneity in informativeness across patients.
`While there are alternative explanations for the individual-level di)usion process (such
`as the publication of the results of post-marketing clinical trials in medical journals),
`we focus on a learning explanation because our data includes especially rich detail on
`doctors’ prescription histories.
`Our results suggest that the learning model does very well in generating the ob-
`served slow di)usion path of omeprazole in the Italian market. The parameters of
`the learning model quantify, in informational terms, the disadvantage that omeprazole
`su)ered relative to the existing drugs upon its entry into the Italian anti-ulcer mar-
`ket. This informational disadvantage can arise from either doctors’ initial pessimism
`about omeprazole’s quality, or risk aversion. In addition, we Ind that the informa-
`tional spillovers are negative across some diagnosis groups, which tends to retard
`the speed of learning. That
`is, we Ind that a positive outcome when prescribing
`omeprazole for certain diagnoses leads doctors to regard it as less attractive for other
`The next section provides some background on the international and Italian anti-ulcer
`drug markets. Section 3 describes the doctor-level learning model, Section 4 describes
`our panel data set of complete prescription histories and Section 5 derives the estimating
`equations associated with the learning model. Results from several speciIcations of the
`learning model are presented and interpreted in Section 6, and we conclude in the last


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`2. Background
`Several studies have documented the existence and, more importantly, the nature of
`barriers to entry into pharmaceutical markets. Bond and Lean (1977) found evidence of
`substantial pioneer advantage, but they also found that products containing some thera-
`peutic novelty managed to gain large market shares when backed by heavy promotional
`campaigns. Berndt et al. (1997) document similar e)ects in the anti-ulcer drug market.
`Their Indings clearly show that technological advances do not necessarily translate into
`large market shares without tremendous marketing muscle. 1 As striking as the results
`from the two studies are, however, they never explain the causes of pioneer advantage.
`The availability of doctor-level prescription histories allows us a unique opportunity
`to assess the role of information in explaining the di)usion patterns observed in many
`product markets. 2
`This paper joins a growing empirical literature examining behavioral explanations for
`di)usion patterns for new products in experience good markets. Among these studies,
`Ackerberg (2002) and Erdem and Keane (1996) estimated structural learning models
`to explain consumers’ purchase patterns for, respectively, yogurt and laundry deter-
`gent. Ching (2000) has also estimated a demand model for pharmaceuticals based on
`a Bayesian learning procedure. Our work di)ers from these papers because we con-
`sider a more general learning model which allows for spillovers across all the patients
`of a given doctor, as well as heterogeneity in informativeness across patients. These
`extensions seem especially appropriate for pharmaceutical markets, since prescription
`drugs (and in particular anti-ulcer drugs) are usually prescribed for several di)erent
`Using aggregate market share data, Azoulay et al. (2003) estimate a di)usion model
`to study the importance of consumption externalities in explaining the di)usion patterns
`of H2-antagonist drugs into the anti-ulcer drug market. Our analysis extends their work
`by using a novel micro-data set to quantify the extent of network-type spillovers across
`patients belonging to the same doctor. 3
`3. The learning model
`In this section, we describe the behavioral model which forms the basis of our
`empirical analysis. In what follows, we index doctors by the subscript i, and assume
`that patients are heterogeneous in their diagnoses, which we subscript by j. We begin
`1 Using a similar data set, Azoulay (2002) investigates how promotional activity and scientiIc informa-
`tion arising from clinical trials a)ect the di)usion of competing molecules in the anti-ulcer drug market.
`King (2000) focuses on the role of marketing in increasing the perceived product di)erentiation (i.e., degree
`of substitutability) between competing anti-ulcer drugs.
`2 A related literature (Stern, 1996; Ellison et al., 1997) has investigated the extent of competition in
`pharmaceutical markets by estimating cross-price elasticities between the competing drugs in a market.
`Unlike these papers, we abstract away from competition between existing anti-ulcer drugs.
`3 Finally, there has been a long interest in di)usion models in the marketing literature. See Bass et al.
`(1990) for a review of this largely theoretical and macro-level empirical literature. Chandrashekaran and
`Sinha (1995) is one of the few papers in this literature which are formulated at the micro-level.


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`by describing a baseline version of the learning model in which doctors are assumed
`to be risk neutral. At the end of this section, we discuss an alternative model which
`allows for risk aversion.
`Consider a given patient k, from diagnosis group j, who visits doctor i during period
`t. We assume that doctor i distinguishes between two treatment alternatives: the new
`molecule, omeprazole (alternative 1), and any of the other molecules (alternative 0).
`The utilities for a given patient k with diagnosis j during period t from each alternative
`1j + (cid:14)i1(t) + (cid:13)∗1jkt = (cid:8)∗1 xi + (cid:10)p1t + (cid:12)∗
`U i
`U i
`0jkt = (cid:8)∗0 xi + (cid:10)p0t + (cid:12)∗0(t) + (cid:13)∗0j + (cid:14)i
`if take omeprazole;
`• p1t and p0t are, respectively, the price of omeprazole and a weighted average of the
`prices of the incumbent drugs weighted by their market shares at time t. The vector
`xi contains observed doctors’ characteristics.
`• (cid:13)∗
`1j and (cid:13)∗0j parameterize the “unobserved quality” of omeprazole and the incum-
`bent drugs when treating diagnosis j. These are unobserved by the econometrician.
`Doctors, however, are presumed to know (cid:13)∗
`0j, and have imperfect information about
`(cid:13)∗1j. As described below, doctors learn about (cid:13)∗1j by prescribing omeprazole to their
`• (cid:12)∗
`1(t) and (cid:12)∗
`0(t) are Nexible functions of time, which parameterize period t factors
`which a)ect the attractiveness of, respectively, omeprazole and the incumbent drugs.
`These are the same over all doctors, patients, and diagnoses. In particular, the func-
`tion (cid:12)∗
`1(t) proxies for aspects of the learning process which we do not explicitly
`model, such as word of mouth, medical congresses, and articles in medical journals.
`• (cid:14)i
`1jkt and (cid:14)i0jkt are i.i.d. (over doctors, patients, diagnoses, and time periods) shocks as-
`sociated with, respectively, omeprazole and the incumbent drugs. They are observed
`by the doctors, but not by the econometrician.
`Throughout, we abstract away from agency problems between the doctor and the
`patient, and assume the doctor maximizes the patient’s utility from the prescription. 4
`Doctor i chooses the option with the higher per-period utility. 5 The choice rule for
`4 The reputation e)ects resulting from the long-term nature of many patient–doctor relationships in Italy
`(the National Health Service requires each enrollee to list a general practitioner) tend to minimize the
`divergence between doctors’ and patients’ objective functions which potentially form the basis of agency
`5 For computational tractability we have assumed that doctors are myopic in our model, so that in any
`given time period, a doctor chooses the molecule with the highest per-period utility based solely on her
`current information. If the doctor were forward-looking, she would choose the molecule with the highest
`present discounted utility and thereby take into account the information that she would gain about omeprazole
`by prescribing it this period. Ongoing work by Crawford and Shum (2000) examines issues of uncertainty
`and matching in pharmaceutical demand in a fully forward-looking framework. Ferreyra (1999) has recently
`estimated a forward-looking dynamic learning model, using the same data that we use in this paper, but
`without allowing for spillovers across patients.


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`the doctor is to prescribe omeprazole if Et(U i
`1kjt) ¿ U i0kjt. If we assume that (cid:14)i
`1jkt and
`0jkt are i.i.d. with the type 1 extreme value distribution, the probability that doctor i
`prescribes omeprazole takes the familiar logit form: 6
`exp((cid:8)xi + (cid:10)Rpt + (cid:12)(t) + Et(cid:13)j)
`Prob(prescribe omeprazole) =
`1 + exp((cid:8)xi + (cid:10) Rpt + (cid:12)(t) + Et(cid:13)j)
`1(t)−(cid:12)∗1−(cid:8)∗0 ; (cid:12)(t) ≡ (cid:12)∗
`where we have substituted (cid:8) ≡ (cid:8)∗
`and the (cid:12)(t) function are to be estimated. 7
`By distinguishing between di)erent diagnoses, we allow the entrant and incumbent
`anti-ulcer drugs to di)er in their e)ectiveness and suitability across diagnoses. This
`accommodates “segmentation” or “horizontal di)erentiation” in the market on the basis
`of diagnosis, which we believe to be an important feature of the anti-ulcer drug market.
`0(t), and Et(cid:13)j ≡ Et(cid:13)∗1j−(cid:13)0j: (cid:8); (cid:10),
`3.1. Bayesian updating
`The main focus of the paper is to measure how well the di)usion pattern for omepra-
`zole can be explained by doctors’ learning about (cid:13). We explain this learning process
`in this section. Throughout, we assume that the learning processes are independent
`across doctors. 8 Therefore, we describe the learning process for doctor i, omitting the
`superscript i in most of the equations below for expositional clarity. We assume that, at
`time t = 0 (i.e., at omeprazole’s entry), she (doctor i) has the following initial beliefs
`about ˜(cid:13), the J -dimensional vector of quality di)erentials between omeprazole and the
`incumbent drugs:
`˜(cid:13) ∼ N
`˜(cid:13)1 ≡
`Throughout, we adopt the indexing convention that the subscript t denotes the be-
`ginning of period t; therefore, ˜(cid:13)1 denotes the mean of doctors’ beliefs at the beginning
`of period 1, corresponding to the mean of the doctors’ initial beliefs (and (cid:19)(cid:13);1 is sim-
`ilarly the initial variance–covariance matrix). The assumption that the initial variance–
`covariance matrix (cid:19)(cid:13);1 is diagonal implies that the information that doctors had about
`: : :
`: : :
`: : :
`(cid:19)(cid:13);1 ≡
`6 By aggregating all the non-omeprazole-based drugs into one alternative, we are implicitly assuming that
`all these drugs are perfectly substitutable, and that an omeprazole-based drug substitutes equally well with all
`of them. We make this assumption because we want to focus on the di)usion of drugs based on omeprazole
`into the marketplace.
`7 In most of the speciIcations reported below, we assume that the time function (cid:12)(t) is a quadratic time
`trend. As we point out below, since the price di)erential Rpt only varies over time, it would be impossible
`to separately identify the price coeScient (cid:10) apart from a full set of time dummies.
`8 Informational spillovers across doctors (“word of mouth”) at the aggregate level are captured by the


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`omeprazole before its entry is speciIc to particular diagnoses. This reNects the institu-
`tional feature that clinical trials—the results of which constitute most of doctors’ prior
`information—are generally most informative as to a drug’s e)ectiveness for particular
`diagnoses, and less informative regarding interactions of e)ects for di)erent diagnoses,
`which would lead to non-zero o)-diagonal terms in the initial variance–covariance
`matrix. 9
`The evolution of doctor i’s beliefs over time can be derived period-by-period. As-
`sume that doctor i begins time period t with beliefs that
`˜(cid:13) ∼ N(Et
`˜(cid:13); (cid:19)(cid:13);t ≡ Et
`˜(cid:13)˜(cid:13)(cid:2) − (Et
`(it will be clear later how these beliefs arise). During period t, the doctor prescribes
`omeprazole to kj of her patients with diagnosis j, and observes kj noisy signals of (cid:13)j.
`We assume that these kj signals ((cid:21)jtk; k = 1 → kj) take the following form:
`(cid:21)jtk = (cid:13)j + (cid:22)jtk;
`where (cid:22)jtk is normally distributed, with zero mean. Doctors attempt to form estimates
`of (cid:13)j from observations of the noisy signals (cid:21)jtk’s.
`Correlation structure: In order to accommodate informational spillovers across pa-
`tients in di)erent diagnosis groups (i.e., to capture the idea that “what is good for
`diagnosis X may not be good for diagnosis Y”), we assume that, within a given pe-
`riod t, the noise terms (cid:22) are correlated across signals. We induce correlation across
`signals with the following variance components structure for each (cid:22):
`(cid:22)jtk = (cid:23)jt + jtk;
`j = 1; : : : ; J;
`where (i) t is distributed N(0; (cid:20)2
`), i.i.d. over t; (ii) the jtk’s are independent over j,
`t, and k, and distributed N(0; (cid:20)2
` j), j = 1; : : : ; J ; and (iii) (cid:23)1; : : : ; (cid:23)J are time-invariant
`parameters. Given these assumptions, then, the following correlation structure among
`all the signals ((cid:21)’s) that doctor i observes in period t emerges:
` + (cid:20)2
`1. Var((cid:22)jtk) = (cid:23)2j (cid:20)2
` j,
`, for k (cid:7)= k(cid:2),
`2. Cov((cid:22)jtk; (cid:22)jtk(cid:1)) = (cid:23)2j (cid:20)2
`, for j (cid:7)= j(cid:2) and ∀ k; k(cid:2).
`3. Cov((cid:22)jtk; (cid:22)j(cid:1)tk(cid:1)) = (cid:23)j(cid:23)j(cid:1)(cid:20)2
`This one-factor variance components speciIcation reduces the number of parameters,
`while placing mild restrictions on the correlation structure. 10 In Appendix A we cal-
`culate the variance–covariance matrix of a vector of signals, for a simple example.
`Period-by-period updating: Given the normality assumptions on the signals (cid:21) as
`well as on the (cid:13)’s, a doctor’s posterior beliefs about ˜(cid:13) given ˜(cid:21)t are described by a
`normal distribution with a mean and variance that can be derived using the multivariate
`normal conditional mean and variance formulas (Amemiya, 1985, p. 3). The computed
`posterior distribution in period t serves as the prior distribution for period t + 1. In this
`way, we derive the sequence of a doctor’s posterior distributions over all the periods
`9 SpeciIc clinical evidence on omeprazole’s e)ectiveness for di)erent diagnoses is presented further below.
`10 We have attempted to estimate an extended model with a 2-factor variance components structure, but
`we have experienced problems identifying some of the parameters in that case.


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`by repeatedly applying the conditional mean and variance formulas for jointly normally
`distributed random variables.
`To this end, we characterize the joint distribution of (˜(cid:13); ˜(cid:21)t), during period t:
`(cid:13);(cid:21);t (cid:19)(cid:21);t
`∼ N
`where ˜(cid:13)t and (cid:19)(cid:13);t are, respectively, the mean and variance–covariance matrix of ˜(cid:13)
`conditional on all the signals received before period t. ˜(cid:13)(cid:21);t and (cid:19)(cid:21);t are the mean and
`variance–covariance matrix of the vector of signals ˜(cid:21)t (Eqs. (A.2) and (A.3) in the
`appendix are examples of these formulas), and (cid:19)(cid:13);(cid:21);t is the matrix of covariance terms
`between ˜(cid:13) and ˜(cid:21)t (which is easy to derive given Eqs. (A.1), (3.6) and (3.7)).
`Recall our indexing convention, whereby ˜(cid:13)t+1 ≡ E(˜(cid:13)|˜(cid:21)t) and (cid:19)(cid:13);t+1 ≡ (cid:19)t(˜(cid:13)|˜(cid:21)t)
`are, respectively, the prior mean vector and variance–covariance matrix of the quality
`vector ˜(cid:13) at the beginning of period t +1 (i.e., conditional on all the information signals
`obtained up to, and including, period t). For the learning model described above, and
`given the initial beliefs (3.4), these quantities can be recursively deIned as
`(cid:21);t (˜(cid:21)t − (˜(cid:13)t));
`˜(cid:13)t+1 = ˜(cid:13)t + (cid:19)(cid:2)(cid:13);(cid:21);t(cid:19)−1
`(cid:19)(cid:13);t+1 = (cid:19)(cid:13);t − (cid:19)(cid:2)
`(cid:21);t (cid:19)(cid:13);(cid:21);t
`for period t = 0; 1; 2; : : :
`Eq. (3.9) yields the means of the posterior distribution of the (cid:13)’s which are substi-
`tuted into the logit prescription probabilities (cf. Eq. (3.3)). These probabilities form
`the basis for our likelihood function, which is described in the next section.
`The parameters of the model which we estimate are: (i) the elements of the pe-
`riod zero initial mean vector (E1(cid:13)1; : : : ; E1(cid:13)J ); (ii) the diagonal elements of the initial
`variance–covariance matrix ((cid:20)2(cid:13);1; : : : ; (cid:20)2(cid:13);J ); (iii) the true values (cid:13)1; : : : ; (cid:13)J ; (iv) the pa-
` J , and (cid:20)2rameters of the correlation structure (cid:23)1; : : : ; (cid:23)J , (cid:20)2 1; : : : ; (cid:20)2
`; and (v) the param-
`eters which enter the utility speciIcation (cid:8); (cid:10), and the time function (cid:12)(t).
`3.2. Remarks
`Rational expectations and risk aversion: In the preceding model, we have not
`allowed for risk aversion in the utility function. We can accommodate risk aversion
`directly in the utility speciIcation of Eq. (3.1) above by including the posterior vari-
`ance directly as an argument in the expected utility expression. Hence, the probability
`that doctor i prescribes omeprazole takes the form
`exp((cid:8)xi + (cid:10)Rpt + (cid:12)(t) + Et(cid:13)j + (cid:26) Vart (cid:13)j)
`1 + exp((cid:8)xi + (cid:10)Rpt + (cid:12)(t) + Et(cid:13)j + (cid:26) Vart (cid:13)j)
`where Vart (cid:13)j denotes doctor i’s posterior variance on (cid:13)j based on the information she
`has obtained from prescriptions prior to period t, and (cid:26) measures the degree of risk


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`Et(cid:13) =
`aversion.11 Using the notation presented above in Eq. (3.8), we can write Vart (cid:13)j =
`(cid:19)(cid:13);t(j; j), the (j; j)th element of the period t variance–covariance matrix (cid:19)(cid:13);t.
`Without additional assumptions, we cannot separately identify the prior mean E1(cid:13)j,
`and the risk coeScient (cid:26). To see this, consider the simplest case of only one diagnosis.
`In this case, the sequence of prior means and variances is given by the well-known
`formulas (cf. DeGroot, Optimal Statistical Decisions, p. 167), for periods t = 2; 3; : : :
`(cid:20)2(cid:21)(cid:13)1 + (cid:20)2
`t(cid:1)=1 (cid:21)t(cid:1))
`1 ∗ rt−1
`(cid:20)2(cid:21) + (cid:20)2
`1 ∗ rt−1
`(cid:20)2(cid:21) + (cid:20)2
`where rt−1 denotes the number of prescription of omeprazole up to (and including)
`period t − 1; (cid:20)2
`(cid:21) denotes the variance of the prescription signals, and (cid:13)1 and (cid:20)2
`1 denote
`the initial mean and variance. By substituting these expressions into the expression for
`the choice probability (in Eq. (3.10) above), we see that the mean and variance above
`always enter the choice probability as the sum
`Vart (cid:13) =
`Et(cid:13)j + (cid:26) Vart (cid:13)j
`(cid:13)1 + (cid:26)(cid:20)2
`(cid:13)1 + (cid:26)(cid:20)2
`1 ∗ (cid:20)(cid:21) ∗(cid:20)(cid:17)rt−1
`t(cid:1)=1 (cid:22)t(cid:1)
`(cid:21) ∗(cid:18)
`(cid:21) ∗(cid:18)
`t(cid:1)=1 (cid:21)t(cid:1)) + (cid:26) ∗ (cid:20)2
`(cid:20)2(cid:21)(cid:13)1 + (cid:20)2
`1 ∗ rt−1
`(cid:20)2(cid:21) + (cid:20)2
`t(cid:1)=1((cid:13) + (cid:22)t(cid:1) ∗ (cid:20)(cid:21))
`+ (cid:20)2
`1 ∗ rt−1
`(cid:20)2(cid:21) + (cid:20)2
`1 ∗ rt−1 ∗ (cid:13) + (cid:20)2
`+ (cid:20)2
`1 ∗ rt−1
`(cid:20)2(cid:21) + (cid:20)2
`where we have re-written the signals as (cid:21)t(cid:1) =(cid:13)+(cid:22)t(cid:1)∗(cid:20)(cid:21) with (cid:22)t(cid:1) as i.i.d. standard-normal
`random variables.
`(cid:21) (the signal variance), (cid:20)2
`In our learning model, the parameters (cid:20)2
`1 (the prior vari-
`ance), (cid:13) (the true quality), (cid:13)1 (the prior mean), and (cid:26) (the risk aversion parameter)
`a)ect the likelihood function only through expression (3.11) above. Clearly, if one only
`has cross-sectional data for the initial period t = 1, it is impossible to identify all these
`parameters separately (in this case, r0 = 0 across all doctors, and the above expression
`reduces to the constant (cid:13)1 + (cid:26)(cid:20)2
`However, inspection of the above expression yields that variation in rt (the number
`of omeprazole prescriptions) across periods t and across doctors should be suScient
`to identify (cid:20)2(cid:21); (cid:20)21; (cid:13), and the sum [(cid:13)1 + (cid:26)(cid:20)21], just due to the non-linear updating
`formulas of the Gaussian learning model. Since the two remaining parameters (cid:13)1 and
`(cid:26) only enter the above expression via the sum [(cid:13)1 + (cid:26)(cid:20)2
`1], they cannot be separately
`identiIed (i.e., for any value of Z, the locus of pairs ((cid:13)1; (cid:26) = (Z − (cid:13)1)=(cid:20)2
`1) yields the
`same likelihood function value).
`11 With CARA utility, (cid:26) = 1
`2 r, where r is the coeScient of absolute risk aversion.


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`This discussion highlights the infeasibility of identifying the risk aversion parameter
`(cid:26) separately from the prior means E1(cid:13)j; j = 1; 2; 3; 4. Therefore, in our estimation, we
`consider an additional restriction which rules out pessimism by setting the prior means
`equal to the true qualities:
`E1(cid:13)j = (cid:13)j;
`j = 1; 2; 3; 4:
`This is a rational expectations assumption which is standard in many learning models,
`and which is based on the assumption that doctors’ beliefs should be right on average
`about the true quality of omeprazole. However, even with rational expectations, doctors
`(cid:13);j ≡
`still face uncertainty about its quality, as parameterized by the prior variances (cid:20)2
`Var1 (cid:13)j; j = 1; : : : ; 4.
`For the more complex multivariate learning model with varying numbers of signals
`per diagnosis employed in this paper, the argument for non-identiIcation is more diS-
`cult because the expressions for the posterior mean and variance cannot be written in
`the manner above, as a function of rt,
`t (cid:21)t(cid:1), and the estimated parameters. Hence, we
`explore the separate identiIcation of the (cid:26) and prior mean parameters by simulation,
`and our Indings are presented in Appendix C.
`Related work with the same data set: In ongoing work, one of the authors is
`using the same data set to estimate an explicitly dynamic (forward-looking) model of
`learning (cf. Crawford and Shum, 2000). Since the real world prescription process is
`much more complicated than either of these approaches, the models in the two papers
`accommodate contrasting sets of simplifying assumptions to shed light on di)erent
`aspects of the learning problems which we expect to be important in pharmaceutical
`markets. We discuss several important di)erences between the two papers here.
`First, the empirical questions considered in the two papers are quite di)erent. The
`current paper addresses the question of new good entry, and focuses on considering
`micro-level explanation for omeprazole’s aggregate di)usion pattern. For this reason,
`we assume here that agents are uncertain about the quality of only omeprazole, but not
`the other drugs. The Crawford–Shum paper, on the other hand, addresses the issue of
`patient–drug matches, and focuses on estimating a model which explains the observed
`treatment lengths and “switches” of patients from one drug to another. Since matching
`problems arise only when agents face uncertainty about the returns from a number of
`competing choices, the Crawford–Shum model assumes that patients are ignorant of
`the relative qualities of all the competing drugs, not just omeprazole. 12
`Second, the model considered in Crawford and Shum (2000) is fully dynamic, and
`features patients who choose drugs via a dynamic discrete-choice optimization prob-
`lem. Since the computational burden of such a model is quite severe, the information
`structure is kept quite simple, and no attempt is made to accommodate informational
`spillovers across patients at the doctor-level. In the current paper, however, we ac-
`commodate a more complicated information structure (including these spillovers) by
`abstracting away from the dynamic forward-looking aspect of the learning problem.
`12 The matching problem resembles the well-known “multi-armed bandit” problem in decision theory.


`A. Coscelli, M. Shum / Journal of Econometrics 122 (2004) 213 – 246
`4. Data
`The data used in this analysis was collected by the Italian National Institute of Health.
`It records, for a 10% sample 13 of the doctors in the metropolitan area of Rome, all
`prescriptions of anti-ulcer drugs (therapeutic class A02B 14 ) to all their patients during
`a 3-year period (1990–1992). A prescription episode is the unit of observation in the
`data set. The data set contains more than 660,000 observations, each of which records
`the identity of the patient, the prescribing doctor, the drug prescribed, and the year and
`month of the prescription: 326 doctors, and 174,000 patients are represented in the data.
`The median number of prescriptions for the 326 doctors is around 2000 prescriptions
`during the 3-year sample period: 10% of the in-sample doctors have less than 1300
`prescriptions, while only 10% have more than 2800 prescriptions. Appendix B provides
`more details on the data, in particular describing the covariates which we use when
`estimating the learning model.
`The anti-ulcer drug market: The anti-ulcer drug market is the largest therapeutic
`drug market worldwide. It is naturally segmented, with preferred treatments di)ering
`across segments depending on the severity of the diagnosis, as summarized in the Irst
`three columns of Table 1.
`The two most common diagnoses requiring treatment using anti-ulcer drugs are peptic
`ulcers and Gastroesophageal ReNux Disease (GERD). A peptic ulcer is an area of
`Table 1
`Segmentation in the anti-ulcer drug market: diagnoses and treatments
`(1) Minor
`Drugs or no
`(2) Pathological
`(3) Attack therapy
`for GERD
`or peptic ulcer
`(4) Maintenance
`therapy for GERD
`or peptic ulcer
`Patient has
`6 2 in-sample
`Q ¿ 133% average
`monthly quantity for
`ulcer in every month
`Q ¿ 133% average
`monthly quantity for
`ulcer in the month
`Q ¡ 133% average
`monthly quantity for
`ulcer in the month
`aMedical Economics Co., 1997.
`bPrescription assigned to diagnoses by the authors using the daily dosage for the average patient requiring
`an ulcer treatment as suggested by Medical Economics Co. (1997).
`13 The doctors in the sample were not chosen following any sampling technique, since the only information
`available was their “id” number.
`14 This four digit code, the ATC code, is an international classiIcation s

