throbber
Biometrika (1965), 52, 3 and 4, p. 591
`With 5 text—figures
`Printed in Great Britain
`
`591
`
`An analysis of variance test for normality
`(complete samples)T
`
`BY S. S. SHAPIRO AND M. B. WILK
`
`General Electric 00. and Bell Telephone Laboratories, Inc.
`
`1.
`
`INTRODUCTION
`
`The main intent Of this paper is to introduce a new statistical procedure for testing a
`complete sample for normality. The test statistic is obtained by dividing the square of an
`appropriate linear combination of the sample order statistics by the usual symmetric
`estimate of variance. This ratio is both scale and origin invariant and hence the statistic
`is appropriate for a test of the composite hypothesis of normality.
`Testing for distributional assumptions in general and for normality in particular has been
`a major area of continuing statistical research—both theoretically and practically. A
`possible cause of such sustained interest is that many statistical procedures have been
`derived based on particular distributional assumptions—especially that of normality.
`Although in many cases the techniques are more robust than the assumptions underlying
`them, still a knowledge that the underlying assumption is incorrect may temper the use
`and application Of the methods. Moreover, the study of a body Of data With the stimulus
`of a distributional test may encourage consideration of, for example, normalizing trans-
`formations and the use Of alternate methods such as distribution-free techniques, as well as
`detection of gross peculiarities such as outliers or errors.
`The test procedure developed in this paper is defined and some of its analytical properties
`described in §2. Operational information and tables useful in employing the test are detailed
`
`in §3 (which may be read independently of the rest of the paper). Some examples are given
`in §4. Section 5 consists of an extract from an empirical sampling study Of the comparison of
`the effectiveness of various alternative tests. Discussion and concluding remarks are given
`in §6.
`
`2. THE W TEST FOR NORMALITY (COMPLETE SAMPLES)
`
`2- 1. Motivation and early work
`
`This study was initiated, in part, in an attempt to summarize formally certain indications
`of probability plots. In particular, could one condense departures from statistical linearity
`of probability plots into one or a few ‘degrees Of freedom’ in the manner of the application
`of analysis of variance in regression analysis?
`In a probability plot, one can consider the regression of the ordered Observations on the
`expected values of the order statistics from a standardized version of the hypothesized
`distribution—the plot tending to be linear if the hypothesis is true. Hence a possible method
`of testing the distributional assumptionis by means Of an analysis of variance type procedure.
`Using generalized least squares (the ordered variates are correlated) linear and higher-order
`models can be fitted and an F-type ratio used to evaluate the adequacy of the linear fit.
`
`1' Part of this research was supported by the Office of Naval Research while both authors were at
`Rutgers University.
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 1
`InnOPharma Licensing LLC v. AstraZeneca AB
`IPR2017-00904
`
`Fresenius-Kabi USA LLC v. AstraZeneca AB IPR2017-01910
`
`

`

`592
`
`S. S. SHAPIRO AND M. B. WILK
`
`This approach was investigated in preliminary work. While some promising results
`were obtained, the procedure is subject to the serious shortcoming that the selection of the
`higher—order model is, practically speaking, arbitrary. However, research is continuing
`along these lines.
`Another analysis of variance viewpoint which has been investigated by the present
`authors is to compare the squared slope of the probability plot regression line, which under
`the normality hypothesis is an estimate of the population variance multiplied by a constant,
`with the residual mean square about the regression line, which is another estimate of the
`variance. This procedure can be used with incomplete samples and has been described
`elsewhere (Shapiro & Wilk, 1965b).
`As an alternative to the above, for complete samples, the squared slope may be com-
`pared with the usual symmetric sample sum of squares about the mean which is independent
`of the ordering and easily computable. It is this last statistic that is discussed in the re-
`mainder of this paper.
`
`2-2. Derivation of the W statistic
`
`Let m’ = (m1, m2, ..., m”) denote the vector of expected values of standard normal
`order statistics, and let V = (vii) be the corresponding n x n covariance matrix. That is, if
`901 < .752 <
`x7, denotes an ordered random sample of size n from a normal distribution with
`mean 0 and variance 1, then
`
`and
`
`cov(x,.,xj) = v“.
`
`(i,j = 1,2,...,’n).
`
`E’(x),. = mi
`
`(i = 1, 2, ...,n),
`
`Let y’ = (y1,...,yn) denote a vector of ordered random observations. The objective is
`to derive a test for the hypothesis that this is a sample from a normal distribution with
`unknown mean ,u and unknown variance 0'2.
`Clearly, if the {3%.} are a normal sample then 3/,- may be expressed as
`
`y, = ”+0113,
`
`(i = 1,2, ...,n).
`
`It follows from the generalized least-squares theorem (Aitken, 1935; Lloyd, 1952) that the
`best linear unbiased estimates of [t and 0' are those quantities that minimize the quadratic
`form (y—Ml —o‘m)’ V*1(y—/i1~—crm), where 1’ = (1,1, ..., 1). These estimates are, respec—
`
`tlvely,
`a — m/ V_1(m1/ __ 1m!) 17—1y
`‘ “ 1’V—11m’V—1m—— (1’V-1m)2
`
`1’ V—1(1m' ~m1’) V‘ly
`1’ V—1 1m’ 17—1 m — (1’ 17*1m)2'
`
`and
`
`q)
`
`For symmetric distributions, 1’ V—lm = 0, and hence
`
`A
`In
`A
`fl=fizy¢=a and 0‘:
`1
`
`
`m’ 1743/
`m’ V“1m'
`
`Let
`
`32 = § (%-37)2
`1
`
`denote the usual symmetric unbiased estimate of (ii —— 1) U2.
`The W test statistic for normality is defined by
`
`R462
`[,2
`(a’
`2
`
`W:
`
`n
`
`02,312 = g2 = 8%) = (iéllaz‘yz’) E (gt—9V,
`
`2
`
`n
`i=1
`
`_
`
`This content downloaded from l28.255.6.l25 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 2
`
`

`

`An analysis of variance test for normality
`
`593
`
`where
`
`R2 = m’V‘lm,
`
`and
`
`02 = m’ V‘1 V“1m,
`I
`a — (a
`a )— -———~le~1
`_ 1,...) n — (m,V_1-V_1m)%
`b= 1326/0.
`
`Thus, b is, up to the normalizing constant 0, the best linear unbiased estimate of the slope
`of a linear regression of the ordered observations, yi, on the expected values, mi, of the stand—
`ard normal order statistics. The constant 0 is so defined that the linear coefficients are
`normalized.
`
`It may be noted that if one is indeed sampling from a normal population then the numer—
`ator, b2, and denominator, 82,01‘ W are both, up to a constant, estimating the same quantity,
`namely 0'2. For non—normal populations, these quantities would not in general be estimating
`the same thing. Heuristic considerations augmented by some fairly extensive empirical
`sampling results (Shapiro & Wilk, 1964a) using populations with a wide range of «A61 and
`[i2 values, suggest that the mean values of W for non—null distributions tends to shift
`to the left of that for the null case. Further it appears that the variance of the null dis-
`tribution of W tends to be smaller than that of the non-null distribution. It is likely
`that this is due to the positive correlation between the numerator and denominator for a
`normal population being greater than that for non-normal populations.
`Note that the coefficients {ai} are just the normalized ‘best linear unbiased’ coeflicients
`tabulated in Sarhan & Greenberg (1956).
`
`LEMMA 1. W is scale and origin invariant
`
`2-3. Some analytical properties of W
`
`Proof. This follows from the fact that for normal (more generally symmetric) distribu-
`tions,
`_
`_at — an—i+1
`
`COROLLARY 1. W has a distribution which depends only on the sample size n, for samples
`from a normal distribution.
`
`COROLLARY 2. W is statistically independent of 82 and of 37, for samples from a normal
`distribution.
`
`Proof. This follows from the fact that 37 and S2 are sufficient for ,a and (72 (Hogg & Craig,
`1956}
`
`COROLLARY 3. E WT = EMT/ES”, for any r.
`
`LEMMA 2. The maximum value of W is 1.
`
`Proof. Assume 37 = 0 since W is origin invariant by Lemma 1. Hence
`
`W = [Z “May/Z 3/?
`
`Since
`
`(2 din-)2 s 2932 at = Z 9%,
`
`because 2 a? = a’a = 1, by definition, then W is bounded by 1. This maximum is in fact
`'12
`
`achieved when 3/2‘ = 77%: for arbitrary 77.
`
`LEMMA 3. The minimum value of W is nafi/(n— 1).
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 3
`
`

`

`594
`
`S. S. SHAPIRO AND M. B. WILK
`
`Proof.T (Due to C. L. Mallows.) Since W is scale and origin invariant, it suffices to con-
`7L
`
`sider the maximization of 23/? subject to the constraints 23/,- = O, Zaiyi = 1. Since this
`i=1
`
`is a convex region and 23/? is a convex function, the maximum of the latter must occur at
`one of the (n —— 1) vertices of the region. These are
`
`
`(N -—1 ~ -—1>
`(
`n—2
`(72—2)
`
`not;l
`’nal’
`'nal
`
`
`”(5514412), ”((11+a2) , n(a1+d2) ,
`
`—2
`
`~2
`
`’ n(a1+a2)
`
`)
`
`n(a1+...+an_1)’n(a1+...+dn_1)’ ""n(a1+...+an_1) '
`
`
`
` ( 1 1 —(n—1)
`
`
`
`
`
`)
`
`It can now be checked numerically, for the values of the specific coefficients {0%}, that the
`77/
`
`maximum of 2, 3;? occurs at the first of these points and the corresponding minimum value
`i=1
`
`of W is as given in the Lemma.
`
`LEMMA 4. The half andfirst moments of W are given by
`
`and
`
`_ R2 I‘{%(n— 1)}
`WE ‘ aren) J2
`
`_ R2(R2+ 1)
`EW__02(n—1)’
`
`Where R2 = m’ 17.1772, and 02 = m'V—l V—lm.
`
`Proof. Using Corollary 3 of Lemma 1,
`
`EW% = Eb/ES and EW = Ebz/ES2.
`
`Now,
`
`E8 = 0421‘ (9/11 (13—1)
`
`and E82 = (n— 1) (72.
`
`From the general least squares theorem (see e.g. Kendall & Stuart, vol. II (1961)),
`
`and
`
`R2
`
`A
`
`R2
`
`A
`A
`R4
`A
`R4
`E52 = (72an = CTZ{V&I‘(O')+(EO’)2}
`
`= 02122 (122+ 1)/02,
`
`since var (6) = 0‘2/m’ V—lm = 02/132, and hence the results of the lemma follow.
`Values of these moments are shown in Fig. 1 for sample sizes n = 3(1) 20.
`
`LEMMA 5. A joint distribution involving W is defined by
`
`h( W, 62, ..., 6W2) = K W—%(1 — Wfim—‘D cow—4192
`
`cos 6,14,
`
`over a region T on which the 61’s and W are not independent, and where K is a constant.
`
`1' Lemma 3 was conjectured intuitively and verified by certain numerical studies. Subsequently
`the above proof was given by C. L. Mallows.
`
`This content downloaded from 128.255.6.l25 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://ab0ut.jst0r.org/terms
`
`AstraZeneca Exhibit 2171 p. 4
`
`

`

`An analysis of variance test for normality
`
`595
`
`Proof. Consider an orthogonal transformation B such that y = Bu, Where
`7!,
`70’
`
`“1 = 2191/«M and “2 =i21aiy'i = b-
`2:
`IL:
`
`The ordered yi’s are distributed as
`
`%n
`
`exp{—%Z(y‘—fl)}
`
`2
`
`._
`0'
`
`(—oo<y1<...<yn<oo).
`
`”!(27,102)
`
`
`
`-1
`
`After integrating out, al, the joint density for a2, ..., an is
`
`K*
`
`1
`n
`2
`exp { — 2—03 i§2ui}
`
`over the appropriate region T*. Changing to polar co-ordinates such that
`
`u2 = psin 61, etc,
`
`and then integrating over p, yields the joint density of 61, ..., 6n_2 as
`
`K ** cos"-3 61 cos ”—4 (92. .. cos 0n_3,
`
`over some region T **.
`From these various transformations
`
`
`“2
`,02 sin2 6
`b2
`2 2 mp21
`W = ~——«
`2 = n
`2 n?
`S
`i=1
`
`= sin2 61,
`
`from which the lemma follows. The 65s and W are not independent, they are restricted
`in the sample space T.
`'
`
`
`
` 090
`
`3
`
`5
`
`7
`
`9
`
`11
`
`13
`
`15
`
`17
`
`19
`
`21
`
`Sample size, n
`
`Fig. l. Moments of W, E(WT), n = 3(l)20, r = 1}, l.
`
`COROLLARY 4. For n = 3, the density of W is
`
`3 —
`
`7-T(1—W)-%'W-%, is W s 1.
`
`This content downloaded from l28.255.6.l25 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 5
`
`

`

`596
`
`S. S. SHAPIRO AND M. B. WILK
`
`Note that for n = 3, the W statistic is equivalent (up to a constant multiplier) to the
`statistic (range/standard deviation) advanced by David, Hartley & Pearson (1954) and
`the result of the corollary is essentially given by Pearson &; Stephens (1964).
`It has not been possible, for general n, to integrate out of the 65s of Lemma 5 to obtain
`an explicit form for the distribution of W. However, explicit results have also been given
`for n = 4, Shapiro (1964).
`
`2-4. Approximations associated with the W test
`
`The {ai} used in the W statistic are defined by
`
`a, = .Elmjvfi/O (j = 1,2, ...,n),
`
`9:
`
`where mi, 7),.)- and 0 have been defined in §2- 2. To determine the a,- directly it appears necessary
`to know both the vector of means m and the covariance matrix V. However, to date, the
`
`elements of V are known only up to samples of size 20 (Sarhan & Greenberg, 1956). Various
`approximations are presented in the remainder of this section to enable the use of W for
`samples larger than 20.
`By definition,
`
`m’ V—1
`
`m’ V‘1
`
`(m’V—lV—lm)21‘ -
`
`C
`
`is such that a’a = 1. Let a* = m’V‘l, then 02 = a*’a*. Suggested approximations are
`
`A: = 2m (i = 2,3,...,n—1),
`
`and
`
`=
`
`Ni”)
`
`x_
`
`= meow» ‘
`
`n g 20 ,
`
`’
`
`(/2P(%n+1)
`
`'
`
`A comparison of a? (the exact values) and a: for various values of i + 1 and n = 5, 10,
`15, 20 is given in Table 1. (Note a, = —an_i+1.) It will be seen that the approximation is
`generally in error by less than 1 %, particularly as 7; increases. This encourages one to trust
`the use of this approximation for n > 20. Necessary values of the mi for this approximation
`are available in Harter (1961).
`
`Table 1. Comparison of [afl and my) = [2mg], for selected values of
`i( =i= 1) and n
`
`n
`
`5
`
`10
`
`15
`
`20
`
`i =
`
`Exact
`
`Approx.
`Exact
`Approx.
`Exact
`Approx.
`Exact
`Approx.
`
`2
`
`1-014
`
`09%
`2-035
`2.003
`2530
`2-496
`2'849
`2815
`
`3
`
`0-0
`
`0-0
`1-324
`1-312
`1909
`1-895
`2'277
`2-262
`
`4
`
`——
`
`—
`0-757
`0-752
`L437
`1-430
`1-850
`1-842
`
`5
`
`—
`
`~—
`0247
`0245
`1036
`1-031
`1496
`1-491
`
`8
`
`—~
`
`—
`—
`—~
`0'0
`00
`0631
`0-630
`
`10
`
`—
`
`——
`——
`—-
`~—
`—~
`0124
`0-124
`
`This content downloaded from l28.255.6.l25 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://ab0ut.jst0r.0rg/terms
`
`AstraZeneca Exhibit 2171 p. 6
`
`

`

`An analysis of variance test for normality
`
`597
`
`A comparison of a1 and a? for n = 6(1) 201s given in Table 2. While the errors of this
`approximation are quite small for n< 20, the approximation and true values appear to
`cross over at n= 19. Further comparisons with other approximations, discussed below,
`suggested the changed formulation ofa? for n > 20 given above.
`
`n
`6
`7
`8
`9
`10
`11
`12
`
`C2
`
`60
`
`50
`
`40
`
`30
`
`20
`
`Table 2. Comparison of a? and 6%
`
`Exact
`0-414
`-388
`366
`347
`-329
`-314
`'300
`
`Approximate
`0-426
`-392
`-365
`-343
`324
`'308
`-295
`
`n
`13
`14
`15
`16
`17
`18
`19
`20
`
`Exact
`0-287
`276
`-265
`256
`247
`'239
`-231
`-224
`
`Approximate
`0-283
`-272
`-261
`254
`245
`-237
`231
`-226
`
`0
`
`
`
`R2: —2-41+1-98n
`
`Fig. 2. Plot of 02 = m’V—l V‘lm and R2 = m’V‘lm as functions of the sample size n.
`
`Sample size, n
`
`What is required for the W test are the normalized coefficients {ai}. Thus 6% is directly
`usable but the a: (i = 2, ..., n— 1), must be normalized by division by 0 = (m’V‘1 V—lmfi‘.
`A plot of the values of 02 and of R2 = m’ V—lm as a function of n is given in Fig. 2. The
`linearity of these may be summarized by the following least—squares equations:
`
`02 = — 2-722 + 4-083n,
`
`which gave a regression mean square of 7331-6 and a residual mean square of 00186, and
`
`R2 = —2-411 + 1-981n,
`
`with a regression mean square of 1725-7 and a residual mean square of 00016.
`38
`
`Biom. 52
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 7
`
`

`

`598
`
`S. S. SHAPIRO AND M. B. WILK
`
`These results encourage the use of the extrapolated equations to estimate 02 and R2
`for higher values of n.
`
`A comparison can now be made between values of 02 from the extrapolation equation
`TL
`
`and from 2 632, using
`1
`
`“*2 _
`a —
`1
`
`
`"*2
`a'
`a?
`n—l
`]
`2d? 2 ’L
`
`.
`
`For the case n = 30, these give values of 119-77 and 120-47, respectively. This concordance
`of the independent approximations increases faith in both.
`Plackett (1958) has suggested approximations for the elements of the vector at and R2.
`While his approximations are valid for a wide range of distributions and can be used with
`censored samples, they are more complex, for the normal case, than those suggested above.
`For the normal case his approximations are
`
`@333 = ”mj[F(mj+1)—F(mj—1)l
`, 2
`
`
`(j = 2> 3> ...,n— 1),
`
`7“: = nimjjflégzgl +m§f(mj) —f(mj) +mj[F(mj+1) _F(mj)]:
`
`(.9 = 1),
`
`where
`
`F(mj) = cumulative distribution evaluated at m],
`
`f(mj) = density function evaluated at my,
`
`and
`Plackett’s approximation to R2 is
`
`(if = —d;‘;.
`
`
`
`R2 = 2 {m§f(ml)f + m§f(m1)+ m1f(m1) — 2F(m1) + 1}.
`
`F(m1)
`
`Plackett’s 07,?“ approximations and the present fig" approximations are compared with the
`exact values, for sample size 20, in Table 3. In addition a consistency comparison of the
`two approximations is given for sample size 30. Plackett’s result for a1 (11, = 20) was the
`only case where his approximation was closer to the true value than the simpler approxima-
`tions suggested above. The differences in the two approximations for (11 were negligible,
`being less than 0-5 %. Both methods give good approximations, being off no more than
`three units in the second decimal place. The comparison of the two methods for n = 30
`shows good agreement, most of the difl'erences being in the third decimal place. The largest
`discrepancy occurred for i = 2; the estimates differed by six units in the second decimal
`place, an error of less than 2 %.
`The two methods of approximating R2 were compared for n = 20. Plackett’s method
`gave a value of 36-09, the method suggested above gave a value of 37-21 and the true
`value was 37-26.
`
`The good practical agreement of these two approximations encourages the belief that
`there is little risk in reasonable extrapolations for n > 20. The values of constants, for
`n > 20, given in §3 below, were estimated from the simple approximations and extrapola-
`tions described above.
`
`As a further internal check the values of an, (in—1 and an_4 were plotted as a function of
`n for n = 3(1) 50. The plots are shown in Fig. 3 which is seen to be quite smooth for each
`of the three curves at the value n = 20. Since values for n < 20 are ‘exact’ the smooth
`
`transition lends credence to the approximations for n > 20.
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 8
`
`

`

`An analysis of variance test for normality
`
`599
`
`Table 3. Comparison of approximate values of a* = m’V—l
`
`n
`20
`
`30
`
`i
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`l2
`13
`14
`15
`
`Present approx.
`—4-223
`—2-815
`—2-262
`— 1-842
`— 1-491
`—1-181
`— 0-897
`—0-630
`—0-374
`~0-l24
`
`Exact
`—4-2013
`—2-8494
`—2-2765
`—— 1-8502
`— 1-4960
`~1-1841
`— 0-8990
`—0-6314
`—0-3784
`—0-1243
`
`~ 4-655
`— 3-231
`— 2-730
`— 2-357
`— 2-052
`— 1-789
`—— 1-553
`— 1-338
`— 1-137
`— 0-947
`—0-765
`— 0-589
`—0-418
`—— 0-249
`— 0-083
`
`—
`~—
`——
`——
`—
`—
`w
`——
`—-
`-
`—
`fl
`——
`—
`—
`
`Plackett
`—4-215
`~2-764
`-—2-237
`—- 1-820
`—1-476
`—1-169
`— 0-887
`—O-622
`—0-370
`—0-l23
`
`~4-671
`— 3-170
`—— 2-768
`— 2-369
`— 2-013
`—— 1-760
`~1-528
`— 1-334
`—— 1-132
`— 0-941
`~0-759
`— 0-582
`~0-4l3
`— 0-249
`—0-082
`
`A
`
`
`
`0-7
`
`0-6
`
`0-5
`
`0-4
`
`0-2
`
`
`
`
`
`
`50
`
`
`
`IIIII]-III--
`III-IE.-
`I.Ill-fl-illII llIlllli
`EIIIII-‘IifllIII:In.In
`
`II.II;
`Ill
`
`0-1
`III
`
`
`1 5
`
`
`2'5
`
`20
`
`30
`
`35
`
`
`40
`
`45
`
`Sample size, n
`
`Fig. 3. at plotted as a function of sample size, n = 2(1)50, for
`i = n,n——l,n——4 (n > 8).
`
`38‘:
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://ab0ut.jst0r.0rg/terms
`
`AstraZeneca Exhibit 2171 p. 9
`
`

`

`600
`
`S. S. SHAPIBO AND M. B. WILK
`
`-'~’I
`-I5135']
`-mu
`
`'I'IIIII
`
`
`1~00
`
`Fig. 5. Selected empirical percentage points of W, n = 3(l)50.
`
`Sample size, n
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 10
`
`

`

`An analysis of variance test for normality
`
`601
`
`Table 4. Some theoretical moments (m) and Monte Carlo moments (221-) of W
`3
`
`e
`0-9549
`-9486
`-9494
`
`0-9521
`-9547
`-9574
`-9600
`'9622
`
`0-9643
`-9661
`-967 8
`-9692
`-9706
`
`0-9718
`-9730
`-9741
`'9750
`-9757
`
`fig.
`0-9547
`-9489
`-9491
`
`0-9525
`-9545
`-9575
`-9596
`-9620
`
`0-9639
`-9661
`-9678
`-9693
`-9705
`
`0-9717
`-9730
`-9741
`-9750
`-9760
`
`0-9771
`-9776
`-9782
`-9787
`-9789
`
`0-9796
`-9801
`-9805
`-9810
`-9811
`
`0-9816
`-9819
`-9823
`-9825
`-9827
`
`0-9829
`-9833
`-9837
`-9837
`-9839
`
`0-9840
`-9844
`-9846
`-9846
`-9849
`
`0-9850
`-9854
`-9853
`-9855
`-9855
`
`M1
`0-9135
`-9012
`-9026
`
`0-9072
`-9123
`-9174
`-9221
`-9264
`
`0-9303
`-9337
`-9369
`-9398
`-9424
`
`0-9447
`-9470
`-9491
`-9508
`-9523
`
`——
`4
`~
`~
`~
`
`——
`—
`—
`~—
`—-
`
`——
`—
`—
`——
`#
`
`—
`M
`——
`~
`——
`
`——
`—
`~—
`—
`—
`
`—«
`-
`—
`—
`~—
`
`[4’1
`0-9130
`-9019
`-9021
`
`0-9082
`-9120
`-9175
`-9215
`-9260
`
`0-9295
`-9338
`-9369
`-9399
`-9422
`
`0-9445
`-9470
`-9492
`-9509
`-9527
`
`0-9549
`-9558
`-9570
`-9579
`-9584
`
`0-9598
`-9607
`-9615
`-9624
`-9626
`
`0-9636
`-9642
`-9650
`-9654
`-9658
`
`09662
`-9670
`-9677
`'9678
`-9682
`
`0-9684
`-9691
`-9694
`~9695
`-9701
`
`0-9703
`~9710
`-9708
`-9712
`-9714
`
`fie
`0-005698
`-005166
`-004491
`
`0-003390
`002995
`-002470
`-002293
`-00197 2
`
`0-001717
`-001483
`-001316
`-001168
`-001023
`
`0-000964
`-000823
`-000810
`-000711
`-000651
`
`0-000594
`-000568
`-000504
`-000504
`-000458
`
`0-000421
`'000404
`-000382
`-000369
`-000344
`
`0-000336
`-000326
`-000308
`-000293
`-000268
`
`0000264
`-000253
`000235
`-000239
`-000229
`
`0-000227
`-000212
`'000196
`-000193
`-000192
`
`0-000184
`-000170
`-000179
`'000165
`-000154
`
`its/fl?
`— 0-5930
`-— -8944
`— -8176
`
`— 1-1790
`—— 1-3229
`— 1-3841
`—— 1-5987
`-—-1-6655
`
`— 1-7494
`—1-7744
`— 1-7581
`—— 1-9025
`— 1-8876
`
`— 1-7968
`— 1-9468
`—2-1391
`— 2-1305
`— 2-2761
`
`— 2-2827
`—~ 2-3984
`— 2-1862
`— 2-3517
`—2-3448
`
`—2-4978
`—2-5903
`—2-6964
`— 2-6090
`—2-7288
`
`— 2-7997
`—— 2-6900
`—— 3-0181
`— 3-0166
`-— 2-8574
`
`—2-7965
`—— 3-1566
`— 3-0679
`——-3-3283
`—3-1719
`
`— 3-0740
`— 3-2885
`—— 32646
`-— 30803
`— 3-1645
`
`— 3-3742
`— 3-3353
`— 32972
`—— 3-2810
`— 3-3240
`
`fidfi:
`2-3748
`3-7231
`7-8126
`
`5-4295
`6-4104
`7-1092
`8-4482
`9-2812
`
`11-0547
`11-9185
`13-0769
`14-0568
`16-7383
`
`17-6669
`22-1972
`24-7776
`29-7333
`32-5906
`
`36-0382
`44-5617
`40-7507
`43-4926
`46-3318
`
`58-9446
`60-5200
`64-1702
`68-9591
`71-7714
`
`77-4744
`76-8384
`93-2496
`100-4419
`108-5077
`
`91-7985
`120-0005
`118-2513
`134-3110
`136-4787
`
`129-9604
`136-3814
`151-7350
`140-2724
`137-2297
`
`176-0635
`179-2792
`173-6601
`183-9433
`212-4279
`
`
`
`._.ocoooqcn0114;”3
`
`11
`12
`13
`14
`15
`
`16
`17
`18
`19
`20
`
`21
`22
`23
`24
`25
`
`26
`27
`28
`29
`30
`
`31
`32
`33
`34
`35
`
`36
`37
`38
`39
`40
`
`41
`42
`43
`44
`45
`
`46
`47
`48
`49
`50
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 11
`
`

`

`602
`
`S. S. SHAPIRO AND M. B. WILK
`
`2-5. Approximation to the distribution of W
`
`The complexity in the domain of the joint distribution of W and the angles {01} in Lemma 5
`necessitates consideration of an approximation to the null distribution of W. Since only
`the first and second moments of normal order statistics are, practically, available, it follows
`that only the one-half and first moments of W are known. Hence a technique such as the
`Cornish—Fisher expansion cannot be used.
`In the circumstance it seemed both appropriate and efficient to employ empirical samp-
`ling to obtain an approximation for the null distribution.
`Accordingly, normal random samples were obtained from the Rand Tables (Rand Corp.
`(1955)). Repeated values of W were computed for n = 3(1) 50 and the empirical percentage
`points determined for each value of n. The number of samples, m, employed was as follows:
`
`for n = 3(1)20,
`
`n = 21(1)50, m =[
`
`m = 5000,
`
`100,000
`n
`
`.
`
`Fig. 4 gives the empirical c.D.F.’s for values of n: 5, 10, 15, 20, 35, 50. Fig. 5
`gives a plot of the 1, 5, 10, 50, 90, 95, and 99 empirical percentage points of W for
`n = 3(1)50.
`
`A check on the adequacy of the sampling study is given by comparing the empirical
`one-half and the first moments of the sample with the corresponding theoretical moments
`of W for n = 3(1)20. This comparison is given in Table 4, which provides additional
`assurance of the adequacy of the sampling study. Also in Table 4 are given the sample
`variance and the standardized third and fourth moments for n = 3(1) 50.
`After some preliminary investigation, the SB system of curves suggested by Johnson
`(1949) was selected as a basis for smoothing the empirical null W distribution. Details of
`this procedure and its results are given in Shapiro & Wilk (1965 a). The tables of percentage
`points of W given in §3 are based on these smoothed sampling results.
`
`3. SUMMARY OF OPERATIONAL INFORMATION
`
`The objective of this section is to bring together all the tables and descriptions needed
`to execute the W test for normality. This section may be employed independently of
`notational or other information from other sections.
`
`The object of the W test is to provide an index or test statistic to evaluate the supposed
`normality of a complete sample. The statistic has been shown to be an effective measure
`of normality even for small samples (n < 20) against a wide spectrum of non-normal alter-
`natives (see §5 below and Shapiro & Wilk (1964a)).
`The W statistic is scale and origin invariant and hence supplies a test of the composite
`null hypothesis of normality.
`To compute the value of W, given a complete random sample of size n, x1, x2, ...,xn,
`one proceeds as follows:
`(i) Order the observations to obtain an ordered sample yl S yz S
`(ii) Compute
`
`S yn.
`
`32 = Z:(yi_?7)2 : 1%(9611—3—02'
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 12
`
`

`

`An analysis of variance test for normality
`
`603
`
`(iii)
`
`(a) If n is even, n = 2k, compute
`k
`
`b = ‘2 an—i+1(yn—’L+1 — gt):
`’L=1
`
`where the values of an_i+1 are given in Table 5.
`(b) If n is odd, n = 219 + 1, the computation is just as in (iii) (a), since ah.+1 = 0 when
`n = 210+ 1. Thus one finds
`
`= “144%; “ 91) + --- + ak+2(yk+2 _ 3/15):
`
`where the value of yk+1, the sample median, does not enter the computation of b.
`(iv) Compute W = 192/82.
`(v)
`1, 2, 5, 10, 50, 90, 95, 98 and 99 % points of the distribution of W are given in Table 6.
`Small values of W are significant, i.e. indicate non-normality.
`(vi) A more precise significance level may be associated with an observed W value by
`using the approximation detailed in Shapiro & Wilk (1965a).
`
`Table 5. Oceflicients {an_,-+1} for the W test for normality,
`for n = 2(1)50.
`8
`6
`7
`5
`4
`3
`2
`06052
`0-6431
`0-6233
`06646
`06872
`07071
`0-7071
`3164
`4
`0000
`-1677
`-2413
`2806
`3031
`—
`_
`»—
`0000
`0875
`-1401
`-1743
`
`0000
`0561
`
`
`\X’
`1
`2
`3
`4
`5 4
`
`—
`
`9
`0-5888
`-3244
`-1976
`0947
`.0000
`
`10
`05739
`3291
`-2141
`-1224
`0399
`
`>\”
`l
`
`11
`0-5601
`-3315
`-2260
`-1429
`-0695
`
`12
`0-5475
`-3325
`-2347
`-1586
`-0922
`
`13
`0-5359
`-3325
`-2412
`-1707
`-1099
`
`14
`0-5251
`3318
`2460
`-1802
`-1240
`
`15
`0-5150
`-3306
`-2495
`1878
`-1353
`
`16
`0-5056
`-3290
`-2521
`-1939
`-1447
`
`17
`0-4968
`-3273
`-2540
`1988
`-1524
`
`18
`0-4886
`-3253
`-2553
`2027
`~1587
`
`01197
`0-1109
`0-1005
`0-0880
`0-0727
`0-0539
`0-0303
`0-0000
`-0837
`-0725
`0593
`-0433
`-0240
`-0000
`—
`——
`
`-0496
`-0359
`-0196
`0000
`
`—
`—
`-0163
`-0000
`
`
`19
`0-4808
`-3232
`-2561
`2059
`-1641
`
`01271
`-0932
`-0612
`'0303
`-0000
`
`20
`0-4734
`-3211
`72565
`2085
`-1686
`
`01334
`-1013
`-0711
`-0422
`-0140
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`29
`
`30
`
`1‘9‘1—
`
`0-4643
`-3185
`-2578
`-2119
`~1736
`
`0-1399
`-1092
`0804
`0530
`-0263
`
`0-4590
`-3156
`-2571
`-2131
`-1764
`
`0-1443
`~1150
`-0878
`-0618
`-0368
`
`0-4542
`-3126
`2563
`-2139
`-1787
`
`0-1480
`-1201
`-0941
`-0696
`0459
`
`0-4493
`-3098
`-2554
`-2145
`-1807
`
`01512
`~1245
`-0997
`-0764
`-0539
`
`0-4450
`-3069
`-2543
`-2148
`~1822
`
`01539
`1283
`-1046
`0823
`0610
`
`0-4407
`-3043
`-2533
`2151
`1836
`
`01563
`-1316
`-1089
`-0876
`0672
`
`0-0321
`00228
`ll 0-0000 00122
`0-0403
`0-0476
`0107
`0000
`12
`—
`—
`0200
`0284
`
`—
`0000
`0094
`13
`
`14
`15
`
`—
`
`4.
`
`—
`
`—
`
`—
`
`——
`
`0-4366
`3018
`2522
`2152
`-1848
`
`0-1584
`-1346
`-1128
`-0923
`-0728
`
`0-0540
`0358
`0178
`0000
`—
`
`0-4328
`2992
`-2510
`-2151
`-1857
`
`0-1601
`'1372
`-1162
`-0965
`-0778
`
`0-0598
`0424
`0253
`0084
`—
`
`0-4291
`-2968
`-2499
`-2150
`-1864
`
`0-1616
`~1395
`-1192
`1002
`-0822
`
`0-0650
`0483
`0320
`0159
`0000
`
`0-4254
`-2944
`-2487
`-2148
`-1870
`
`O~1630
`-1415
`-1219
`-1036
`-0862
`
`0-0697
`0537
`0381
`0227
`0076
`
`This content downloaded from l28.255.6.l25 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://about.jstor.org/terms
`
`AstraZeneca Exhibit 2171 p. 13
`
`

`

`604
`
`S. S. SHAPIRO AND M. B. WILK
`
`Table 5. Ooefl‘icients {an_i+1} for the W test for normality,
`for n = 2(1)50 (com?)
`34
`35
`36
`
`37
`
`38
`
`39
`
`40
`
`32
`
`33
`
`31
`
`0-4220
`-2921
`-2475
`-2145
`-1874
`
`0-1641
`-1433
`-1243
`-1066
`-0899
`
`0-4188
`-2898
`-2463
`-2141
`-1878
`
`0-1651
`-1449
`-1265
`-1093
`-0931
`
`0-4156
`-2876
`2451
`-2137
`'1880
`
`0-1660
`-1463
`1284
`-1118
`'0961
`
`0-0812
`-0669
`-0530
`-0395
`-0262
`
`0-4127
`-2854
`-2439
`2132
`'1882
`
`01667
`-1475
`-1301
`-1140
`-0988
`
`00844
`-0706
`-0572
`-0441
`-0314
`
`0-4096
`-2834
`-2427
`-2127
`-1883
`
`01673
`1487
`-1317
`-1160
`-1013
`
`00873
`-0739
`'0610
`-0484
`-0361
`
`0-4068
`-2813
`-2415
`-2121
`1883
`
`0-1678
`1496
`~1331
`-1179
`-1036
`
`0-0900
`'0770
`-0645
`'0523
`'0404
`
`0-4040
`-2794
`-2403
`-2116
`-1883
`
`0-1683
`-1505
`-1344
`-1196
`-1056
`
`0-0924
`'0798
`-0677
`-0559
`-0444
`
`0-4015
`-2774
`-2391
`2110
`-1881
`
`0-1686
`-1513
`-1356
`-1211
`-1075
`
`0-0947
`-0824
`-0706
`-0592
`-0481
`
`00739 00777
`-0585
`0629
`-0435
`-0485
`-0289
`-0344
`-0144
`-0206
`
`0-0187
`00131
`0-0000 00068
`00287
`00239
`00372
`00331
`'0062
`-0000
`——
`H
`-0172
`-0119
`-0264
`-0220
`
`-0057
`-0000
`-0158
`'0110
`
`-0053
`-0000
`
`
`0-3989
`-2755
`-2380
`-2104
`~1880
`
`0-1689
`-1520
`~1366
`-1225
`-1092
`
`00967
`-0848
`-0733
`-0622
`-0515
`
`0-0409
`-0305
`-0203
`'0101
`-0000
`
`03964
`2737
`-2368
`-2098
`~1878
`
`0-1691
`-1526
`-1376
`-1237
`-1108
`
`00986
`-0870
`'0759
`-0651
`-0546
`
`0-0444
`-0343
`'0244
`-0146
`-0049
`
`
`
`mistaken/3cech~unhara:5ew~xa~mpmuu3)—"°'NHI-lI-Ih-lhull-ls'cocoqo
`
`
`
`
`41
`
`42
`
`43
`
`44
`
`45
`
`46
`
`47
`
`48
`
`49
`
`50
`
`0-3917
`0-3940
`-2701
`-2719
`-2345
`-2357
`-2085
`'2091
`-1874
`-1876
`01693 01694
`-1531
`-1535
`-1384
`-1392
`-1249
`-1‘259
`-1123
`-1136
`
`11 01004 0-1020
`12
`-0891
`'0909
`13
`'0782
`'0804
`14
`-0677
`-0701
`15
`-0575
`'0602
`
`16 0-0476 00506
`17
`-0379
`-0411
`18
`-0283
`-0318
`19
`-0188
`-0227
`20
`'0094
`'0136
`
`0-3894
`-2684
`'2334
`-2078
`-1871
`0-1695
`-1539
`-1398
`-1269
`-1149
`
`01035
`-0927
`-0824
`-0724
`-0628
`
`00534
`-0442
`-0352
`-0263
`-0175
`
`03872
`-2667
`-2323
`-2072
`-1868
`0-1695
`-1542
`-1405
`-1278
`-1160
`
`01049
`-0943
`-0842
`'0745
`-0651
`
`00560
`0471
`-0383
`-0296
`-0211
`
`03850
`-2651
`-2313
`'2065
`-1865
`0-1695
`-1545
`-1410
`-1286
`-1170
`
`0-1062
`-0959
`-0860
`-0765
`-0673
`
`.
`
`00584
`-0497
`-0412
`-0328
`-0245
`
`03830
`-2635
`-2302
`-2058
`-1862
`0-1695
`-1548
`-1415
`-1293
`-1180
`
`0-1073
`-0972
`-0876
`-0783
`-0694
`
`0-0607
`-0522
`-0439
`-0357
`-0277
`
`21
`22
`23
`24
`25
`
`0-0000
`
`0-0197
`0-0163
`0-0126
`0-0087
`0-0045
`-0118
`-0081
`—
`-0000
`-0042
`
`-0000
`-0039
`
`~
`—
`~
`—«
`—«
`—
`—
`
`0-3808
`-2620
`-2291
`-2052
`-1859
`0-1695
`-1550
`-1420
`-1300
`-1189
`
`0-1085
`-0986
`-0892
`-0801
`-0713
`
`00628
`-0546
`-0465
`-0385
`-0307
`
`0-0229
`-0153
`'0076
`-0000
`~—
`
`0-3789
`-2604
`-2281
`-2045
`-1855
`0-1693
`-1551
`-1423
`-1306
`-1197
`
`01095
`-0998
`-0906
`-0817
`-0731
`
`00648
`-0568
`-0489
`-0411
`'0335
`
`0-0259
`'0185
`-0111
`-0037
`——
`
`0-3770
`-2589
`-2271
`-2038
`-1851
`0-1692
`-1553
`-1427
`-1312
`-1205
`
`01105
`-1010
`-0919
`'0832
`-0748
`
`00667
`-0588
`-0511
`-0436
`-0361
`
`0-0288
`-0215
`-0143
`-0071
`-0000
`
`03751
`-2574
`-2260
`-2032
`-1847
`0-1691
`-l554
`-1430
`-1317
`~1212
`
`01113
`'1020
`-0932
`-0846
`-0764
`
`00685
`'0608
`-0532
`-0459
`-0386
`
`00314
`-0244
`-0174
`-0104
`-0035
`
`This content downloaded from 128.255.6.125 on Mon, 29 May 2017 20:00:22 UTC
`All use subject to http://ab0ut.jst0r.0rg/terms
`
`AstraZeneca Exhibit 2171 p. 14
`
`

`

`An analysis of variance test for normality
`
`605
`
`n
`
`3
`4
`5
`
`6
`7
`8
`9
`10
`
`11
`12
`13
`14
`15
`
`16
`17
`18
`19
`20
`
`Table 6. Percentage points of the W test* for n = 3(1) 50
`Level
`
`
`f_'_
`001
`
`0'98
`
`N
`0'99
`
`0'02
`
`0'05
`
`010
`
`0'50
`
`090
`
`0-95
`
`0-753
`-687
`-686
`
`0-713
`-730
`-749
`-764
`-781
`
`0-792
`-805
`-814
`-825
`-835
`
`0-844
`-851
`-858
`-863
`-868
`
`0-756
`-707
`-715
`
`0-743
`-760
`-778
`-791
`-806
`
`0-817
`-828
`-837
`-846
`-855
`
`0-863
`-869
`-874
`-879
`-884
`
`0-767
`-748
`-762
`
`0-788
`-803
`-818
`-829
`-842
`
`0-850
`-859
`-866
`-874
`-881
`
`0-887
`-892
`-897
`-901
`-905
`
`0-789
`-792
`-806
`
`0-826
`-838
`-851
`-859
`-869
`
`0-876
`-883
`-889
`-895
`-901
`
`0-906
`-910
`-914
`-917
`-920
`
`0-923
`-926
`-928
`-930
`-931
`
`0-959
`-935
`-927
`
`0-927
`-928
`-932
`-935
`'938
`
`0-940
`-943
`-945
`-947
`-950
`
`0952
`954
`-956
`'957
`-959
`
`0-960
`'961
`-962
`-963
`-964
`
`0-998
`-987
`-979
`
`0-974
`-972
`-972
`-972
`-972
`
`0-973
`-973
`-974
`-975
`-975
`
`0-976
`-977
`-978
`-978
`-979
`
`0-980
`-980
`-981
`-981
`-931
`
`0-999
`-992
`-986
`
`0-981
`-979
`-978
`-978
`-978
`
`0-979
`-979
`-979
`'980
`'980
`
`0-981
`-981
`-982
`'982
`-983
`
`0-983
`'984
`-984
`-984
`-985
`
`1-000
`-996
`-991
`
`0-986
`-985
`-984
`-984
`-983
`
`0-984
`-984
`-984
`-984
`-984
`
`0-985
`-985
`-986
`-986
`-986
`
`0-987
`-987
`-987
`-987
`-988
`
`1-000
`-997
`-993
`
`0-989
`'988
`-987
`-986
`-986
`
`0-986
`-986
`-986
`-986
`-987
`
`0-987
`-987
`-988
`-988
`-988
`
`0-989
`-989
`'989
`'989
`-989
`
`21
`22
`23
`24
`25
`
`26
`27
`28
`29
`30
`
`31
`32
`33
`34
`35
`
`36
`37
`38
`39
`40
`
`41
`42
`43
`44
`45
`
`46
`47
`48
`49
`50
`
`0-873
`-878
`-881
`884
`-888
`
`0-891
`-894
`-896
`-898
`-900
`
`0-902
`-904
`-906
`-908
`-910
`
`0-912
`-914
`-916
`-917
`-919
`
`0-920
`-922
`-923
`-924
`-926
`
`0-927
`-928
`-929
`-929
`-930
`
`0-888
`-892
`-895
`-898
`-901
`
`0-904
`-906
`-908
`-910
`-

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket