`
`JERROLD H. ZAR
`
`Associate Professor
`Departmentof Biological Sciences
`Northern Illinois University
`
`PRENTICE-HALL, INC,
`
`EnglewoodCliffs, NJ.
`
`1 of 14
`
`Alkermes, Ex. 1030
`
`1 of 14
`
`Alkermes, Ex. 1030
`
`
`
`_ Library of Congress Cataloging in Publication Data
`Zar, JERROLDH ss
`Biostatistical analysis.
`Prentice-Hall biological sciences series)
`Bibliography: p.
`1, Biometty.
`I. Title. DNLM: 1. Biometry.
`2. Statistics. QH 405 Z36b 1974
`QH323,5.Z37
`574’.01'5195
`73-3443
`ISBN 0-13-076984-3
`
`© 1974 by PRENTICE-HALL, INc., Englewood Cliffs, N.J.
`
`All rights reserved. No part of this book may be repro-
`duced in any form or by any means without permission in
`wtiting from the publisher.
`
`109 8
`
`7
`
`Printed in the United States of America
`
`Seas
`
`PRENTICE-HALL INTERNATIONAL, Inc., London
`PRENTICE-HALL OF AUSTRALIA, Pry, LTD., Sydney
`PRENTICE-HALL OF CANADA, LtD., Toronto
`PRENTICE-HALL OF INDIA PRIVATE LIMITED, New Delhi
`PRENTICE-HALL OF JAPAN, INC., Tokyo
`
`2 of 14
`
`Alkermes, Ex. 1030
`
`2 of 14
`
`Alkermes, Ex. 1030
`
`
`
`A
`
` ii
`
`Measures of Dispersion
`
`and Variability
`
`In addition to a measureofcentral tendency,it is generally desirable to have a measure
`of dispersion of data. A measure of dispersion, or a measure ofvariability, as it is
`sometimescalled, is an indication ofthe clustering of measurements around the center
`of the distribution, or, conversely, an indication of how variable the measurements
`are. Measures of dispersion of populations are parameters of the population, and the
`sample measures of dispersion that estimate them are statistics.
`
`4.1 The Range
`
`The difference between the highest and lowest measurements in a group ofdatais
`termed the range. If sample measurements are arranged in increasing order of magni-
`tude, as if the median were about to be determined, then
`
`(4.1)
`sample range = X, — Xj.
`Sample 1 in Example 4.1 is a hypothetical set of data in which XY, = 1.2 g and X, =
`2.4 g. Thus, the range may be expressed as 1.2 to 2.4 g, oras2.4g —12g = 1.28.
`(We might bear in mind that X, is really within the limits of 1.15 to 1.25 g and X,is
`really 2.35 to 2.45 g, so that the range of the sample would be expressed by a few
`authors as 2.45 g — 1.15g = 1.3 g.) Note that the range has the same units as the
`individual measurements.
`The rangeis a relatively crude measure of dispersion, inasmuch asit does not take
`into account any measurements except the highest and the lowest. Furthermore,
`since it is unlikely that a sample will contain both the highest and lowest values in the
`population, the sample range usually underestimates the population range; therefore,
`
`29
`
`3 of 14
`
`Alkermes, Ex. 1030
`
`3 of 14
`
`Alkermes, Ex. 1030
`
`
`
`30
`
`Measures of Dispersion and Variability
`
`Ch. 4
`
`Example 4.1 Calculation of measures of dispersion for two hypothetical
`samples.
`
`
`
`
`
`
`
`Sample 1
`x; — ¥() |X; — XI (8)X; (8) (X; — X)?(g?)pe
`1,2
`—0.6
`0.6
`0.36
`1.4
`—0.4
`0.4
`0.16
`1.6
`—0.2
`0.2
`0.04
`1.8
`0.0
`0.0
`0.00
`2.0
`0.2
`0.2
`0.04
`2.2
`0.4
`0.4
`0.16
`2.4
`0.6
`0.6
`0.36
`XX = 12.68 Vw XH) =00g
`LIM —-#X=24e YX xX
`= 1.12 g?
`= “sum of squares”
`
`
`
`ananeai
`
`12.6 _== = 18g
`
`|
`range = X7 — X1 = 24g —12g=1.2¢
`oe
`— IX — XI] _ 24g _
`mean deviation = <1" a = 0.34 g
`2
`= Bh(hi — XY" = LZ8" — 0.1867 92
`
`= 0.1867 g? = 0.43 g
`
`Sample 2
`
`X; (8)
`xX; — X(g)
`|X: — X|(g)
`(X; — X)? (g?)
`
`
`©
`
`1.2
`16
`1.7
`1.8
`1.9
`2.0
`2.4
`> X; = 12.68
`
`0.6
`—0.6
`0.2
`—0.2
`0.1
`0.1
`0.0
`0.0
`0.1
`0.1
`0,2
`0.2
`0.6
`0.6
`Xx (% — X) = 0.08 dX — X| = 188
`
`.
`
`0.36
`0.04
`0.01
`0.00
`0.01
`0.04
`0.36
`D> (% — Xx)?
`= 0.82 g2
`= “sum of squares”
`
`>
`12.62
`x= =8 = 18g
`range = X7 — Xi = 24g —12g= 1.28
`Dlx — X] 18g _
`ST OT = 0.26 8
`2 =Ei—HP?_ 0.828?_ 0.1367 g2
`
`mean deviation =
`
`2
`
`= /0,1367 g¢ = 0.37 g
` ——
`
`4 of 14
`
`Alkermes, Ex. 1030
`
`4 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Sec. 4.3
`
`The Variance
`
`37
`
`it is a biased andinefficient estimator. Nonetheless, it is useful in some circumstances
`to present the sample range as an estimate (although a poor one) of the population
`range. Taxonomists are frequently concerned, for example, with having an estimate of
`whatthe highest and lowest valuesin a population are expected to be. Whenever the
`range is specified in reporting data, however, it is usually a good practice to report
`another measure of dispersion as well. The range is applicable to ordinal, interval,
`andratio scale data.
`
`4.2 The Mean Deviation
`
`Asis evident from the two samples in Example 4.1, the range conveys no informa-
`tion about how clustered about the middle of the distribution the measurementsare.
`Since the mean is so useful a measure of central tendency, one might express dis-
`persionin terms of deviations from the mean. The sum ofall deviations from the mean,
`ie., ¥ (X, — X), will always equal zero, however, so such a summation would be
`useless as a measure ofdispersion (see Example 4.1).
`To sum the absolute values of the deviations from the meanresults in a quantity
`that is an expression of dispersion about the mean. Dividing this quantity by
`yields
`a measure known as the mean deviation, or mean absolute deviation of the sample.
`In Example 4.1, sample 1 is more variable (or more dispersed, or less concentrated)
`than sample 2. Although the two samples have the same range, the mean deviation,
`calculated as
`
`. sample mean deviation = 21%,—Al, (4.2)
`
`
`expresses the differences in dispersion. Mean deviation can also be defined by using
`the sum of the absolute deviations from the median rather than from the mean.
`
`4.3 The Variance
`
`Another method of eliminating the signs of the deviations from the mean is.to
`square the deviations. The sum of the squares of the deviations from the mean is
`called the sum ofsquares, abbreviated SS, andis defined as follows:
`
`sample SS = 5 (X, — X)?.
`
`(4.3)
`
`The mean sum of squaresis called the variance (or mean square, the latter being short
`for mean squared deviation), and for a population is denoted by a? (“sigma squared,”
`using the lowercase Greek letter):
`
`
`ot = BL—wyGi ¥. (4.5)
`
`
`
`5 of 14
`
`Alkermes, Ex. 1030
`
`5 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`32
`
`Measures of Dispersion and Variability
`
`Ch, 4
`
`deviation, and coefficient of variation.
`
`i
`The best estimate of the populationvariance, o?, is the sample variance, s?:
`=2—X¥. (4.6)
`
`The replacementof yz by X and N by n in Equation (4.5) results in a quantity whichis
`a biased estimate of o?. The dividing of the sample sum of squares by n — | (called
`the degrees offreedom, abbreviated DF) rather than by x,yields an unbiased estimate,
`and it is Equation (4.6) which should be used to calculate the sample variance.If all
`observations are equal, then there is no variability and s? = 0; and s* becomesin-
`creasingly large as the amount of variability, or dispersion, increases. Since s? is a
`mean sum of squares, it can never be a negative quantity.
`The variance expresses the same type of information as does the mean deviation,
`but it has certain very important properties relative to probability and hypothesis
`testing that makeit distinctly superior. Thus, the mean deviation is very seldom en-
`countered in biostatistical analysis.
`Example4.2.
`“Machine formula” calculation of variance, standard
`
`i
`"
`i
`,
`
`|
`('|
`i|
`
`i
`
`|
`
`Sample 1
`
`Sample 2
`X? (8?)
`1.44
`2.56
`2.89
`3.24
`3.61
`4.00
`5.76
`DY X? = 23,50 g2
`
`X; (g)
`X?} (g?)
`X; (g)
`1.2
`1.44
`12
`1.6
`1.96
`1.4
`1,7
`2.56
`1.6
`1.8
`3.24
`1.8
`1.9
`4.00
`2.0
`2.0
`4.84
`2.2
`2.4
`5.76
`2.4
`DY X= 126g
`DxX=126g YX? =23.80g2
`n=7
`n=7
`¥=V68_ 185
`x= USe—18¢
`I:
`he
`_
`_ (12.6 g)?
`_
`(> X)?
`ny
`
`
`f SS = 23.50 g2—S98SS =D x7-
`= 23,80ge — 12.59)"
`= 08 a!
`
`
`( 2= U.02B*= 2
`
`1 6—0.13678= 23.80 g? — 22.68 g? ;
`
`
`i]
`= 1.12 g2
`s = 0.1367 g? = 0.372
`
`— _SS
`1
`_ 037g
`|
`|
`y= Tp
`= 1128? _ 9.1867 82
`= 0.21 = 21%
`s = 10,1867 g? = 0.43 g
`_ 5 _ 043g
`Y=¥= Teg
`= 0.24 = 24%
`
`ik
`F !
`Ik
`ia
`I
`I:
`if
`{"
`I
`hi
`i
`
`i
`i
`Hi
`
`
`
`
`
`6 of 14
`
`Alkermes, Ex. 1030
`
`6 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`33
`
`Sec. 4.4
`
`The Standard Deviation
`
`The calculation of s? can be tedious for large samples, but it can be facilitated by
`the use of the equality
`
`(4.7)
`sample SS = ) X? — xy
`Although this formula might appear more complicated than (4.3), it is in reality
`simpler to work with. Example 4.2 demonstrates its use to obtain a sample sum of
`squares. Proof that Equations (4.3) and (4.7) are equivalent is given in Appendix B.
`Since sample variance equals sample SS divided by DF,
`
`_ exp E!
`
`n—1
`
`s?
`
`(4.8)
`
`This last formula is often referred to as a “working formula,” or “machine formula,”
`because of its computational advantages. There are, in fact, two major advantages in
`calculating SS by Equation (4.7) rather than by Equation (4.3). First, fewer computa-
`tional steps are involved, a fact that decreases chance of error. On a good desk calcu-
`lator, the summed quantities, 5) ¥, and >) X}, can both be obtained with only one
`pass through the data, whereas Equation (4.3) requires one pass through the data to
`calculate X, and at least one more pass to calculate and sum the squares of the devia-
`tions, X, — X. Second, there may be a good deal of rounding error in calculating
`each XY, — X, a situation which leads to decreased accuracy in computation, but
`which is avoided by the use of Equation (4.7).
`For data recorded in frequency tables,
`(4.9)
`sample SS = 5 f,X? — lay,
`wheref, is the frequency of observations with magnitude X,. But with a desk calcu-
`latorit is often faster to use Equation (4.7) for each individual observation, disregard-
`ing the class groupings.*
`The variance has square units. If measurements are in grams, their variance will
`be in grams squared, or if the measurements are in cubic centimeters, their variance
`will be in terms of cubic centimeters squared, even though such squared units have no
`physical interpretation.
`
`4.4 The Standard Deviation
`
`The standard deviation is the positive square root of the variance; therefore, it has
`the same units as the original measurements. Thus, for a population,
`
`*When calculating s* from frequency tables of continuous data (e.g., Example 1.5) or grouped
`discrete data (c.g., Example 1.4b), the result is a slightly biased estimate of 02, the statistic being a
`little inflated by an amountrelated to the class interval size. Sheppard’s correction (Sheppard, 1898)
`occasionally is suggested to eliminate this bias; but it is only rarely employed, partly because the
`amountofbias generally is relatively very small (unless the data are grouped into too few classes),
`and partly because at times it results in a value for s2 which is in fact more biased an estimator thanis
`the uncorrected s? (Croxton, Crowdon, and Klein, 1967: 213, 536).
`
`
`
`7 of 14
`
`Alkermes, Ex. 1030
`
`7 of 14
`
`Alkermes, Ex. 1030
`
`
`
`34
`
`Measures of Dispersion and Variability
`
`Ch. 4
`
`— fax -D
`2
`X,)*
`o¢=—_, >
`
`(4.10)
`
`and for a sample,
`
`S =
`
`5x? aen
`4.1]
`__fe,
`(4.11)
`aI
`Example 4.1 demonstrates the calculation of s. This quantity frequently is abbre-
`viated SD, and on rare occasionsis called the root mean square deviation. Remember
`that the standard deviationis, by definition, always a nonnegative quantity.*
`Some modern desk calculators have automatic square root capability. Since
`many do not, Appendix Tables D.2 and D.3 are supplied, for the obtaining of square
`roots is a recurring necessity in statistical analysis.
`
`4.5 The Coefficient of Variation
`
`Thecoefficient ofvariation, or coefficient of variability, is defined as
`Ss
`Ss
`.
`==
`=—=-l1 %.
`(4.12)
`x V z
`00%
`‘Since s/X is generally a small quantity, it is frequently multiplied by 100% in order
`to express V as a percentage.
`As a measure of variability, the variance and standard deviation have magnitudes
`which are dependent on the magnitude of the data. Elephants have ears that are
`perhaps 100 times larger than those of mice. If elephant ears were no morevariable,
`relative to their size, than mouseears, relative to their size, the standard deviation of
`elephant ear lengths would be 100 times as great as the standard deviation of mouse
`ear lengths (and the variance of the former would be 100? = 10,000 times the variance
`of the latter). The coefficient of variation expresses sample variability relative to the
`mean of the sample (and is on rare occasion referred to as the “relative standard
`deviation”). It is called a measure of relative variability or relative dispersion.
`Since s and X haveidentical units, V has no units at all, a fact which emphasizes
`that it is a relative measure, divorced from the actual magnitude or units of measure-
`mentof the data. Thus, had the data in Example 4.2 been measured in pounds,kilo-
`grams, or tons, instead of grams, the calculated V would have been the same. The
`coefficient of variability may be calculated only for ratio scale data; it is, for example,
`not valid to calculate coefficients of variation of temperature data measured on the
`Celsius or Fahrenheit temperature scales. Simpson, Roe, and Lewontin (1960: 89-95)
`present a good discussion of V and its biological application, especially with regard
`to zoomorphological measurements.
`
`*The sample s is actually a slightly biased estimate of the population a, in that on the average it
`estimates a trifle low, especially in small samples. But this fact generally is considered to be offset
`by the statistic’s usefulness. Correction for this bias is sometimes possible (e.g., Bliss, 1967: 131;
`Dixon and Massey, 1969: 136; Gurland and Tripathi, 1971; Tolman, 1971), but it is rarely employed.
`
`
`
`8 of 14
`
`Alkermes, Ex. 1030
`
`8 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`Sec. 7.3
`
`The Distribution of Means
`
`77
`
`7.3 The Distribution of Means
`
`If random samples of size » are drawn from a normal population, the means of
`these samples will form a normal distribution. The distribution of means from a
`nonnormalpopulation will not be normal but will tend toward normality as ” increas-
`es in size. Furthermore, the variance of the distribution of means will decrease as n
`increases; in fact, the variance of the population ofall possible means of samples of
`size n from a population with variance o7 is
`(7.12)
`apa.
`The quantity o} is called the variance of the mean, and the preceding comments on
`the distribution of means comes from a very important mathematical theorem, known
`as the central limit theorem. A distribution of sample statistics is called a sampling
`distribution; therefore, we are discussing the sampling distribution of means.
`Since a} has square units, its square root, a, will have the same units as the
`original measurements (and,therefore, the same units as the mean, y,and the standard
`deviation, 7). This value, oy, is the standard deviation ofthe mean. The standard devia-
`tion of a parameter orofa statistic is referred to as a standard error; thus, oy is fre-
`quently called the standard error of the mean, or simply the standard error (sometimes
`abbreviated SE):
`
`Or = oor Or = —tm:
`n
`a n
`Just as Z = (X, — y/o is a normal deviate that refers to the normal distribution
`of X, values,
`_
`Z=4%=#
`Ox
`
`(7.14)
`
`(7.13)
`
`is a normaldeviate referring to the normal distribution of means (X values). Thus, we
`can ask questions such as whatis the probability of obtaining a random sample of nine
`measurements with a mean larger than 50.0 mm from a population having a mean of
`47.0 mm and a standard deviation of 12.0 mm? This and other examples of the use of
`normaldeviates for the sampling distribution of means are presented in Example7.2.
`As seen from Equation (7.13), to determine oy one must know a? (or a), which is
`a population parameter. Since we very seldom can calculate population parameters,
`we must rely on estimating them from random samples taken from the population.
`The best estimate of o?, the population variance of the mean,is
`2
`= >
`the sample variance of the mean. Thus,
`2
`
`(7.15)
`
`(7.16)
`sp = Tr
`Sea rin or
`is an estimate of oy and is the sample standard error of the mean. Example 7.3 demon-
`
`
`
`9 of 14
`
`Alkermes, Ex. 1030
`
`9 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Testing for Difference Between Two Means
`
`105
`
`Unfortunately, we are faced with the requirement of the variance ratio test that
`the two underlying distributions be normal (or nearly normal). Thus, this test must
`be applied with caution, for if the two sets of sample data are, in fact, from normal
`. populations, the logarithms of the data will not be normally distributed. The require-
`menthereis that the logarithms be normally distributed.
`
` Sec. 9.3
`
`9.3 Testing for Difference Between Two Means
`
`Example 9.1 presented as data the number of moths captured in each of two types
`of traps. The two-tailed hypotheses, Hy: uw, — @, =Oand Hy: uw, — wu, 0, can be
`proposed to test whether the two traps possess the sameefficiency in catching moths
`(i.e., whether they catch the same numbers of moths). These hypotheses are commonly
`expressed in their equivalent forms:'H,: uw, = w, and Hy: wy ~ py.
`If the two samples came from normal populations, and if the two populations
`have equal variances, then a ft value may be calculated in a manner analogousto the
`t test introduced in Section 8.1. The ¢ value for testing the preceding hypotheses con-
`cerning the difference between two meansis
`t= Mi%,
`S22;
`The quantity ¥, — ¥, is simply the difference between the two means, and sy,_y,
`is the standarderrorof the difference between the means.
`The quantity sy,_z,, along with s},_x,, the variance of the difference between the
`means, is new to us, and we need to consider it further. Both s%,_», and sy,_y, are
`statistics that can be calculated from the sample data andare estimates of the popula-
`tion parameters, o},_», and oy,-x,, respectively. We can show mathematically that
`the varianceofthe difference between twovariables is equal to the sum ofthe variances
`of the two variables, so that o}%,_», = 0}, + o},. Since 0% = a?/n, wecan write
`2
`ot
`a}
`of = 4G.
`Ny
`ny
`1
`:
`Recall that in the two-sample ¢ test, we assume that a? = 03; therefore, we can write
`2
`a oe
`(9.5)
`OF.-2, = ny + Ny
`Thus, to calculate the estimate of o%,_»,, we must have an estimate of a”. Since both
`Ss} and s} are assumed to estimate o?, we compute the pooled variance, s?, which is
`then used as the best estimate of a”.
`
`(9.4)
`
`and
`
`SS + SS, |
`2
`=o,,
`
`s3
`Xin-ks
`
`— 524.52
`Ny
`Ay
`
`9.
`(9.2)
`
`(9.6)
`°
`
`
`
`10 of 14
`
`Alkermes, Ex. 1030
`
`10 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Two-Sample Hypotheses
`
`Ch. 9
`
`106
`
`Thus,
`
`and
`
`2
`
`a ee
`
`*¥,-¥%,
`¥,-¥,
`Sacn ]eE =
`SRR:
`$2 4. Sp
`
`m
`
`My
`
`(9.7)
`
`(
`
`9.8
`
`)
`
`“u-M
`one
`soVa +i
`
`Example 9.4 summarizes the procedure fortesting the hypotheses under consideration.
`Thecritical value to be obtained from Table D.10 is t,(2), 4+.) the two-tailed ¢ value
`for the & significance level, with vy, ++ v, degrees of freedom.
`
`Example 9.4 The two-sample ¢, test for the two-tailed hypotheses, Ho:
`#1 = fzand Ay: 41 ~ “2 (which could also be stated as Ho: wy — #2 =
`Oand Hy: #1 — U2 #0). The calculationsutilize the data of Example 9.1,
`
`n= 11
`vi = 10
`X1 = 34.5 moths
`SS; = 218.73 moths?
`
`i120= 8
`2 =7
`X2 = 57.2 moths
`SS2 = 107.50 moths?
`
`$= S81 4 SS2 _ 218.73 + 107.50 _ 326.23 = 19.19 moths?
`
`=V/1.74+240= /4.14 = 2.0 moths
`— Xi = X2 _ 34,5 — $7.2 _.-22.7 _ 14.35
`t
`Sf,-f,
` 2oO 2.0
`f0.05(2),(vitvs) = f0,05(2),17 = 2-110
`Therefore, reject Ho.
`P(|t| = 11.35) « 0,001
`
`
`
`
`One-tailed hypotheses can be tested in situations where the investigator is inter-
`ested in detecting a difference in only one direction. For example, an entomologist
`might own a number of moth traps of the second type mentioned in Example 9.4,
`and he might wish to determine whether he should changeto traps of type 1. Now,if
`type | traps are no moreefficient at moth catching than are type 2 traps, there will
`be no reason to abandonthe use of the latter. Thatis, if 74, < ,, the entomologist
`would choose to retain his present supply of type 2 traps; but if ~, > 42, he would be
`justified in discarding them in favorof type 1 traps. The ft statistic is calculated by Equa-
`tion (9.8), just as for the two-tailed test. But this calculated ¢ is then compared with
`the critical value, ¢,(1)«+5,» father than with f,;2),,+»,). In other cases, the one-tailed
`hypotheses, Hy: u, > mw, and H,: wu, < “,, may be appropriate.
`Note that Hy: zu, = “, can be written Hy: uw, — uw, = 0, Ho: wy < wz can be
`written Ho: “, — MW, <0, and Ay: uw, > pw, can be written Hy: uw, — #, > c; the
`generalized ¢ statistic is
`
`phi hile.
`
`SRi-2,
`
`(9.9)
`
`
`
`11 of 14
`
`Alkermes, Ex. 1030
`
`11 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`Sec.9.4
`
`Confidence Limits for Means.
`
`107
`
`Thus, for example, the entomologist might have considered that because of the
`expense of purchasing an entire new set of moth traps, he would do so onlyif he had
`reason to concludethat the new traps could catch more than 10 more mothsper night
`than the present traps. Here, Hy: uw, — “, < 10 moths, H,: uw, — uz, > 10 moths,
`and t= (|X, — X¥,|—10 moths)/sy,_», = (22.7 — 10)/2.0 = 6.35. The
`critical
`value is f9.95(1),17 = 1.740, so we would reject H,. This test, then, allows us to conclude,
`with 95% confidence, that type 1 traps have a trapping efficiency at least 10 moths
`per night greater than do type 2 traps. Thus, the one-tailed two-sample test can exam-
`ine a hypothesis that one population mean is a specified amount larger (or smaller)
`than a second. By the procedure of Section 9.6 one can even test whether the measure-
`ments in one population are a specified amountas large (or as small) as those in a
`second population.
`
`Violations of the Two-Sample ¢ Test Assumptions. The two-sample ¢ test assumes,
`by dint of its underlying theory, that both samples came at random from normal
`populations with equal variances. Thebiological researcher cannot, however, always
`be assured that these assumptions are correct. Fortunately, numerous studies have
`shown that the ¢ test is robust enough to stand considerable departures from its
`theoretical assumptions, especially if the sample sizes are equal or nearly equal, and
`especially when two-tailed hypotheses are considered (e.g., Boneau, 1960; Box, 1953;
`Cochran, 1947). If the underlying populations are markedly skewed, then one should
`be wary of one-tailed testing, and if there is considerable nonnormality in the popula-
`tions, then very small significance levels (say, % < 0.01) should not be depended upon.
`Equal variances appear to be generally the more important of the two assumptions,
`and thus some authors have recommended thetesting of the hypothesis H):.07 = 03
`prior to commencing a two-sample ¢ test. However, the procedure for testing this
`hypothesis (Section 9.1) is adversely affected by deviations from its underlying nor-
`mality assumption, whereasthe ¢ test is robust with regard to its underlying assump-
`tions, so to “make the preliminary test.on variances is rather like putting out to sea
`in a rowing boat to find out whether conditions are sufficiently calm for an oceati
`liner to leave port!” (Box, 1953).
`In conclusion, two-sample ¢ testing may be employed except in cases where it is
`felt there are severe deviations from the normality and equality of variance assump-
`tions. In such cases, the nonparametric test of Section 9.6 would better be employed.
`Alternatively, the Behrens-Fisher procedure (Fisher and Yates, 1963: 3-4, 60-61) or
`appropriate modifications of the ¢ test (e.g., Cochran, 1964; Cochran and Cox, 1957:
`100-102; Dixon and Massey, 1969: 119; Satterthwaite, 1946) might be used.*
`
`9.4 Confidence Limits for Means
`
`In Section 8.3, we defined the confidenceinterval for a population mean as ¥ +
`t.(2),.8Where sz is the best estimate of oy andis calculated as ,/s?/n. For the two-
`sample situation, where we assume that 07 = a3, the confidence interval for either
`
`*In Fisher and Yates, s refers to the standard error, not the standard deviation.
`
`
`
`
`
`12 of 14
`
`Alkermes, Ex. 1030
`
`12 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Sec. 12.4
`
`Comparison of a Control Mean to Each Other Group Mean
`
`157
`
`arranged in increasing order of magnitude. Pairwise differences between rank sumsare
`then tabulated, starting with the difference between the largest and smallest rank sums,
`and proceeding in the same sequence as described in Section 12.1. The standard error
`is calculated as
`
`(12.5)
`SE = /kapXnp + 1)
`(Nemenyi, 1963; Wilcoxon and Wilcox, 1964: 10), and the tabled Studentized range
`to be used is q,,..,,. Note that this multiple range test requires that there be equal
`numbers of data in each of the & groups.
`
`12.4 Comparison of a Control Mean to Each Other Group Mean
`
`Sometimes the objective of multisample experiments with k samples, or groups,
`is to determine whether the mean of one group, designated as a “control,” differs
`significantly from each of the means of the — 1 other groups. Dunnett (1955) has
`provided a procedure for such testing, which differs from the multiple comparison
`approach in 12.1 in that the investigator is here not interested in all possible compari-
`sons of pairs of group means, but only in those k — | comparisons involving the “con-
`trol” group. Knowing k, the total number of groups in the experiment, and v,the error
`degrees of freedom from the analysis of variance for Hy: “, = “@, = +++ = fy, one
`obtains critical values from either Table D.13 or Table D.14, depending on whether
`the hypotheses are to be one-tailed or two-tailed, respectively. We shall refer to these
`tabled values as g/,,,,, for they are used in a manner similar to that of the q,,,,,
`values employed in the Newman-Keuls test. As in the SNK procedure, the errorrate,
`a, denotes the probability of committing a Type I error somewhere amongall of the
`pairwise comparisons made. The standard error for Dunnett’s test is
`
`where group sizes are equal, or
`
`SE = ,/2%2
`n
`
`1
`
`(12.6)
`
`(12.7)
`
`—
`
`fafl
`
`_ SE=,/s (— 4 =)
`
`when groupsizes are not equal (Steel and Torrie, 1960: 114). The testing procedure
`utilizing Equations (12.2) and (12.6) is demonstrated in Example 12.4. For a two-
`tailed test, if |q’| > Qic2),,9> then Ho: Moontrot = Ha is rejected. In a one-tailed test
`H,: Hontrol < Ha. would be rejected if \d'| = Jui1),»,2 and KXcontrot > X,; and Hy:
`Hcontrol = La would be rejected if |q’ | = Fut),¥, » and Xcootrot < X4.
`The null hypothesis Ho: Lcontror = La IS, ofcourse, a special case Of Ho! Mcontrot
`— [4 = ¢, where c = 0. Other values of c mayappear in the hypothesis, however, and
`such hypotheses may be tested by placing | X.ontro1 — %4| — ¢ in the numerator of
`the q’ calculation. In a similar manner, Ho: Leontrot — Ha Sa € OF Ao? Mcontrot — Ha = €
`may be tested.
`
`
`
`13 of 14
`
`Alkermes, Ex. 1030
`
`13 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Mathematical and Statistical Tables
`
`413
`
`TABLE D.10 Critical Values of the ¢ Distribution
`
`0,001
`0,002
`0,005
`0,01
`0.02
`0,05
`0.10
`0.20
`la(2): 0,50
`0,0005
`0,001
`0.0025
`0,005
`0,01
`0.025
`0.05
`0,10
`ja(l): 0,25
`vv
`cenS
`63,657 127,321 318,309 636,619
`1
`|
`1,000
`3,078
`6,314
`12,706
`31,821
`31,599
`2
`|
`0.816
`1,886
`2,920
`4&,303
`6,965
`9,925
`14,089
`22,327
`12,924
`3
`|
`0,765
`1,638
`2,353
`3,182
`4.542
`5,842
`7,453
`10,215
`8,610
`4
`|
`0,741
`1,533
`2,132
`2,776
`3,747
`4,604
`5,598
`7,173
`6,869
`5
`|
`0,727
`1,476
`2,015
`2,571
`3,365
`4,032
`4,773
`5,893
`|
`5.959
`6
`|
`0.718
`1.440
`1,963
`2,447
`3,145
`3,707
`4,317
`5,208
`5,408
`7
`1
`0,711
`1,415
`1,895
`2,365
`2,998
`3,499
`4,029
`4,785
`5,041
`a
`|
`0,706
`1,397
`1,860
`2,306
`2,896
`3,355
`3,833
`4,501
`4,781
`9
`|
`0,703
`1,383
`1.833
`2,262
`2,821
`3,250
`3,690
`4,297
`4.587
`10
`|
`0,700
`1,372
`1.812
`2,228
`2,764
`3,169
`3,581
`4,144
`j
`4437
`1l
`|
`0.697
`1,363
`1,796
`2,201
`2,718
`3,106
`3,497
`4,025
`4,318
`12
`|
`0,695
`1,356
`1,782
`2,179.
`2,681
`3,055
`3,428
`3,930
`4,222
`13}
`0,694
`1,350
`1.771
`2,160
`2,650
`3,012
`3,372
`3,852
`4,140
`14
`I
`0,692
`1,345
`1,762
`2,145
`2,624
`2,977
`3,326
`3,787
`4,073
`15
`|
`0,692
`1,341
`1,753
`2,131
`2,602
`2,947
`3,286
`5,733
`|
`16
`|
`0.690
`1,537
`1,746
`2,120
`2,583
`2,921
`3,252
`3,686
`17
`|
`0,689
`1,335
`1,740
`2,120
`2,567
`2,898
`3,222
`3,64
`18
`|;
`0.688
`1,330:
`1,734
`2,101
`2,552
`2,878
`3,197
`3,610
`1g
`|
`0.688
`1.328
`1,729
`2,093
`2,539
`2,861
`3.174
`3,579
`20
`|
`0,687
`1,325
`1,725
`2,086
`2,528
`2.845
`3,153
`3,552
`1
`e
`21
`|
`0,686
`1,323
`1,721
`2,080
`2,518
`2,831
`(3,135
`5,527
`22
`4
`0,686
`1,321
`1,717
`2,074
`2,508
`2,819
`3,119
`3,505
`23
`«|
`0,685
`1,319
`1,714
`2,069
`2,500
`2,807
`3,104
`5,485
`24s
`0,685
`1,318
`1,711
`2,064
`2,492
`2,797
`3,091
`3,467
`25
`«|
`0,684
`1,316
`1,708
`2,060
`2,485
`2,787
`3,078
`3,450
`|
`26
`I
`0.684
`1,315
`1,706
`2,056
`2,479
`2,779
`3,067
`3,435
`27
`«I
`0.684
`1.314
`1,703
`2,052
`2,473
`2,771
`3,057
`3,421
`28
`I
`0.683
`1,313
`1,701
`2,048
`2,467
`2.763
`3.047
`3,408
`29
`«|
`0.683
`1,311
`1,699
`2,045
`2,462
`2.756
`3.038
`5,396
`30
`|
`0,683
`1/310
`1,697
`2.042
`2,457
`2,750
`3,030
`3,385
`I
`|
`«4
`CO
`|
`
`4,015
`3,965
`3,922
`3,883
`3.850
`
`3,819
`3,792
`3,768
`3,745
`3,725
`3.707
`3,690
`3,674
`3.659
`3.646
`
`3,633
`3,622
`3,611
`3,601
`3,591
`3,582
`3,574
`3,566
`3,558
`3,551
`
`3,544
`3,538
`3,532
`3,526
`3.520
`
`3,515
`3,510
`3.505
`3,500
`3,496
`
`
`
`31
`32
`33
`34
`35
`
`0.682
`0,682
`0.682
`0,682
`0,682
`
`1,309
`1,309
`1,308
`1,307
`1,306
`
`1,696
`1,694
`1,692
`1,691
`1,690
`
`2,040
`2,037
`2,035
`2,032
`2,030
`
`2,453
`2,449
`2,445
`2,441
`2,438
`
`2,744°
`2,738
`2,733
`2,728
`2,724
`
`3,022
`3,015
`3,008
`3,002
`2,996
`
`5,375
`3,365
`3,356
`3,348
`3,340
`
`0,681
`«|
`36
`0.681
`«||
`37.
`0,681
`OC
`38
`0,681
`|
`39
`| 0,681
`i
`|
`|
`|
`|
`
`0,681
`0,680
`0,680
`0.680
`0,680
`
`41
`42
`Sr
`bu
`45
`
`| 0,680
`u7
`I
`0.680
`48
`|
`0,680
`49
`ft
`0.680
`50
`|
`0,679
`
`1,306
`1,305
`1.3046
`1,304
`1,303
`
`1,303
`1,302
`1,302
`1,301
`1,302
`
`1,300
`1,500
`1,299
`1,299
`1,299
`
`1,688
`12,687
`1,686
`1,685
`1.68%
`
`1,683
`1,682
`1,681
`1,680
`1,679
`
`1,679
`1.678
`1,677
`1,677
`1,676
`
`2,028
`2,026
`2,024
`2,023
`2,021
`
`2,020
`2,018
`2,017
`2,015
`2,038
`
`2,013
`2,012
`2.011
`2,010
`2,009
`
`2,434
`43
`2,429
`2.426
`2,423
`
`2,421
`2,418
`2,416
`2,414
`2,412
`
`2,410
`2,408
`2,407
`2,405
`2,403
`
`2,990
`2,719
`2,985
`2,715
`2,980
`2,712
`2,708. 2.976
`2,704
`2,971
`
`2,701
`2,698
`2,695
`2,692
`2,690
`
`2,687
`2,685
`2,682
`2,680
`2,678
`
`2,967
`2,963
`2,959
`2,956
`2,952
`
`2,949
`2,946
`2,943
`2,940
`2,937
`
`3,333
`3,326
`3,319
`3,313
`3,307
`
`3,301
`3,296
`5,291
`3,286
`3,281
`
`3,277
`3,273
`3,269
`3,265
`3,261
`
`
`
`14 of 14
`
`Alkermes, Ex. 1030
`
`14 of 14
`
`Alkermes, Ex. 1030
`
`