`
`JERROLD H. ZAR
`
`Associate Professor
`Department of Biological Sc/encos
`Northern Illinois University
`
`PRENTICE-HALL. mc.
`
`Eng/9 wood Cliffs, NJ.
`
`1 of 14
`
`Alkermes, Ex. 1030
`
`1 of 14
`
`Alkermes, Ex. 1030
`
`
`
`, Library of Congress Cataloging in Publication Data
`zAn Jmaowfi
`_
`,
`-
`fiiostatistical analysis.
`
`Prentiqe-Hall biological sciences series)
`Bib ography: p.
`I. Biometry.
`I. Title. DNLM: 1. Biometry.
`2. Statistics. QH 405 2.361: 1974
`QH323.5.Z37
`574'.0l’Sl95
`ISBN 0-I3-076984-3
`
`73-3443
`
`© 1974 by PIu=.N11c£-HALL, INC., Englewood Cliffs, N.J.
`
`All rights reserved. No part of this book may be repro-
`duced in any form or by any means without permission in
`writing from the publisher.
`
`1o'9s7
`
`Printed in the United States of America
`
`. _____
`
`4’
`
`Pruzmrca-HALL INTERNATIONAL, INC., London
`PRENTICE-HALL or AUSTRALIA, Pry. Lm., Sydney
`PRENTICE-HALL or CANADA, LTD., Toronto
`Pnnrrrrcia-HALL or INDIA PRIVATE LIMITED, New Delhi
`PRENTICE-HALL or JAPAN, INc., Tokyo
`
`2 of 14
`
`Alkermes Ex 1030
`
`2 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`4
`
`f
`
`|
`
`1
`
`_
`
`I
`
`a’
`
`‘
`
`.
`
`Measures of Dispersion
`and Variability
`
`In addition to a measure of central tendency, it is generally desirable to have a measure
`
`ofdispersion of data. A measure of dispersion, or a measure of variability, as it is
`
`sometimes called, is an indication of the clustering of measurements around the center
`of the distribution, or, conversely, an indication of how variable the measurements
`are. Measures of dispersion of populations are parameters of the population, and the
`
`sample measures ofdispersion that estimate them are statistics.
`4.1 The Range
`
`The difference between the highest and lowest measurements in a group of data is
`termed the range. If sample measurements are arranged in increasing order of magni~
`
`tude, as ifthe median were about to be determined, then
`
`sample range = X" — X,.
`
`(4.1)
`
`Sample 1 in Example 4.1 is a hypothetical set of data in which X. = 1.2 g and X, =
`2.4 g. Thus, the range may be expressed as 1.2 to 2.4g, oras 2.4 g — 1.2 g = 1.2 g.
`(We might bear in mind that X, is really within the limits of 1.15 to 1.25 g and X, is
`really 2.35 to 2.45 g, so that the range of the sample would be expressed by a few
`authors as 2.45 g — 1.15 g = 1.3 g.) Note that the range has the same units as the
`individual measurements.
`
`The range is a relatively crude measure of dispersion, inasmuch as it does not take
`into account any measurements except the highest and the lowest. Furthermore,
`since it is unlikely that a sample will contain both the highest and lowest values in the
`population, the sample range usually underestimates the population range; therefore,
`
`29
`
`3 of 14
`
`Alkermes, Ex. 1030
`
`3 of 14
`
`Alkermes, Ex. 1030
`
`
`
`30
`
`Measures of Dispersion and Variability
`
`Ch. 4
`
`Example 4.] Calculation of measures of dispersion for two hypothetical
`samples.
`
`(X. — I52 (32)
`IX: — 2?: (2)
`X. — X<g)
`X:(z)
` j_j_
`
`Sample 1
`
`1.2
`1.4
`1.6
`1.8
`2.0
`2.2
`2.4
`2X,=12.6g
`
`—0.6
`-0.4
`-0.2
`0.0
`0.2
`0.4
`0.6
`2(X,—X')=o.og
`
`0.6
`0.4
`0.2
`0.0
`0.2
`0.4
`0.6
`2IX.—X’|=2.4g_
`
`0.36
`0.16
`0.04
`0.00
`0.04
`0.16
`0.36
`z;(x,—)?)z
`= 1.12 g2
`= “sum of s'quares”
`
`- _ 12.6 g __
`X — —7— — 1.8 3
`range = X7 — X1= 2.4g -1.23 = 1.23
`.
`.
`X — X’
`‘
`mean deviation = L1? =
`= 0.34 g
`_ 2 (X: — X)‘ _ 1.1233 _
`S2 — -7? — T —
`: = A/0.1867 g7 = 0.43 g
`
`g7-
`
`Sample 2
`
`mg)
`X. — X(g)
`IX: — XI (2)
`(X. — 2?)= (3')
`
`
`*
`
`1.2
`1.6
`1.7
`1.8
`1.9
`2.0
`2.4
`ZX:=12.68
`
`-0.6
`—-0.2
`-0.1
`0.0
`0.1
`0.2
`0.6
`E(Xz—«\_’)=0-03
`
`0.6
`0.2
`0.1
`0.0
`0.1
`0.2
`0.6
`XIX:-A7|=1-38
`
`,
`
`0.36
`0.04
`0.01
`0.00
`0.01
`0.04
`0.36
`EU’:-X)‘
`= 0.82 g2
`
`li
`-2l
`1.'1
`2
`
`= “sum pf squares”
`
`2 = gag = ..s .
`range-—X7 — X; =2.4g—1.2g= 1.23
`mean deviation = Z“—|"""—_fl = 1'8 3 = 0.26 g
`
`.92 = _____Z(;"'_‘l '7)’ = °__-3:8’ = 0.1367 3’
`s = A/0.1367 g2 = 0.37 g
`
`
`4 of 14
`
`Alkermes, Ex. 1030
`
`4 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Sec. 4.3
`
`The Variance
`
`31
`
`it is a biased and inefficient estimator. Nonetheless, it is useful in some circumstances
`to present the sample range as an estimate (although a poor one) of the population
`range. Taxonomists are frequently concerned, for example, with having an estimate of
`what the highest and lowest values in a population are expected to be. Whenever the
`range is specified in reporting data, however, it is usually a good practice to report
`another measure of dispersion as well. The range is applicable to ordinal, interval,
`and ratio scale data.
`
`4.2 The Mean Deviation
`
`As is evident from the two samples in Example 4.1 , the range conveys no informa-
`tion about how clustered about the middle of the distribution the measurements ‘are.
`
`Since the mean is so useful a measure of central tendency, one might express dis-
`persion in terms of deviations from the mean. The sum of all deviations from the mean,
`i.e., 2 (X, — X’), will always equal zero, however, so such a summation would be
`useless as a measure of dispersion (sec Example 4.1).
`'
`To sum the absolute values of the deviations from the mean results in a quantity
`that is an expression of dispersion about the mean. Dividing this quantity by n yields
`a measure known as the mean deviation, or mean absolute deviation of the sample.
`In Example 4.], sample 1 is more variable (or more dispersed, or less concentrated)
`than sample 2. Although the two samples have the same range, the mean deviation,
`calculated as
`
`sample mean deviation = ,
`
`(4.2)
`
`expresses the dilferences in dispersion. Mean deviation can also be defined by using
`the sum of the absolute deviations from the median rather than from the mean.
`
`4.3 The Variance
`
`Another method of eliminating the signs of the deviations from the mean is.to
`square the deviations. The sum of the squares of the deviations from the mean is
`called the sum ofsquares, abbreviated SS, and is defined as follows:
`
`sample SS = Z (X, — A7)1.
`
`(4.3)
`
`The mean sum of squares is called the variance (or mean square, the latter being short
`for mean squared deviation), and for a population is denoted by 0‘ (“sigma squared,”
`using the lowercase Greek letter):
`
`32 =
`
`(45)
`
`
`
`5of14
`
`Alkermes, Ex. 1030
`
`5 of 14
`
`Alkermes, Ex. 1030
`
`
`
`32
`
`Measures of Dispersion and Variability
`
`Ch. 4
`
`The best estimate of the population- variance, 0'’, is the sample variance, .92:
`(4.6)
`
`s2 = ._L__3(:‘_‘1*‘7)’-
`The replacement of ,u by X’ and N by n in Equation (4.5) results in a quantity which is
`a biased estimate of 02. The dividing of the sample sum of squares by n —— I (called
`the degrees offreedom, abbreviated DF) rather than by n, yields an unbiased estimate,
`and it is Equation (4.6) which should be used to calculate the sample variance. If all
`observations are equal, then there is no variability and s1 = 0; and .9‘ becomes in-
`creasingly large as the amount of variability, or dispersion, increases. Since .9‘ is a
`mean sum of squares, it can never be a negative quantity.
`The variance expresses the same type of information as does the mean deviation,
`but it has certain very important properties relative to probability and hypothesis
`testing that make it distinctly superior. Thus, the mean deviation is very seldom en-
`countered in biostatistical analysis.
`
`Example 4.2 “Machine formula” calculation of variance, standard
`deviation, and coefficient of variation.
`
`Sample 1
`
`Sample 2
`
`X1(s)
`
`X1’ (32)
`
`X1 (3)
`
`X1‘ (3')
`
`p
`
`1.2
`1.4
`1.6
`1.8
`2.0
`2.2
`2.4
`
`1.44
`1.96
`2.56
`3.24
`4.00
`4.84
`5.76
`
`1.2
`1.6
`1.7
`1.8
`1.9
`2.0
`2.4
`
`1.44
`2.56
`2.89
`3.24
`3.61
`4.00
`5.76
`
`ZX1=l2.6g
`n = 7
`
`EXf=23.80g2
`
`EX:=l2.6g EX, =23.50gz
`n =
`
`_= =l.8g
`ss = 23.50 32 — (_.,__‘2-53)’
`
`= o.s2 g:
`‘' =(l%5‘=°"35792
`s=«/0.135737 = 0.373
`
`~_—o.21=21%
`
`ss = 2 X} — (E If‘):
`
`= 218%, _ (12.3 g)2
`=23.sog= —226sg2
`= l.l2g1
`
`=1.1 gz _0l867gz
`
`s = A/0.1867 g1 = 0.43 g
`
`V
`
`____ i = 0.433
`X’
`1.8 g
`= 0.24 = 24°.
`
`
`
`6 of 14
`
`Alkermes, Ex. 1030
`
`6 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`Sec. 4.4
`
`The Standard Deviation
`
`33
`
`
`
`
`
`___:.__:__# _¢:~éj —-ii?-.,__.,..Z__.
`
`The calculation of 37- can be tedious for large samples, but it can be facilitated by
`the use of the equality
`
`sample SS = Z X3 —
`
`(4.7)
`
`Although this formula might appear more complicated than (4.3), it is in reality
`simpler to work with. Example 4.2 demonstrates its use to obtain a sample sum of
`squares. Proof that Equations (4.3) and (4.7) are equivalent is given in Appendix B.
`Since sample variance equals sample SS divided by DF,
`
`2Xf_
`"
`n — I
`
`-
`
`s2
`
`(4.3)
`
`This last formula is often referred to as a “working formula,” or “machine formula,”
`because of its computational advantages. There are, in fact, two major advantages in
`calculating SS by Equation (4.7) rather than by,Equation (4.3). First, fewer computa-
`tional steps are involved, a fact that decreases chance of error. On a good desk calcu-
`later, the summed quantities, 2 X, and Z Xf, can both be obtained with only one
`pass through the data, whereas Equation (4.3) requires one pass through the data to
`calculate X’, and at least one more pass to calculate and sum the squares of the devia-
`tions, X, — A-’. Second, there may be a good deal of rounding error in calculating
`each X, — X’, a situation which leads to decreased accuracy in computation, but
`which is avoided by the use of Equation (4.7).
`For data recorded in frequency tables,
`
`_
`
`sample ss = 2 f,X,‘ _ (_E_J’B§)_’,
`II
`
`(4.9)
`
`wheref, is the frequency of observations with magnitude X,. But with a desk calcu-
`lator it is often faster to use Equation (4.7) for each individual observation, disregard-
`ing the class groupings.*
`_
`The variance has square units. If measurements are in grams, their variance will
`be in grams squared, or if the measurements are in cubic centimeters, their variance
`will be in terms of cubic centimeters squared, even though such squared units have no
`physical interpretation.
`
`4.4 The Standard Deviation
`
`The standard deviation is the positive square root of the variance; therefore, it has
`the same units as the original measurements. Thus, for a population,
`
`‘-When calculating s1 from frequency tables of continuous data (e.g., Example 1.5) or grouped
`discrete data (e.g., Example l.4b), the result is a slightly biased estimate of 02, the statistic being a
`little inflated by an amount related to the class interval size. Sheppard’: correction (Sheppard, 1898)
`occasionally is suggested to eliminate this bias; but it is only rarely employed, partly because the
`amount of bias generally is relatively very small (unless the data are grouped into too few classes),
`and partly because at times it results in a value for s2 which is in fact more biased an estimator than is
`the uncorrected s2 (Croxton, Crowdon, and Klein, 1967: 213, 536).
`
`7of14
`
`Alkermes, Ex. 1030
`
`7 of 14
`
`Alkermes, Ex. 1030
`
`
`
`34
`
`Measures of Dispersion and Variability
`
`Ch. 4
`
`and for a sample,
`
`;;,y,2_(Z3_«":X
`
`a =
`
`EX; _(23:Xz2i
`‘= %'
`
`(4.10)
`
`<4“)
`
`Example 4.1 demonstrates the calculation of s. This quantity frequently is abbre-
`viated SD, and on rare occasions is called the root mean square deviation. Remember
`that the standard deviation is, by definition, always a nonnegative quantity.*
`Some modem desk calculators have automatic square root capability. Since
`many do not, Appendix Tables D.2 and D.3 are supplied, for the obtaining of square
`roots is a recurring necessity in statistical analysis.
`
`4.5 The Coefficient of Variation
`
`The caefiicient of variation, or coeflicient of variability, is defined as
`
`.9
`— X,
`
`0
`s
`or V-2 -100/,.
`
`(4.12)
`
`Since 3/}? is generally a small quantity, it is frequently multiplied by 100% in order
`to express V as a percentage.
`As a measure of variability, the variance and standard deviation have magnitudes
`which are dependent on the magnitude of the data. Elephants have ears that are
`perhaps 100 times larger than those of mice. If elephant ears were no more variable,
`relative to their size, than mouse ears, relative to their size, the standard deviation of
`elephant ear lengths would be 100 times as great as the standard deviation of mouse
`ear lengths (and the variance of the former would be 100’ = 10,000 times the variance
`of the latter). The coefficient of variation expresses sample variability relative to the
`mean of the sample (and is on rare occasion referred to as the “relative standard
`deviation”). It is called a measure of relative variability or relative dispersion.
`Since .9 and .17 have identical units, Vhas no units at all, a fact which emphasizes
`that it is a relative measure, divorced from the actual magnitude or units of measure-
`ment of the data. Thus, had the data in Example 4.2 been measured in pounds, kilo-
`grams, or tons, instead of grams, the calculated V would have been the same. The
`coeflicient of variability may be calculated only for ratio scale data; it is, for example,
`not valid to calculate coefficients of variation of temperature data measured on the
`Celsius or Fahrenheit temperature scales. Simpson, Roe, and Lewontin (1960: 89-95)
`present a good discussion of V and its biological application, especially with regard
`to zoomorphological measurements.
`
`‘The sample sis actually a slightly biased estimate of the population a, in that on the average it
`estimates a trifle low, especially in small samples. But this fact generally is considered to be offset
`by the statistic's usefulness. Correction for this bias is sometimes possible (e.g., Bliss, 1967: 131;
`Dixon and Massey, 1969: 136; Gurland and Tripathi, 1971 ; Tolman, 197 l), but it is rarely employed.
`
`8 of 14
`
`Alkermes, Ex. 1030
`
`8 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`Sec. 7.3
`
`The Distribution of Means
`
`77
`
`7.3 The Distribution of Means
`
`If random samples of size n are drawn from a normal population, the means of
`these samples will form a normal distribution. The distribution of means from a
`nonnormal population will not be normal but will tend toward normality as n increas-
`es in size. Furthermore, the variance of the distribution of means will decrease as it
`
`increases; in fact, the variance of the population of all possible means of samples of
`size n from a population with variance 0’ is
`
`0}» =
`
`(7.12)
`
`The quantity of» is called the variance of the mean, and the preceding comments on
`the distribution of means comes from a very important mathematical theorem, known
`as the central limit theorem. A distribution of sample statistics is called a sampling
`distribution,‘ therefore, we are discussing the sampling distribution of means.
`Since :73, has square units, its square root, a-,9, will have the same units as the
`original measurements (and, therefore, the same units as the mean, ;i,and the standard
`deviation, :7). This value, a,9, is the standard deviation of the mean. The standard devia-
`tion of a parameter or of a statistic is referred to as a standard error; thus, 0,, is fre-
`quently called the standard error of the mean, or simply the standard error (sometimes
`abbreviated SE):
`
`03:“? OI‘ 0x=7a’7-
`
`Just as Z = (X, — ,u)/a is a normal deviate that refers to the normal distribution
`of X, values,
`
`z = £1
`0x
`
`(7.14)
`
`is a normal deviate referring to the normal distribution of means (X values). Thus, we
`can ask questions such as what is the probability of obtaining a random sample of nine
`measurements with a mean larger than 50.0 mm from a population having a mean of
`47.0 mm and a standard deviation of 12.0 mm? This and other examples of the use of
`normal deviates for the sampling distribution of means are presented in Example 7.2.
`As seen from Equation (7.13), to determine 0, one must know 0'” (or a), which is
`a population parameter. Since we very seldom can calculate population parameters,
`we must rely on estimating them from random samples taken from the population.
`The best estimate of 0'}, the population variance of the mean, is
`.9} =
`
`(7.15)
`
`the sample variance of the mean. Thus,
`
`s,g=,,/S72 or s,g=7'f7
`
`(7.16)
`
`is an estimate of a, and is the sample standard error of the mean. Example 7.3 demon-
`
`
`
`9of14
`
`Alkermes, Ex. 1030
`
`9 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Sec. 9.3
`
`Testing for Difference Between Two Means
`
`105
`
`Unfortunately, we are faced with the requirement of the variance ratio test that
`the two underlying distributions be normal (or nearly normal). Thus, this test must
`be applied with caution, for if the two sets of sample data are, in fact, from normal
`. populations, the logarithms of the data will not be normally distributed. The require-
`ment here is that the logarithms be normally distributed.
`
`9.3 Testing for Difference Between Two Means
`
`Example 9.1 presented as data the number of moths captured in each of two types
`of traps. The two-tailed hypotheses, Ho: p , — ,u2 = 0 and HA: p, —- .11, at 0, can be
`proposed to test whether the two traps possess the same efficiency in catching moths
`(i .e., whether they catch the same numbers of moths). These hypotheses are commonly
`expressed in their equivalent forms: ‘Ho: ,u, = p, and H‘: ,u, ¢ #2.
`If the two samples came from normal populations, and if the two populations
`have equal variances, then a t value may be calculated in a manner analogous to the
`t test introduced in Section 8.1. The t value for testing the preceding hypotheses con-
`cerning the difference between two means is
`
`t:
`
`@.
`X|—3:
`
`(9,4)
`
`The quantity )7, — 1?, is simply the difference between the two means, and sxrx,
`is the standard error of the difference between the means.
`
`The quantity sx,_x,, along with s§__x,, the variance of the difference between the
`means, is new to us, and we need to consider it further. Both s},_,, and s,¢,_x, are
`statistics that can be calculated from the sample data and are estimates of the popula-
`tion parameters, a'},_,, and ax,_;,, respectively. We can show mathematically that
`the variance of the difference between two variables is equal to the sum of the variances
`of the two variables, so that a§,_,g, = 0}. + 0}, Since a}; = 0“/n, we‘ can write
`
`2%
`_ Li
`a}|-X9 — "1 + "2
`Recall that in the two-sample t test, we assume that 0% .= :73; therefore, we can write
`0'
`0‘
`°'}r.-x. = E: +
`
`(95)
`
`Thus, to calculate the estimate of a}_,,, we must have an estimate of a2. Since both
`st and 3% are assumed to estimate 0", we compute the pooled variance, 3:, which is
`then used as the best estimate of 0‘.
`
`and
`
`ss
`
`ss
`
`s}, =
`
`s=
`Xi-X2
`
`= 32 + £2.
`"1
`"2
`
`(9.2)
`
`(9 6)
`‘
`
`
`
`10 of 14
`
`Alkermes, Ex. 1030
`
`10 of 14
`
`Alkermes, Ex. 1030
`
`
`
`106
`
`Thus,
`
`and
`
`Two-Samp/e Hypotheses
`
`Ch. 9
`
`2
`-9x.-x. = 1/ %f -1- gig’
`
`r = T _
`‘xx-X=
`
`£2 + £2
`
`;
`
`“.2.
`s,.V:' +~¢~.L
`
`(9-7)
`
`(9.3)
`
`Example 9.4 summarizes the procedure for testing the hypotheses under consideration.
`The critical value to be obtained from Table D. 10 is t,(,,_(,,+,,,,, the two-tailed 1 value
`for the ac significance level, with v, + v, degrees of freedom.
`
`Example 9.4 The two-sample t,test for the two-tailed hypotheses, Ho:
`#1 - #2 and H4: in 9* /12 (which could also be statedasHo: /4. — #2 =
`0 and H1: /11 —- /12 a’: 0). The calculations utilize the data of Example 9.1.
`
`m = 11
`V] = 10
`X1 = 34.5 moths
`SS1 = 218.73 moths1
`
`n2 = 8
`v; = 7
`1?; = 57.2 moths
`SS; = 107.50 moths?
`
`ss. + ss, = 218.73 + 107.50 _ 326.23 = mg moms,
`s}5=-2v1—i-V2
`10-l-7
`_
`
`= «/T7Z'W;T'= ~/T14 = 2.0 moths
`
`I = X’, — X’, = 345- 57.2 __._—E =_“_35
`sx, — x,
`2.5
`2.0
`!o.os(2),(-.+y.) = !o.os(z>.g = 2-110
`Therefore, reject Ho.
`P(|t| 2 11.35) << 0.001
`
`One-tailed hypotheses can be tested in situations where the investigator is inter-
`ested in detecting a difference in only one direction. For example, an entomologist
`might own a number of moth traps of the second type mentioned in Example 9.4,
`and he might wish to determine whether he should change to traps of type 1. Now, if
`type 1 traps are no more efficient at moth catching than are type 2 traps, there will
`be no reason to abandon the use of the latter. That is, if [11 g ,uz, the entomologist
`would choose to retain his present supply of type 2 traps; but if y , > #2, he would be
`justified in discarding them in favor of type 1 traps. The t statistic is calculated by Equa-
`tion (9.8), just as for the two-tailed test. But this calculated t is then compared with
`the critical value, t,(,,,(,,,+,,,, rather than with t,(2,_(,_.,,,,. In other cases, the one-tailed
`hypotheses, Ho: u, 2 ,u, and H‘: ,u, < ,u,, may be appropriate.
`Note that Ho: p, = [12 can be written H.,:,u, — pa = 0, Ho: /1. S p; can be
`written Ho: ,u, — u, g 0, and H0: ,u, 2 u, can be written Ho: ,u, — ,u, 2 c; the
`generalized t statistic is
`
`1 =
`
`A (9.9)
`
`
`
`11 of 14
`
`Alkermes, Ex. 1030
`
`11 of 14
`
`Alkermes, Ex. 1030
`
`
`
`
`
`Sec. 9.4
`
`Confidence Limits for Means .
`
`107
`
`Thus, for example, the entomologist might have considered that because of the
`expense of purchasing an entire new set of moth traps, he would do so only if he had
`reason to conclude that the new traps could catch more than 10 more moths per night
`than the present tr_aps. Here, Ho: ,u, — ,u, g 10 moths, HA: /1, - /1, > 10 moths,
`and t = (| X, — X,| —— 10 moths)/s,__,,, = (22.7 — 10)/2.0 = 6.35. The
`critical
`value is 10.0,“, ,7 = 1.740, so we would reject Ho. This test, then, allows us to conclude,
`with 95% confidence, that type 1 traps have a trapping efficiency at least 10 moths
`per night greater than do type 2 traps. Thus, the one-tailed two-sample test can exam-
`ine a hypothesis that one population mean is a specified amount larger (or smaller)
`than a second. By the procedure of Section 9.6 one can even test whether the measure-
`ments in one population are a specified amount as large (or as small) as those in a
`second population.
`
`Violations of the Two-Sample t Test Assumptions. The two-sample t test assumes,
`by dint of its underlying theory, that both samples came at random from normal
`populations with equal variances. The. biological researcher cannot, however, always
`be assured that these assumptions are correct. Fortunately, numerous studies have
`shown that the t test is robust enough to stand considerable departures from its
`theoretical assumptions, especially if the sample sizes are equal or nearly equal, and
`especially when two-tailed hypotheses are considered (e.g., Boncau, 1960; Box, 1953;
`Cochran, 1947). If the underlying populations are markedly skewed, then one should
`be wary of one-tailed testing, and if there is considerable nonnormality in the popula-
`tions, then very small significance levels (say, at < 0.01) should not be depended upon.
`Equal variances appear to be generally the more important of the two assumptions,
`and thus some authors have recommended the testing of the hypothesis H, : at = 0%
`prior to commencing‘ a two-sample t test. However, the procedure for testing this
`hypothesis (Section 9.1) is adversely affected by deviations from its underlying nor-
`mality assumption, whereas the t test is robust with regard to its underlying assump-
`tions, so to “make the preliminary test on variances is rather like putting out to sea
`in a rowing boat to find out whether conditions are sufficiently calm for an ocean
`liner to leave port!” (Box, 1953).
`In conclusion, two-sample t testing may be employed except in cases where it is
`felt there are severe deviations from the normality and equality of variance assump-
`tions. In such cases, the nonparametric test of Section 9.6 would better be employed.
`Alternatively, the Behrens-Fisher procedure (Fisher and Yates, 1963: 3-4, 60-61) or
`appropriate modifications of the t test (e.g., Cochran, 1964; Cochran and Cox, 1957:
`100-102; Dixon and Massey, 1969: 119; Satterthwaite, 1946) might be used.*
`
`9.4 Confidence Limits for Means
`
`"In Section 8.3, we defined the confidence interval for a population mean as X’ ;t
`t¢m,,,s,g, where 5,; is the best estimate of ax and is calculated as 4/s‘/n. For the two-
`sample situation, where we assume that‘ of = (73, the confidence interval for either
`
`‘In Fisher and Yates, s refers to the standard error, not the standard deviation.
`
`12 of 14
`
`Alkermes, Ex. 1030
`
`12 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Sec. 72.4
`
`Comparison of a Control Mean to Each Other Group Mean
`
`167
`
`arranged in increasing order of magnitude. Pairwise differences between rank sums are
`then tabulated, starting with the dilference between the largest and smallest rank sums,
`and proceeding in the same sequence as described in Section 12.1. The standard error
`is calculated as
`
`SE : /”(”P)(l’1§ + 1)
`
`(12.5)
`
`(Nemenyi, 1963; Wilcoxon and Wilcox, 1964: 10), and the tabled Studentized range
`to be used is q,,,..,,.- Note that this multiple range test requires that there be equal
`numbers of data in each of the k groups.
`
`12.4 Qom arison of a Control Mean to Each Other Group Mean
`
`Sometimes the objective of multisample experiments with k samples, or groups,
`is to determine whether the mean of one group, designated as a “control,” differs
`significantly from each of the means of the k — 1 other groups. Dunnett (1955) has
`provided a procedure for such testing, which differs from the multiple comparison
`approach in 12.1 in that the investigator is here not interested in all possible compari-
`sons of pairs of group means, but only in those k — 1 comparisons involving the “con-
`trol” group. Knowing k,' the total number of groups in the experiment, and v, the error
`degrees of freedom from the analysis of variance for Ho: ,u, = #2 = -
`-
`- = ,u,,, one
`obtains critical values from either Table D.l3 or Table D.14, depending on whether
`the hypotheses are to be one-tailed or two-tailed, respectively. We shall refer to these
`tabled values as q,',,,,,,, for they are used in a manner similar to that of the q,,,,,,,
`values employed in the Newman-Keuls test. As in the SNK procedure, the error rate,
`at, denotes the probability of committing a Type I error somewhere among all of the
`pairwise comparisons made. The standard error for Dunnett’s test is
`
`s5 = 27‘:
`
`where group sizes are equal, or
`
`. SE = 1/33 +
`
`(12.6)
`
`(12.7)
`
`when group sizes are not equal (Steel and Torrie, 1960: 114). The testing procedure
`utilizing Equations (12.2) and (12.6) is demonstrated in Example 12.4. For a two-
`tailed test, if |q’| 2 q;(,,,,,,, then H.,: ,u,,,,,,,,, = p‘ is rejected. In a or_1e-tailed test
`H0.’ fleontml g flA~ would be rejected
`2 ¢1(l).v,_g and Xeogtrol >
`and H0:
`pc,,,,,,,,, 2 [14 would be rejected if|q’| 2 q;,,,’,,, and X,,,,,,,,, < X,,.
`The null hypothesis H0: ;t,,,,,,,,,,, = ,u,, is, of course, a special case of H.: ;t,,,,.,,,
`— /1,, = c, where c = 0. Other values of c may appear in the hypothesis,_however, and
`such hypotheses may be tested by placing |J?,,,,,,,,. —- X’,,| — c in the numerator of
`the q’ calculation. In a similar manner, Ho: [1,,,,,.,°, — 11, g c or H.,: ,u,,,,,,,,, — ,u,, 2 c
`may be tested.
`
`
`
`13 of 14
`
`Alkermes, Ex. 1030
`
`13 of 14
`
`Alkermes, Ex. 1030
`
`
`
`Mathematical and Statistical Tab/as
`
`473
`
`TABLE D.l0 Critical Values of the t Distribution
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`vs‘-«NHonooouca\l'|8'\I|l~ID4c'u:_oo~:¢IVIA.-vnowosamsunmcunuJ-vavtvlvlvwuuuvcnnnnnnnnnKy):-It-It-It-Iy-u-n-p-..-p-Icuaaoum
`
`
`
`
`
`0.50
`0.25
`1.000
`0.816
`0.765
`0.701
`0.727
`
`0.718
`0.711
`0.706
`0.703
`0.700
`0.697
`0.695
`0.690
`0.692
`0.691
`0.690
`0.689
`0.688
`0.688
`0.687
`
`0.686
`0.686
`0.685
`0.685
`0.680
`
`0.680
`0.680
`0.683
`0.683
`0.683
`0.682
`0.682
`0.682
`0.682
`0.682
`
`0.681
`0.681
`0.081
`0.681
`0.681
`0.681
`0.880
`0.680
`0.680
`0.680
`
`0.680
`0.680
`0.680
`0.680
`0.679
`
`0.20
`0.10
`3.078
`1.886
`1.638
`1.533
`1.076
`
`1.000
`1.015
`1.397
`1.383
`1.372
`1.363
`1.356
`1.350
`1.305
`1.301
`1.337
`1.333
`1.330
`1.328
`1.325
`
`1.323
`1.321
`1.319
`1.318
`1.316
`
`1.315
`1.310
`1.313
`1.311
`1.310
`1.309
`1.309
`1.308
`1.307
`1.306
`
`1.306
`1.305
`1.300
`13300
`1.303
`1.303
`1.302
`1.302
`1.301
`1.301
`
`1.300
`1.300
`1.299
`1.299
`1.299
`
`0.10
`0.05
`6.310
`2.920
`2.353
`2.132
`2.015
`
`1.903
`1.895
`1.860
`1.833
`1.812
`1.796
`1.782
`1.771
`1.761
`1.753
`1.706
`1.700
`1.730
`1.729
`1.725
`
`1.721
`1.717
`1.710
`1.711
`1.708
`
`1.706
`1.703
`1.701
`1.699
`1.697
`1.696
`1.690
`1.692
`1.691
`1.690
`
`1.688
`1.687
`1.686
`1.685
`1.680
`1.683
`1.682
`1.681
`1.680
`1.679
`
`1.679
`1.678
`1.677
`1.677
`1.676
`
`0.05
`0.025
`12.706
`0.303
`3.182
`2.776
`2.571
`
`2.007
`2.365
`2.306
`2.262
`2.228
`2.201
`2.179-
`2.160
`2.105
`2.131
`2.120
`2.110
`2.101
`2.093
`2.086
`
`2.080
`2.070
`2.069
`2.060
`2.060
`
`2.056
`2.052
`2.008
`2.005
`2.002
`2.000
`2.037
`2.035
`2.032
`2.030
`
`2.028
`2.026
`2.020
`2.023
`2.021
`2.020
`2.018
`2.017
`2.015
`2.010
`
`2.013
`2.912-
`2.011
`2.010
`2.009
`
`0.02
`0.01
`21
`1.8
`6.965
`0.501
`3.707
`3.365
`
`3.103
`2.998
`2.896
`2.821
`2.760
`2.718
`2.681
`2.650
`2.620
`2.602
`2.583
`2.567
`2.552
`2.539
`2.528
`
`2.518
`2.508
`2.500
`2.092
`2.085
`
`2.079
`2.073
`2.067
`2.062
`2.057
`2.053
`2.009
`2.005
`2.001
`2.038
`
`2.030
`2.031
`2.029
`2.026
`2.023
`2.021
`2.018
`2.016
`2.010
`2.012
`
`2.010
`2.008
`2.007
`2.005
`2.003
`
`0.005
`0.0025
`
`0.002
`0.0 1
`
`0.001
`0.0005
`
`0.01
`0.005
`63.657 127.321 318.309 636.619
`22.327
`31.599
`10.089
`9.925
`7.053
`10.215
`12.920
`5.801
`5.598
`7.173
`8.010
`0.600
`0.773
`5.893
`6.869
`0.032
`5.208
`5.959
`5.008
`0.785
`5.001
`0.501
`0.781
`0.297
`0.587
`0.100
`0.025
`0.037
`3.930
`0.318
`3.852
`0.221
`3.787
`0.100
`3.733
`0.073
`
`3.707
`3.099
`3.355
`3.250
`3.169
`3.106
`3.055
`3.012
`2.977
`2.907
`2.921
`2.898
`2.878
`2.861
`2.805
`
`0.317
`0.029
`3.833
`3.690
`3.581
`3.097
`3.028
`3.372
`3.326
`3.286
`3.252
`3.222
`3.197
`3.170
`3.153
`
`2.831
`2.819
`2.807
`2.797
`2.787
`
`2.779
`2.771
`2.763
`2.756
`2.750
`2.700‘
`2.738
`2.733
`2.728
`2.720
`
`2.719
`2.715
`2.712
`2.708
`2.700
`2.701
`2.698
`2.695
`2.692
`2.690
`
`2.687
`2.685
`2.682
`2.680
`2.678
`
`. 3.135
`3.119
`3.100
`3.091
`3.078
`3.067
`3.057
`3.007
`3.038
`3.030
`3.022
`3.015
`3.008
`3.002
`2.996
`
`2.990
`2.985
`2.980
`2.076
`2.971
`2.067
`2.963
`2.959
`2.056
`2.952
`2.909
`2.806
`2.903
`2.900
`2.937
`
`3.686
`3.606
`3.610
`3.579
`3.552
`
`3.527
`3.505
`3.085
`3.067
`3.050
`3.035
`3.021
`3.008
`3.396
`3.385
`3.375
`3.365
`3.356
`3.308
`3.300
`
`3.333
`3.326
`3.319
`3.313
`3.307
`
`3.301
`3.296
`3.291
`3.286
`3.281
`3.277
`3.273
`3.269
`3.265
`3.261
`
`0.015
`3.965
`3.922
`3.883
`3.850
`
`3.819
`3.792
`3.768
`3.705
`3.725
`3.707
`3.690
`3.670
`3.659
`3.606
`
`3.633
`3.622
`3.611
`3.601
`3.591
`3.582
`3.570
`3.566
`3.558
`3.551
`
`3.500
`3.538
`3.532
`3.526
`3.520
`
`3.515
`3.310
`3.505
`3.500
`3.096
`
`
`
`UlffkfEgftfcomma:xncmnou
`
`
`
`14 of 14
`
`Alkermes, Ex. 1030
`
`14 of 14
`
`Alkermes, Ex. 1030