`Experimenters
`An Introduction to Design,
`Data Analysis,
`and Model Building
`
`GEORGE E. P. BOX
`WlLLIAM G. HUNTER
`J. STUART HUNTER
`
`New York
`
`John Wiley & Sons
`Brisbane
`Chichester
`
`.
`
`(cid:127)
`
`(cid:127) Toronto
`
`SNF Holdiing Company et al v BASF Corporation,
`
`Page 1 of 7
`IPR2015-00600
`
`
`
`Copyright © 1978 by John Wiley & Sons,
`rights reserved. Published simultaneously in Canada
`All
`
`Inc.
`
`translation of any part of
`Reproduction or
`this work beyond that permilled by Sections
`108 of the 1976 United States Copyright
`107
`or
`Act withoul Lhe permission of the copyright
`for permission
`owner is unlawful
`Requests
`should be addressed to
`information
`further
`or
`the Permissions Department. John Wiley & Sons, Inc.
`
`Library of Congress Cataloging in Publication Data
`Box, George EP
`Statistics for experimenters
`
`in probabÌlity and mathematical
`
`(Wiley scrics
`statistics)
`includes index.
`2 Analysis of
`l. Experimentul
`design
`Ilunter, william Gordon,
`variance
`1937-
`I.
`joint author.
`[1. Hunter,
`J. Stuart,
`1923
`joint author.
`III. Title-
`001.4'24
`QA279.B68
`ISBN 0-471-09315-7
`
`77-15087
`
`Printed in
`
`the5 United States of Arnerica
`
`To
`
`Page 2 of 7
`
`
`
`CHAPT-ER 9
`
`Empirical Modeling
`
`Models studied in preceding chapters contained mostly qualitative (or cate-
`also contain quantitative
`gorical) variables. The models we now consider
`Such models can take advantage of
`the in-
`temperature.
`variables,
`such
`as
`herent continuity of quantitative variables.
`
`9.1. MATHEMATICAL MODELS
`
`they are meaningful only in relation to
`Data have no meaning in themselves;
`that
`the phenomenon studied. Suppose,
`a conceptual model of
`instance,
`for
`a clock was observed at midnight on Sunday and thereafter
`12 hours.
`every
`to 6 o'clock. A set of
`the hands were always found to point
`Suppose that
`is shown in Figure 9.la, The interpretation of
`these data
`such observations
`would be different, depending on what model was thought appropriate.
`6 o'clock.
`the clock had stopped at
`that
`One idea that would fit
`the facts
`The appropriate mathematical model is
`then
`
`is
`
`the hour hand, and ße
`the observed value of
`the reading of
`where q
`is
`illustrated in
`to 6, as illustrated in Figure 9.16. A second idea,
`constant equal
`the clock is correctly making one revolution every
`Figure 9.1c.
`that
`12
`is
`6 hours fast. The appropriate model is
`hours but
`(ßo + x).s ;2
`(9.2)
`reading, and (ße + x),,,
`the elapsed time in hours from the first
`where x is
`12. A third idea is
`means the remainder obtained after dividing ße + x by
`p revolutions every
`12 hours, where p
`the clock is making, not one, but
`that
`in which case
`is any integer,
`
`is
`
`a
`
`22
`
`is
`
`=
`
`9
`
`y =
`
`(ße + px), a
`
`i2
`
`(93)
`
`291
`
`Page 3 of 7
`
`
`
`292
`
`EMPIR1CAL MODELING
`
`caia e
`
`modes
`
`-
`
`2 _
`
`data
`
`I
`
`O
`
`i
`
`12
`
`I
`
`24
`
`i
`
`36
`
`noe ihrl
`
`-
`
`-
`
`12
`
`o
`
`.
`
`i
`
`O
`
`2
`
`-
`
`¯
`
`12
`
`10
`
`.
`
`clock
`
`stopped
`
`i
`
`12
`
`ilme
`
`090
`
`.
`
`I
`
`24
`
`36
`
`e
`
`i
`
`hours
`
`fast
`
`6
`
`i
`
`:
`
`24
`
`I
`
`2
`
`e
`
`two
`
`if
`
`revolutont every 12 hou,,
`lil
`lil
`/
`
`12
`
`24
`
`36
`
`h!ATHEMATICAL MODELS
`models as plausible and othe
`tentative
`regardii
`hypothesis
`experimental design is chosei
`the model should be,
`what
`Accordingly the experimenta
`the ini
`detect
`inadequacies
`of
`is typically
`OŸScientinc work,
`in Chapter
`I; alternative mo
`new candidates
`are scrutinize
`In general, experimenters ;
`ship
`
`I
`
`between the mean value of
`and the levels
`(or versions)
`time, concentration, pressure.
`ship may be written as
`
`o
`
`a
`
`3
`
`12
`
`/
`eme mi
`
`36
`
`0
`
`ame mi
`
`o
`
`2
`
`12
`
`10
`
`a
`
`e
`
`4
`
`2
`
`o
`
`-
`
`-
`-
`
`-
`
`-
`
`0
`
`""
`
`12
`
`ame
`
`a
`
`,
`12
`Ë io
`
`a
`
`e
`
`4
`
`2
`
`-
`-
`-
`-
`
`-
`
`-
`
`9
`
`'""""9
`
`"""
`
`24
`
`36
`
`C
`
`12
`
`24
`
`36
`
`,,
`
`y
`
`FIGURE 9.I.
`
`Explanations of clock readings
`
`based
`
`on differem models.
`
`as illustrated in Figure 9.ld for p = 2.
`In all
`is assumed that
`the above,
`the
`it
`a regular rate. Of course,
`hands move
`forward at
`the observations would be
`equally consistent with a model
`requiring the hands of
`the clock
`to run
`in Figure 9.le, or
`backward,
`to speed up at one part of
`the cycle
`as
`and to
`slow down at another part, as m Figure 9.tf.
`The possibilities are clearly extensive.
`In practice, however,
`the phenomenon under study (the clock mechanism)
`knowledge of
`available. This prior knowledge
`enables an experimenter to classify
`
`basic
`some
`is usually
`certain
`
`where x refers jointly to the A
`
`Theoretical Models
`
`Sometimes the phenomenon i
`consideratio
`from theoretical
`The required physical
`laws a
`equations, For example,
`sup|
`reactant,
`substance
`the
`B is
`formation od
`Then the rate of
`concentration of unreacted si
`tion of B at
`indicate
`time x is
`shown*
`to be
`
`.
`* Equatton 9.6 results from solving i
`senement
`The rare of
`formation o
`mnce r
`
`as
`
`is assumed thal
`It
`or a
`is p,
`
`I mole of
`
`8
`
`is
`
`Ihri
`
`I
`
`Page 4 of 7
`
`
`
`EMP1RICAL MODELIN(i
`
`Clock
`
`slooPed
`
`I
`
`12
`
`I
`
`24
`
`time »,
`
`36
`
`Two tevnfutient every
`//i//t/li
`
`12
`
`houts
`
`12
`
`24
`
`36
`
`tim"
`
`*
`
`ornos
`
`12
`
`ni
`
`24
`
`ame
`
`owi
`
`se
`
`sed
`
`on different models.
`
`.bove,
`
`is assumed that
`the
`it
`, the observations would be
`hands of
`the clock
`to run
`ne part of
`the cycle and to
`
`however,
`some
`clice,
`basic
`lock mechanism) is usually
`rimenter to classify
`certain
`
`MATHEMATICAL MODEl.S
`93
`models as plausible and others as implausible. Based on the experimenter's
`regarding which model or models are plausible, an
`tentative hypothesis
`experimental design is chosen. Even when the investigator thinks he knows
`keep in mind reasonable alternatives.
`the model should be, he must
`what
`Accordingly the experimental design should
`be constructed
`that
`it can
`so
`the initial model. Model building, an important part
`detect
`inadequacies of
`by the iterative process described
`typically accomplished
`Of scientific work,
`is
`1; alternative models are exposed to hazard, and survivors and
`in Chapter
`further.
`new candidates are scrutinized
`In generat experimenters
`are often
`ship
`
`in studying some
`
`relation-
`
`interested
`
`ry = f(xy,xa,
`(9.4)
`xx)
`as yield, quality, or eñiciency,
`response, such
`between the mean value
`of
`uch a
`a number of variables x,, x2,
`and the levels
`(or versions) of
`xx,
`type. For conciseness the relation-
`time, concentration,
`pressure, and c
`alyst
`ship may be wrillen as
`
`...,
`
`a
`
`...,
`
`f(X)
`jointly to the k variables x,, x2...., xx.
`
`Q
`
`-
`
`where x refers
`
`(9.5)
`
`Theoretical Models
`
`Sometimes the phenomenon under study is well understood.and it
`is possible,
`to write down a plausible functional
`from theoretical
`forpi.
`considerations,
`laws are often expressed mosl diicctlÿEy dilferential
`The required physical
`in a chemical reaction substance A is
`equations. For example, suppose that
`the product, and
`kinetics
`apply.
`nrst-order
`substance B
`reactant,
`the
`is
`time is proportional
`Then the rate of formation of B at any instant of
`to the
`concentration of unreacted substance A.
`If the mean value of
`the concentra-
`tion of B at time x is
`indicated by 4,
`the relationship between y and x can be
`shown*
`to be
`
`y =
`
`/I,(l
`
`- e¯'")
`
`(9.6)
`
`½
`
`6
`
`results from selvmg a differential equanon, which capresses mathemattcally the
`" Equation 9
`statemen: -The rate or Ínrmation of
`lo the concentration of unreacted sub-
`B is proportionaÍ
`slance.1"as
`
`It is assumed that 1 mole of Bis formed from i mole of
`°
`6 UI
`
`and that at x =
`
`the concentration
`
`0
`
`.4
`
`dx
`
`Page 5 of 7
`
`
`
`9
`
`2
`
`r; = 2, + ß,x
`= ( + ß2x
`and slopes of
`s and ß s are parameters measuring the intercepts
`where the a
`lines. Equation 9.7 makes no claim to do more than locally
`two straight
`limited region. These two straight
`lines
`approximate the true functions over a
`this purpose. even though the experimenter
`be perfectly adequate for
`is
`may
`be straight
`possibly
`over
`cannot
`quite certain that
`the true relationships
`ranges (see Figure 9.2).
`wider
`incorporated in an empirical
`The degree of complexity that should
`be
`model can seldom be guessed with certainty a priori. One approach
`to
`a fairly general model that can be simplified in a variety of ways
`allow for
`once the experimental results indicate what simplifications are reasonable.
`line functions
`the experimenter might hope that
`the straight
`For example,
`the same time, make a provisiOD
`of Equation 9.7 would be adequate but, at
`fitting quadratic relationships should they prove necessary:
`for
`x2
`+ ß,x + y
`n =
`y = a2 + ßgx + y2x2
`that measure curvature.
`where the y's are parameters
`the temperature range studied,
`it would not be surprising if,over
`In practice,
`simplifications occurred:
`one or more of
`the following
`
`for
`
`for
`
`reactor
`
`reactor
`
`1
`
`2
`
`for
`
`for
`
`reactor
`
`reactor
`
`1
`
`2
`
`(9,7)
`
`is
`
`9
`
`294
`
`EMPIR1CAL MODEL1NG
`
`MATHEMAT[CAL MODELS
`
`is
`
`the rate constant
`the ultimate concentration of B, and ß_
`of
`where ß,
`is
`or mechanistic model
`theoretical
`reaction. This equation is
`called a
`the
`or mechanistic
`is based directly on an appreciation of physical
`because it
`theory governing the system--in this particular case, chemical kinetic theory.
`in Chapter 16.
`Mechanistic modeling is discussed further
`
`Empirical Models
`
`is
`
`Frequently the mechanism underlyinga process is not understood sufficiently
`too complicated, to allow an exact model to be postulated
`from
`well, or
`an empirical model may be useful, particularly
`In such circumstances
`theory.
`limited ranges of
`is desired to approximate the response only over
`the
`if
`it
`reaction is being studied in
`a chemical
`variables. For example, suppose that
`of yield are
`In both reactors measurements
`reactors.
`chemical
`two different
`from 170 to 190°C. Over
`this
`reaction temperatures
`range of
`made over
`relationships between the mean
`range the experimenter guesses
`that
`the
`be approximated by straight
`the yield and temperature x can
`value y of
`lines. Thus the empirical model contemplated is:
`
`a
`
`approxirnaang
`
`'eactor y
`/
`
`'
`
`"'°'
`
`/
`
`/
`
`re,on c
`
`1'O
`
`temper¡
`
`Straighi
`
`lines approxir
`
`FIGURE 91
`two reactors.
`
`line relationships ol
`I. Straight
`quate approximation (i.e.,
`th
`2. To an adequate approximat
`reactors was constant.
`In tf
`curves would be the same
`(þ
`placed from one another on
`illustrated in Figure 9.2.
`3. The two reactors gave identi
`
`this partic
`A good design for
`these various pos
`Lemplation of
`will be to use three levels of
`temç
`ance with the scheme shown in
`This arrangement is
`2 x
`are just sufficient
`contams
`to c
`In addition the design allows e
`by visual
`insp
`sidered, whether
`In practice,
`of course,
`tation.
`several
`times.
`
`3
`
`e
`
`a
`
`i
`
`Page 6 of 7
`
`
`
`is
`
`EMPIRICAL MODELING
`the rate constant
`id ß,
`of
`erical or mechanistic model
`of physical or mechanistic
`kinetic theory.
`:ase.chemical
`lapter
`I 6.
`
`s not understood sufficiently
`iodel
`to be postulated
`from
`:I may be useful, particularly
`limited ranges of
`y over
`the
`reaction is being studied in
`i measurements of yield
`are
`190 C. Over
`m 170 to
`this
`between
`onships
`the mean
`: approximated by straight
`
`etor
`
`1
`
`clor 1
`
`MATHEMATICAL MODELS
`
`295
`
`approxirnating straight
`
`lines
`
`true tunctional
`relationships
`
`i
`
`""'°'
`
`/
`
`y
`
`/
`
`region
`
`of
`
`interesi
`
`:
`
`170
`
`ternperature
`
`PC)
`
`x
`
`100
`
`(9.7)
`
`FlGURE 9.1
`two reactors.
`
`Straight
`
`lines
`
`approximating
`
`relationships
`
`between yield and temperature in
`
`and slopes
`the intercepts
`of
`to do more than locally
`on. These two straight lines
`though the experimenter is
`be straight over
`t possibly
`
`orporated in an empirical
`priori. One approach is
`to
`plified in a variety of ways
`plifications are reasonable,
`the straight
`line functions
`me time, make a provision
`ave necessary:
`
`reactor
`
`reactor
`
`i
`
`2
`
`ure·
`temperaturerangestudied,
`rred:
`
`1. Straight line relationships of
`the form of Equation 9.7 provided
`quate approximation (i.e.,
`the coefficients
`y, and y, were zero)
`2. To an adequate approximation the difference
`between the yields
`the
`of
`the yield-temperature
`reactors was constant.
`case the shape of
`In this
`curves would be the same (ß, = ß2 and y, = 72), but
`they would be dis-
`the situation
`placed from one another on the yield axis (ma ¢ 22). This is
`illustrated in Figure 9.2.
`3. The two reactors gave identical results (2,
`
`= 22, ß, = ß,, and y,
`
`=
`
`72).
`
`an ade-
`
`this particular experimental situation will allow con-
`A good design for
`templation of
`these various possibilities. The most economical arrangernent
`will be to use three levels of tempera lure for each of
`the two reactors in accord-
`ance with the scheme shown in Table 9.L
`This arrangement is
`factorial design-The six sets of conditions it
`to determine the six constants
`in Equation 9.8.
`contains
`are just suflicient
`In addition the design allows each of
`the possible simplifications to be con-
`the results or by numerical compu-
`sidered, whether
`inspection of
`by visual
`In practice, of course, each of
`the six conditions might be replicated
`tation.
`several
`times.
`
`a
`
`2
`
`×
`
`3
`
`Page 7 of 7