`estimation,
`and control
`VOLUME 1
`
`PETER S. MAYBECK
`
`DEPARTMENT OF ELECTRICAL ENGINEERING
`AIR FORCE I NSTITUTE OF TECHNOLOGY
`WRIGHT-PATTERSON AIR FORCE BASE
`OHIO
`
`1979
`ACADEMIC PRESS New York San Francisco London
`A Subsidiary of Harcourt Brace Jovanovich, Publishers
`
`1
`
`ION 1052
`
`
`
`CHAPTER 1
`Introduction
`
`1.1 WHY STOCHASTIC MODELS. ESTIMATION,
`AND CONTROL?
`
`When considering system analysis or controller design, the engineer has at
`his disposal a wealth of knowledge derived from deterministic system and
`control theories. One would then naturally ask, why do we have to go beyond
`these results and propose stoclwsric system models, with ensuing concept of
`estimation and control based upon these stochastic models? To answer this
`question, let us examine what the deterministic theories provide and determine
`where the shortcomings might be.
`Given a physical system, whether it be an aircraft, a chemical process, or
`the national economy, an engineer first attempts to develop a mathematical
`model that adequately represents some aspects of the behavior of that system.
`Through physical insights, fundamental "laws." and empirical testing. he tries
`to establish the interrelationships among certain variables of intere t, inputs
`to the system, and outputs from the system.
`With such a mathematical model and the tools provided by system and
`control theories, he is able to investigate the system structure and modes of
`response. If desired, he can design compensators that alter these characteristics
`and controllers that provide appropriate inputs to generate desired system
`responses.
`In order to observe the actual system behavior, measurement devices are
`constructed to output data signals proportional to certain variables of interest.
`These output signals and the known inputs to the system are the only informa(cid:173)
`tion that is directly discernible about the system behavior. Moreover, if a
`feedback controller is being designed, the measurement device outputs are the
`only signals directly available for inputs to the controller.
`There are three basic reasons why deterministic system and control theories
`do not provide a totally sufficient means of performing this analysis and
`
`2
`
`
`
`2
`
`I.
`
`INTRODUCTTO
`
`design. First of all, no marhemarical system model is pe1:lect. Any such model
`depicts only those characteristics of direct interest to the engineer's purpose.
`For instance, although an endless number of bending modes would be re(cid:173)
`quired to depict vehicle bending precisely, only a finite number of modes would
`be included in a useful model. The objective of the model is to represent the
`dominant or critical modes of system re ponse, so many effects are knowingly
`left unmodeled. In fact, models used for generating online data processors or
`controllers mu t be pared to only the basic essentials in order to generate a
`computationally feasible algorithm.
`Even effects which are modeled are necessarily approximated by a mathe(cid:173)
`matical model. The "laws" of Newtonian physics are adequate approximations
`to what is actually observed, partially due to our being unaccustomed to
`speeds near that of light. It is often the case that such "laws" provide adequate
`system strucTures, but various parameters within that structure are not deter(cid:173)
`mined absolutely. Thus, there are many sources of uncertainity in any mathe(cid:173)
`matical model of a system.
`A second shortcoming of deterministic models is that dynamic systems are
`driven not only by our own control inputs, but also by disturbances which we
`can neither conTrol nor model deterministically. Jf a pilot tries to command a
`certain angular orientation of his aircraft, the actual response will differ from
`his expectation due to wind buffeting, irnpreci ion of control surface actuator
`respon e , and even his inability to generate exactly the desired response from
`hi own arm and hands on the control stick.
`A final shortcoming is that sensors do not proL'ide perfecT am/ complete data
`about a ystem. First , they generally do not provide a ll the information we
`would like to know: either a device cannot be devised to generate a measure(cid:173)
`ment of a desired variable or the cost (volume. weight, monetary, etc.) of
`including such a measurement is prohibitive. In other situations, a number of
`different device yield functionally related signals, and one must then ask how
`to generate a best estimate of the variables of interest based on partially
`redundant daut. Sensors do not provide exact reading of desired quantities,
`but introduce their own system dynamics and distortions as well. Furthermore,
`the ·e devices are also a lways noise corrupted.
`As can be seen from the preceding discussion, to assume perfect knowledge
`of all quantities necessary to describe a system completely and/or to assume
`perfect control over the system is a naive, and often inadequate. approach.
`This motivates us to ask the following four questions:
`
`(I) How do you develop system models that account for these uncertain(cid:173)
`ties in a direct and proper, yet practical, fashion?
`(2) Equipped with such models and incomplete, noise-corrupted data from
`available sensors, how do you optimally estimate the quantities of interest to
`you?
`
`3
`
`
`
`1.3 1 HE KALMAN FILTER: AN INTRODUCflO'l TO CONCEPTS
`
`3
`
`In the face of uncertain system descriptions, incomplete and noise(cid:173)
`(3)
`corrupted data, and disturbances beyond your control, how do you optimally
`control a system to perform in a desirable manner'!
`(4) How do you evaluate the performance capabilities of such estimation
`and control systems, both before and after they are actually built?
`
`This book bas been organized specifically to answer these questions in a
`meaningful and useful manner.
`
`1.2 OVERVIEW OF TilE TEXT
`
`Chapters 2- 4 are devoted to the stochastic modeling problem. First Chap(cid:173)
`ter 2 reviews the pertinent aspects of deterministic system models, to be ex(cid:173)
`ploited and generalized subsequently. Probability theory provides the basis of
`all of our stochastic models, and Chapter 3 develops both the general concepts
`and the natural result of static system models. In order to incorporate dy(cid:173)
`namics into the model, Chapter 4 investigates stochastic processes. concluding
`with practical linear dynamic system models. The basic form is a linear sy tern
`driven by white Gaussian noise, from which arc available linear measurements
`which are similarly corrupted by white Gaussian noise. This structure is justi(cid:173)
`fied extensively, and means of describing a large class of problems in this
`context arc delineated.
`Linear estimation is the subject of the remaining chapters. Optimal filtering
`for cases in which a linear system model adequately describes the problem
`dynamics is studied in Chapter 5. With this background, Chapter 6 describes
`the design and performance analysis of practical online Kalman filters. Square
`root filters have emerged as a means of solving some numerical precision diffi- v
`culties encountered when optimal filters are implemented on restricted word(cid:173)
`length online computers, and these are detailed in Chapter 7.
`Volume l is a complete text in and of itself. Nevertheless, Volume 2 will
`extend the concepts of linear estimation to smoothing, compensation of model
`inadequacies, system identification, and adaptive filtering. Nonlinear stochastic
`system models and estimators based upon them will then be fully developed.
`Finally. the theory and practical design of stochastic controllers will be
`described.
`
`l.3 THE KALMAN FILTER:
`AN rNTRODUCTION TO CONCEPTS
`
`Before we delve into the details of the text, it would be useful to sec where
`we are going on a conceptual basis. Therefore, the rest of this chapter will
`provide an overview of the optimal linear estimator, the Kalman filter. This
`will be conducted at a very elementary level but will provide insights into the
`
`4
`
`
`
`4
`
`1.
`
`rNTRooucno
`
`underlying concepts. As we progress through this overview, contemplate the
`ideas being presented: try to conceive of graphic images to portray the con(cid:173)
`cepts involved (such as time propagation of density functions), and to generate
`a logical structure for the component pieces that are brought together to solve
`the estimation problem. If this basic conceptual framework makes sense to
`you. then you will better understand the need for the details to be developed
`later in the text. Should the idea of where we are going ever become blurred
`by the development of detail, refer back to this overview to regain sight of the
`overall objectives.
`First one must ask, what is a Kalman filter? A Kalman filter is simply an
`optimal recursive data processi11y algorithm. There are many ways of defining
`optimal, dependent upon the criteria chosen to evaluate performance. Tt will be
`shown that. under the assumptions to be made in the next section, the Kalman
`filter is optimal with respect to virtually any criterion that makes sense. One
`aspect of this optimality is that the Kalman filter incorporates all information
`that can be provided to it. It processes all available measurements, regardless
`of their precision, to estimate the current value of the variables of interest,
`with use of (I) knowledge of the system and measurement device dynamics,
`(2) the statistical description of the sy tern noises, measurement errors, and
`uncertainty in the dynamics models, and (3) any available information about
`initial conditions of the variables of interest. For example, to determine the
`velocity of an aircraft. one could usc a Doppler radar, or the velocity indica(cid:173)
`tions of an inertial navigation system, or the pitot and static pressure and
`relative wind information in the air data system. Rather than ignore any of
`these outputs, a Kalman fi lter could be built to combine all of this data and
`knowledge of the various systems' dynamics to generate an overall best esti(cid:173)
`mate of velocity.
`The word recursive in the previous description means that, unlike certain
`data processing concepts, the Kalman filter does not require all previous data
`to be kept in storage and reprocessed every time a new measurement is taken.
`This will be of vital importance to the practicality of filter implementation.
`The "filter·· is actually a data processing algorithm. Despite the typical con(cid:173)
`notation of a fi lter as a "black box" containing electrical networks, the fact is
`that in most practical applications, the '·fitter" is just a computer program in
`a central processor. As such, it inherently incorporates discrete-time measure(cid:173)
`ment samples rather than continuous time inputs.
`Figure 1.1 depicts a typical situation in which a Kalman filter could be used
`advantageously. A system of some sort is driven by some known controls, and
`measuring devices provide the value of certain pertinent quantities. Knowledge
`of these system inputs and outputs is all that is explicitly available from the
`physical system for estimation purposes.
`The 11eed for a filter now becomes apparent. Often the variables of interest,
`some finite number of quantities to describe the "state" of the system. cannot
`
`5
`
`
`
`1.3 THE KALMAN FILTER: AN INTRODUCTION TO CONCEPTS
`
`5
`
`~-----------------,
`System error
`1
`1
`sources
`I
`I
`
`Controls
`
`System state
`(desired. but
`not known)
`
`I
`I
`I
`I
`I
`I
`1
`t
`I
`_______________ .J
`I
`
`Measurement
`error sources
`
`Observed
`measurements
`
`OptimaJ estimate
`of system state
`
`FIG. 1.1 Typical Kalman filter application.
`
`be measured directly, and some means of inferring the e values from the avail(cid:173)
`able data must be generated. For instance. an air data system directly provides
`static and pitot pressures, from which velocity must be inferred. This inference
`i complicated by the facts that the system is typically driven by inputs other
`than our own known controls and that the relationships among the various
`"sta te" variables and measured outputs are known only with some degree of
`uncertainty. Furthermore, any measurement will be corrupted to some degree
`by noise. biases, and device inaccuracies, and so a means of extracting valuable
`information from a noisy signal must be provided as well. There may also be
`a number of di!Terent measuring devices, each with its own particular dynamics
`and error characteristics, that provide some information about a particular
`variable, and it would be desirable to combine their outputs in a systematic
`and optimal manner. A Kalman filter combines all available measurement data,
`plus prior knowledge about the system and measuring devices, to produce an
`estimate of the desired variables in such a manner that the error is minimized
`statistically. In other words. if we were to run a number of candidate filters
`many time for the same application, then the average results of the Kalman
`filter would be better than the average results of any other.
`Conceptually, what any type of filter tries to do is obtain an "optimal"
`estimate of desired quantities from data provided by a noisy environment,
`"optimal" meaning that it minimizes errors in some respect. There are many
`means of accomplishing this objective. If we adopt a Bayesian viewpoint,
`then we want the filter to propagate the contlirio11al probability de11sity of
`
`6
`
`
`
`6
`
`l.
`
`INTRODUCTIO
`
`FIG. 1.2 Conditional probability densit) .
`
`X
`
`the desired quantities, conditioned on knowledge of the actual data coming
`from the measuring devices. To understand this concept. consider Fig. 1.2, a
`portrayal of a conditional probability density of the value of a scalar quan(cid:173)
`tity \" at time instant i (x(i) ), conditioned on knowledge that the vector mea(cid:173)
`s urement z( l) at time instant 1 took on the value z1 (z(l) = z1) and similarly
`for instant 2 through i, plotted as a function of possible x(i) values. This is
`denoted as .f~t•ll'< 1 ,, L 121 •.•.• ,c;1(x I z 1, z 2 ••••• z;). For example. let \(i) be the one(cid:173)
`dimensional position of a vehicle at time instant i, and let z(j) be a two(cid:173)
`dimensional vector describing the measuremen~ of position at time j by two
`separate radars. Such a conditional probability density contains all the avail(cid:173)
`able information about x(i): it indicates, for the given value of all measurements
`taken up through time in!>tant i, what the probability would be of x(i) assuming
`any particular value or range of values.
`It i termed a "conditional" probability density because its shape and loca(cid:173)
`tion on the .x axis is dependent upon the values of the measurements taken.
`Its shape conveys the amount of certainty you have in the knowledge of the
`value of x. lf the density plot is a narrow peak, then most of the probability
`"weight" is concentrated in a narrow band of x values. On the other hand, if
`the plot has a gradual shape, the probability ··weight" i spread over a wider
`range of x, indicating that you are less sure of its value.
`
`7
`
`
`
`1.4 BASIC ASSUMPTIO S
`
`7
`
`Once such a conditional probability density function is propagated, the
`.. optimal" estimate can be defined. Possible choice would include
`
`the mean
`the "center of probability mass" estimate;
`(1)
`the value of x that has the highest probability, locating the
`the mode-
`(2)
`peak of the density; and
`the media11 - the value of x such that half of the probability weight lies
`(3)
`to the left and half to the right of it.
`
`A Kalman filter performs this conditional probability density propagation
`for problems in which the system can be described through a linear model and
`in which system and measurement noises are white and Gaussian (to be ex(cid:173)
`plained shortly). Under these conditions, the mean, mode, median. and virtually
`any reasonable choice for an "optimal" estimate all coincide, so there is in
`fact a unique .. best'' estimate of the value of x. Under the e three restrictions,
`the Kalman filter can be shown to be the best filter of any conceivable form.
`Some of the restrictions can be relaxed, yielding a qualified optimal filter. For
`instance, if the Gaussian assumption i removed, the Kalman filter can be
`shown to be the best (minimum error variance) filter out of the cia s of linear
`unbiased filters. However, these three assumptions can be justified for many
`potential applications, as seen in the following section.
`
`1.4 BASIC ASSUMPTlONS
`
`At thi point it is useful to look at the three basic assumptions in the
`Kalman filter formulation. On first inspection, they may appear to be overly
`restrictive and unrealistic. To allay any misgivings of this sort, this section will
`briefly discuss the physical implications of these assumptions.
`A linear system model is justifiable for a number of reasons. Often such a
`model is adequate for the purpose at hand, and when nonlinearities do exist.
`the typical engineering approach is to linearize about some nominal point or
`trajectory, achieving a perturbation model or error model. Linear systems are
`desirable in that they are more easily manipulated with engineering tools. and
`linear system (or differential equation) theory is much more complete and
`practical than nonlinear. The fact is that there are means of extending the
`Kalman filter concept to some nonlinear applications or developing nonlinear
`filters directly, but these arc considered only if linear models prove inadequate.
`"Whiteness'' implies that the noise value is not correlated in time. Stated
`more simply, if you know what the value of the noise is now. this knowledge
`does you no good in predicting what its value will be at any other time.
`Whiteness also implies that the noise has equal power at all frequencies. Since
`this results in a noise with infinite power, a white noise obviously cannot really
`exist. One might then ask, why even consider such a concept if it does not
`
`8
`
`
`
`8
`
`J.
`
`INTRODUCTION
`
`Power spectral density
`
`System bandpass
`(amplitude ratio Bode plot)
`
`/
`
`FIG. 1.3 Power ~pectral dcns1ty bandwidths.
`
`Frequency
`
`exist in real life? The answer is twofold. First. any physical system of interest
`has a certain frequency ··bandpass" a frequency range of inputs to wh ich it
`can respond. Above this range. the input either has no effect. or the system so
`severely atlentuates the effect that it essentially does not exist. In Fig. 1.3. a
`typical system bandpass curve is drawn on a plot of "power spectral density"
`(interpreted as the amount of power content at a certain frequency) versus
`frequency. Typically a system will be driven by wideband noise- one having
`power at frequencies above the system bandpass, and essentially constant
`power at all frequencies within the system bandpass-as shown in the figure.
`On this same plot, a white noise would merely extend this constant power level
`out across all frequencies. Now, within the bandpass of the system of interest,
`the fictitious white no ise looks identical to the real wideband noise. So what
`has been gained? That is the second part of the answer to why a white noise
`model is used. ft turns out that the mathematics involved in the filter are
`vast ly simplified (in fact, made tractable) by replacing the real wideband noise
`with a white noise which. from the system's ''point of view;' is identical.
`Therefore. the white noise model is used.
`One might argue that there are cases in which the noise power level is not
`constant over all frequencies within the system bandpass. or in which the noise
`is in fact time correlated. For such instances. a white noise put through a
`small linear system can duplicate virtually any form of time-correlated noise.
`This small system. called a "shaping filter." is then added to the original sys(cid:173)
`tem, to achieve an overall linear system driven by white noise once again.
`Whereas whiteness pertains to time or frequency relationships of a noise,
`Gaussianness has to do with its ampl itude. Thus, at any single point in time,
`the probability density of a Gaussian noise amplitude takes on the shape of a
`normal bell-shaped curve. This assumption can be justified physically by the
`fact that a system or measurement noise is typically caused by a number of
`small sources. It can be shown mathematically that when a number of inde(cid:173)
`pendent random variables are added together. the summed effect can be de(cid:173)
`scribed very closely by a Gaussian probability density, regardless of the shape
`of the individual densities.
`
`9
`
`
`
`1.5 A SlMPLE EXAMPLF
`
`9
`
`There is also a practical justification for using Gaussian densities. Similar
`to whiteness, it makes the mathematics tractable. But more than that, typically
`an engineer will know, at best, the first and second order statistics (mean and
`variance or standard deviation) of a noise process. T n the absence of any higher
`order statistics, there is no better form to assume than the Gaussian density.
`The first and econd order statistics completely determine a Gaussian density,
`unlike most densities which require an endless number of orders of statistics
`to specify their shape entirely. Thus, the Kalman filter, which propagates the
`first and second order statistics. includes all information contained in the con(cid:173)
`ditional probability density, rather than only some of it, as would be the case
`with a different form of density.
`The particular assumptions that are made are dictated by the objectives of,
`and the underlying motivation for, the model being developed. If our objective
`were merely to build good descriptive models, we would not confine our atten(cid:173)
`tion to linear system models driven by white Gaussian noise. Rather, we would
`seek the modeL of whatever form. that best fits the data generated by the "real
`world.'' It is our desire to build estimators and controllers based upon our
`system models that drives us to these assumptions: other as umptions generally
`do not yield tractable estimation or control problem formulations. Fortunately,
`the class of models that yields tractable mathematics also provides adequate
`representations for many applications of interest. Later, the model structure
`will be extended somewhat to enlarge the range of applicability, but the re(cid:173)
`quirement of model usefulness in subsequent estimator or controller design will
`again be a dominant influence on the manner in which the extensions are made.
`
`1.5 A SIMPLE EXAMPLE
`
`To see how a Kalman filter works, a simple example will now be developed.
`Any example of a single measuring device providing data on a single variable
`would suffice. but the determination of a position is chosen because the prob(cid:173)
`ability of one's exact location is a familiar concept that easily allows dynamics
`to be incorporated into the problem.
`Suppose that you are lost at sea during the night and have no idea at all
`of your location. So you take a star sighting to establish your po it ion (for the
`sake of simplicity, consider a one-dimensional location). At some time 11 you
`determine your location to be z 1 • However, because of inherent measuring
`device inaccuracies, human error, and the like, the result of your measurement
`is somewhat uncertain. Say you decide that the precision is such that the
`standard deviation (one-sigma value) involved is u"1 (or equivalently, the vari(cid:173)
`ance, or second order statistic, is u;,). Thus, you can establish the conditional
`probability of x(t J, your position at time t 1 , conditioned on the observed
`value of the measurement being z 1, as depicted in Fig. 1.4. This is a plot of
`J.~<t.ll=<,.,(xlzd as a function of the location x: it tells you the probability of
`
`10
`
`
`
`10
`
`1.
`
`!NTRODUCTIO
`
`J-IG 1.4 Conditional density of position based on measured value : 1 •
`
`being in any one location, based upon the measurement you took. Note that
`q:, is a direct measure of the uncertainty: the larger q=• is, the broader the
`probability peak is, spreading the probability "weight" over a larger range of
`x values. For a Gaussian density. 68.3°;0 of the probability "weight" is con(cid:173)
`tained within the band q units to each side of the mean, the shaded portion
`in Fig. 1.4.
`Based on this conditional probability density, the best estimate of your
`position is
`
`(1-l)
`
`~(tl) = ::1
`and the variance of the error in the estimate is
`q_/(td = (J;,
`(1-2)
`Note that .X is both the mode (peak) and the median (value with i of the prob(cid:173)
`ability weight to each side), as well as the mean (center of mass).
`Now say a trained navigator friend takes an independent fix right after you
`do, at time t 2 ~ t 1 (so that the true position has not changed at all), and ob(cid:173)
`tains a measurement ::2 with a variance q;2
`. Because he ha a higher skill,
`assume the variance in his measurement to be somewhat smaller than in yours.
`Figure 1.5 presents the conditional density of your position at time t 2 , based
`only on the measured value z2 . Note the narrower peak due to smaller vari(cid:173)
`ance. indicating that you are rather certain of your position based on his
`measurement.
`Ar this point, you have two measurements available for estimating your
`position. The question is, how do you combine these data? lt will be shown
`subsequently Lhat, based on the assumptions made, the conditional density of
`
`11
`
`
`
`l 5 A SIMPLE EXAMPLl!
`
`11
`
`----
`
`FIG. 1.5 Conditional density of position based on measurement : 2 alone.
`
`a
`I
`I
`I
`I
`I
`
`/
`/ /
`,/''
`_,.,...,.,.
`
`,-T-.....
`/''
`I
`I
`I
`I
`I
`I /
`/
`
`---
`
`...--
`
`=·
`FIG. 1.6 Conditionul density of posit1on based on data : 1 and : 2
`
`X
`
`•
`
`'---' ---
`
`T'\
`I \
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I I
`\
`\
`\
`\
`\
`........ ~ __
`\
`
`I
`
`:---..
`
`'
`
`12
`
`
`
`12
`
`1.
`
`J, TRODUCTIO
`
`your position at time t 2 ;;;: t 1, x(t 2 ), given both: 1 and : 2 , is a Gaussian density
`1 as indicated in Fig. 1.6, with
`with mean Jl and variance (J
`JL = [(JU ((J;, + (J;zn=~ + [(J;.!((J;, + (J;
`2)]zz
`1 ~ = (1 (J;,) + (1 O';J
`
`{1-4)
`
`(1-3)
`
`Note that, from (l-4), 0' is less than either 0':, or (J, 2
`, which is to say that the
`uncertainty in your estimate of position has been decreased by combining the
`two pieces of information.
`Given this density, the best estimate is
`
`(1-5)
`
`2
`
`2
`
`. Tt is the mode and the mean (or, since it
`with an associated error variance (J
`is the mean of a conditional density, it is also termed the conditional mean).
`Furthermore, it is also the maximum likelihood estimate, the weighted least
`squares estimate, and the linear estimate whose variance is less than that of
`any other linear unbiased estimate. In other words, it is the "best'' you can do
`according to just about any reasonable criterion.
`After some study, the form of f.l given in Eq. (l-3) makes good sense. lf (J=•
`were equal to (J= z , which is to say you think the measurements are of equal
`precision. the equation says the optimal estimate of position is simply the
`average of the two measurements. as would be expected. On the other hand,
`if (J=• were larger than (J=
`, which is to say that the uncertainty involved in the
`measurement: 1 is greater than that of : 2 , then the equation dictates "weighting''
`z2 more heavily than : 1 • Finally, the variance of the estimate is less than (J=•
`even if (J=z is very large: even poor quality data provide some information, and
`should thus increase the precision of the filter output.
`The equation for x(t 2 ) can be rewritten as
`.x{r2 ) = [(J;2 ((J;, + a;,)]:t +[a;, (a;,+ (J;,)]: 2
`= :J + [(J;, {(J;, + (J;2)][:2 - :1]
`
`{1-6)
`
`or. in final form that is actually used in Kalman filter implementations [noting
`that .X{r d = z 1] ,
`
`where
`
`(1-7)
`
`(1-8)
`
`These equations say that the optimal estimate at time 12 , .x(t 2 ) , is equal to the
`best prediction of its value before :::2 is taken, x(l Jl, plus a correction term of
`an optimal weighting value times the difference between : 2 and the best predjc(cid:173)
`tion of its value before it is actually taken, x(t 1 ). It is worthwhile to understand
`
`13
`
`
`
`1.5 A SIMPl E cXA~1PLE
`
`13
`
`this ··predictor- corrector" structure of the filter. Based on all pre\ ious informa(cid:173)
`tion, a prediction of the value that the desired variables and measurement will
`have at the next measurement time is made. Then, when the next measurement
`IS taken, the difference between it and its predicted value is used to ··correct"'
`the prediction of the desired variables.
`Using the K(f 2 ) in Eq. (1-8), the variance equation given by Eq. (1-4) can be
`rewritten as
`
`a, 2(1 2 ) = a/(rd- K(1 2 )a~ 2(1 1)
`(1-9)
`Note that the values of x(1 2) and v./(12) embody all of the information in
`./~(llll=<ta), : 1,21(x I= 1, .:2). Stated differently, by propagating these two variables. the
`conditional density of your position at time 12 , given .: 1 and .:2 , is completely
`specified.
`Thus we have solved the static estimation problem. Now consider incor(cid:173)
`porating dynamics into the problem.
`Suppose that you travel for some time before taking another measurement.
`Further assume that the best model you have of your motion is of the simple
`form
`
`(1-1 0)
`
`dxfdl = u + 1r
`where u is a nominal velocity and w i a noise term used to represent the un(cid:173)
`certainty in your knowledge of the actual velocity due to disturbances. ofl'(cid:173)
`nominal conditions, effects not accounted for in the simple first order equation,
`and the like. The "noise" w will be modeled as a white Gau ·sian noise with a
`mean of zero and variance of v., 2•
`Figure 1.7 shows graphically what happens to the cond itional density of
`position, given .: 1 and .:2 . At time 12 it is as previou ly derived. As time pro(cid:173)
`gresses, the density travels along the x axis at the nominal peed u, whi le
`simultaneously spreading out about its mean. Thus, the probability density
`starts at the best estimate, moves according to the nominal model of dynamics,
`
`FIG. 1.7 Propagation of conditional probability density.
`
`14
`
`
`
`14
`
`I .
`
`INTRODUCTIO
`
`and spreads out in time because you become less sure of your exact position
`due to the constant addition of uncertainty over time. At the time t
`-. just
`before the measurement is taken at lime /3. the density .1~1,,11=cr11.:(r,l~ I= 1
`• =
`) i ·
`2
`as shown in Fig. 1.7. and can be expressed mathematically as a Gaussian
`density with mean and variance given by
`
`3
`
`(1-lJ)
`
`(l-12)
`
`.~(t3 -) = ~(l2) + u[r3 -
`t2]
`a./(r3 - ) = o-/(r2) + a,/[t 3 -
`t2 ]
`Thus, .x(r 3 -) is the o ptimal prediction of what the x value is at 1
`- , before the
`3
`measurement is taken at t 3 , and o-x 2(13 - ) is the expected variance in that
`prediction.
`Now a measurement is taken. and its value turns out to be ;:
`. and its vari(cid:173)
`ance is assumed to be a;,. As before, there are now two Gaussian densities
`available that contain informatio n about position, one encompassing all the
`information available before the measurement, and the other being the informa(cid:173)
`tion provided by the measurement itself. By the same process as before. the
`density with mean -~(13 ) and variance o-/(1 3 - )is combined with the density
`with mean .::3 and variance a;,, to yield a Gaussian dens it) with mean
`
`3
`
`and variance
`
`o-/(13) = o-/(t3 - ) - K(t3)o-/(t 3
`where the gain K(t 3 ) is given by
`K(t 3 ) =a., 2{1 3 )/ [u/
`
`(13 -) + o-;J
`
`(1-13)
`
`( 1-14)
`
`( l-15)
`
`The optimal estimate, .x(t3 ), satisfies the same form of equation as seen pre(cid:173)
`viously in {1-7). The best prediction of its value before ;:3 is taken is corrected
`by an optimal weighting value times the difference between ;:
`and the predic(cid:173)
`3
`tion of its value. Similarly, the variance and gain equations are of the same
`form as (l-8) and { 1-9).
`Observe the form of the equation for K(t 3 ). lf u;,, the measurement noise
`variance, is large, then K(t 3 ) is small; this simply says that you would tend to
`put little confidence in a very noisy measurement and so would weight it
`lighlly. In the limit as o-; 3 --+ XJ . K(t 3) becomes 7ero. and x(t3) equals .X{t
`):
`3
`an infinitely noisy measurement is totally ignored. If the dynamic system noise
`vanance u,/ is large. then o-./(t 3 - )will be large [see Eq. (1-12)] and so will
`K(t 3 ); in this case, you arc not very certain of the output of the ystem model
`within the filter structure and therefore would weight the measurement heavily.
`Note that in the limit as a./ -+ XJ , u/(1 3 - ) -+ x and K(t
`)--+ 1, so Eq. {l-13)
`yields
`
`3
`
`( 1-16)
`
`15
`
`
`
`15
`
`Thus in the limit of absolutely no confidence in the