`10/14/21, 4:52 PM
`The Wayback Machine - https://web.archive.org/web/20010419095031/http://www.itl.nist.gov:80/div898/handbook/pmc/section4/pmc431.htm
`
`
`6. Process or Product Monitoring and Control
`6.4. Introduction to Time Series Analysis
`6.4.3. What is Exponential Smoothing?
`6.4.3.1. Single Exponential Smoothing
`
`Exponential
`smoothing
`weights past
`observations
`with
`exponentially
`decreasing
`weights to
`forecast
`future values
`
`
`
`
`
`
`
`
`
`This smoothing scheme begins by setting S0 to y1, where S stands for
`smoothed observation or EWMA, and y for the observation The subscripts
`refer to the time periods, 1, 2, ..., n. For the second period, S2 = (cid:68) y2 + (1-
`(cid:68)) S1 and so on.
`
`For any time period t, the smoothed value S t is found by computing
`
`This equation is due to Roberts (1959) and is called the basic equation of
`exponential smoothing and the constant or parameter(cid:68)is called the
`smoothing constant.
`
`Setting the first EWMA
`
`The initial EWMA plays an important role in computing all the subsequent
`EWMA's. Setting S0 to y1 is one method of initialization. Another way is to
`set it to the target of the process.
`Still another possibility would be to average the first four or five
`observations.
`
`It can also be shown that the smaller the value of (cid:68), the more important
`becomes the selection of the initial EWMA The user would be wise to try a
`few methods, (assuming that the software has them available) before
`finalizing the settings.
`
`Why is it called "Exponential"?
`Let us expand the basic equation by first substituting for St-1 in the basic
`equation to obtain
`
` St = (cid:68) yt+ (1-(cid:68))[ (cid:68)yt-1 + (1-(cid:68)) St-2 ]
` = (cid:68)yt +(cid:3)(cid:68) (1-(cid:68)) y t-1 + (1-(cid:68))2 St-2
`By substituting for St-2, then for St-3, and so forth, until we substitute for
`S0 it can be shown that the expanding equation can be written as:
`
`The first
`forecast is
`very
`important
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`https://web.archive.org/web/20010419095031/http://www.itl.nist.gov:80/div898/handbook/pmc/section4/pmc431.htm
`
`For example, the expanded equation for the smoothed value S4 is:
`
`
`
`This illustrates the exponential behavior. The weights,(cid:3)(cid:68)(cid:3)(1-(cid:68)) t decrease
`geometrically, and their sum is unity as shown below, using a property of
`geometric series:
`
`1/3
`
`APPLE 1038
`Apple v. Masimo
`IPR2020-01523
`
`1
`
`
`
`6.4.3.1. Single Exponential Smoothing
`From the last formula we can see that the summation term shows that the
`contribution to the smoothed value St becomes less at each consecutive time
`period.
`Let (cid:68) = .3. Observe that the weights(cid:3)(cid:68)(1-(cid:68)) t decrease exponentially
`(geometrically) with time.
`
`
`
`What is the "best" value for (cid:68)?
`
`The speed at which the older responses are dampened (smoothed) is a
`function of the value of(cid:3)(cid:68). When (cid:68) is close to 1, dampening is quick and
`when a is close to 0, dampening is slow. This is illustrated in the table
`below:
`
` ---------------> towards past observations
`
`The best value for(cid:3)(cid:68) is that value which results in the smallest MSE.
`
`Let us illustrate this principle with an example. Consider the following data
`set consisting of 12 observation taken over time:
`
`10/14/21, 4:52 PM
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`How do you
`choose the
`weight
`parameter?
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Example
`
`
`
`
`
`
`
`
`The sum of the squared errors (SSE) =169.143 The mean of the squared
`errors (MSE) is the SSE /12 = 14.095
`
`The MSE was again calculated for(cid:3)(cid:68) = .5 and turned out to be3.78 so in this
`case we would select an(cid:3)(cid:68) of 5. Can we do better? We could apply the
`proven trial and error method. This is an iterative procedure beginning with
`a range of (cid:68) between .1 and .9. The we find the smallest value for(cid:3)(cid:68) and
`search between (cid:68) -(cid:3)(cid:39) to(cid:3)(cid:68) +(cid:3)(cid:39). We could repeat this maybe one more time
`to find the best (cid:68)(cid:3)to 3 decimals.
`
`But there are better search methods, such as the Marquardt procedure. This
`is a nonlinear optimizer that minimizes the sum of squares of residuals. In
`general, most well designed statistical software programs should be able to
`find that value for (cid:68)(cid:3)that minimizes the MSE.
`
`https://web.archive.org/web/20010419095031/http://www.itl.nist.gov:80/div898/handbook/pmc/section4/pmc431.htm
`
`2/3
`
`2
`
`
`
`10/14/21, 4:52 PM
`
`6.4.3.1. Single Exponential Smoothing
`
`https://web.archive.org/web/20010419095031/http://www.itl.nist.gov:80/div898/handbook/pmc/section4/pmc431.htm
`
`3/3
`
`3
`
`