throbber
IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART C, VOL. 20, NO. 1, JANUARY 1997
`
`39
`
`Real-Time Diagnosis of Semiconductor
`Manufacturing Equipment Using a
`Hybrid Neural Network Expert System
`
`Byungwhan Kim, Member, IEEE, and Gary S. May, Senior Member, IEEE
`
`Abstract— This paper presents a tool for the real-time diag-
`nosis of integrated circuit fabrication equipment. The approach
`focuses on integrating neural networks into an expert system.
`The system employs evidential reasoning to identify malfunctions
`by combining evidence originating from equipment maintenance
`history, on-line sensor data, and in-line post-process measure-
`ments. Neural networks are used in the maintenance phase
`of diagnosis to approximate the functional form of the failure
`history distribution of each component. Predicted failure rates
`are then converted to belief levels. For on-line diagnosis in the
`case of previously unencountered faults, a CUSUM control chart
`is implemented on real sensor data to detect very small process
`shifts and their trends. For the known fault case, continuous
`hypothesis testing on the statistical mean and variance of the
`sensor data is performed to search for similar data patterns and
`assign belief levels. Finally, neural process models of process
`figures of merit (such as etch uniformity) derived from prior
`experimentation are used to analyze the in-line measurements,
`and identify the most suitable candidate among faulty input
`parameters (such as gas flow) to explain process shifts. A working
`prototype for this hybrid diagnostic system has been implemented
`on the Plasma Therm 700 series reactive ion etcher located in the
`Georgia Tech Microelectronics Research Center.
`
`Index Terms—Diagnosis, expert systems, neural networks, re-
`active ion etching.
`
`I. INTRODUCTION
`
`identifying the assignable causes for the equipment malfunc-
`tions and correcting them quickly to prevent the subsequent
`occurrence of expensive misprocessing. With the advent of
`highly proficient sensors capable of monitoring process con-
`ditions in-situ, it is now desirable to perform diagnosis on a
`real-time basis.
`Algorithmic diagnostic systems such as HIPPOCRATES [1]
`have been developed to identify process faults from statistical
`inference procedures and electrical measurements performed
`on finished IC wafers. Although this system makes good use
`of quantitative models of process behavior, it can only arrive at
`useful diagnostic conclusions in the limited regions of opera-
`tion over which these models are valid. Furthermore, in critical
`process steps such as reactive ion etching (RIE), the theo-
`retical basis for determining causal relationships is not well
`understood, thereby limiting the usefulness of physical models
`[2]. Expert systems such as PIES [3] have been designed
`to draw upon experiential knowledge to develop qualitative
`models of process behavior. This approach has attained limited
`success in attempting to diagnose unstructured problems which
`lack a solid conceptual foundation for reasoning. However, a
`purely knowledge-based technique often lacks the precision
`inherent in deep-level physical models, and is thus incapable
`of deriving solutions for unanticipated situations from the
`underlying principles surrounding the process.
`Neural networks have recently emerged as an effective
`tool for process modeling [4], [5] as well as fault diagnosis
`[6], [7]. Diagnostic problem solving using neural networks
`requires the association of input patterns representing quanti-
`tative and qualitative process behavior to fault identification.
`Robustness to noisy sensor data and high speed parallel
`computation make neural networks an attractive alternative for
`real-time diagnosis. However, the pattern recognition-based
`neural network approach has limitations. First, a complete set
`of fault signatures is hard to obtain, and the representational
`inadequacy of a limited number of data sets can induce
`network overtraining, thus increasing the misclassification or
`“false alarm” rate. Also, pattern matching approaches in which
`diagnostic actions take place following a sequence of several
`processing steps are sub-optimal since evidence pertaining
`to potential equipment malfunctions accumulates at irregular
`intervals throughout the process sequence. At the end of a
`sequence, significant misprocessing and yield loss may have
`already taken place, making post-process diagnosis alone
`economically undesirable.
`1083–4400/97$10.00 ª
`
`AS THE semiconductor industry moves toward submicron
`
`fabrication technology, tight control of process variability
`is an essential requirement. A certain amount of variability
`is inherent
`in sophisticated semiconductor equipment, and
`significant performance shifts may occur when this variabil-
`ity becomes large compared to random process noise (i.e.,
`fluctuations resulting from small and essentially uncontrol-
`lable causes). Such shifts are often indicative of equipment
`malfunctions. When unreliable equipment performance causes
`operating conditions to vary beyond an acceptable level, over-
`all product quality is jeopardized. Thus, timely and accurate
`equipment malfunction diagnosis can be a key to the success of
`the semiconductor manufacturing process. Diagnosis involves
`
`Manuscript received January 31, 1996; revised March 1997. This work was
`supported by the National Science Foundation Grant DDM-9 358163 and the
`IEEE/CPMT Motorola Fellowship.
`B. Kim is with the Memory R&D Division, Department of Equipment
`Engineering, Hyundai Electronics Industries Co., Ltd., Korea.
`G. S. May is with the School of Electrical and Computer Engineering,
`Georgia Institute of Technology, Atlanta, GA 30332-0250 USA.
`Publisher Item Identifier S 1083-4400(97)04320-9.
`
`1997 IEEE
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 00:51:26 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1029
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 1 of 9
`
`

`

`40
`
`IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART C, VOL. 20, NO. 1, JANUARY 1997
`
`This paper presents a prototype tool for the automated mal-
`function diagnosis of integrated circuit fabrication equipment.
`The methodology described combines the best characteristics
`of quantitative algorithmic, qualitative experiential and pattern
`recognition-based neural network approaches. This system
`offers advantages in that it yields a stable and reliable ranked
`list of fault possibilities, even in the presence of measurement
`noise (in part due to the inherent noise resistance of neural
`networks). In addition, the varying degrees of belief in each
`stage of diagnosis aids in the early detection of suspicious
`trends, often prior to an actual failure occurrences. This work-
`ing prototype is currently being developed and implemented
`on a Plasma Therm 700 series RIE located in the Georgia Tech
`Microelectronics Research Center.
`
`II. DIAGNOSTIC INFERENCE METHOD
`As a diagnostic inference method,
`the Dempster–Shafer
`theory of evidential reasoning [8] has proven to be suitable
`for real-time malfunction diagnosis applications [9]. This tech-
`nique allows the combination of various pieces of uncertain
`evidence obtained at irregular intervals, and its implementation
`results in time-varying, nonmonotonic belief functions which
`reflect the current status of diagnostic conclusions at any given
`point in time.
`One of the basic concepts in Dempster–Shafer theory is
`the frame of discernment (symbolized by ), defined as an
`exhaustive set of mutually exclusive propositions. In diagnosis,
`the frame of discernment is the union of all possible fault
`hypotheses. Each piece of collected evidence can be mapped
`to a fault or group of faults within . The likelihood of a fault
`proposition
`is expressed as a bounded interval [
`]
`which lies in [0, 1]. The parameter
`represents the support
`for
`, which measures the weight of evidence in support of
`. The other parameter,
`, called the plausibility of
`,
`is defined as the degree to which contradictory evidence is
`lacking. Plausibility measures the maximum amount of belief
`that can possibly be assigned to
`. The quantity
`is
`the uncertainty of
`, which is the difference between the
`evidential plausibility and support. For example, an evidence
`indicates that
`the
`interval of [0.3, 0.7] for proposition
`probability of
`is between 0.3 and 0.7, with an uncertainty
`of 0.4.
`For diagnosis, proposition
`represents a given fault hy-
`pothesis. An evidence interval for fault
`is determined from
`a basic probability mass distribution (BPMD). The BPM
`indicates the portion of the total belief in evidence assigned
`to a particular fault hypothesis set. Any residual belief in the
`frame of discernment that cannot be attributed to any subset
`of
`is assigned directly to
`itself, which in effect introduces
`uncertainty into the diagnosis. Using this framework,
`the
`support and plausibility of proposition
`are given by
`
`(1)
`(2)
`
`Fig. 1. Partial schematic of RIE gas delivery system.
`
`Dempster’s rules for evidence combination provide a de-
`terministic and unambiguous method of combining BPMD’s
`from separate and distinct sources of evidence contributing
`varying degrees of belief to several propositions under a
`common frame of discernment. The rule for combing the
`observed BPM’s of two arbitrary and independent knowledge
`sources
`and
`into a third
`is as follows:
`
`where
`
`and
`
`(3)
`
`(4)
`
`where
`represent various
`and
`Ø. Here
`propositions which consist of fault hypotheses and disjunctions
`thereof. Thus, the BPM of the intersection of
`and
`is the
`product of the individual BPM’s of
`and
`. The factor
`is a normalization constant which prevents the total
`belief from exceeding unity due to attributing portions of belief
`to the empty set.
`Consider the combination of
`when each contains
`and
`different evidence concerning the diagnosis of a malfunction
`in the RIE application. Such evidence could result from two
`different sensor readings for example. In particular, suppose
`that the sensors have observed that the flow of one of the
`etch gases into the process chamber is too low. Let
`the
`frame of discernment
`, where
`through
`symbolically represent the following mutually exclusive
`equipment faults:
`mass flow controller miscalibration;
`gas line leak;
`throttle valve malfunction;
`incorrect sensor signal.
`These components are illustrated graphically in the partial
`schematic of the etcher gas flow system shown in Fig. 1.
`Suppose that belief in this frame of discernment is dis-
`tributed according to the BPMD’s:
`
`and the summation is taken over
`and
`where
`all propositions in a given BPM. Thus the total belief in
`is
`the sum of support ascribed to
`and all subsets thereof.
`
`) is shown in
`The calculation of the combined BPMD (
`Table I. Each cell of the table contains the intersection of the
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 00:51:26 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1029
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 2 of 9
`
`

`

`KIM AND MAY: REAL-TIME DIAGNOSIS OF SEMICONDUCTOR MANUFACTURING EQUIPMENT
`
`41
`
`TABLE I
`ILLUSTRATION OF BPMD COMBINATION
`
`, along with the
`and
`corresponding propositions from
`product of their individual beliefs. Note that the intersection
`of any proposition with
`is the original proposition. The
`BPM attributed to the empty set,
`, which originates from
`the presence of various propositions in
`and
`whose
`intersection is empty, is 0.11. By applying (3), BPM’s for the
`remaining propositions result in
`
`combined
`individual
`
`the
`in
`propositions
`for
`plausibilities
`The
`BPM are
`calculated by applying (2). The
`evidential
`intervals implied by
`are
`.
`and
`Combining the evidence available from knowledge sources
`and
`thus leads to the conclusion that
`the most
`likely cause of the insufficient gas flow malfunction is a
`miscalibration of the mass flow controller (proposition
`).
`
`III. NEURAL NETWORK-BASED RIE MODELING
`Neural networks have the capability of learning complex
`relationships between groups of related parameters. They
`consist of parallel processing units (called neurons), which
`are interconnected in such a way that knowledge is stored
`in the weight of the connections between them. Each neuron
`contains the weighted sum of its inputs filtered by a sigmoidal
`activation function. The nonlinear mapping capabilities of
`neural networks have recently been applied by several other
`researchers in semiconductor process modeling [10]–[13]. To
`model the RIE process, the quantitative relationships which
`relate input parameters to output responses have been encoded
`in feed-forward neural networks via the error back-propagation
`(BP) algorithm [14]. The structure of a typical BP network
`appears in Fig. 2. The specific manner in which BP neural
`nets are used in RIE diagnosis is described below.
`
`A. Time Series Modeling
`For real-time diagnosis, it is critical to model the variation
`of in-situ sensor data and develop an efficient method for
`handling this voluminous and multidimensional data. Time-
`series modeling is a means to achieve each of these ends.
`Under malfunction conditions, sensor readings can serve as
`process “signatures” which assist in identifying the occurrence
`of a particular fault. Recently, neural networks have been
`proposed as a means to develop time series models of tool
`data [15].
`
`Fig. 2. Typical back-propagation neural network.
`
`Fig. 3. Data signatures for a malfunctioning CHF3 mass flow controller.
`
`Neural networks used to generalize the behavior of a time
`series are referred to as neural time series (NTS) models. The
`NTS model is capable of simultaneously filtering both auto-
`and cross-correlated data. That is, the NTS model can account
`for correlation among several variables being monitored simul-
`taneously. To illustrate, real-time tool data was collected via an
`equipment monitoring system designed to transfer data from an
`etcher to a remote workstation. Monitoring was accomplished
`using a Tektronix Model 2510 TestLab data acquisition system
`interfaced to the Plasma Therm RIE system via serial ports. In
`this example, an equipment alarm was signaled, and its cause
`was later identified to be an insufficient gas supply from the
`tri-fluoro methane mass flow controller (CHF ). Fig. 3 depicts
`malfunctioning behavior of the CHF gas flow.
`An NTS network was trained to model the CHF flow
`pattern in the RIE process using a simple sampling technique
`which involved training the network to forecast the next CHF
`from the behavior of five past values. The training set for the
`NTS network consisted of one out of every ten data samples.
`As shown in Fig. 4, auto-correlation among consecutive CHF
`measurements was accounted for by simultaneously training
`the network on the present value of CHF and five past
`values. The cross-correlation among the CHF was modeled
`by including as inputs to the network the present values of
`the temperature, incident and reflected RF power, oxygen and
`CHF . The accuracy of the trained network was measured by
`its root-mean-squared (RMS) error, which was 2.2%. Once
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 00:51:26 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1029
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 3 of 9
`
`

`

`42
`
`IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART C, VOL. 20, NO. 1, JANUARY 1997
`
`Inputs (auto and cross-correlated data) and output of a neural
`Fig. 4.
`time-series model.
`
`trained, the NTS model provides a simple means to encode
`this fault signature for later use.
`
`B. Process Modeling
`Diagnostic information can also be extracted from in-line
`measurements of post-processed wafers. To achieve this, these
`measurements must be compared to values predicted by a
`process model. Differences between model predictions and
`measured responses are indicative of potential equipment
`malfunctions. In [5], neural network models of RIE responses
`were developed from a Box–Wilson central composite circum-
`scribed design requiring 27 trials [16]. Etching was performed
`on a test structure designed to facilitate the simultaneous
`measurement of the etch rate, uniformity, anisotropy of SiO
`in a CHF and oxygen plasma, as well as the selectivity of
`the SiO etch with respect to photoresist. This characterization
`experiment provided neural network training data.
`A “forward” neural network-based process model defines
`a functional relationship between RIE process conditions (in-
`puts) such as RF power or gas composition and responses
`(outputs) such as etch rate or uniformity. The forward process
`model also provides a mechanism for comparing measured
`RIE output responses to predicted values. Large differences,
`which may indicate potential equipment faults, must then be
`traced back to fluctuations in model input parameters. To
`calculate the shift of the process input settings from their
`nominal values, an inverse neural process model is employed.
`This inverse model is obtained by training the network “in
`reverse” (i.e., using output/input pairs, rather than input/output
`pairs). The inverse model provides a means to identify the
`input parameter which is most likely responsible for an output
`process shift. Process shifts required for generating evidential
`support and plausibility can then be computed by utilizing the
`inverse neural process models.
`
`IV. GENERATION OF EVIDENTIAL SUPPORT
`The three relevant time periods for evidence collection in
`semiconductor manufacturing are:
`1) during equipment maintenance periods (before process-
`ing);
`2) during on-line equipment operation (during processing);
`3) during in-line, post-process physical and/or electrical
`inspection (after processing).
`
`Fig. 5. Chronological evidence sources for equipment malfunction diagno-
`sis.
`
`Diagnosis based on this framework for evidence collection
`takes place in three chronological stages (Fig. 5). Maintenance
`diagnosis is performed by examining the relevant historical
`records of equipment performance and building reliability
`models of each equipment component. During on-line diag-
`nosis, both neural time-series models and CUSUM control
`chart [17] techniques are employed to analyze fault pat-
`terns available from equipment monitoring system. For in-line
`diagnosis, measurements on processed wafers are used in
`conjunction with neural network process models. In each
`phase, evidential support and plausibility for various fault
`hypotheses are generated and mapped to particular equipment
`components. The methodology employed to do so is discussed
`below.
`
`A. Maintenance Diagnosis
`During the maintenance phase, the objective of the diag-
`nostic system is to derive evidence of potential component
`failures based on the historical performance of equipment
`components. The data available from which evidential belief
`may be generated is limited, consisting of only the number of
`failures a given component has experienced and the compo-
`nent age. In order to derive evidential support for potential
`malfunctions from this information, a reliability modeling
`technique has been developed to investigate the aging behavior
`of components. The failure probability as a function of time
`and the instantaneous failure rate (or “hazard” rate) for each
`component may be estimated from a neural network trained on
`failure history. The neural reliability model may then be used
`to generate evidential support, plausibility and uncertainty for
`each fault hypotheses (i.e., each potentially faulty component)
`in the frame of discernment.
`1) The Weibull Distribution: Consider reliability model-
`ing based on the Weibull distribution. The Weibull distribution
`has been used extensively as a model of time to failure in
`electrical and mechanical components and systems. Examples
`of systems which lend themselves to the Weibull model include
`electrical components such as batteries and ceramic multilayer
`capacitors, mechanical systems such as gas turbine engines,
`semiconductor devices such as memory circuits, individual
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 00:51:26 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1029
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 4 of 9
`
`

`

`KIM AND MAY: REAL-TIME DIAGNOSIS OF SEMICONDUCTOR MANUFACTURING EQUIPMENT
`
`43
`
`The error signal for the modified back-propagation neural
`network in this case is
`
`(7)
`
`Since the predicted hazard rate is differentiable, the error
`gradient with respect to the network weights may be computed
`using the chain rule as
`
`Fig. 6. Scheme to estimate Weibull function parameters.
`
`(8)
`
`mechanical parts such as bearings, and structural elements in
`aircraft and automobiles [18]. When a system is composed
`of a number of components and failure is due to the most
`serious of a large number of possible faults,
`the Weibull
`distribution seems to be a particularly accurate model [17],
`and this closely resembles the situation being addressed in
`semiconductor equipment malfunction diagnosis.
`The cumulative distribution function (which represents the
`failure probability of a component at time ) for the two-
`parameter Weibull distribution is given by
`
`(5)
`
`are called scale and shape parameters, respec-
`and
`where
`tively. If a device exhibits Weibull-like reliability behavior, the
`and
`will allow this distribution
`appropriate selection of
`functions to closely approximate the observed failure behavior
`throughout its lifetime. The Weibull hazard rate,
`, is given
`by
`
`is the calculated output of the th neuron in the
`where
`th layer. The first partial derivative in (8) is
`,
`and the third is the same as in the standard implementation
`of the back-propagation algorithm [14]. As for the second
`factor,
`this partial derivative may be computed separately
`for each individual output neuron (or equivalently, for each
`unknown parameter to be estimated). Due to the initially
`random network weights, the first predicted values of the
`hazard rate are arbitrary. However, after several
`training
`iterations, the predicted hazard rate converges to the actual
`rate. At this point, the scale and shape parameters computed
`at the network output are the estimates which best fit the
`distribution indicated by the training data.
`Following parameter estimation using this technique, the
`evidential support for each equipment component
`is then
`obtained from the Weibull distribution function in (5) with
`the estimated parameters. The corresponding plausibility is the
`associated with this probability estimate,
`confidence level
`which is defined as [19]
`
`(6)
`
`(9)
`
`The hazard rate may be computed from the failure history of
`each component by plotting the number of failures versus time
`and finding the slope of this curve at each time point.
`A scheme designed to extract the shape and scale parameters
`using neural networks has been developed and tested, and
`is outlined in [18]. This scheme is depicted schematically
`in Fig. 6. Here the network outputs represent the initially
`unknown scale and shape parameters. These outputs are iter-
`atively adjusted to reach to their optimal values as the neural
`network learns. The outputs are fed into the failure hazard
`function in (6), a predicted hazard rate (
`) is computed, and
`the result is compared with the actual hazard rate (
`), which
`has been computed from the failure history data.
`The standard back-propagation training algorithm for feed-
`forward neural networks begins with a random set of weights.
`An input vector which has been normalized so that all input
`data lies in the interval between
`1 and 1 is then presented
`to the network, and the output is calculated using this initial
`weight matrix. Next, the calculated output vector is compared
`to the measured output vector, and the squared difference
`between the two is used to determine the system error.
`Error minimization is accomplished via the gradient descent
`approach, in which the weights are adjusted in the direction
`of decreasing error.
`
`denotes the total number of component failures which
`where
`have been observed at time .
`2) The Exponential Distribution: Although the Weibull
`distribution provides one approach to component reliability
`modeling, due to its simple functional form, the exponential
`distribution is widely used to describe the time elapsing
`between two failures by characterizing the period during which
`a failure rate is constant [17]. The cumulative distribution
`function of this distribution is
`
`(10)
`
`is a constant equal to the reciprocal of the mean-time-
`where
`to-failure. This parameter may be estimated as
`
`(11)
`
`represents the elapsed time between th and (
`where
`)th failure of a specific component. The evidential support
`is obtained by inserting (11) into (10) and subsequently
`computing the corresponding Dempster–Shafer plausibility by
`using (9).
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 00:51:26 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1029
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 5 of 9
`
`

`

`44
`
`IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING TECHNOLOGY—PART C, VOL. 20, NO. 1, JANUARY 1997
`
`TABLE II
`SAMPLE FAULT RANKING AFTER MAINTENANCE DIAGNOSIS
`
`3) The Normal Distribution: The normal distribution is a
`symmetrical two-parameter distribution characterized by the
`mean and the standard deviation. Its extensive popularity in
`statistical applications stems from the fact that many mea-
`surements exhibit normal behavior. The normal cumulative
`distribution function is
`
`(12)
`
`The normal parameters in (12) may be estimated by using
`maximum likelihood estimation. The resultant estimates of
`the mean and the variance, which are denoted as
`and
`,
`respectively, are [17]
`
`(13)
`
`(14)
`
`The evidential belief in this case is computed from (12). The
`corresponding Dempster–Shafer plausibility is then calculated
`from (9).
`In this diagnostic system, the user is able to select which
`probability distribution best describes the reliability behavior
`of each component. Applying this methodology to the Plasma
`Therm RIE system yields a ranked list of components faults
`similar to that shown in Table II.
`
`B. On-Line Diagnosis
`1) Recognized Fault Case: In diagnosing faults previ-
`ously encountered by the system, NTS models are used
`to describe raw tool data indicating specific fault patterns.
`The resemblance between stored NTS fault models and the
`pattern currently under examination is measured to ascertain
`a measure of their similarity. The underlying assumption is
`
`that an equipment malfunction is triggered by an inadvertent
`process shift in one of process settings. This shift is assumed
`to be larger than the variability inherent in the processing
`equipment. To ascribe evidential support and plausibility
`to such a shift, statistical hypothesis tests are applied to
`sample means and variances of the time series data. This
`again requires the assumption that the notion of statistical
`confidence is analogous to the Dempster-Shafer concept of
`plausibility [9]. Equivalently, this means that the significance
`of a given statistical hypothesis test is equal to one minus the
`plausibility of a given event.
`2) Hypothesis Test on Pattern Means: To compare two
`(potentially faulty) data patterns, we first make the assumption
`that if the two patterns are the same (or similar), then their
`means and variances are both similar. Further, it is assumed
`that an equipment malfunction may cause either a shift in
`the mean or variance of a given sensor signal. To apply a
`hypothesis test to assess the similarity of two data patterns,
`the mean and variance of each pattern must first be computed.
`We begin by testing the hypothesis that the mean value (
`)
`of the current fault pattern equals the mean (
`) of previously
`stored fault patterns. The quantity
`is calculated using the
`predictions from the stored NTS fault models. Letting
`and
`be the sample variances of current pattern and stored
`-statistic [17],
`pattern, the appropriate test statistic is the
`which is given by
`
`(15)
`
`are the sample sizes for the current and stored
`and
`where
`pattern, respectively. The statistical significance level for this
`hypothesis test
`satisfies the relationship:
`,
`where
`is the number of degrees of freedom. After the
`significance level has been computed, the probability that the
`is equal
`mean values of the two data patterns are equal
`to
`.
`the as-
`3) Hypothesis Test on Pattern Variances: Under
`sumption that
`two similar data patterns will have similar
`variances, we may also test the hypothesis that the variance
`of the fault pattern currently being analyzed (
`) equals the
`variance of each of the stored patterns (
`). The appropriate
`-ratio [17], defined as
`test statistic is now the
`
`(16)
`
`are the sample variances for the current fault
`and
`where
`pattern and the stored patterns, respectively. The statistical
`) satisfies the
`significance level for this hypothesis test (
`relationship
`
`(17)
`
`where
`are the degrees of freedom
`and
`for
`, respectively. The parameters
`and
`are the
`and
`numbers of samples collected for the current fault pattern and
`the stored patterns, respectively. The resultant probability of
`equal variances is
`.
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 00:51:26 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1029
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 6 of 9
`
`

`

`KIM AND MAY: REAL-TIME DIAGNOSIS OF SEMICONDUCTOR MANUFACTURING EQUIPMENT
`
`45
`
`Fig. 7. Plot of real-time support and plausibility for a recognized gas flow
`fault.
`
`Fig. 8. Typical CUSUM control chart depicting the V-mask and scaling
`parameters.
`
`their targets. This is accomplished by means of the moving
`“V-mask” (see Fig. 8).
`Using this approach to generate evidential support requires
`the cumulative sums
`
`(19)
`(20)
`
`After the two hypothesis tests for equal mean and variance
`have been completed, the evidential support and plausibility
`that the current sampled pattern is similar to a previously
`stored pattern are defined as
`
`Support
`Plausibility
`
`(18)
`
`where
`is the sum used to detect positive process shifts,
`is the sum used to detect negative shifts,
`is the mean
`value of the current sample, and
`is the target value. The
`initial value of both
`and
`is set to zero. Both sums
`accumulate deviations from the target value
`greater than ,
`and both reset to zero upon becoming negative. The parameter
`is given by
`
`Using the rules of evidence combination outlined above, the
`support and plausibility of a particular pattern generated at
`each time point are integrated with their prior values for each
`sensor, thereby updating them continuously and providing the
`mechanism for real-time diagnosis.
`To demonstrate this approach, sensor data corresponding to
`the faulty CHF flow shown in Fig. 3 was used to derive an
`NTS model for this fault pattern. The training set for the NTS
`models consisted of one out of every ten data samples. The
`NTS fault model is assumed to be stored in a database, from
`which it is compared to other patterns collected by sensors in
`real-time so that the similarity of the sensor data to this stored
`pattern can be evaluated. In this example, the pattern of CHF
`flow under consideration as a potential match to the stored
`fault pattern was sampled once for every 15 sensor data points.
`and
`Using the sample mean and variance of each pattern,
`were calculated using (15) and (16). After evaluating the data,
`the evidential support and plausibility for pattern similarity
`are shown in Fig. 7.
`4) Unrecognized Fault Case: In order to identify mal-
`functions which have not been previously encountered, May
`and Spanos established a technique based on the CUSUM
`control chart [9]. The technique allows the detection of very
`small process shifts, which is critical for fabrication steps such
`as reactive ion etching, where slight equipment miscalibrations
`may only have sufficient
`time to manifest
`themselves as
`small shifts when the total processing time is on the order of
`minutes. CUSUM charts monitor such shifts by comparing the
`cumulative sums of the deviations of the sample values from
`
`(21)
`
`where
`is the standard deviation of the sampled variable
`and
`is the aspect angle of the V-mask. This angle has
`been selected to detect one-sigma process shifts with 95%
`probability. The chart has an average run length of 50 wafers
`between alarms when the process is in control [17].
`When either
`or
`exceeds the decision interval ( ), this
`signals that the process has shifted out of statistical control.
`The decision interval is
`
`(22)
`
`where
`is the V-mask lead distance. The decision interval may
`be used as the process tolerance limit and the sums
`and
`are to be treated as measurement residuals. Thus, numerical
`support is derived from the

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket