`
`VI- 5
`
`EQUIPMENT ANALYSIS AND WAFER PARAMETER PREDICTION USING
`REAL-TIME TOOL DATA
`
`Sherry F. LEE and Costas J. SPANOS
`
`Department of Electrical Engineering C? Computer Sciences
`University of Califonzia, Berkeley CA 94720
`
`We prolpose a system which uses real-time equipment sensor signals to automatically detect and analyze
`semicortductor equipment faults, and evaluate the impact of the fault on the wafer parameters. The system,
`which has been applied on plasma processes, consists of three modules: (1) fault detection, (2) fault analysis,
`and (3) prediction of final wafer parameters such as etch rate, uniformity, selectivity, and anisotropy.
`
`1.0 Introduction
`To compete in today's semiconductor industry, compa-
`nies must continuously improve upon their manufacturing
`skills to both maintain high yield and reduce the cost of own-
`ership of the equipment on the manufacturing line. A key ele-
`ment required io achieve these goals is to monitor the
`equipment to ensure that the semiconductor wafers are pro-
`cessed properly at each step. Measuring each wafer after it
`completes each step, however, is especially difficult in semi-
`conductor factories producing chips with over 100 manufac-
`turing steps. Moreover, due to throughput requirements, each
`wafer processed in each machine can not be measured indi-
`vidually. Present practice is to measure monitor wafers peri-
`odically, perhaps at the start of each work shift, after
`performing maintenance, or after changing the machine set-
`tings. Unfortunately, monitor wafers give no guarantee that
`subsequent procluction wafers will be processed properly.
`Thus, instead of detecting equipment faults causing yield loss
`early in the process flow, yield loss is usually found at the
`very end of the processing line.
`We propose a novel system which uses equipment sensor
`signals to automatically detect equipment faults in real-time,
`and analyze and evaluate the effect of the fault on wafer
`parameters on a run-to-run basis. The three modules of the
`system are (1) detection of equipment malfunctions, (2) anal-
`ysis (classification) of equipment faults, and (3) the predic-
`tion of output wafer parameters (Figure 1). The system
`impacts the semiconductor fabrication line by reducing the
`scrap produced by the equipment, reducing the down-time,
`and reducing the mean-time-to-repair. The result is a reduc-
`tion in the overall cost of ownership of the equipment.
`This general methodology is verified on a plasma etcher,
`one of the cost1ii:st pieces of equipment in the semiconductor
`fabrication line. Not only is the etcher usually a bottleneck
`piece of equipment, it is difficult to control because it is not
`well understood. Most importantly, each etcher can generate
`up to $100,000 worth of scrap per hour. The plasma etcher
`used in this work is a Lam Rainbow 4400 polysilicon etcher.
`In this paper, the designed experiment is described first.
`A discussion of the fault detection and analysis modules is
`
`next, followed by a description of the wafer parameterpredic-
`tion module.
`
`Equipment
`I, Real-time
`r - - T - i
`
`,
`
`p&-J
`
`Analvsis
`
`Figure 1 Proposed system
`
`2.0 Designed Experiment
`This section describes the experiment conducted to
`obtain the real-time data sets used to develop and verify the
`system. First, the wafer test structure is briefly described, fol-
`lowed by a discussion of both the training and the prediction
`experiments. Finally, the measurements taken on each wafer
`arc described along with a discussion of the real-time signals
`collected during wafer processing.
`
`2.1 Test Structure
`The test structure was designed so all processes of inter-
`est are simultaneously obtained in the same etch step. Due to
`complex loading effects, this method results in more accurate
`etch rates and selectivities than etching blanket wafers
`individually'). A simplified view of the test structure indicat-
`ing the etched surfaces is shown in Figure 2. First, a 600A
`thermal gate oxide is grown on the 4" wafers, followed by
`5500A n+ doped polysilicon, deposited via low pressure
`chemical vapor deposition. After a 20 minute nitrogen anneal
`at 95OoC, 2800A undoped low temperature oxide (LTO) is
`deposited by chemical vapor deposition. A three step mask
`process is required to build the test structure.
`
`- 133 -
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 04:22:28 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1026
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 1 of 4
`
`
`
`2.2 Training and Prediction Experiments
`In both the training and prediction experiments, a fixed
`pre-etch recipe was used for all runs. The main etch recipe
`was modified according to a designed experiment described
`below. To obtain accurate etch rates the main etch was a timed
`etch, so no overetch was performed. The input parameters
`varied in the experiment are the chamber pressure, RF for-
`ward power, electrode gap spacing, the ratio of C12 to He, and
`the total gas flow of C12 and He. Because the ratio and total
`gas flows are more significant to the etch results, they were
`varied in the experiment instead of the individual gas flows.
`The output wafer parameters of interest are the etch rate of
`polysilicon, selectivity of polysilicon to oxide and I-line pos-
`itive photoresist, polysilicon wafer uniformity, and anisot-
`ropy.
`
`2XWA LT
`5500A polySi
`6008, gate oxide
`
`I P R I
`
`.9pm
`
`Figure 2 Test structure for the experiment.
`
`2.2.1 Training Experiment
`The training experiment consisted of two phases. Phase
`I, the variable screening stage, determined the statistically
`significant variables in the models. Phase I1 assessed the qua-
`dratic nature of the system via a star design. The input values
`used for all experiments are listed in Table 1, in terms of per-
`cent offset from the nominal values. The particular values
`were chosen to cover a wide range of operating conditions of
`the machine. Of the 37 runs in both phases of the training
`experiment, including replicated runs, 10 were eliminated
`before modeling due to unstable real-time signals or mispro-
`cessing.
`
`Table 1: Change in % From Nominal
`Phase II
`22.5%
`22.5%
`
`Prediction
`
`10%
`
`Parameter
`
`Phase I
`
`Pressure
`Power
`
`15%
`
`with no blocking, but drops to resolution JII when blocked for
`time and split lots. The design is essentially resolution V
`because blocking was not a factor in any of the phase I
`response surface models. Assuming that four factor interac-
`tions are negligible, this experiment provides a good estimate
`of the main effects.
`The variable screening analysis was performed by build-
`ing models from the phase I data. The statistical significance
`of each parameter was determined via the student-t test at the
`0.05 significance level. Results of the analysis show that
`although input settings are not all statistically significant in
`every model, all are required to model the four output charac-
`teristics of interest.
`Additional runs were performed in phase II to estimate
`the quadratic behavior of the system. The models are limited
`to quadratic terms to limit complexity. The phase I1 runs con-
`sisted of center points and “star” points, arranged symmetri-
`cally along the axis of each variable2). Two star points were
`run for each variable. Two center points were also run, for a
`total of 12 additional runs.
`
`2.2.2 Prediction Experiment
`The purpose of the prediction experiment is to collect
`another data set used to simulate equipment faults and test the
`prediction capability of the models. The prediction experi-
`ment was run approximately four weeks after the phase II
`experiment. The input settings for this experiment were var-
`ied one at a time.
`
`2.3 Wafer Measurements
`In both experiments, film thickness measurements were
`taken by a Nanometrics Nanospec AFT system on 9 die per
`wafer. Four points were measured on the outer perimeter of
`the wafer, four were measured half-way from the edge of the
`wafer, and one point was measured at the center. Measure-
`ment error was approximately lOA. The Alphastep 200 Auto-
`matic Step Profiler was used to confirm the Nanospec
`measurements. Film thicknesses were measured before and
`after etching; etch rates at each measured point are calculated
`by subtracting the post-etch from the pre-etch measurements,
`and dividing by the etch time. Wafer etch rates are averaged
`over the 5 inner points. Uniformity is calculated by scaling
`the difference between the etch rates of the outer and the inner
`rings by the etch rate of the inner ring.
`
`2.4 Real-time Data
`The real-time data collected from the plasma etcher are
`comprised of various electrical and mechanical signals.
`Between six and thirteen signals are collected. Six signals are
`collected via a Comdel Real Power Monitor (RPM-I), placed
`directly above the upper electrode3). The remaining seven
`signals are collected via the Lamstation software, which
`reads the signals from the SECS11 serial port on the etcher4).
`
`Flow Ratio
`
`I TotalFlow
`
`I
`
`19%
`11%
`
`I
`
`23%
`22%
`
`I
`
`10%
`10%
`
`I
`
`Phase I consists of a two-level, 16 run fractional 25-’ fac-
`torial design and 4 center points. The design is resolution V
`
`- 134 -
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 04:22:28 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1026
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 2 of 4
`
`
`
`Because a number of the measurements are related electri-
`cally or mechanically, many of the signals are highly corre-
`lated. A few signals are collected from different places in the
`equipment by
`the
`two different monitoring systems.
`Although correlated, these signals are not identical. The
`important sign.& monitored are RF Power, RF Voltage (rms),
`RF Current (nns), Load Impedance, RF Phase Error, Tune
`Vane Position, Load Coil Position, Peak-to-Peak Voltage, and
`End Point Data.
`3.0 Fault Detection and Analysis
`Since the data are collected sequentially at a sampling
`rate of 1 Hz the real-time signals are correlated in time, dem-
`onstrating time series behavior. The real-time fault detection
`module utilizes time series models to analyze the real-time
`signals. The objective i s to use these automatically collected
`signals to establish the baseline behavior of a complex tool
`and later detec t deviations from this baseline. The fault detec-
`tion algorithm is implemented through RTSPC, a software
`utility which automatically collects real-time sensor data and
`generates real-time alarms7). Examples of faults include
`shifts in the process parameters, such as changes in chamber
`pressure, RF power, or gas flows.
`If no equipment faults are detected, normal operation of
`the machine c'ontinues. When a malfunction is detected, the
`diagnostic routine is triggered, and an alarm is generated to
`alert the operai.or. After being filtered in RTSPC, the real-time
`residual data form distinct signatures which can be traced
`back to a specific equipment fault or group of faults. Initially,
`a training set of faults must be generated to teach the diagno-
`sis module fault signatures, creating a library of signatures.
`Discriminant analysis techniques are employed to analyze the
`equipment faults and train the system. Once training is com-
`plete, equipment faults are detected and analyzed on a run-to-
`run
`4.0 Wafer Parameter Prediction
`Empirical models are used to predict the outcome of each
`wafer immediately after it is processed by the equipment. To
`provide useful prediction capabilities, robust prediction mod-
`els of the machines are required. The industry standard is to
`use response surface methodology to build models relating
`the input settings of the machine to the output characteristics.
`Response surface models, however, become unusable in time
`due to machine drifts, rendering them ineffective for predic-
`tion.
`We propase that using real-time signals to build the mod-
`els results in better prediction capabilities. Four types of
`regression methods were explored: ordinary least squares
`(OLS) regression, ridge regression, principal component
`regression (PCR), and partial least squares regression
`(PLSR). No ane modeling method was overwhelmingly bet-
`ter than the others, although OLS regression resulted in
`
`slightly better prediction models for polysilicon etch rate.
`PCR and PLSR models, however, were less sensitive to over-
`fitting.
`The time series nature of the signals is not exploited in
`the prediction module. Instead, each signal is averaged over
`the duration of the main etch step, which lasts approximately
`30 seconds. Approximately 30 points are collected per signal
`per wafer etch. Since the wafer-to-wafer variance of the real-
`time signals is much larger than the within wafer variance, the
`average values per signal across each wafer are used as the
`input for the prediction models built with the real-time sig-
`nals. The training model is built from data collected during
`the training experiment. The final prediction metric is based
`on how well the training model predicts the outcome of the
`data collected during the prediction experiment. By using this
`prediction data, the true prediction capability of the models
`can be gauged. The two metrics used are the average predic-
`tion error (PE) and the stanGard error prediction (SEP), where
`Yi is the ith observation, Yi is the predicted value of the ith
`point, and n is the number of observations:
`
`I n
`
`i = 1
`It must be emphasized that models with the best adjusted
`R2 value are not necessarily good for prediction. The best
`models built with real-time data have adjusted R2 values of
`0.95 or greater. These models, however, can potentially have
`huge prediction errors even if all the terms in the model are
`statistically significant. In some cases, especially for OLS
`models, the prediction error is huge. To allow for better pre-
`diction a few terms in the models are eliminated, at the cost
`of reducing the adjusted R2 value.
`Two sets of models were built for each of the wafer char-
`acteristics to show that the real-time signals are better suited
`for prediction than the machine input settings. The first uses
`the real-time signals and the second uses the input settings.
`Due to the small ranges of selectivities across the design
`space, models are created for the individual etch rates of gate
`oxide and photoresist instead of modeling the selectivities.
`The ranges of each output parameter are listed to give a rela-
`tive measure of the accuracy of the prediction. The resulting
`PE and SEP values for the etch rate and uniformity models
`are listed in Table 2 in terms of percent error of the range. In
`all cases, the models built with the real-time data are superior
`to those built with input settings. The best models for unifor-
`mity have 25% prediction error, indicating that uniformity
`can not be successfully modeled with this data set.
`One reason for the better prediction is that models built
`with input settings generally include statistically significant
`
`- 135 -
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 04:22:28 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1026
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 3 of 4
`
`
`
`blocking terms to account for differences in the machine
`between sets of runs. In this experiment, the chlorine bottle
`was refilled between phases I and 11, causing a slight shift in
`the baseline behavior of the machine. This shift is accounted
`for in the models built with input settings through a blocking
`parameter. Blocking terms, however, can not be used in pre-
`diction models. Without the blocking parameter, the predic-
`tion capability of the model built with input settings suffers.
`Table 2 Comparison of Models Built Using Input Settings
`vs. Real-Time Data
`
`ing continue down the line.
`Another consequence of the prediction capability of the
`real-time models is that inexpensive run-to-run control is pos-
`sible. Future work includes pursuing such a run-to-run con-
`trol scheme of plasma etch equipment which will bring the
`specified output parameters back to their target value in the
`case of equipment drift.
`
`1 Input Setting Model
`
`8
`
`\
`
`I
`
`.
`
`
`
`I
`
`Replication Wafer #
`
`Figure 3 Comparison of the model built with input
`settings versus the model built with real-time signals.
`
`1) G.S. May, J. Buang, C.J. Spanos, “Statistical Experi-
`mental Design in Plasma Etch Modeling,” IEEE Trans.
`Semiconductor Manufacturing, vol. 4, no. 2 (1991).
`2) G.E.P. Box, N.R. Draper,
`and ResDonse Surfaces, Wiley (1987).
`3) Real Power Monitor (RPM-I), Comdel Inc.
`4) LamStation Rainbow, v 3.6, Brookside Software (1991).
`5) C.J. Spanos, S. Leang, S.E Lee, “A Control & Diagno-
`sis Scheme for Semiconductor Manufacturing,” Proc. Amer-
`ican Control Conference, San Francisco (1993).
`6) S. F. Lee, C. J. Spanos, “Real-time Diagnosis for Plasma
`Etch Equipment,” Techcon, pp. 16-18 (1993).
`7) S.F. Lee, E.D. Boskin, H.C. Liu, E. Wen, C.J. Spanos,
`“RTSPC: A Software Utility for Real-Time SPC and Tool
`Data Analysis,” to appear in Trans. Semiconductor Mfg.
`
`1
`
`1
`
`PE
`SEP
`
`1
`
`1 ::z 1
`
`1
`
`34%
`
`82%
`
`4%
`
`z3
`
`6%
`
`58%
`83%
`
`25 %
`25%
`
`SEP
`
`Oxide Etch Rate
`(160-23OA)
`PREtchRate
`(1280 - 1750A)
`Uniformity
`(4 - 20%)
`
`Unlike the fixed input settings the real-time signals
`change with the state of the machine, eliminating the need for
`blocking terms. Figure 3 compares the modeling results of 6
`centerpoint wafers. The model built with the fixed input set-
`tings predicts a constant etch rate, while the real-time model
`adjusts the prediction as a result of small changes in the
`machine state. Thus, from the results listed in Table 2 and the
`above argument, we conclude that models built using real-
`time data predict etch rates with more accuracy than those
`built with input settings.
`5.0 Conclusions
`The three-module system presented is especially power-
`ful because it does not depend upon monitor wafers or expen-
`sive metrology; rather, it uses non-invasive real-time signals
`collected automatically from the tool while the wafer is pro-
`cessing. These signals are used effectively to detect and ana-
`lyze equipment faults. Prediction models have also been
`developed using real-time signals. Since the wafer parame-
`ters are predicted immediately after the wafer has finished
`processing in the machine, important yield information is
`obtained on a run-to-run basis. In addition to catching prob-
`lems with the machine, the real-time data can be used to
`assess the quality of the wafer immediately after processing,
`making it possible to ensure that only wafers worth process-
`
`- 136 -
`
`Authorized licensed use limited to: LEHIGH UNIVERSITY. Downloaded on July 12,2021 at 04:22:28 UTC from IEEE Xplore. Restrictions apply.
`
`Applied Materials, Inc. Ex. 1026
`Applied v. Ocean, IPR Patent No. 6,836,691
`Page 4 of 4
`
`