In the manufacture of products, it is often necessary to conduct quality assurance measurements at various stages in the manufacturing process, to ensure that the ultimate product meets required specifications. Similarly, during research in the development of new products, it is necessary to be able to quantitatively characterize the structure of a prototype material. These requirements are particularly applicable in the field of advanced materials. In the context of the present invention, the term "advanced materials" encompasses a variety of different types of crystalline, polycrystalline and amorphous materials, including semiconductors, metallic thin films, glasses, coatings, superconductors, and the like. One example of a situation in which these types of measurement are particularly applicable is the manufacture of semiconductor products. Typically, a silicon wafer which forms the starting material for a semiconductor product has an appreciable cost associated with it. Each successive major processing step increases the cost of the overall process by a factor of about ten. Hence, it is necessary to be able to characterize the structure of the wafer during research and throughout the manufacturing process, to identify potential problems and correct them before significant waste occurs. Similar concerns apply in the processing of other types of advanced materials as well.
Originally, destructive testing methods were employed to determine characteristics such as the thickness of various layers of a structure, the roughness of their interfaces, their respective densities, strain and the tilt of crystalline layers. Over time, there has been a tendency to employ larger wafers in the process of manufacturing advanced materials. Because of the cost of each individual wafer, destructive testing methods have become quite uneconomical, and consequently undesirable. For this reason, non-destructive methods are employed.
One particularly useful category of non-destructive testing is X-ray metrology. In addition to being non-destructive, X-ray metrological methods provide several other advantages for both process development and process control of advanced material structures. For instance, they are capable of detecting unacceptable levels of internal defects generated by various processes. Furthermore, as individual material layers on such structures become thinner, traditional methods for measuring thickness and roughness become less accurate, relative to the results that can be achieved by X-rays. A variety of different X-ray metrological methods are known, for measuring different characteristics of a structure. Examples of these methods include X-ray absorption, diffraction, fluorescence, reflectivity, scattering, imaging and fringe analysis. In the context of the present invention, the term "X-ray scattering" is employed as a generic term which collectively encompasses any known X-ray technique that is applied to materials characterization.
The various types of X-ray metrological methods basically involve the projection of an X-ray beam onto a specimen being tested, and measurement of the intensity of the rays which are scattered by the specimen. If the crystallographic structure of a device is known, it is possible to calculate its X-ray scattering. For example, the equations for calculating X-ray diffraction are described by S. Takagi, Acta Cryst., Vol. 15, p. 1311 (1962) and J. Phys. Soc. Japan, Vol. 26, p. 1239 (1969), and by D. Taupin, Bull. Soc. Fr. Mineral Crystallogr., Vol. 87, p. 469 (1964). In addition, software programs for modeling and predicting experimental data are commercially available. Examples of such include the RADS program for modeling high resolution X-ray diffraction, and the REFS program for modeling X-ray reflectivity, both of which are distributed by Bede Scientific Instruments Ltd., Durham, UK.
While it is possible to predict the X-ray scattering data that can be obtained from a given structure, the inverse operation cannot be directly achieved. In particular, it is not possible to calculate the structure of a device from its X-ray scattering data. This limitation is due, at least in part, to the fact that the X-ray scattering data only provides a measurement of the intensity of the detected rays, but not their amplitude or phase. As a result, insufficient information is available from which to directly derive the parameters that characterize the structure, such as layer thicknesses, density, surface roughness, strain, etc.
Consequently, to characterize a structure on the basis of its X-ray scattering data, it is necessary to infer such structure by means of a trial-and-error technique. In the known techniques, an iterative procedure is employed, in which a structural model for a specimen is first suggested, on the basis of available information, such as the materials which were employed in the manufacture of the specimen being tested. The X-ray scattering data for this suggested model is then simulated, for example by means of one of the above-mentioned programs, and this simulated data is compared with the experimental, or measured, data from the specimen, either graphically or by using any suitable function which provides an indication of the differences or similarities between them. Based on this information, the suggested model is refined, i.e., certain parameters are changed. The X-ray scattering data for this revised model is then simulated, and compared against the measured data. This process is iteratively repeated until the difference between the simulation and the actual data converges to a minimum or falls below a predefined threshold.
This iterative procedure can be characterized as a non-linear data fitting problem. Given a particular curve, i.e., the experimental data, it is necessary to find a known curve which most closely matches the experimental curve. The parameters which define the known curve are then assumed to be those which characterize the tested specimen. In the past, various approaches have been applied to non-linear data fitting. In one approach, known as direct search, the parameter space is divided into small, but finite, regions. An error function is calculated for each region, and the region that produces the smallest error value is chosen as the best-fit parameter vector. In a related approach, known as the Monte Carlo method, the parameter space is again divided into small regions. The regions are selected at random, and the error function is evaluated for each. After a certain number of regions have been chosen, or when the error value is smaller than a specified value, the search is stopped. The region with the smallest error value is chosen as the best fit.
Other approaches to parameter optimization are also well known. In one approach, identified as Downhill Simplex, an initial guess for the best-fit parameters is made by the user. A geometrical construct known as a simplex then moves in directions that decrease the error value. The parameter vector that yields the smallest error value in the neighborhood of the initial guess is chosen as the best-fit parameter vector. In the Levenberg-Marquardt method, an initial guess for the best-fit parameters is again made by the user. This method combines linearization and gradient searching of the error function, to minimize the error in the neighborhood of the initial guess. The parameter vector which provides the smallest error value is selected as the best fit.
These latter methods have relatively limited usefulness, since they are highly dependent upon the accuracy of the initial estimate. In operation, they tend to settle upon a local minimum in the error function which is in the neighborhood of the initial estimate. An optimal solution, i.e., a global minimum, which is significantly different from the initial guess may never be found, because of the manner in which they operate. Hence, the fitting can only be carried out by a person having a high level of skill who is able to make an accurate initial estimate. Even then, it may take several hours to find the best solution.
To avoid the problem of having to start with an estimate that is close to the optimum solution, another approach known as simulated annealing has been employed. This approach uses the physical principles that govern annealing, namely the slow cooling of a liquid so that it forms a crystal, to obtain the best-fit parameters. While this approach provides improved results, in that it is capable of escaping from local minima, it can be quite difficult to design and set up the annealing schedule. The process also requires a considerable amount of time, since it does not search the parameter space in an efficient manner, and therefore is not well suited for on-line quality control for processes.
It is desirable, therefore, to provide a technique for fitting experimental X-ray scattering data to simulated data which operates in an efficient and robust manner to find a global solution, while avoiding the need to begin with an estimate that is close to the optimum, and which can therefore be practiced by non-skilled operators.