The present invention relates generally to automated imaging inspection techniques, and more particularly to automated inspection of arrays of identical objects using highly constrained tomography and Bayesian estimation.
Tomographic reconstruction from projections provides cross-sectional and 3-dimensional information, and is utilized in many fields including medical imaging, security screening, and automated inspection of manufactured goods. In manufacturing electronic assemblies, for example, solder joints or other interconnects are often not accessible for electrical testing or optical inspection. As a result, imaging using penetrating radiation, e.g. X-rays, is often used for automated inspection of such joints. Tomographic or 3-dimensional methods are required for two reasons. First, modern printed circuit assemblies (PCAs) are typically double-sided, and, in addition, may possess multiple internal layers. As a result, joints or components frequently obscure other joints in transmission radiographs, preventing easy interpretation. Second, many joint types are themselves 3-dimensional in nature, making it difficult or impossible to distinguish good joints from bad joints in transmission radiographs. In addition to their 3-dimensional nature, these joint types are typically deployed in dense arrays (linear, areal, or even 3-dimensional) with large numbers of similar joints in close proximity.
Tomographic methods which have been applied to automated X-ray inspection of solder joints include laminography, tomosynthesis, and various forms of cone beam or fan beam computed tomography (CT). CT methods include filtered backprojection (also known as convolution backprojection) as well as other transform methods, iterative methods including conjugate gradients, and specialized variations of these techniques. Unfortunately, such conventional methods typically perform poorly when applied to arrays of 3-dimensional solder joints, leading to sever artifacts in the reconstructed images that impede interpretation and automatic classification. Among the cause of such artifacts are poor signal to noise ratio (SNR), limited projection angles, lack of linearity, and X-ray scattering.
Solder joints typically contain lead, tin, or both, and are highly attenuating for X-rays. Each joint in an array can be up to several millimeters thick, attenuating more than 99% of the incoming X-ray photons, and resulting in a very low SNR in regions of greatest interest. Additionally, angles at which projection data may be collected are severely limited by nearby joints which are similarly highly attenuating. Equatorial projections are not available, for example, since each ray would typically pass through a large number of joints. Axial projections with acceptable SNR may also be difficult or impossible to obtain at reasonable doses (e.g., for CGAs or similar joints which are longest in the axial direction). As a result, projections are often available only for a limited range of angles approximately 30°–60° off the axial direction. The well-known Radon inversion theorem (See A. Louis and F. Natterer, “Mathematical Problems of Computerized Tomography”, Proc. IEEE 71:379–389 (1983)) guarantees that an object can be reconstructed from noiseless projections from all possible angles. When projections from a limited range of angles are available, or are corrupted by noise, exact reconstruction is not possible in the absence of additional information.
High attenuation combined with the use of a polychromatic source also leads to difficulties with reconstruction. Transmission tomography generally assumes exponential attenuation, l/l0=e−λX, where l0 and l are the original and attenuated intensities, respectively, λ is a linear attenuation coefficient, and X is the distance traveled through the object. Under these assumptions, taking logarithms leads to a linear system: −log(l) =−log(l0)+λX. X-ray sources used for solder joint inspection are broad-band, often emitting X-rays ranging in energy from 10 keV up to 160 keV or more. The attenuation coefficient, λ, is not constant over this range, and instead must be treated as a function of energy, with various materials having characteristic spectra. This poses two closely related difficulties for standard methods of tomographic reconstruction. First, the system equations cannot be readily linearized, violating the assumptions underlying all common tomographic methods. Second, as the beam passes further into the sample, the effective spectrum changes as radiation at some energies is preferentially absorbed or scattered. Typically, lower energies and energies near absorption edges are attenuated more strongly, resulting in so-called “beam hardening”. The net result is that identical absorbers at different locations can produce different attenuation, again violating the assumptions underlying tomography.
In addition, all classical methods of tomography, both transmission and emission, are predicated on straight-line propagation from source to detector, forming projections of the object under test. A significant minority of X-rays, however, is deflected during propagation through a sample. Both elastic and inelastic scattering occur, resulting in changes in direction (or equivalently, absorption followed by emission in a different direction), with or without an associated change of energy. Inelastic scattering, for example, is prominent in low-Z materials such as the glass-epoxy composites typically used as substrates for printed circuit assemblies. While the fraction of incoming photons undergoing inelastic scattering is typically small, they can nonetheless represent a large percentage of the detected X-rays in dark regions, as when a low-Z material such as a PCB substrate is adjacent to a highly attenuating material such as a solder joint. Scattering is particularly troublesome when area array sensors are used, since collimation is typically not practical. Heuristic corrections for scatter and beam hardening have been proposed, and can reduce, but not eliminate artifacts resulting from these mechanisms (e.g., see B. Ohnesorge, T. Flohr, K. Klingenbeck-Regn, “Efficient Object Scatter Correction Algorithm For Third and Fourth Generation CT Scanners”, Eur. Radiol. 9, 563–569 (1999)).
In principle, Bayesian reconstruction methods can surmount these difficulties. As a brief overview of the principles of the Bayesian tomography, let D represent a set of measured projection data, and let M represent a model of the object under consideration. Typically, M consists of a set of parameters describing the object(s) of interest. The goal is to assign values to the parameters of M that accurately reflect the objects being inspected from the noisy and potentially incomplete data, D. In maximum likelihood (ML) estimation, the estimated model MML is taken to be the value of M which maximizes the likelihood, P(D|M), of observing D, assuming M is correct. Equivalently, and more commonly, the log-likelihood can be maximized. Under appropriate assumptions, each of the classical tomography methods is equivalent to a form of maximum likelihood estimation.
In contrast, Bayesian estimation incorporates additional a priori or prior information, p(M), about the model M, summarizing all objective and subjective information about how likely alternative models are thought to be in the absence of measured data, D. The posterior probability, p(M|D), for any particular model (i.e., the probability that a given model M is correct having observed data set D) can be calculated using Bayes' rule, p(M|D)=p(D|M)p(M)/p(D), where p(D)=∫p(D|M)p(M) serves as a normalization constant.
In the simplest form of Bayesian analysis, so-called maximum “a posteriori” or MAP estimation, the estimated model MMAP is taken to be the value of M which maximizes p(D|M)p(M). The normalizing factor, p(D), is not required, since it is not a function of M. MAP estimation generalizes maximum likelihood by incorporation of prior information, and this has been shown effective in reducing artifacts in tomographic reconstruction. (See S. Geman and D. McClure, “Bayesian Image Analysis: An Application to Single Photon Emission Tomography”, Proc. Statist. Comput. Sect. Amer. Stat. Soc. Washington, D.C., paragraph. 12–18 (1985); T. Hebert and R. Leah, “A Generalized EM Algorithm for 3-D Bayesian Reconstruction From Poisson Data Using Gibbs Priors”, IEEE Trans. on Medical Imaging 8:194–202 (1989); P. J. Green, “Bayesian Reconstruction From Emission Tomography Data Using A Modified EM Algorithm”, IEEE Trans. on Medical Imaging 9:84–93 (1990); K. Sauer and C. A. Bouman, “A Local Update Strategy For Iterative Reconstruction From Projections”, IEEE Trans. On Signal Processing 41:534–548 (1993); and K. Hanson and G. Wecksung, “Bayesian Approach to Limited-angle Reconstruction in Computed Tomography”, J. Optimal. Sci. Am. 73:1501–1509 (1983), each of which is incorporated herein by reference for all that it teaches). Iterative methods to increase posterior probability, p(M|D), are typically used, sometimes using a filtered backprojection reconstruction as a starting estimate. MAP methods are also available for discrete tomography, where one or more model parameters (e.g., the attenuation coefficient in a particular region) are to be chosen from among a finite and typically small number of choices.
Considerable effort has been devoted to developing methods that converge rapidly, some requiring differentiable likelihoods and others which do not. Multi-resolution analysis has been shown to speed convergence and solution quality in some cases. (See T. Freese, C. Bouman, and K. Sauer, “Multiscale Bayesian Methods for Discrete Tomography”, Discrete Tomography: Foundations, Algorithms, and Applications, edited by G. Herman and A. Kuba, Birkhauser Boston, Cambridge, Mass., pp. 237–261 (1999), incorporated herein by reference for all that it teaches). In multi-resolution analysis, prior information has typically been incorporated via a potential function or Markov random field penalizing neighboring pixel values which are thought to be unlikely (e.g., a sharp difference not lying along well defined edges).
Both ML and MAP estimation result in a single model for the reconstructed object, namely that model which maximizes, respectively, the likelihood or the posterior probability. This can be appropriate and effective when the corresponding function (likelihood or posterior probability) is symmetrical or sharply peaked at the maximum. It can, however, be very misleading with distributions that do not satisfy these assumptions. (See C. Fox and G. Nicholls, “Exact MAP States and Expectations from Perfect Sampling: Grieg, Porteous, and Scheult Revisited”, in Bayesian Inference and Maximum Entropy Methods in Science and Engineering, edited by A. Djafari, AIP Conference Proceedings, 568:252–263 (2001), incorporated herein by reference for all that it teaches).
Full Bayesian inference, as opposed to MAP estimation, avoids this problem by treating the posterior probability, p(M|D), as a distribution over all possible models. Quantities of interest, such as parameter values or utility functions, are estimated by taking expectations over the posterior probability. In rare cases, the posterior probability and associated expectations may be computed analytically. More typically, numerical approximation is required. Markov Chain Monte Carlo (MCMC) methods represent the current state-of-the-art for approximating such expectations. Discrete and continuous parameters may be freely mixed, and differentiable likelihood is not required. As in the case of MAP estimation, multi-resolution analysis may be applied in conjunction with MCMC methods to improve convergence.
Full Bayesian analysis incorporating calculations of expectations (e.g., using MCMC) is computationally demanding compared to MAP estimation, which in turn is often slower and more demanding than conventional tomography. A key factor in successful implementation of full Bayesian tomography is therefore careful choice of a framework in which prior knowledge can be naturally and effectively represented, and in which MCMC approximation converges rapidly.
In order to illustrate some of the limitations and difficulties when attempting to use conventional model representations with Bayesian tomography, consider a specific example from the field printed circuit inspection. FIG. 1 shows a stylized cross-section through an ideal ball grid array (BGA) joint 10 connecting a printed circuit board (PCB) pad 3 of a printed circuit board (PCB) 1 to an integrated circuit (IC) pad 7 of integrated circuit (IC) 2. As shown, BGA joint 10 includes a ball 5 soldered to the PCB pad 3 with solder fillet 4 and soldered to the IC pad 7 with solder fillet 6.
In addition to normal joints, several types of defective joints can occur. FIGS. 2 and 3 respectively show a cross-section of a defective BGA joint where the PCB interface solder fillet 4 and IC interface solder fillet 6 are missing. Another defect that can occur is insufficient solder at fillet 4 (as shown in FIG. 4) or 6 (not shown), which is especially problematic when the ball 5 is of the non-collapsible type. Further, excessive solder can result in a short between one or more neighboring joints.
The simplest representation for reconstruction of a 3-dimensional region is a uniform voxelation. A BGA joint 10, for example, might be divided into 15×15×15 cubical voxels of uniform size, as illustrated in FIG. 5. In the simplest, and most conventional use of such a representation, fitting the model, M, for a single BGA joint involves assigning a value of the so-called CT number, λX, at each of the 3,375 voxels. Such a voxel-based model is an example of what is typically called a low-level representation. Voxel based models are extremely general and capable of representing virtually any object. Bayesian tomography generally requires computing the likelihood that projection images, D, could result from a given model, M. This computation is straightforward when a voxel-based model is used. Simple, deterministic ray-tracing can be used to predict the projections that would result in the absence of noise. Noise of known distribution may then be added, completing the calculation of the likelihood.
Despite these advantages, voxel-based, and other low-level representations are often poorly suited for Bayesian tomography. The number of parameters to be fitted is large, and may exceed the number of measurements, particularly when the number of projections is limited, as in X-ray PCA inspection. The resulting system of equations is typically ill-conditioned and often inconsistent, leading to slow convergence and artifacts in the fitted model. Local maxima are also common, since the system equations are not truly linear. Additionally, constraints and prior knowledge can be difficult or impossible to express in such a representation. Local information, e.g. smoothing or other regularization, can be easily incorporated, but global information is problematic. BGA balls are approximately spherical with diameters falling predominantly in a narrow range, for example, and the spacing between joints is known in advance. Unfortunately, expressing this information in a low-level representation is difficult and inefficient.
As described above, Bayesian reconstruction can minimize the effects of limited or poor data by incorporating prior information. Additionally, a detailed forward map accurately reflecting image formation and noise statistics replaces the often inappropriate and numerically unstable inverse map used by conventional methods. Nonetheless, Bayesian methods have found limited use to date in tomographic reconstruction, particularly in realtime automated industrial imaging inspection applications where the inspection and analysis operations must keep up with the rate of the particular manufacturing line, for example in electronic assembly manufacturing and/or testing lines. This is due both to the need to understand and model the relevant processes in considerable detail, including relevant prior knowledge, and to the high computational demands of Bayesian inference. Further, representation of prior knowledge and constraints is often difficult in conventional voxel-based representations.
It would therefore be desirable to have a computationally tractable framework for practical exploitation of the benefits of Bayesian reconstruction in industrial imaging that incorporates prior knowledge in a clear, concise, and natural fashion, and which is computationally efficient and robust. It would also be desirable that such framework allow realtime inspection, analysis, and classification in an automated industrial imaging inspection environment, especially one where the object(s) to be inspected consist of dense arrays of similar objects.