Tomographic imaging as a general concept means producing an image of internal structures of an essentially solid object by observing how certain trans-mitted waves (e.g., acoustic or electromagnetic) or particle beams behave when they pass through such structures. A typical application is medical X-ray tomography, in which the object is a living organism or a part thereof, and the waves used for irradiation are X-rays in the range from a few to some tens of keV or even around a hundred keV. The objective of the imaging process is to make diagnostic observations about such properties of the object that are not readily seen on the surface. Other applications of tomography include, but are not limited to, various industrial processes, in which it is useful to obtain knowledge about what is hidden inside a piece of raw material or a certain product.
For example, log tomography aims at examining logs prior to sawing so that each log could be sawed into planks in the most optimal way.
FIG. 1 illustrates a basic principle of what is known as computerised tomography with sparse angle data in two dimensions. An object 102 is irradiated from a limited number of directions with a radiation source, which in FIG. 1 is shown in two exemplary positions 106 and 107. A spatially sensitive detector, shown correspondingly in two positions 104 and 105, gives spatially bound attenuation information for each exposure. In FIG. 1 the circular regions 103 inside the object 102 are assumed to attenuate radiation more than the bulk of the object. Also, the region 101 outside the object is also assumed to have a different attenuation than the bulk of the object. The readings from the detector and the corresponding irradiation angles are taken into a computer 108, which uses a mathematical reconstruction method to calculate what sort of configuration of attenuation regions is located inside the object 101 in order to give just these particular attenuation profiles. The result of the calculation, which essentially represents an attenuation coefficient map of a two-dimensional slice of the object 102, is shown on a display 109. In order to gather additional information for the measurement, external measurement means like a stereoscopic camera 110, a mechanical measurement arm 111 or a laser scanner 112 can be used to scan the location and form of the outer boundary of the object 102.
An advantageous method for performing tomographic imaging is using statistical inversion, also known as Bayesian statistics. Methods based on this approach are known to produce very good images for limited angle tomography reconstructions. As a mathematical method it is not new, but it has long been regarded as computationally too intensive to be used in practical high resolution tomography applications. However, due to recent advances in algorithms (see, for example, the Finnish patent number FI 116324) and computational capabilities of computers, this method can be seen as a practical alternative.
Bayesian statistics is in essence based on inspecting the posteriori probability density function
                              p          ⁡                      (                          x              ❘              m                        )                          =                                            p              ⁡                              (                                  m                  ❘                  x                                )                                      ⁢                          p              ⁡                              (                x                )                                                          p            ⁡                          (              m              )                                                          (        1        )            that allows us to make inferences about our measurements m together with prior information about the unknown x. In this equation the term p(m|x) is the probability of the measurements, given the model that explains the measurements. This is also known as the likelihood function and it contains the forward model of the tomographic measurement. The density p(m) is the probability of the measurements and it can be viewed as a normalization constant. The term p(x) is the so called prior probability density function. This prior probability density can be used to make prior assumptions about the statistical properties of the unknown x, which is vital for limited angle tomographic reconstruction.
All information of the object is contained within the posterior probability density. However, because this density function is very high-dimensional, it cannot be easily used for visualization. For this purpose, one has to form some sort of estimate. There are two common ways of forming such an estimate: the conditional mean (CM) estimate:
                                          x            ^                    CM                =                              ∫            0                                                          ⁢                      x            ⁢                                                            p                  ⁡                                      (                                          m                      ❘                      x                                        )                                                  ⁢                                  p                  ⁡                                      (                    x                    )                                                                              p                ⁡                                  (                  m                  )                                                      ⁢                                                  ⁢                          ⅆ              x                                                          (        2        )            or the maximum a posteriori (MAP) estimate:
                                          arg            ⁢                                                  ⁢            max                                x            ⁢                          {                                                                    p                    ⁡                                          (                                              m                        ❘                        x                                            )                                                        ⁢                                      p                    ⁡                                          (                      x                      )                                                                                        p                  ⁡                                      (                    m                    )                                                              }                                      ⁢                                    (        3        )            The CM estimate {circumflex over (x)}CM can be seen as the statistically expected tomographic reconstruction, while the MAP estimate {circumflex over (x)}MAP can be seen as the tomographic reconstruction that maximizes the posterior probability density function. In both cases, the nature of the estimators is to find a balance between the likelihood of the measurements p(m|x) and the prior assumptions p(x) of the statistical properties of the unknown x. In Bayesian statistics, the role of the prior distribution is to regularize the solution by adding constraints on the unknown x.
When formulating a forward model for tomography, one needs to parameterize the medium in some way. FIG. 2 shows an exemplary way of doing this, by dividing the modelled region into equally spaced pixels (in the two-dimensional case) or voxels (in the three-dimensional case, illustrated in FIG. 2). FIG. 3 illustrates an alternative parameterization of the medium by using an irregular grid 301. The forward theory of the propagation of the ray along the path of each ray passing from a radiation source to the detector can then be modelled using one of many possible interpolation methods. Examples of rays are shown as 201 in FIGS. 2 and 302 in FIG. 3, and the detector is illustrated schematically as 202 in FIGS. 2 and 303 in FIG. 3. The grid has an exaggeratedly coarse resolution in both FIG. 2 and FIG. 3 for demonstration purposes only. In practice the grid may contain thousands or even millions of points.
Ignoring possible calibration parameters, the likelihood function p(m|x) of a tomographic model is completely defined by the set of model parameters xi, which define the two- or three-dimensional map of attenuation coefficients (or other characteristic descriptors) of the medium. The subscript i refers in general to the way in which the model parameters are identified; for example in FIG. 2 a three-dimensional indexing scheme is utilised (with the parameters of the 16 front voxels illustrated in FIG. 2), and in FIG. 3 a consecutive numbering scheme is utilised. Pixels of the detector are similarly indexed, so that in FIG. 2 we have a two-dimensional detector of M×N pixels and an exemplary two-dimensional indexing scheme mmn of detector pixels, where the indices mε[1,M] and nε[1,N]. FIG. 3 illustrates schematically a linear detector of N pixels m1, m2, . . . mN.
So-called Markov chain—Monte Carlo (MCMC) methods can be used for finding the most acceptable value of a random variable x. We may suppose that with respect to a random variable x there is a so-called posteriori distribution. In order to find the most acceptable value of x we generate a Markov chain by starting with some xi and using a general rule of generating each xi+1 from xi by drawing it from a probability distributionD(xi+1|xi)−p(xi,xi+1).  (4)Supposing now thatf(y)=∫f(x)p(x,y)dx  (5)it follows from ergodic theorems that
                                          1            N                    ⁢                                    ∑                              i                =                1                            N                        ⁢                                                  ⁢                          g              ⁡                              (                                  x                  i                                )                                                    →                              E            f                    ⁡                      (                          g              ⁡                              (                x                )                                      )                                              (        6        )            for any measurable function g(x) and the number N of elements in the chain (xi). In other words, we can use the chain (xi) as a simulated sample from the posterior distribution f(x) to calculate e.g. the mean value Ef(x) of x, the posterior variance Ef(x)2 or any higher moments of x.
The two commonly used methods for generating the Markov chain are the so-called Metropolis-Hastings (MH) method and Gibbs sampling. The MH method generates the whole new sample directly whereas Gibbs sampling generates one new component of the vector x at a time. The MH method is relatively fast in all kinds of distributions, but it does not work when the dimension of the random vector x is high. Gibbs sampling works for arbitrarily highly dimensioned vectors x. From the viewpoint of introducing additional constraints, such as a positivity requirement, Gibbs sampling is easy whereas the MH method is not.
A basic principle of all digital image processing is the use of an imaging grid, so that in a two-dimensional case the image consists of a number of picture elements or pixels. A three-dimensional object is modelled in a three-dimensional grid that consists of volumetric pixels known as voxels. Each pixel or voxel is associated with a value, for example in X-ray tomography with an absorption coefficient value, which at the beginning of the calculation is an unknown variable. Together the unknown variables constitute unknown vectors and matrices. The reconstruction process thus involves a large number of matrix calculations, in which the required number of processor operations scale up quadratically or in the third power of the number of image elements. Meaningful results are thus difficult to obtain in reasonable time and with reasonable requirements to computer power. As an example, Gibbs sampling has been considered fast enough for practical applications only in a limited number of distributions.