In many fields, for example speech signal processing, artificial intelligence, telecommunications and medical imaging, it is desired to obtain estimates of statistical information such as probability distributions from discrete, sampled signals, such as digitised speech, still images or video. Examples of probability distributions are probability density functions (PDFs) and cumulative distribution functions (CDFs). For a one-dimensional signal y=f(x), the PDF gives the density of samples of that signal having particular values. As an example, to assist understanding of the background of the present invention, for the signal y=sin(x), the PDF can be calculated analytically and is plotted in FIG. 1. FIG. 1 also plots the CDF, which is obtained by integrating the PDF and gives the probability that the signal has a value less than a particular y value. For example, for y=1, the CDF is 1 because the function sin(x) is always less than or equal to 1. For y=0, the CDF is 0.5 because there is a probability of 0.5 that a sample of the signal will be less than zero.
In the case of real-world signals which are sampled and whose form is not known in an analytic representation, then a conventional method for PDF estimation is known as histogramming. The possible values of the signal are divided into a number of ranges known as bins. Then for each bin a count is kept of how many times the available samples of the signal fall within the range of that bin. A histogram can then be plotted of the number of times a sample falls within the range of a particular bin divided by the total number of samples. FIG. 2 shows an example of such a histogram for the signal y=sin(x). In this case, the signal values y, which must lie between −1 and +1, are divided into 64 bins, each with a bin-width of 1/32. It can be seen that this histogram gives an approximation to the PDF. The continuous line representing the PDF is superimposed on FIG. 2. The PDF has been normalized by a weighting factor by assuming that the point evaluation of the PDF is constant across the width. The bin values of a histogram represent a probability that the signal lies between two value points, the upper and lower bin boundaries, and therefore must be less than 1. The histogram is a piecewise constant function. On the other hand, a PDF is a continuous function and at a point represents the density of values which that function (signal) passes through (and hence can be greater than 1). In the limit of zero bin-width, the histogram equals the PDF.
Probability distribution estimation techniques generally fall into one of three categories: parametric, non-parametric and semi-parametric. Parametric techniques are suitable where a particular form of function can be assumed due to some application specific reasons. For example, Rician and Rayleigh functions are often used in medical signal processing applications e.g. ultrasound imaging. However, such analytical forms for the PDF are generally neither known or not appropriate in most applications.
Probably the simplest and most widely used non-parametric technique is histogramming, as explained above. However, this technique has a number of associated problems, such as the requirement to define in advance the number of bins, and to specify the arbitrary bin boundaries, both of which render the histogram sensitive to slight displacements of the signal, and also the block-like nature of the resulting PDF estimate. Furthermore, the resulting PDF estimates tend to be poor and require large numbers of samples to produce stable estimates (typically the number of samples must considerably exceed the number of bins). Conventionally, it is widely assumed that only the samples available can be used and hence for a given portion of the signal, the number of samples is fixed. For example, if it was desired to estimate the PDF of the part of an image corresponding to a face, and this was represented by 50 pixels, then conventional methods would use only those 50 pixels. A number of techniques to repair these problems are available, such as Parzen windowing in which the PDF is approximated as the superposition of kernels placed at domain positions corresponding to the (co-domain) value of each sample. However, they do not work well in general. For example, Parzen windowing avoids arbitrary bin assignments and leads to smoother PDFs, however, a suitable kernel shape and size must be chosen. Conventionally this choice has been somewhat arbitrary and non-systematic, so does not give stable or universally predictable results.
Semi-parametric techniques, such as Guassian mixture models, offer a compromise between the parametric and non-parametric approaches, whereby the superposition of a number of parametric densities are used to approximate the underlying density.
Thus the current techniques suffer from the problem that the PDF can only be estimated from large parts of the signal (to obtain enough samples), and the resulting PDF estimates often exhibit poor stability (if the signal is shifted, the PDF estimate changes), poor accuracy, and poor resolution (limited to the bin width); conventional techniques also require the careful setting of several parameters, such as bin-widths or smoothing kernel shapes and sizes. However, according to the invention, it has been realized that some of these limitations and problems arise from the fact that these techniques do not use all of the information in the sampled signal.
The Whittaker-Shannon sampling theory states that a band-limited continuous signal, y(t) can be uniquely reconstructed from its samples (assumed to be periodic) as long as the sampling rate Fs satisfies the relationship, Fs≧2B where B is the highest frequency present in y(t). When Fs=2B the signal is said to be critically sampled and the respective sampling frequency referred to as the Nyquist rate. Since real-world signals have infinite bandwidth (in theory at least) they do not have an upper limit on their frequency. Therefore, they must be low-pass filtered prior to sampling in order to avoid corruption of the reconstruction known as aliasing. In such a case, the reconstruction is of the band-limited signal, its bandwidth chosen appropriately for the application. For example, since speech can generally be understood with ease even if the signal is cut-off at 4 kHz, telephone quality speech can be sampled at a rate of 8 kHz. Similarly, for a signal representing an image, filtering (in this case spatial filtering) is performed by the optics, such as the camera lens and aperture.
Essentially, band-limited signals of practical interest, such as speech or images, can be reconstructed exactly given three pieces of information: (1) the samples; (2) their order; and (3) the sampling pre-filter characteristics.
Often, conventional techniques for PDF estimation, such as histogramming, Parzen windowing and Gaussian mixture models, assume that the samples are independent and identically distributed (IID) samples from some continuous underlying PDF which is to be estimated. However, this assumption is not true for band-limited signals sampled at least the Nyquist rate. Essentially these methods just use the first piece of information, i.e. the sample values. However, this disregards information. For example, given a sample value of a signal at one point, the next consecutive sample cannot just take an arbitrary value selected from the probability distribution because of, for example, constraints such as the frequency band limitation. This loss of information because of the IID assumption leads to poor system performance in which, for example, the number of samples determines the quality of the final estimate.