This application claims priority under 35 U.S.C. xc2xa7119 (b) to British Patent Application No. 0103030.3 filed Feb. 7, 2001, the disclosure of which is incorporated herein by reference in its entitety.
The invention relates to a method of processing a spectrum, in particular an elastic scattering spectrum taken from tissue and to apparatus including a spectrum processor for carrying out the method.
Elastic scattering spectroscopy is a known technique for investigating tissue. In essence, light is shone into human tissue, generally living human tissue, and a photoreceptor measures the light transmitted to the photoreceptor through scattering in the tissue. The spectrum of light passing through the tissue is then recorded, and used to assist in diagnosis of any of a number of medical conditions that the patient may have. Thus, the technique may be described as optical biopsy.
Prior art apparatus for carrying out optical biopsy is presented in WO98/27865 to David Benaron, and in U.S. Pat. No. 5,303,026 to Stroble et al. The latter patent describes a system having a light source feeding into a reference optical fiber and a probe optical fiber. The probe optical fiber being brought to a probe tip. The probe tip has another optical fiber arranged adjacently of it, which collects light and brings it to a detection system which compares its intensity to the intensity of light on the reference optical fiber. When the probe tip is brought against human tissue the detection system can record the difference as a between the reference signal strength and that of the light scattered by the human tissue as a function of wavelength to obtain an optical biopsy spectrum.
The use of an elastic scattering spectrum to diagnose a number of medical conditions is described in a number of papers. Zhengfang GE et al describe in the paper xe2x80x9cIdentification of Colonic Dysplasia and Neoplasia by Diffuse Reflectance Spectroscopy and Pattern Recognition techniquesxe2x80x9d, Applied Spectroscopy Volume 52 number 6 (1998) p 833, a method of identifying colonic dysplasia and neoplasia. The paper describes a number of different pattern recognition techniques used to evaluate the samples.
One of these is multiple linear regression analysis, which is used to fit to reflectance intensities measured at 26 different wavelengths every 16 nm in the range 350 nm to 750 nm. An output score is obtained from the formula:   score  =      k    +                  ∑                  j          =          1                26            ⁢                        a          j                ⁢                              D            i                    ⁡                      (                          λ              j                        )                              
The coefficients aj are fitted coefficients arranged such that the score is +1 for adenomatous polyps and xe2x88x921 for hyperplastic polyps. Dixcexi is the reflectance value for the ith tissue sample at the jth wavelength.
Another approach described in the paper by Zhengfang et al is linear discriminant analysis. This is a method of classifying a test into one of k groups using a classification score that can be computed from a formula. The test is classified into the group which gives the lowest classification score.
The classification of a test object Xi=(x1,x2, . . . xd) containing d independent integers is assigned to one of k gropus using the classification score defined as
Dk2=(Xixe2x88x92xcexck)TMxe2x88x921(Xixe2x88x92xcexck)
where Mxe2x88x921 is the inverse of the pooled covariance matrix over all classes   M  =            1      n        ⁢                  ∑        k            ⁢                        ∑                      i            =            1                                n            k                          ⁢                              (                                          X                i                            -                              μ                k                                      )                    ⁢                                                    (                                                      X                    i                                    -                                      μ                    k                                                  )                            T                        .                              
A third approach is backpropagating neural network analysis using a multilayer neural network with n input nodes, a hidden layer and an output layer. Neural network techniques have been widely reported and will not be discussed further here.
Other papers describe the use of elastic scattering spectroscopy in the diagnosis of a number of conditions. Backman et al describe the detection of precancerous epithelial cells in xe2x80x9cDetection of Preinvasive Cancer Cellsxe2x80x9d, Nature, vol 406 p35 (2000). Perelman et al, in xe2x80x9cObservation of Periodic Fine Structure in Reflectance from Biological Tissue: A New Technique for Measuring Nuclear Size Distributionxe2x80x9d, Phys. Rev. Lett. vol 80 p627 (1998) describe periodic fine structure in mucosal membranes. The diagnosis of bladder cancer is described in xe2x80x9cSpectroscopic Diagnosis of Bladder Cancer with Elastic Light Scatteringxe2x80x9d Mourant et al, Lasers in Surgery and Medicine, Volume 17 page 350 (1995). The use of elastic scattering to diagnose pathologies in the gastrointestinal tract is described in xe2x80x9cElastic Scattering Spectroscopy as a diagnostic tool for differentiating pathologies in the Gastrointestinal tract: preliminary testingxe2x80x9d, Mourant et al, Journal of Biomedical optics, Vol 1 p192, and in xe2x80x9cUltraviolet and visible spectroscopies for tissue diagnostics: fluorescence spectroscopy and elastic scattering spectroscopyxe2x80x9d, Bigio and Mourant Phys. Med. Biol. Volume 42 p803 (1997).
It is thus clear that the use of elastic scattering spectroscopy is attracting interest as a diagnostic tool. In spite of this research interest the most reliable approach presently used for detection of cancer in tissue and other conditions is histology. However, this is time consuming and laborious and in many situations, especially for diagnosing cancer, multiple biopsies may be needed.
There is thus a need to develop optical techniques further. One application is to guide conventional biopsies, avoiding false negatives and reducing the number taken while increasing yield. The long-term goal is to develop the techniques to a point where they can be used rapidly, efficiently and reliably to diagnose conditions without the need for histology.
According to the invention there is provided a method of processing a broadband elastic scattering spectrum obtained from tissue comprising the steps of: obtaining, in a plurality of fitting ranges of wavelength, fitting parameters giving the best fit to the spectrum in the respective fitting ranges; and recording the fitting parameters as a parameter data set representing the spectrum; wherein in at least one fitting range the fit is to the absorption of at least one predetermined component, and in the remainder of the fitting ranges the fit is to a smooth function.
By fitting in a number of different fitting regions to known absorption spectra and to a smooth function a measured spectrum including a large number of data points can be reduced to the very much smaller number of data points, i.e. the fitting parameters. Subsequent data processing using the fitting parameters instead of the whole spectrum (as used in the prior art discussed above) may allow simpler, more reliable and more rapid assessment of the patient""s condition.
The method can be thought of as using model dependent fitting, i.e. of analysing the spectrum using a model of the absorption with certain absorbing components absorbing at certain frequencies before carrying out any diagnosis or discrimination.
The fit to the absorption of at least one predetermined component may be to the absorption line shape of the at least one predetermined absorbing component. In particular, the fit may use a parabolic approximation to the peak of absorption of that absorbing component.
The fit to the absorption of at least one predetermined component may be a fit to an absorption spectrum previously measured using an optical biopsy probe on a sample of the predetermined absorbing component in a tissue-like matrix. This absorption spectrum, in general, differs from the simple absorption spectrum due to scattering obtained from a conventional optical transmission cell and available in most textbooks. The use of a spectrum measured using an optical biopsy probe on a sample in a tissue-like substrate has not previously been suggested, as far as the inventor is aware.
Alternatively, especially for single component systems, the fit may be nothing more than determining the excess of absorption in the spectrum at a predetermined frequency over the background spectral lineshape due to scattering calculated by extrapolating a straight line fit in a neighbouring region of the spectrum. The predetermined frequency is preferably the peak absorption wavelength of the absorbing component.
The fit to a smooth function is preferably to a straight line. Such fits are straightforward to carry out and with suitable choice of fitting ranges can parameterise an absorption slowly varying with wavelength, i.e. the spectral lineshape due to scattering in the absence of any absorption features.
The fitting ranges, taken together, preferably include at least 60%, further preferably 80% of the wavelength range of the complete broadband elastic spectrum, at least in the range in which the spectrum has been measured with reasonable accuracy. In this way substantially all of the measured spectrum may be parameterised.
The method may also include, after obtaining fitting parameters in one fitting range, calculating a modified spectrum to compensate for the shape of the spectrum represented by the fitting parameters in that fitting range, and using the modified spectrum when fitting parameters in at least one further fitting range. This may be done by inputting the recorded fitting parameters in that fitting frequency range into a model of the absorption, calculating the expected absorption spectrum determined by the model with the input fitting parameters and subtracting the calculated absorption spectrum from the initial spectrum to obtain the modified spectrum used when fitting parameters in at least one further fitting range.
It should be noted that the fitting regions may overlap. For example, it may be desired to fit to a line shape of a known absorption component in a certain fitting region and then to multiply by a predetermined function to remove that line shape. However, there may still be information in the residual intensity of the lineshape due to scattering and this can be fitted using linear fitting parameters in a fitting region that may overlap or even be identical to the fitting region used to fit to the line shape of the absorption component.
The preprocessed spectrum may be fed to a discriminating algorithm to determine whether or not the data corresponds to one or more medical conditions. The discriminating algorithm may be trained to detect a particular medical condition or to discriminate between a number of clinically similar conditions. The training will use a number of training samples. The skilled person will realise that there are a number of suitable models with a number of variable fitting parameters for implementing the discriminating algorithm. For example, a neural network approach may be used, a linear discriminant analysis or a hierarchical cluster analysis. All of these are known per se, and the first two are, for example, discussed in the paper by Zhenzhou et al mentioned above. It is generally the case, however, that the smaller number of model dependent fitting parameters obtained using the preprocessing method according to the invention can provide a benefit whatever fitting and diagnosis approach is used.
One reason for the improvement is the reduction in the number of points in the data set for fitting. Whether using a neural network or other discriminant analysis, the large number of points in the original data set of the whole spectrum means that a large number of training samples would be needed to train any model of the results output from the preprocessor. By reducing the number of points in the data set to be fitted less training is required and the fit can be carried out more reliably.
A preferred approach is hierarchical cluster analysis in which the n parameters of the spectrum define a unique point in n-dimensional space. Clusters of points are determinedxe2x80x94the diagnosis corresponds to identifying in which cluster a measured spectrum point lies. Hierarchical cluster analysis has the advantage that it allows a xe2x80x9cdon""t knowxe2x80x9d response, if for example the measured point is located far from any of the clusters identified. This is of advantage in preventing false diagnosis in cases where no such diagnosis can be reliably made.
A number of different absorbing components can be fitted and each absorbing component will absorb in a different wavelength range and hence a different fitting range.
One example of general application in human tissue is to fit to the haemoglobin absorption; this can be done by fitting to the saturation and the total hematocrit concentration. The saturation is defined as the percentage of oxygenated haemoglobin to the total hematocrit (oxygenated and deoxygenated haemoglobin). Such a fit to haemoglobin concentration may be carried out in a fitting range including at least part of the region of the spectrum from 320 nm to 620 nm.
Haemoglobin gives rise to an absorption feature which may dominate the spectrum in the region of 415 nm, called the Soret band. Other constituents of tissue also absorb near this wavelength and once the xe2x80x9cnormalxe2x80x9d Hb absorption has been removed the absorption due to these features will be observed and may be fitted. The other features may be the absorption due to components such as cytokines.
Other absorbing components are relevant to a number of different kinds of spectrum. For example, to detect breast cancer it is preferred to fit to the beta-carotene absorption spectrum in the fitting range 400-520 nm.
In some studies, exogenous dye may be introduced into tissue for diagnostic purposes and accordingly the preprocessing method can include fitting to the spectrum of the dye used. For example for, for suspected cases of breast cancer, blue dye can be introduced into human tissue to trace the spread of the disease. For the dye used in studies to date, Patent V blue dye (Trade Mark), a fitting range including at least past of the range 530 nm to 720 nm is suitable.
One predetermined fitting region may be a region in the range 630 nm to 810 nm, and the fit may be a linear fit in this region. The method may also include fitting to a linear model in a number of other regions. These can include linear fitting in the range 340 nm to 360 nm, and/or the range 320 nm to 330 nm. These fittable regions will be observed after the removal of absorption features.
The spectral trace may be checked for and the spectrum rejected if the check reveals measurement errors or unsuitable data. For example, it may be advisable to check for a minimal transmitted intensity in the Soret band and to reject the spectrum if the measured transmitted intensity is substantially zero in this band. In other words, if the measured transmitted intensity is less than 10% full scale, preferably less than 5% full scale, the spectrum may be rejected. Another possibility is to check for interference from background illumination and to reject the spectrum if background illumination is too high. Further, the spectrum can be checked for contact between probe and tissue.
The invention also relates to a method including recording an elastic scattering spectrum from tissue, and preprocessing the spectrum as described above.
The tissue may be in vivo, i.e. tissue incorporated in the living human body.
The tissue may be in vitro, i.e. tissue removed from the body.
In another aspect there is provided a method comprising the steps of recording an elastic scattering spectrum; preprocessing the spectrum using a preprocessing method to obtain a plurality of fitting parameters characterising the spectrum; testing the preprocessed spectrum using a discriminant model; and outputting a result based on the model.
The result may be an output indicating to which class, if any, the recorded elastic scattering spectrum belongs. The output may thus be a diagnosis.
In embodiments, the discriminant model may be a neural network, linear discrimination, hierarchical cluster analysis or other methods as are known to those skilled in the art.
The preferred discriminant model uses hierarchical cluster analysis. This groups data into unbounded class regions permitting xe2x80x9cnot surexe2x80x9d diagnostic indications, rather than forcing a decision and risking a false diagnosis.
In another aspect, the invention relates to a method comprising the steps of recording an elastic scattering spectrum from tissue; processing the spectrum to produce a number of parameters characterising the spectrum; determining to which, if any, of a number of classes the parameterised spectrum belongs; and outputting the class, if any, to which the spectrum is determined to belong.
In a further aspect there is provided a training method comprising the steps of recording a plurality of optical biopsy spectra from tissue for which it is known whether the tissue displays a predetermined medical condition; preprocessing each of the spectra using a method as defined above; and training a discriminant model using the preprocessed spectra.
In another aspect there is provided an apparatus for optical biopsy, comprising apparatus for elastic scattering spectroscopy of tissue, including a light source for emitting light over a broad range of frequencies; a probe for transmitting light from the light source to tissue and for receiving light scattered in the tissue; a spectrometer for measuring the intensities of the received light as a function of frequency; and a processor for processing the measured light spectrum arranged to carry out the method as described above.
The apparatus may include a first optical fiber bringing light from the light source to a probe tip; and a second optical fiber bringing scattered light from the probe tip to the spectrometer; wherein the ends of the first and second fibers at the probe tip are arranged adjacently spaced apart by a predetermined distance.
The apparatus may include a decision processor for checking the fitted parameters against the results for one or more predetermined medical conditions and outputting the best fit medical condition based on the decision processor output.