Automated systems for the rapid and accurate screening of cytological specimens are being developed to address cost, labour and liability issues. One such example is the development of automated systems for the well-known Pap (i.e. Papanicolaou) test. The Pap test is a screening test for evidence of pre-cancerous lesions in exfoliated cervical cells. The Pap test involves a tedious, manual examination of tens of thousands of cervical epithelial cells, and as a result is costly to apply and is subject to human error. Nevertheless, the Pap test has an enviable record of reducing cervical cancer mortality in the countries where it is applied. Thus, an automated alternative has been eagerly sought.
The most successful automated Pap test systems emulate cytotechnologists, the highly-trained professionals that screen this and other tests. As the cytotechnologist relies on the visual evaluation of cervical cells, so the automated systems depend upon some type of image analysis.
For machines, image analysis typically comprises a series of four steps. First, the microscopic image is digitized to the image into a form that may be readily used by the electronic hardware and the software instruction set for the machine. Second, the digitized image is segmented. Segmentation involves separating the relevant portions of the digitized image from the rest of the image. In most image analysis systems, segmentation comprises the most difficult and crucial step in the processing sequence. Since segmentation must e done well in advance of any pattern recognition operation, the segmentation procedure must be designed to use visual keys such as edges to find and separate the important image components. The third step is known as feature extraction. Each of the segmented regions or objects in the image is subjected to a range of mathematical measures that seek to encapsulate the visual appearance in numerical form. The fourth step known as classification involves using the numerical features to arrive at some type of conclusion about the object's identity.
Near Infrared (NIR) spectroscopy is an established technique for the extraction of quantitative measures in a wide variety of materials. Recently, NIR spectroscopy has been applied to human tissue samples in order to discriminate between cancerous or pre-cancerous tissue and normal tissue.
Most of the absorption spectra of organic compounds are generated by the vibrational overtones or the combination bands of the fundamentals of O--H (oxygen-Hydrogen), C--H (Carbon-Hydrogen), N--H (nitrogen-Hydrogen), and C--C (Carbon-Carbon) transitions. As these transitions fall in the mid-infrared regions, NIR spectra in the easily accessible range between 0.7 microns and 2.5 microns are produced. However, the strengths of these spectra are one to three orders of magnitude smaller than the associated fundamentals and therefore special care needs to be taken to recover and analyze this information.
Known research into the NIR response of normal and pre-cancerous human tissue has uncovered a host of structural and chemical changes which may be used to discriminate between normal and pre-cancerous tissues. These features include increases in glycogen content, extensive hydrogen bonding of phosphodiester groups in nucleic acids, tighter physical packing of nucleic acids, phosphorylation of C--OH groups in carbohydrates and proteins, increased disorder of methylene chains in membrane lipids, increased ratio of methyl to methylene, reduction in the hydrogen bond strength in the amide groups of .alpha.-helical segments and an increase in the hydrogen bond strength in the amide groups of the .beta.-sheet segments.
On the basis of these results, proposals for NIR-based cancer screening protocols have been made, including a screening protocol for the early detection of pre-cancerous lesions of the uterine cervix and carcinomatous breast tissue.