Cellular behavior is primarily dictated by the selective expression of a subset of genes. Normal growth and differentiation depends on the appropriate genes being expressed in a desired context. Various disease states alter the normal expression of genes as compared to normal tissue. For example, malignant transformation of cancer tissues involves or induces altered gene expression. Through signal transduction cascades and transcriptional networks, alterations of one gene can impact a large number of genes and result in global effects on cell behavior. Regulation of translation and post-transcriptional modification play significant roles, but, invariably, signal transduction pathways lead to the nucleus and changes in gene transcription.
Therefore, there has been enormous interest in the development of techniques that allow the analysis of differential gene expression between different tissues or cell lines. One such technique includes use of ordered micro-arrays that allow two color fluorescence detection of hybridization signals. Individual DNA targets are arrayed on a small glass surface and hybridized with fluorescently labeled heterogeneous DNA probes derived from cDNA. The amount of fluorescence at each DNA spot correlates with the abundance of that DNA fragment in the probe mixture.
Using micro-arrays, gene expression levels can be quantitated at up to thousands of genes simultaneously. As hundreds of the same array can be printed, numerous tissues can be easily analyzed for relative expression levels. As such, the technique provides a powerful new tool for analyzing differential gene expression in numerous biologic problems. In addition to the determination of gene expression differences between tissues, genomic micro-arrays are useful for genomic mapping, genomic ploidy measurements and as hybridization targets for genomic mismatch scanning. Such techniques require rapid quantitative analysis of fluorescent hybridization for hundreds to tens of thousands of DNA spots. As such, there is a severe bottleneck in gene expression data collection due to inadequate methods for processing of individual DNA spot images for determining the quantitative fluorescent hybridization levels.
Some existing methods include manual processing of DNA spot images using a generic image processing tool, such as NIH image. Using such a tool a user visually locates each DNA spot image in a micro-array image, and moves a display pointer to each spot image, and manually defines a small area around the spot image. The image processing tool then reports image intensity values within the small area. The user then manually records the intensity values and continues this process for other visually located DNA spot images in the micro-array image.
However, such manual methods are impractical for micro-arrays with more than a handful of spot images. Further such methods are tedious and repetitive, requiring considerable time and effort. For example, with a micro-array image having about 600 DNA image spots, such manual methods can take about 8 hours of work, and resulting in quantification of only a limited number of image spots which visually seem to have a “good” expression level. As the micro-array density increases and becomes more complex, use of such methods becomes even more prohibitive. For example, current micro-array sizes range from several hundred to 1,200 genes, arrayed in a 1.8×1.8 cm area. As tip fabrication has improved, arrays with greater than 50,000 genes are viable. Such methods are also prone to various errors, including errors in manually recording the intensity values. Further such methods provide inconsistent quantification of intensity values, both for different spot images measured by a single individual, and for multiple individuals making measurements from the same micro-array image.
To alleviate the shortcomings of manual methods, some existing methods automate the process of locating DNA spot images from micro-array images and quantifying corresponding expression values. Such methods utilize a computer to manually position a cell grid on an area of the micro-array image containing an array of DNA spot images. The grid can be resized and individual columns and rows of the grid can be manually adjusted to better fit the arrayed pattern of DNA spot images. The grid position is then used by the computer to quantify the expression values using the intensity levels at each cell in the grid. However, such methods are inflexible since the grid placement requires extensive user interaction to fine-tune the grid. Further, the grid used in such methods is either completely fixed in shape, or has limited global flexibility (e.g., resizing and rotating the entire grid).
Such limitations cause a major handicap in most DNA array image analysis applications since DNA spots are never perfectly formed in a regular grid pattern in a micro-array such as shown in FIG. 1. Although a robot used in spotting DNA fragments on a glass surface has positional accuracy to within +/−5 um, larger variations in the precise spacing of the arrayed DNA spots occur due to surface interactions of the solution with the silanized surface and tip variations. Moreover, printing tips are difficult to fabricate and many do not work uniformly. Therefore, as shown in FIG. 2, not only are DNA spots occasionally placed out of the regular grid pattern, but they also vary in size. It is therefore rare to have a fixed grid that can match exactly the pattern in the micro-array. Though in existing methods the grid can be manually resized, rotated, and a column or a row of the grid can be moved, the individual grid cells cannot be manipulated. Therefore, such methods are impractical for most DNA array image analysis applications, and specially for high density micro-arrays
Further, DNA spot image signals derived from the micro-arrays are susceptible to surface noise and laser reflection, due to surface dust. And, nonspecific DNA binding to the silanized surface occurs in a non-uniform pattern creating a varying background of fluorescence over the surface. Existing methods are unable to cope with irregular micro-array pattern, search for DNA image spots, and accurately quantify specific signals while accounting for the local background.
Other existing methods do not use a grid at all but apply a “spot” filter to detect locations in the micro-array image which “look-like” DNA spot images. However, using such methods it is difficult to define what a spot should look like. Furthermore, extensive noise and variations in the spot shape, due to the processing and scanning mechanisms, significantly reduce the signal to noise ratio (SNR) of the spot images. Thus, the detection scheme misses many real spots and processes many false patches in the image as real DNA spot images.
Another disadvantage of existing systems is their inability to display micro-array image pixel intensities, corresponding to gene expression values in related DNA spots for example, in an intuitive manner. As such, the user cannot easily determine gene properties in such DNA spots.
There is, therefore, a need for a DNA array image analysis method for automatically segmenting DNA array images into individual DNA spot images for quantification. There is also a need for such method to process irregular micro-array patterns, search for DNA image spots, and accurately quantify, and intuitively display, specific signals while accounting for the local background.