A variety of techniques have been developed to analyze DNA or other biological samples to identify diseases, mutations, or other conditions present within a patient providing the sample. Such techniques may determine, for example, whether the patient has any particular disease such as cancer or AIDS, or has a predisposition toward the disease, or other medical conditions present in the patient.DNA-based analysis may be used either as an in-vitro or as an in-vivo control mechanism to monitor progression of disease, assess effectiveness of therapy or be used to design dosage formulations. DNA-based analysis is used verify the presence or absence of expressed genes and polymorphisms.
One particularly promising technique for analyzing biological samples uses a DNA-based microarray (or microelectronics biochip) which generates a hybridization pattern representative of the characteristics of the DNA within the sample. Briefly, a DNA microarray includes a rectangular array of immobilized single stranded DNA fragments. Each element within the array includes few tens to millions of copies of identical single stranded strips of DNA containing specific sequences of nucleotide bases. Identical or different fragments of DNA may be provided at each different element of the array. In other words, location (1,1) contains a different single stranded fragment of DNA than location (1,2) which also differs from location (1,3) etc. Certain biochip designs may replicate the nucleotide sequence in multiple cells.
DNA-based microarrays deploy chemiluminiscence, fluorescence or electrical phenomenology to achieve the analysis. In methods that exploit fluorescence imaging, a target DNA sample to be analyzed is first separated into individual single stranded sequences and fragmented. Each sequence being tagged with a fluorescent marker molecule. The fragments are applied to the microarray where each fragment binds only with complementary DNA fragments already embedded on the microarray. Fragments which do not match any of the elements of the microarray simply do not bind at any of the sites of the microarray and are discarded during subsequent fluidic reactions. Thus, only those microarray locations containing fragments that bind complementary sequences within the target DNA sample will receive the fluorescent molecules. Typically, a fluorescent light source is then applied to the microarray to generate a fluorescent image identifying which elements of the microarray bind to the patient DNA sample and which do not. The image is then analyzed to determine which specific DNA fragments were contained within the original sample and to determine therefrom whether particular diseases, mutations or other conditions are present in the patient sample.
For example, a particular element of the microarray may be exposed to fragments of DNA representative of a particular type of cancer. If that element of the array fluoresces under fluorescent illumination, then the DNA of the sample contains the DNA sequence representative of that particular type of cancer. Hence, a conclusion can be drawn that the patient providing the sample either already has that particular type of cancer or is perhaps predisposed towards that cancer. As can be appreciated, by providing a wide variety of known DNA fragments on the microarray, the resulting fluorescent image can be analyzed to identify a wide range of conditions.
Unfortunately, under conventional techniques, the step of analyzing the fluorescent pattern to determine the nature of any conditions characterized by the DNA is expensive, time consuming, and somewhat unreliable for all but a few particular conditions or diseases. One major problem with many conventional techniques is that the techniques have poor repeatability. Hence, if the same sample is analyzed twice using two different chips, different results are often obtained. Also, the results may vary from lab to lab. Consistent results are achieved only when the target sample has high concentrations of oligonucleotides of interest. Also, skilled technicians are required to prepare DNA samples, implement the hybridization protocol, and analyze the DNA microarray output possibly resulting in high costs. One reason that repeatability is poor is that the signatures within the digitized hybridization pattern (also known as a "dot spectrogram") that are representative of mutations of interest are typically very weak and are immersed in considerable noise. Conventional techniques are not particularly effective in extracting mutation signatures from dot spectrograms in low signal to noise circumstances. Circumstances wherein the signal to noise ratio is 0 to strongly negative (-2 to -30 dB) are particularly intractable.
Accordingly, it would highly desirable to provide an improved method and apparatus for analyzing the output of the DNA microarray to more expediently, reliably, and inexpensively determine the presence of any medical conditions or concerns within the patient providing the DNA sample. It is particularly desirable to provide a technique that can identify mutation signatures within dot spectrograms even in circumstance wherein the signal to noise ration is extremely low. It is to these ends that aspects of the invention are generally drawn.
Referring now to FIG. 1, conventional techniques for designing DNA microarray chips and for analyzing the output thereof will now be described in greater detail. Initially, at step 100, fluorescently labeled primers are prepared for flanking loci of genes of interest within the DNA sample. The primers are applied to the DNA sample such that the fluorescently labeled primers flank genes of interest. At step 102, the DNA sample is fragmented at the locations where the fluorescently labeled primers are attached to the genes of interest to thereby produce a set of DNA fragments, also called "oligonucleotides" for applying to the DNA microarray.
In general, there are two types of DNA microarrays: passive hybridization microarrays and active hybridization microarrays. Under passive hybridization, oligonucleotides characterizing the DNA sample are simply applied to the DNA microarray where they passively attach to complementary DNA fragments embedded on the array. With active hybridization, the DNA array is configured to externally enhance the interaction between the fragments of the DNA samples and the fragments embedded on the microarray using, for example, electronic techniques. Within FIG. 1, both passive hybridization and active hybridization steps are illustrated in parallel. It should be understood that, currently for any particular microarray, either the passive hybridization or the active hybridization steps, but not both, are typically employed. Referring first to passive hybridization, at step 104 a DNA microarray chip is prefabricated with oligonucleotides of interest embedded or otherwise attached to particular elements within the microarray. At step 106, the oligonucleotides of the DNA sample generated at step 102 are applied to the microarray. Oligonucleotides within the sample that match any of the oligonucleotides embedded on the microarray passively bind with the oligonucleotides of the array while retaining their fluorescently labeled primers such that only those locations in the microarray having corresponding oligonucleotides within the sample receive the primers. It should be noted that each individual nucleotide base within the oligonucleotide sequence (with lengths ranging from 5 to 25 base pairs) can bond with up to four different nucleotides within the microarray, but only one oligonucleotide represents an exact match. When illuminated with fluorescent light, the exact matches fluoresces most effectively and the non-exact matches fluoresce considerably less or not at all.
At step 108, the DNA microarray with the sample loaded thereon is placed within a fluidics station provided with chemicals to facilitate the hybridization reaction, i.e., the chemicals facilitate the bonding of the oligonucleotide sample with corresponding oligonucleotides within the microarray. At step 110, the microarray is illuminated under fluorescent light, perhaps generated using an ion-argon laser, and the resulting fluorescent pattern is digitized and recorded. Alternately, a photograph of the fluorescent pattern may be taken, developed, then scanned into a computer to provide a digital representation of the fluorescent pattern. In any case, at step 112, the digitized pattern is processed using dedicated software programs which operate to focus the digital pattern and to subsequently quantize the pattern to yield a fluorescent intensity value for each array within the microarray pattern. At step 114, the resulting focused array pattern is processed using additional software programs which compute an average intensity value at each array location and provides for necessary normalization, color compensation and scaling. Hence, following step 114, a digitized fluorescent pattern has been produced identifying locations within the microarray wherein oligonucleotides from the DNA sample have bonded. This fluorescent pattern is referred to herein as a "dot spectrogram".
In existing biochips that actively initiate, facilitate or selectively block hybridization, a DNA microarray is prefabricated for active hybridization at step 116. At step 118, the DNA sample is applied to the active array and electronic signals are transmitted into the array to help facilitate bonding between the oligonucleotides of the sample and the oligonucleotides of the array. The microarray is then placed within a fluidics station which further facilitates the bonding. Thereafter, at step 122, an electronic or fluorescent readout is generated from the microarray. When electrical output signals from the biochip array are used to quantify and classify the post-hybridization output, the output signal is indicative of the number oligonucleotide fragments bonded to each site within the array. At step 124 the electronic output is processed to generate a dot spectrogram similar or identical to the dot spectrogram generated using the optical readout technique of steps 110-114. Hence, regardless of whether steps 104-114 are performed or steps 116-124 are performed the result is a dot spectrogram representative of oligonucleotides present within the DNA sample. Here it should be noted that some conventional passive hybridization DNA microarrays provide electronic output and some active hybridization microelectronic arrays provide optical readout. Thus, for at least some techniques, the output of step 108 is processed in accordance with steps 122 and 124. For other techniques, the output of step 120 is processed in accordance with steps 110-114. Again, the final results are substantially the same, i.e., a dot spectrogram.
At step 126, the dot spectrogram is analyzed using clustering software to generate a gene array amplitude readout pattern representative of mutations of interest within the target DNA sample. In essence, step 126 operates to correlate oligonucleotides represented by the dot spectrogram with corresponding DNA mutations. Next, at step 128, the resulting digital representation of the mutations of interest are processed using mapping software to determine whether the mutations are representative of particular diagnostic conditions, such as certain diseases or conditions. Hence, step 128 operates to perform a mutation-to-diagnostic analyses. Finally, at step 130 the diagnostic conditions detected using step 128 are evaluated to further determine whether or not the diagnostic, if any, can properly be based upon the DNA sample. Classical methods such as probabilistic estimator such as minimum a posteriori (MAP) estimator, maximum likelihood estimator (MLE) or inferencing mechanism may be used to render a diagnostic assessment.
As noted above, it would be desirable to provide improved techniques for analyzing the outputs for DNA microarrays to more quickly, reliably and inexpensively yield a valid diagnostic assessment. To this end, the invention is directed primarily to providing a sequence of steps for replacing steps 114-130 of FIG. 1.