The invention generally relates to techniques for analyzing biological samples, such as DNA or RNA samples, and in particular to techniques for analyzing the output of a hybridized biochip to which the sample has been applied.
A variety of techniques have been developed to analyze DNA, RNA samples or other biological samples to identify diseases, mutations, or other conditions present within a patient providing the sample. Such techniques may determine, for example, whether the patient has any particular disease such as cancer or AIDS, or has a genetic predisposition toward the disease.
One particularly promising technique for analyzing biological samples uses a DNA biochip (or microarray) which generates a hybridization pattern representative of the characteristics of the DNA within the sample. Briefly, a DNA microarray includes a rectangular array of single stranded DNA fragments. Each element within the array includes millions of copies of identical single stranded strips of DNA containing specific sequences of bases. A different fragment of DNA may be provided at each different element of the array. In other words, location (1,1) contains a different single stranded fragment of DNA than location (1,2) which also differs from location (1,3) etc.
A DNA sample to be analyzed is first fragmented into individual single stranded sequences with each sequence being tagged with a fluorescent molecule. The fragments are applied to the microarray where each fragment bonds only with matching DNA fragments already embedded on the microarray. Fragments which do not match any of the elements of the microarray simply do not bond at any of the sites of the microarray and are discarded. Thus, only those microarray locations containing fragments that match fragments within the DNA sample will receive the fluorescent molecules. Typically, a fluorescent light source is then applied to the microarray to generate a fluorescent image identifying which elements of the microarray bonded with fragments of the DNA sample and which did not. The image is then analyzed to determine which specific DNA fragments were contained within the original sample and to determine therefrom whether particular diseases, mutations or other conditions are present within the DNA sample.
For example, a particular element of the microarray may be provided with fragments of DNA representative of a particular type of cancer. If that element of the array fluoresces under fluorescent illumination, then the DNA of the sample contains the DNA sequence representative of that particular type of cancer. Hence, a conclusion can be drawn that the patient providing the sample either already has that particular type of cancer or is perhaps predisposed towards that cancer. As can be appreciated, by providing a wide variety of known DNA fragments on the microarray, the resulting fluorescent image can be analyzed to identify a wide range of conditions.
Unfortunately, under conventional techniques, the step of analyzing the fluorescent pattern to determine the nature of any conditions characterized by the DNA is expensive, time consuming, and somewhat unreliable for all but a few particular conditions or diseases. FIG. 1 illustrates various conventional techniques for analyzing the flourescent pattern. Some prior art systems utilize two or more of the techniques. It should be noted though that many combinations of the techniques are not provided in the prior art. Briefly, the fluorescent pattern is quantized to yield a dot spectrogram at step 10. Any of four different techniques are employed to identify oligonucleotides represented by the dot spectrogram and to then identify mutations based upon the oligonucleotides. More specifically, the dot spectrogram may be analyzed using a trained neural network recognizer 12, a statistical decision theory recognizer 14, a fuzzy/expectation method (EM) clustering algorithm 16 or a rule-based inferencing/truth table search 18.
The results are interpreted to yield mutations of interest at step 20. Then the mutations are again processed using either a trained neural network recognizer 22, a statistical decision theory recognizer 24, a fuzzy/EM clustering algorithm 26 or a rule-based inferencing/truth table search 28. The results are combined at step 30 to yield a diagnosis.
Finally, disease confirmation is performed at step 32 by xe2x80x9creductionxe2x80x9d, in other words the disease is confirmed by probabilistic inferencing.
One major problem with many conventional techniques is that the techniques have poor repeatability. Hence, if the same sample is analyzed twice, different results are often obtained. Also, the results may vary from lab to lab. Also, skilled technicians are required to operate the DNA microarrays and to analyze the output resulting in high costs. One reason that repeatability is poor is that the signatures within the dot spectrogram that are representative of mutations of interest are typically very weak and are immersed in considerable noise. Conventional techniques are not particularly effective in extracting mutation signatures from dot spectrograms in low signal to noise circumstances.
Accordingly, it would highly desirable to provide an improved method and apparatus for analyzing the output of the DNA microarray to more expediently, reliably, and inexpensively determine the presence of any conditions within the patient providing the DNA sample. It is particularly desirable to provide a technique that can identify mutation signatures within dot spectrograms even in circumstance wherein the signal to noise ration is extremely low. It is to these ends that aspects of the invention are generally drawn.
One analysis technique for achieving the aforementioned advantages is described in co-pending U.S. patent application Ser. No. 09/253,789, now U.S. Pat. No. 6,136,541, filed contemporaneously herewith, entitled xe2x80x9cMethod and Apparatus for Interpreting Hybridized Biochip Patterns Using Resonant Interactions Employing Quantum Expressor Functionsxe2x80x9d, (the xe2x80x9cco-pending applicationxe2x80x9d) and incorporated by reference herein. Briefly, the method of the co-pending application operates as follows. The method identifies mutations, if any, present in a biological sample from a set of known mutations by analyzing a dot spectrogram representative of quantized hybridization activity of oligonucleotides,in the biological sample to identify the mutations. A resonance pattern is generated which is representative of resonances between a stimulus pattern associated with the set of known mutations and the dot spectrogram. The resonance pattern is interpreted to yield a set of confirmed mutations by comparing resonances found therein with predetermined resonances expected for the selected set of mutations. In a particular example described in the co-pending application, the resonance pattern is generated by iteratively processing the dot spectrogram by performing a convergent reverberation to yield a resonance pattern representative of resonances between a predetermined set of selected Quantum Expressor Functions and the dot spectrogram until a predetermined degree of convergence is achieved between the resonances found in the resonance pattern and resonances expected for the set of mutations. The resonance pattern is then analyzed to a yield a set of confirmed mutations by mapping the confirmed mutations to known diseases associated with the pre-selected set of known mutations to identify diseases, if any, indicated by the DNA sample. A diagnostic confirmation is then made by taking the identified diseases and solving in reverse for the associated Quantum Expressor Functions and then comparing those Quantum Expressor Functions with ones expected for the mutations associated with the identified disease to verify correspondence. If no correspondence is found, a new sub-set of known mutations are selected and the steps are repeated to determine whether any of the new set of mutations are present in the sample.
By exploiting a resonant interaction, mutation signatures may be identified within a dot spectrogram even in circumstances involving low signal to noise ratios or, in some cases, negative signal to noise ratios. By permitting the mutation signatures to be identified in such circumstances, the reliability of dot spectrogram analysis is thereby greatly enhanced. With an increase in reliability, costs associated with performing the analysis are decreased, in part, because there is less of a requirement for skilled technicians. Other advantages of the invention arise as well.
Although the method of the co-pending application represents a significant advance over techniques of the prior art, room for further improvement remains. In particular, it would be desirable to enhance the method of the co-pending application to achieve a higher degree of repeatability.
In particular, repeatability is affected by spatio-temporal degradation of hybridization in DNA microarrays implementing both passive and active hybridization. In bioelectronic systems implementing passive hybridization, sources affecting repeatability of analysis results include:
Stochastic variability in chemical kinetics
Immobilized oligonucleotide damage during fabrication
Uneven kinetics during thermally facilitated fluidics reaction
Post hybridization thermal degradation. Currently biochips are xe2x80x9camplification limitedxe2x80x9d. This is in large part due to losses during high-temperature hybridization downstream. During periods when the sample temperature changes from high to low or low to high, extraneous, undesirable reactions can occur that consume important reagents and create unwanted and interfering chemicals. Rapid transitions ensure that the sample spends a minimum of time at undesirable intermediate temperatures, so that the amplified DNA product has optimum fidelity and purity. So current methods rely on excessive amplification to compensate for these losses.
Oligonucleotide entanglement
Environmental decoherence due to energy and radiation
Uneven fluidic catalysis
Unstable fluorescence and chemiluminiscence marker binding
Spontaneous emissions
Partial bindings
Anti-aliasing during readout and digitization
Active hybridization is degraded by
Capacitive coupling between elements of the immobilized matrix
Partial bindings due to current leakage and uneven conductance
Ultrascale quantum squeeze effects
Spontaneous emission
Nonspecific oligonucleotide trapping
Chaotic relaxation across the array
Hence, aspects of the present invention are directed, in part, to providing enhanced repeatability of biological sample analysis despite these factors.
In accordance with a first aspect of the invention, a method is provided for identifying mutations, if any, present in a biological sample. The method operates to analyze a biochip output pattern generated using the sample to identify the mutations in the sample. In accordance with the method the output pattern is tessellated. A stimulus pattern associated with the set of known mutations is generated. A resonance pattern is then generated which is representative of resonances between the stimulus pattern and the tessellated output patterns. The resonance pattern is interpreted to yield a set of confirmed mutations by comparing resonances found therein with predetermined resonances expected for the selected set of mutations.
In an exemplary embodiment, the output pattern is a dot spectrogram representative of quantized hybridization activity of oligonucleotides in a DNA sample. The stimulus pattern is generated based upon Quantum Expressor Functions. The dot spectrogram is tessellated to match morphological characteristics of the Quantum Expressor Functions and local parametrics are extracted. The tessellated dot spectrogram and the stimulus pattern are transformed to a metrically transitive random field via phase shifting. The resonance pattern is generated by iteratively processing the tessellated dot spectrogram by performing a convergent reverberation to yield a resonance pattern representative of resonances between the Quantum Expressor Functions and the tessellated dot spectrogram until a predetermined degree of convergence is achieved between the resonances found in the resonance pattern and resonances expected for the set of mutations. The convergent reverberation includes the step of performing a convergent reverberant dynamics resonance analysis of the tessellated dot spectrogram using the resonance stimulus pattern to identify mutations represented by the tessellated dot spectrogram. The convergent reverberation also includes the step of performing a convergent reverberant dynamics resonance analysis of the mutations using the resonance stimulus pattern to identify diagnostic conditions represented by the mutations.
Also in the exemplary embodiment, the convergent reverberant dynamics resonance analyses are performed by determining resonance dynamics relaxation values based upon the tessellated dot spectrogram and the resonance stimulus; filtering the dynamics relaxation values using ensemble boundary and complete spatial randomness (CSR) filters to yield a second set of values; applying bulk property estimators to the dynamics relaxation values to yield a third set of values; evaluating the second and third sets of values to determine a degree of resonance convergence; and then determining from the degree of resonance convergence whether a paralysis of dynamics has occurred and, if so, repeating the aforementioned steps.
In the exemplary embodiments, by tessellating the dot spectrogram to match morphological characteristics of the Quantum Expressor Functions and by exploiting a resonant interaction employing a resonance convergence check which uses extracted tessellation parametric, mutation signatures may be identified within a dot spectrogram with a high degree of repeatability. By achieving a high degree of repeatability, the reliability of dot spectrogram analysis is thereby greatly enhanced. With an increase in reliability, costs associated with performing the analysis are decreased, in part, because there is less of a requirement for skilled technicians. Other advantages of the invention arise as well.
In accordance with a second aspect of the invention. a method is provided for preconditioning a dot spectrogram representative of quantized hybridization activity of oligonucleotides in a DNA samples. The method comprises the steps of tessellating the dot spectrogram to match characteristics of a predetermined stimulus pattern yielding a tessellated image; extracting local parametrics from the tessellated image; determining whether a degree of amplitude wandering representative of the local parametrics is within a predetermined allowable generator function limit; and if not, renormalizing the tessellated image to further match spectral properties of the stimulus pattern and repeating the steps.
In accordance with a third aspect of the invention, a method is provided for performing a convergent reverberant dynamics resonance analysis of a dot spectrogram representative of quantized hybridization activity of oligonucleotides in a DNA sample to identify mutations represented thereby. The method comprising the steps of determining resonance dynamics relaxation values based upon the preconditioned dot spectrogram and the resonance stimulus; filtering the dynamics relaxation values using ensemble boundary and CSR filters to yield a second set of values; applying bulk property estimators to the dynamics relaxation values to yield a third set of values; evaluating the second and third sets of values to determine whether a predetermined degree of resonance convergence has been achieved; and determining whether a paralysis of dynamics has occurred and, if so, repeating the steps.
Among other applications, principles of the invention are applicable to the analysis of various arrayed biomolecular, ionic, bioelectronic, biochemical, optoelectronic, radio frequency (RF) and electronic microdevices. Principles of the invention are particularly applicable to mutation expression analysis at ultra-low concentrations using ultra-high density passive and/or active hybridization DNA-based microarrays. Techniques implemented in accordance with the invention are generally independent of the physical method employed to accumulate initial amplitude information from the bio-chip array, such as fluorescence labeling, charge clustering, phase shift integration and tracer imaging. Also, principles of the invention are applicable to optical, optoelectronic, and electronic readout of hybridization amplitude patterns. Furthermore, principles of the invention are applicable to molecular expression analysis at all levels of abstraction: namely DNA expression analysis, RNA expression analysis, protein interactions and proteinxe2x80x94DNA interactions for medical diagnosis at the molecular level.
Apparatus embodiments are also provided.