The present invention relates to novel methods for analyzing the sequences of nucleic acid molecules using spectroscopic methods. In particular, the present invention relates to methods for gene expression analysis, sequence checking, mutation detection, and sequencing of nucleic acids. As such, the methods of the present invention broadly relate to molecular genetics and medical diagnostics.
Knowledge of genetic information in the form of the nucleotide sequence of genes is critical to an understanding of various biological phenomenon such as cell development and differentiation, organism growth and reproduction, the underlying causes of disease, etc. For example, proteins serve a variety of structural and catalytic functions. These properties of proteins, however, are a function of the amino acid sequence of the protein, which in turn is encoded by nucleic acid sequences. Nucleic acids can also play a more direct role in cellular processes by functioning in the control and regulation of gene expression.
A variety of hybridization techniques have been developed to conduct various types of nucleic acid analyses to gain insight into how genetic information functions in these different types of biological processes. Typically, hybridization techniques involve the binding of certain target nucleic acids by nucleic acid probes under controlled conditions such that hybridization only occurs between complementary sequences. Using such hybridization techniques, it is possible to conduct gene expression studies, sequence checking studies and determine the sequence of nucleic acids of unknown sequence, as well as a variety of other types of analysis.
Gene expression studies are important because differential expression of genes has been shown to be associated with cell development, cell differentiation, disease states and adaptation to various environmental stimuli. For example, many diseases have been characterized by differences in the expression levels of various genes either through change in copy number of the genetic DNA or through alterations in levels of transcription. In certain diseases, infection is frequently characterized by elevated expression of genes from a particular virus.
Sequence checking refers to methods in which samples containing nucleic acid targets are analyzed to detect the presence of a sequence of interest. This type of analysis has utility in diverse applications, including research, clinical diagnostics, quality control, etc. One particular type of sequence checking which is particularly important is the identification of polymorphisms, which are variations in the genetic code. Often polymorphisms take the form of a change in a single nucleotide and are called single nucleotide polymorphisms (SNPs). In other instances, the polymorphism may exist as a stretch of repeating sequences that vary in length among different individuals. In those instances in which these variations exist in a significant percentage of the population, they can readily be used as markers linked to genes involved in mono- and polygenic traits. Thus, analysis of polymorphisms can play an important role in locating, identifying and characterizing genes which are responsible for specific traits. In particular, polymorphisms can be used to identify genes responsible for certain diseases. Similarly, diagnostic tests can also be developed to detect polymorphisms known to be associated with certain diseases or disorders.
Hybridization techniques can also successfully be used in sequencing nucleic acids of unknown sequence. Such methods typically are considerably faster than conventional sequencing techniques.
Chips to which nucleic acid probes are attached can be used to conduct nucleic acid analyses. Probes can be attached at specific locations on the chip; these locations are often referred to as elements or sites. In some applications, the chip may include many elements arranged in the form of an array. Genetic methods utilizing arrays on chips have the advantage of allowing for parallel processing that can dramatically increase the rate at which analyzes can be conducted as compared to conventional methods which often require laborious electrophoretic separations. However, the current nucleic acid methods using chips typically require complex labeling procedures in order to identify which nucleic acid probes have hybridized with a target molecule. Moreover, the methods frequently involve complicated stringency washes in order to minimize binding between probes and targets which are not fully complementary.
The present invention provides new methods for conducting various types of nucleic acid analysis in which hybridization of probe and target sequences can be detected directly, thereby allowing the analyses to be simplified relative to existing methodologies.
The present invention provides various methods of analyzing nucleic acids utilizing a system which is sensitive to the dielectric properties of molecules and binding complexes, such as hybridization complexes formed between a nucleic acid probe and a nucleic acid target. The methods include diagnostic methods which involve detecting the presence of one or more target nucleic acids in a sample, quantitative methods, kinetic methods, and a variety of other types of analysis such as sequence checking, expression analysis and de novo sequencing. The methods can detect binding between nucleic acids without the use of labels. Certain methods involve the use of arrays which allows for rapid throughput. Other methods involve the use of spectral profiles which makes it possible to distinguish between different types of hybridization complexes.
Some methods provided by the present invention involve contacting a nucleic acid probe that is electromagnetically coupled to a portion of a signal path with a sample containing a target nucleic acid. The portion of the signal path to which the nucleic acid probe is coupled typically is a continuous transmission line. A response signal is detected for a hybridization complex formed between the nucleic acid probe and the nucleic acid target. Detection may involve propagating a test signal along the signal path and then detecting a response signal formed through modulation of the test signal by the hybridization complex.
Certain diagnostic methods utilize this general approach and include using a nucleic acid probe which is complementary to a target of known sequence. A sample potentially containing the target of known sequence is contacted with the complementary probe. In some methods, the target and probe are allowed to hybridize and then the targets and probes are washed under stringent conditions. In other methods, the stringency wash is unnecessary. Detection of a response signal is indicative of the sample containing the target of known sequence. Such methods can be used in detecting a single nucleotide polymorphism (SNP). The nucleic acid target containing a polymorphic site includes a first or a second base at the polymorphic site. The nucleic acid probe is selected to be complementary to either a nucleic acid target wherein the polymorphic site includes the first base or is complementary to a nucleic acid target wherein the polymorphic site includes the second base. With knowledge of the sequence of the nucleic acid probe, detection of a response signal makes it possible to identify whether the target contains the first or second base at the polymorphic site.
In other aspects, the present invention provides a variety of methods which utilize spectral profiles to analyze nucleic acid hybridization complexes. A profile is a spectrum for a particular hybridization complex. It can include certain signals which are characteristic of the particular complex, thus making it possible to utilize signatures as a diagnostic tool and as a way to distinguish between different types of binding. Thus, certain methods include acquiring a spectrum for a hybridization complex formed between a nucleic acid probe and a nucleic acid target, wherein the nucleic acid probe is electromagnetically coupled to a signal path. A test signal is propagated along the signal path and a response signal for the hybridization complex formed between the probe and target detected. As the test signal is propagated down the signal path, the test signal is varied with time (for example, by varying the wavelength or frequency of the test signal). Certain spectra include plots of a signal parameter (e.g., transmitted power) as a function of frequency, for example.
Methods utilizing profiles can be used for diagnostic purposes. These methods involve obtaining a spectrum as just described wherein the nucleic acid probe is contacted with a sample containing a target nucleic acid prior to propagation of the test signal. The resulting spectrum is then analyzed for the presence of a known signal which is characteristic for a known hybridization complex between a particular probe and a particular target. The presence of the known signal in the spectrum is indicative of the particular target nucleic acid being present in the sample.
Related methods utilize profiles to distinguish between complementary hybridization complexes and mismatch complexes. In these methods, the spectrum that is obtained using a known probe is examined for the presence of a complementary signal and/or a mismatch signal. The presence of the complementary signal is indicative of complementary binding between the nucleic acid probe and the nucleic acid target; likewise, the presence of a mismatch signal is indicative of a hybridization complex between the probe and target which includes a mismatch.
In still other related methods, profiles are used to identify whether a SNP is of the wild type form or a variant form. The target includes a polymorphic site which can include a first or second base. The nucleic acid probe sequence is selected so that if the target includes the first base at the polymorphic site the target forms a complementary hybridization complex. If, however, the target includes the second base at the polymorphic site, then a mismatch hybridization complex is formed. Hence, the presence of a complementary signal in the test spectrum is indicative of the target including the first base at the polymorphic site; whereas, the presence of a mismatch signal in the spectrum in the test spectrum is indicative of the target including the second base at the polymorphic site. Similar approaches can be used when there are more than two allelic forms.
Certain methods include the use of arrays. An array includes multiple elements, each element including a continuous transmission line and a nucleic acid probe (or plurality of probes) that are electromagnetically coupled to the continuous transmission line located within the element. The elements are contacted with a sample containing a nucleic acid target. A response signal is then detected for those elements in which a hybridization complex is formed.
Utilizing this general approach, arrays can be used to rapidly detect the presence of multiple targets in a sample. For instance, in a sample potentially containing a first target of known sequence and a second target of known sequence, nucleic acid probes are selected such that a first set of probes is complementary to the first target and a second set of probes is complementary to a second target. The first and second set of probes are typically located at a first and second element, respectively. Detection of a response signal from the first element is indicative of the sample including the first target; similarly, a response signal from the second element is indicative of the sample including the second target. Through appropriate selection of the sequence of probes in the various elements, these methods can be used to distinguish between SNPs, to determine which genes are expressed in a particular cell, (i.e., to conduct an expression analysis) and to determine the sequence of a nucleic acid.
The present invention also provides methods for obtaining quantitative information on nucleic acid hybridization events. In general, such methods typically include contacting a nucleic acid probe electromagnetically coupled to a portion of a signal path with a sample that includes a nucleic acid target, whereby a hybridization complex is formed between the probe and target. Changes in a signal or set of signals that are characteristic of the hybridization complex are then monitored. In certain methods, changes in signal amplitude or frequency are measured at different time points to obtain multiple measured values. The multiple values can be utilized, for example, to calculate kinetic parameters.